I have added a custom section in my module using below code:
#[link_section = "my-data"]
pub static FIRST: [u8; 13] = *b"Hello, World!";
Also, in translate_payload function in module_environ module, I've added the below code to translate the custom header.
Payload::CustomSection(s) if s.name() == "my-data" => {
let result = NameSectionReader::new(s.data(), s.data_offset())
.map_err(|e| e.into())
.and_then(|s| self.name_section(s));
println!("read custom section error {:?}", result);
if let Err(e) = result {
log::warn!("failed to parse name section {:?}", e);
}
However, I got the bellow error and couldn't fix it.
error Err(InvalidWebAssembly { message: "name entry extends past end of the code section", offset: 1124882 })
Can you please help me how I can parse this new header?
Because I need to get the custom data during this step and change it.
I'm pretty sure the data you wrote is put directly in the custom section, so simply look at s.data()
without doing anything with the NameSectionReader parser fot.name sections.
Thanks, but how can I change it? my goal is to change this data when module is compiling and loading into the memory.
Is it possible to do that?
Custom sections are not loaded at runtime. If you want to change the size of the data you are out of luck unless you ship the unlinked wasm files and then when you want to change the data change the wasm file containing the data and link everything again. If you don't need to change the data, you can use #[no_mangle]
and then get the address of the data by loading the value of the exported global with the same name as the static and then change the data segment at this address (make sure to account for the fact that a data segment can specify a non-zero load address)
I only want to change the content not the size of data ( like Hello -->Secret).
Is it possible to apply this changes right after loading data from binary file in from_binary function ? I applied this change, but it didn't work, the final data which is shown during execution is still Hello.
yes, one could load the bytes of the module into memory, use a parser like the wasmparser
crate to discover where the section resides in memory, mutate it as you describe (which shouldn't change how the module validates), and then load the module from the slice using from_binary
if desired.
Note that the data in the custom section is otherwise ignored by wasmtime, so I'm not sure what that would accomplish
if you're using a data segment instead of a custom section, you could also locate and mutate it before from_binary
(provided you don't change the size or offset) which would have an observable effect within the guest program
@Peter Huene Thanks for your reply.
This link is the real issue that I have. I have asked and they gave me this offer.
I want to annotate a string in my rust code, and change it during loading time. However, I changed the value before calling from_binary but It didn't work. Do you know what is the reason?
What did you mean about this sentence the custom section is otherwise ignored by wasmtime? Do you mean it put the data into data section?
If I sue data section, is there any automate way to locate the data in the data segment with annotation it in the high-level code?
the data in a custom section not otherwise known to wasmtime ahead of time (i.e. the name or debug sections) are simply skipped over when parsing the module and aren't loaded into memory, so they have no impact on a guest program's execution
if the data is in the data section, you need a mechanism to locate it, so i don't know if you can put the address of the string in a custom section easily (haven't tried)
Sorry I got confused. You said wasmtime skips the data in a custom section. If it does that, why can I use this data in my code (like printing the value of string)?
I have checked the wat format of wasm module, I could find the string in .rodata section.
However, I examined the bytes slice which contains the loaded bytes, the offset of my data showed it put in the custom section.
i'd have to see your guest program, but if you're printing a string in a guest program, it must be in linear memory; for a static string to be initialized, the initial value must be in a data segment from a data section (in wat, a (data ...)
expression)
custom sections (should) never have any influence on the guest program execution and should be strippable with no observable change in behavior
so i suspect in your setup you have two copies of the "hello world" string data: one in a data section which is what gets put in linear memory and one in a custom section
and your modification to the copy in the custom section has no effect
I use spin and this is my simple code that returns string in the body.
use anyhow::Result;
use std::str::from_utf8;
use spin_sdk::{
http::{Request, Response},
http_component,
};
/// A simple Spin HTTP component.
#[http_component]
fn hello_world(req: Request) -> Result<Response> {
#[link_section = "my-data"]
pub static FIRST: [u8; 13] = *b"Hello, World!";
Ok(http::Response::builder()
.status(200)
.header("foo", "bar")
.body(Some(from_utf8(&FIRST)?.into()))?)
}
I examined the bytes slice, there are only on copy of Hello, World! in it. I know that it is weird but mad me also confused.
yeah, it should have two copies: one from the data section (the same as if you omitted the #[link_section]
directive) and then another copy in the custom section that the linker splatted the same bytes
for runtime behavior, only the former is respected
makes sense, but why there are only one Hello, World! in the bytes buffer? in the same offset with custom section.
the #[link_section]
directive on the static has no impact on how its data ends up in the data section
it may be worth stepping back to discuss what you're trying to do, then figure out whether a custom section is the best way to do it. it sounds to me like you want to have a wasm module providing an HTTP component, where the runtime environment that loads the wasm module can push some data into it before it starts handing HTTP requests off. is that right?
a simple (naive) way to make what you want work would be to prefix the string with something easily identifiable so that you could just look through every data segment for the prefix (note: data segments may be disjoint, but probably aren't in your case) and then change the postfix bytes to what you want (the guest program would have to skip the prefix bytes); then feed the modified module in memory to Wasmtime
unfortunately rust prohibits casting a pointer in a constant expression, so i don't think you'll get data into a custom section that helps locate the relevant bytes faster
@Jamey Sharp This is a simple example only. as I mentioned before, I want to annotate some data in high-level and change it during load time. it is important to me to do that Because the previous version should not be presented in the memory during the execution. I need the address of annotated string in the buffer.
so far it sounds to me like you just need the data to be a wasm export, so its offset in linear memory is stored somewhere that you can read it out of the binary
@Peter Huene thanks for the nice idea!
@Jamey Sharp You're right, Thanks!
indeed, one could have a function export that returns the offset in linear memory for the host to write the bytes to, then call that and modify the guest's linear memory before anything else that depends on the data is called
@Peter Huene, does it need to be a _function_ export? isn't a global export okay? I'm not clear on this part of the wasm spec.
@julia, if you can wait to overwrite memory until the module is instantiated (but still before you call any functions exported from the module), then I think you should be able to do that using only existing wasmtime APIs. if you want to write out a new .wasm file with the modified data, I suspect it's a little more tedious, but certainly feasible
yeah that'd work better, need to slap an export directive on the static
jamey's suggestion should work great for your use case
if you slap #[export_name = "foo"]
on your static there, you'll see a global with the address exported
which actually you could just statically read if you want to, don't have to use the export
now that i think about it
so can still modify the module before compilation, just need to parse the exports, find the expected export for the index of the global, find the global and look at its initial value
once you know the string offset in linear memory, then you need to find the data segment(s) that initializes that offset and mutate the segment bytes (at the relative offset) before loading the module in Wasmtime; should be relatively straightforward to accomplish with wasmparser
Thanks a lot!
I should read more to get the details of the idea and understand where I can apply the changes to modify the data.
I wanted to say thank you, I implemented the idea using wasmparser functions and modified the module before compilation. Thanks a lot!
julia has marked this topic as resolved.
Last updated: Dec 23 2024 at 14:03 UTC