Stream: wasmtime

Topic: ✔ translate a custom section


view this post on Zulip julia (Oct 25 2022 at 02:55):

I have added a custom section in my module using below code:

#[link_section = "my-data"]
    pub static FIRST: [u8; 13] = *b"Hello, World!";

Also, in translate_payload function in module_environ module, I've added the below code to translate the custom header.

 Payload::CustomSection(s) if s.name() == "my-data" => {
               let result = NameSectionReader::new(s.data(), s.data_offset())
                .map_err(|e| e.into())
                .and_then(|s| self.name_section(s));
                println!("read custom section error {:?}", result);
                 if let Err(e) = result {
                    log::warn!("failed to parse name section {:?}", e);
            }

However, I got the bellow error and couldn't fix it.

error Err(InvalidWebAssembly { message: "name entry extends past end of the code section", offset: 1124882 })

Can you please help me how I can parse this new header?
Because I need to get the custom data during this step and change it.

A fast and secure runtime for WebAssembly. Contribute to bytecodealliance/wasmtime development by creating an account on GitHub.

view this post on Zulip bjorn3 (Oct 25 2022 at 07:34):

I'm pretty sure the data you wrote is put directly in the custom section, so simply look at s.data() without doing anything with the NameSectionReader parser fot.name sections.

view this post on Zulip julia (Oct 25 2022 at 15:58):

Thanks, but how can I change it? my goal is to change this data when module is compiling and loading into the memory.
Is it possible to do that?

view this post on Zulip bjorn3 (Oct 25 2022 at 19:13):

Custom sections are not loaded at runtime. If you want to change the size of the data you are out of luck unless you ship the unlinked wasm files and then when you want to change the data change the wasm file containing the data and link everything again. If you don't need to change the data, you can use #[no_mangle] and then get the address of the data by loading the value of the exported global with the same name as the static and then change the data segment at this address (make sure to account for the fact that a data segment can specify a non-zero load address)

view this post on Zulip julia (Oct 25 2022 at 19:33):

I only want to change the content not the size of data ( like Hello -->Secret).
Is it possible to apply this changes right after loading data from binary file in from_binary function ? I applied this change, but it didn't work, the final data which is shown during execution is still Hello.

A fast and secure runtime for WebAssembly. Contribute to bytecodealliance/wasmtime development by creating an account on GitHub.

view this post on Zulip Peter Huene (Oct 25 2022 at 20:39):

yes, one could load the bytes of the module into memory, use a parser like the wasmparser crate to discover where the section resides in memory, mutate it as you describe (which shouldn't change how the module validates), and then load the module from the slice using from_binary if desired.

view this post on Zulip Peter Huene (Oct 25 2022 at 20:41):

Note that the data in the custom section is otherwise ignored by wasmtime, so I'm not sure what that would accomplish

view this post on Zulip Peter Huene (Oct 25 2022 at 20:43):

if you're using a data segment instead of a custom section, you could also locate and mutate it before from_binary (provided you don't change the size or offset) which would have an observable effect within the guest program

view this post on Zulip julia (Oct 25 2022 at 20:57):

@Peter Huene Thanks for your reply.
This link is the real issue that I have. I have asked and they gave me this offer.
I want to annotate a string in my rust code, and change it during loading time. However, I changed the value before calling from_binary but It didn't work. Do you know what is the reason?
What did you mean about this sentence the custom section is otherwise ignored by wasmtime? Do you mean it put the data into data section?

If I sue data section, is there any automate way to locate the data in the data segment with annotation it in the high-level code?

I want to annotate some strings (mark them with specific name) in Rust code and compile the code to wasm code using wasm32-wasi target option. I need to know the offsets that show where they store in the data section. I know that clang has annotation capability like the below code: __attribute__((annotate("fff"))) void foo() {} Is there similar solution which gives me the capability to extract the offsets using annotation? Thanks.

view this post on Zulip Peter Huene (Oct 25 2022 at 21:00):

the data in a custom section not otherwise known to wasmtime ahead of time (i.e. the name or debug sections) are simply skipped over when parsing the module and aren't loaded into memory, so they have no impact on a guest program's execution

view this post on Zulip Peter Huene (Oct 25 2022 at 21:00):

if the data is in the data section, you need a mechanism to locate it, so i don't know if you can put the address of the string in a custom section easily (haven't tried)

view this post on Zulip julia (Oct 25 2022 at 21:16):

Sorry I got confused. You said wasmtime skips the data in a custom section. If it does that, why can I use this data in my code (like printing the value of string)?

I have checked the wat format of wasm module, I could find the string in .rodata section.
However, I examined the bytes slice which contains the loaded bytes, the offset of my data showed it put in the custom section.

view this post on Zulip Peter Huene (Oct 25 2022 at 21:18):

i'd have to see your guest program, but if you're printing a string in a guest program, it must be in linear memory; for a static string to be initialized, the initial value must be in a data segment from a data section (in wat, a (data ...) expression)

view this post on Zulip Peter Huene (Oct 25 2022 at 21:19):

custom sections (should) never have any influence on the guest program execution and should be strippable with no observable change in behavior

view this post on Zulip Peter Huene (Oct 25 2022 at 21:19):

so i suspect in your setup you have two copies of the "hello world" string data: one in a data section which is what gets put in linear memory and one in a custom section

view this post on Zulip Peter Huene (Oct 25 2022 at 21:19):

and your modification to the copy in the custom section has no effect

view this post on Zulip julia (Oct 25 2022 at 21:21):

I use spin and this is my simple code that returns string in the body.

use anyhow::Result;
use std::str::from_utf8;
use spin_sdk::{
    http::{Request, Response},
    http_component,
};

/// A simple Spin HTTP component.
#[http_component]
fn hello_world(req: Request) -> Result<Response> {
    #[link_section = "my-data"]
    pub static FIRST: [u8; 13] = *b"Hello, World!";


    Ok(http::Response::builder()
        .status(200)
        .header("foo",  "bar")
        .body(Some(from_utf8(&FIRST)?.into()))?)
}

view this post on Zulip julia (Oct 25 2022 at 21:23):

I examined the bytes slice, there are only on copy of Hello, World! in it. I know that it is weird but mad me also confused.

view this post on Zulip Peter Huene (Oct 25 2022 at 21:24):

yeah, it should have two copies: one from the data section (the same as if you omitted the #[link_section] directive) and then another copy in the custom section that the linker splatted the same bytes

view this post on Zulip Peter Huene (Oct 25 2022 at 21:25):

for runtime behavior, only the former is respected

view this post on Zulip julia (Oct 25 2022 at 21:27):

makes sense, but why there are only one Hello, World! in the bytes buffer? in the same offset with custom section.

view this post on Zulip Peter Huene (Oct 25 2022 at 21:28):

the #[link_section] directive on the static has no impact on how its data ends up in the data section

view this post on Zulip Jamey Sharp (Oct 25 2022 at 21:28):

it may be worth stepping back to discuss what you're trying to do, then figure out whether a custom section is the best way to do it. it sounds to me like you want to have a wasm module providing an HTTP component, where the runtime environment that loads the wasm module can push some data into it before it starts handing HTTP requests off. is that right?

view this post on Zulip Peter Huene (Oct 25 2022 at 21:30):

a simple (naive) way to make what you want work would be to prefix the string with something easily identifiable so that you could just look through every data segment for the prefix (note: data segments may be disjoint, but probably aren't in your case) and then change the postfix bytes to what you want (the guest program would have to skip the prefix bytes); then feed the modified module in memory to Wasmtime

view this post on Zulip Peter Huene (Oct 25 2022 at 21:33):

unfortunately rust prohibits casting a pointer in a constant expression, so i don't think you'll get data into a custom section that helps locate the relevant bytes faster

view this post on Zulip julia (Oct 25 2022 at 21:33):

@Jamey Sharp This is a simple example only. as I mentioned before, I want to annotate some data in high-level and change it during load time. it is important to me to do that Because the previous version should not be presented in the memory during the execution. I need the address of annotated string in the buffer.

view this post on Zulip Jamey Sharp (Oct 25 2022 at 21:35):

so far it sounds to me like you just need the data to be a wasm export, so its offset in linear memory is stored somewhere that you can read it out of the binary

view this post on Zulip julia (Oct 25 2022 at 21:37):

@Peter Huene thanks for the nice idea!
@Jamey Sharp You're right, Thanks!

view this post on Zulip Peter Huene (Oct 25 2022 at 21:38):

indeed, one could have a function export that returns the offset in linear memory for the host to write the bytes to, then call that and modify the guest's linear memory before anything else that depends on the data is called

view this post on Zulip Jamey Sharp (Oct 25 2022 at 21:40):

@Peter Huene, does it need to be a _function_ export? isn't a global export okay? I'm not clear on this part of the wasm spec.
@julia, if you can wait to overwrite memory until the module is instantiated (but still before you call any functions exported from the module), then I think you should be able to do that using only existing wasmtime APIs. if you want to write out a new .wasm file with the modified data, I suspect it's a little more tedious, but certainly feasible

view this post on Zulip Peter Huene (Oct 25 2022 at 21:41):

yeah that'd work better, need to slap an export directive on the static

view this post on Zulip Peter Huene (Oct 25 2022 at 21:41):

jamey's suggestion should work great for your use case

view this post on Zulip Peter Huene (Oct 25 2022 at 21:42):

if you slap #[export_name = "foo"] on your static there, you'll see a global with the address exported

view this post on Zulip Peter Huene (Oct 25 2022 at 21:42):

which actually you could just statically read if you want to, don't have to use the export

view this post on Zulip Peter Huene (Oct 25 2022 at 21:42):

now that i think about it

view this post on Zulip Peter Huene (Oct 25 2022 at 21:43):

so can still modify the module before compilation, just need to parse the exports, find the expected export for the index of the global, find the global and look at its initial value

view this post on Zulip Peter Huene (Oct 25 2022 at 21:50):

once you know the string offset in linear memory, then you need to find the data segment(s) that initializes that offset and mutate the segment bytes (at the relative offset) before loading the module in Wasmtime; should be relatively straightforward to accomplish with wasmparser

view this post on Zulip julia (Oct 25 2022 at 22:07):

Thanks a lot!
I should read more to get the details of the idea and understand where I can apply the changes to modify the data.

view this post on Zulip julia (Oct 28 2022 at 04:42):

I wanted to say thank you, I implemented the idea using wasmparser functions and modified the module before compilation. Thanks a lot!

view this post on Zulip Notification Bot (Oct 28 2022 at 04:42):

julia has marked this topic as resolved.


Last updated: Oct 23 2024 at 20:03 UTC