Stream: git-wasmtime

Topic: wasmtime / issue #9001 `out of bounds memory access` with...


view this post on Zulip Wasmtime GitHub notifications bot (Jul 24 2024 at 03:04):

FiveMovesAhead opened issue #9001:

I have a very simple wasm function that deserializes string data from memory:

#[no_mangle]
pub fn entry_point(ptr: u32) {
    let len: usize = unsafe { *(ptr as *const u32) } as usize;
    let ptr = ptr as *const u8;
    let data = unsafe { std::slice::from_raw_parts(ptr.add(4), len) };
    let text: String = bincode::deserialize(&data).expect("Failed to deserialize");
}

I am getting the memory of the instance, and setting n kb of text data into memory as follows:

let memory = instance
            .get_memory(&mut store, "memory")
            .expect("Failed to find memory");
...
let serialized_text = bincode::serialize(&"A".repeat(n * 1024)).expect("Failed to serialize");
{
    let data_mut = memory.data_mut(&mut store);
    for (i, b) in (serialized_text.len() as u32)
        .to_le_bytes()
        .into_iter()
        .enumerate()
    {
        data_mut[i] = b;
    }
    for (i, b) in serialized_text.into_iter().enumerate() {
        data_mut[i + 4] = b;
    }
}

I have a loop that increments n, and exactly at 1033kb of data, I get an out of bounds memory access error:

...
Testing with 1030kb of data
Testing with 1031kb of data
Testing with 1032kb of data
Testing with 1033kb of data
SHOULD WORK: error while executing at wasm backtrace:
    0: 0x6852 - <unknown>!<wasm function 86>
    1: 0x94b0 - <unknown>!<wasm function 96>

Caused by:
    0: memory fault at wasm address 0x41414145 in linear memory of size 0x110000
    1: wasm trap: out of bounds memory access

Here is the repo with code to replicate the issue

Am I doing anything obviously wrong? The same issue also occurs with wasmi

view this post on Zulip Wasmtime GitHub notifications bot (Jul 24 2024 at 08:20):

bjorn3 commented on issue #9001:

You are corrupting the memory of the wasm module by overwriting it's stack and global data, which then results in the wasm module accessing an invalid address. If you want to pass a string to the linear memory, you will have to add a function to allocate memory to the wasm module and then call this function from the host to determine where to write the string.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 24 2024 at 08:30):

FiveMovesAhead commented on issue #9001:

thanks for your reply @bjorn3 ! I see, so I am allocating so much memory that its overflowing into addresses I shouldn't modify.

Would you be so kind as to give me an example of how to implement the solution that you just described?

view this post on Zulip Wasmtime GitHub notifications bot (Jul 24 2024 at 14:25):

alexcrichton commented on issue #9001:

When copying data into a WebAssembly module a common way to express that is with a malloc-look-alike function export. The host would invoke that to get an appropriate pointer into the guest's memory, and then the guest would also understand that when receiving the memory it needs to deallocate the memory as well.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 24 2024 at 15:25):

FiveMovesAhead closed issue #9001:

I have a very simple wasm function that deserializes string data from memory:

#[no_mangle]
pub fn entry_point(ptr: u32) {
    let len: usize = unsafe { *(ptr as *const u32) } as usize;
    let ptr = ptr as *const u8;
    let data = unsafe { std::slice::from_raw_parts(ptr.add(4), len) };
    let text: String = bincode::deserialize(&data).expect("Failed to deserialize");
}

I am getting the memory of the instance, and setting n kb of text data into memory as follows:

let memory = instance
            .get_memory(&mut store, "memory")
            .expect("Failed to find memory");
...
let serialized_text = bincode::serialize(&"A".repeat(n * 1024)).expect("Failed to serialize");
{
    let data_mut = memory.data_mut(&mut store);
    for (i, b) in (serialized_text.len() as u32)
        .to_le_bytes()
        .into_iter()
        .enumerate()
    {
        data_mut[i] = b;
    }
    for (i, b) in serialized_text.into_iter().enumerate() {
        data_mut[i + 4] = b;
    }
}

I have a loop that increments n, and exactly at 1033kb of data, I get an out of bounds memory access error:

...
Testing with 1030kb of data
Testing with 1031kb of data
Testing with 1032kb of data
Testing with 1033kb of data
SHOULD WORK: error while executing at wasm backtrace:
    0: 0x6852 - <unknown>!<wasm function 86>
    1: 0x94b0 - <unknown>!<wasm function 96>

Caused by:
    0: memory fault at wasm address 0x41414145 in linear memory of size 0x110000
    1: wasm trap: out of bounds memory access

Here is the repo with code to replicate the issue

Am I doing anything obviously wrong? The same issue also occurs with wasmi

view this post on Zulip Wasmtime GitHub notifications bot (Jul 24 2024 at 15:25):

FiveMovesAhead commented on issue #9001:

See here for working solution by @Robbepop

view this post on Zulip Wasmtime GitHub notifications bot (Jul 24 2024 at 16:10):

cfallin commented on issue #9001:

I am allocating so much memory that its overflowing into addresses I shouldn't modify.

To add a bit, in case it helps: the approach in the code above is actually not allocating memory at all (from the point of view of the guest Wasm code); rather, it's splatting data on top of the heap, and below a certain size you are getting lucky that there is nothing else there. That's why the solution is to call the guest's malloc first: the guest is in charge of managing its memory layout, so we have to ask that allocator where to put the data.


Last updated: Dec 23 2024 at 12:05 UTC