I am interested in learning more about the wasmtime linear memory for a project. I have been going through memory.rs
and mmap.rs
a little, but I think I am lacking a bigger picture. Are there any online resources you can recommend (documentation going into more detail)?
A more specific question: What is the difference between the accessible_size
and mapping_size
(called for instance in accessible_reserved
)? As far as I understand, the accessible size represents memory that is guaranteed to be accessible/usable, so it should always be allowed to dereference the pointer to the regions that are accessible. I am not sure what mapping_size
is for.
Unfortunately there's not really documentation besides what's in the code, so the best way to know how things work is to study the implementation -- though we're happy to answer questions here too
accessible_size
is indeed the size of memory that is legal to access -- it corresponds to the size of the Wasm heap (which can grow during runtime)
mapping_size
is, as name suggests, the size of the total memory region we reserve. This can be larger than accessible_size
because we want to allow growth without relocating the heap
Ok, thanks. What is the purpose/use of the pre_guard_bytes
?
It's a configuration for guaranteed-unmapped memory before the start of linear memory which is a small defense-in-depth mechanism against possible compiler bugs which accidentally go before linear memory, otherwise it has no other runtime effect
Thanks! Another question: The bounds checking that ensure nothing is accessed beyong the bounds of the linear memory is done both at compile time and runtime, right? So far, I found validate_bounds
in runtime/src/instance.rs
(runtime) and bounds_check_and_compute_addr
in cranelift/wasm/src/code_translator/bounds_checks.rs
(compile-time). Are there any more locations relevant to bounds checking?
Runtime bounds are really validated by code generated by what you've listed as "compile-time" checks. Since addresses aren't known until runtime, it's not really possible to do bounds-checking purely at compile-time. In more detail, we have two kinds of bounds-checking, "dynamic" and "static" (the names may not be the best, but that's what they are). Static bounds-checking is implemented by mapping a virtual-memory region to be accessible only as far as the memory's length, and a "guard region" after, so if the guest accesses out-of-bounds, we get (and catch) a SIGSEGV. The guard needs to be as large as the 32-bit offset can allow
and then dynamic is what you describe, with actual comparison operators
Where in the code is this "guard region" implemented? Are those all these special cases in bound_checks.rs?
it's not really a single line of code we can point at; it's an overall design
the memory map creates the guard region
and then we compile in a way that has no dynamic bounds check, but rather adds a 32-bit offset
Maybe some more context on why I am asking these questions: In a previous thread, I already mentionned that we are working on a prototype for adding MTE to wasm(time) for increased memory safety (it's part of a software stack that also involves/requires llvm to do some analysis). MTE requires aarch64 and 64 bit pointers, so we had to adapt wasmtime to use that, and that worked well. Since we no longer only have 32 bit addresses, we had to insert 64 bit runtime out of bounds checks. Now, we were thinking of a way to remove the overhead of these bounds checks. We came up with the idea of replacing these runtime bounds checks by using MTE (MTE adds tag bits to the upper bits of addresses) as well, by tagging the entire linear memory itself (with the stg instruction) and all pointers to the linear memory. Then, MTE would trap at a tag mismatch at runtime. We already realized that we should only be tagging the accessible linear memory (our changes here were mostly made to mmap.rs and memory.rs). We've also already modified the bounds checks in the functions/files that I mentionned in the previous message. However, it seems like we have missed adjusting some code related to the bounds checks (not sure if it's the "dynamic" or "static" bounds-checking you mentionned), since I still get an out of bounds exception when running a simple test program, probably because we haven't masked out the tag bits somewhere.
you may be interested in this issue and its discussion as well: https://github.com/bytecodealliance/wasmtime/issues/6094
when you say you had to insert 64 bit addresses, do you mean you switched to using wasm64?
because wasm32 loads/stores already end up as 64 bit addresses after translation, so I am a bit confused
Yes, we switched to wasm64.
Last updated: Dec 23 2024 at 13:07 UTC