Hi guys, I want to ask about the relation between memory.grow and physical memory consumption. On linux, when we call memory.grow, the runtime calls mmap to reserve the virtual memory space, but linux shouldn't really reserve physical memory for this operation, until the the new page being accessed right?
But if the above is true, then what's is the downside of just initialize the linear memory to be 4GB (the entire 32 bit space) at the beginning ?
using an entire 4gb section of address space can quickly run into limitations, such as on common x86-64 systems where userspace only has 47-bits of total address space available, which would mean you can't have more than 32768 wasm instances at once, which is a pretty low limit
somewhat older versions of windows limit that to 43-bits, meaning you're limited to just 2048 wasm instances
I see, this makes sense thank you!
@Coulson Liang I think there may be several inaccuracies in your description of how the opcodes work:
memory.grow
does an mprotect
to make memory accessibleWhat @Jacob Lifshay says about address space limits is also true, and something we worry about, but in practice we still choose to use the VM-based approach typically because the alternative is explicit bounds checks which is a nontrivial perf impact and so usually a worse option
finally -- I'm not sure what you mean by "initialize the linear memory to be 4GB at the beginning" but we're constrained by Wasm semantics -- the linear memory size is defined by the initial size and any grow operations, we must trap if accesses happen that are out of bounds, so we have to actually adjust memory permissions as it grows
A downside to an embedder for starting linear memories at 4G is that the embedder can't keep track easily of what's paged in and what isn't. The memory.grow instruction can fail and provides a hook for the embedder to reject a requested growth as being to big or possibly block the instance entirely until there is more memory available. If everything is mapped in the there's no way to hook and event of memory being paged in easily
Hey all, I'm working with Coulson. Really appreciate all the replies here!
@Chris Fallin Can you possibly direct me to where virtual memory space is reserved? I'm assuming this is for the 4GB region. This makes a lot of sense since I was concerned about how remap operations would work.
Grep for mmap
in the wasmtime codebase, you'll find the abstractions in the runtime crate
(I hope that doesn't sound like a glib answer, but it really is that simple: we use mmap to reserve the space, so you can trace backward from there :-) )
in particular you might want to study the "on-demand allocator" first, it's simpler than the pooling allocator
Yeah yeah, here's the unified abstraction https://github.com/bytecodealliance/wasmtime/blob/main/crates/wasmtime/src/runtime/vm/mmap.rs
And here's the underlying implementation for unix systems
https://github.com/bytecodealliance/wasmtime/blob/main/crates/wasmtime/src/runtime/vm/sys/unix/mmap.rs
With default settings the mmap happens here
The default settings for various knobs there are here
if you look at strace
for a single linear memory you should see a mmap
of anonymous PROT_NONE memory which is 8G large (2G guard before, 4G region in the middle, 2G guard after)
you'll then see mprotect
calls to PROT_READ | PROT_WRITE
to make things read/write as memory becomes accessible
Wasmtime has no bindings to mremap
right now if that's what you're looking for
memory growth which moves linear memory (which again doesn't happen by default, requires non-default settings), happens here with a simple memcpy
Cool thank both of you so much. This all makes a lot of sense now.
So as linear memory grows thats essentially just expanding whats accessible via mprotect. Are there any other checks being done to make sure addresses are valid? or is that just handled by the underlying OS?
I think Coulson asked the same question in a different way but just want to be sure. I believe I saw in a doc somewhere that adress checking similar to how NaCl handles SFI doesnt exist because of the overhead?
https://github.com/bytecodealliance/wasmtime/blob/main/cranelift/wasm/src/code_translator/bounds_checks.rs should answer your questiosn re: bounds-checks -- study the cases in there. We do have modes where we use dynamic checks instead of virtual memory permissions, it's configurable, but VM-based (we call it a "static memory") is the default
ok thank you will try to go through this, its quite a lot of code. What I think youre saying though is that the default VM-based static memory just makes sure addresses are within the 4GB reserved region?
It's a little more subtle than that, I'd recommend reading the code
there are details to do with offsets on the loads/stores for example, and a "guard region"
the main bit is the seven cases with comments that show inequalities; that shouldn't be too bad to read through
You can also check out online some various settings -- https://godbolt.org/z/sxzjTsMMG
you can see there how the CLI settings affect the codegen for loads/stores
although you have to sort of manually reassemble things in your head due to lack of mapping there
Locally you can use wasmtime explore
to take a look at what wasm corresponds to what asm
Alex Crichton said:
memory growth which moves linear memory (which again doesn't happen by default, requires non-default settings), happens here with a simple memcpy
That is when I saw the doc says memory.grow() will relocate base ptr but actually it does not (on 64-bit linux). Can I safely assume in this case I can have a “long lived” pointer into memory since the base ptr will not change? Thanks a lot.
Long context: I want to avoid the copy of output data from guest to host so I tried to store a pointer to the guest's memory(the output data) in host's "MyBuffer". When this "MyBuffer" drops it will call the dealloc func in guest to free the memory. That instance/store may also be reused after the pointer is saved in this "MyBuffer", meaning that memory.grow() will be called.
In general it's not safe to assume the pointer won't change. If possible I'd operate under the assumption that the pointer can change whenever wasm is called. Otherwise though it is possible where you can configure wasmtime specifically to ensure that the base pointer never changes.
Alex Crichton said:
In general it's not safe to assume the pointer won't change. If possible I'd operate under the assumption that the pointer can change whenever wasm is called. Otherwise though it is possible where you can configure wasmtime specifically to ensure that the base pointer never changes.
I see. Thanks. I think if we ensure after we get the pointer no wasm calls will be made to that instace&store (and the store wont be dropped), that pointer also won't change? A follow-up question is, would this be the canonical way to avoid copy of output data from wasm? In my use case, I would like the output to be zero-copy since it is large. That is the reason for my hacking above.
Sorry to add more to this thread, one more question though: When the wasm code contains malloc and free, will those be compiled to mprotect
under default setting?
. I think if we ensure after we get the pointer no wasm calls will be made to that instace&store (and the store wont be dropped), that pointer also won't change?
Correct yeah. You'll also need to avoid growing memory. If possible it's recommended to use the safe Memory::data
API or Memory::data_mut
so you don't have to worry about these concerns, but that may also not be applicable in all situations.
And yes it's expected the embedders should be able to borrow data directly from wasm, and that's what Memory::data
enables (or raw access too).
When the wasm code contains malloc and free, will those be compiled to
mprotect
under default setting?
If I understand you question right, I believe the answer is "sort of". The wasm code itself probably has a malloc/free, for example from wasi-libc. This is not implemented with mprotect at all since it's a pure wasm-level abstractions. The wasm code for malloc, though, probably calls the memory.grow
wasm instruction at some point to allocate more meomry (that's a WebAssembly-level primitive). That is implemented with mprotect
in Wasmtime.
There is, however, no equivalent to freeing memory in WebAssembly. Once an instance has memory it has no means of releasing it until the entire instance is destroyed.
but that may also not be applicable in all situations.
Yes that is my case... Thank you so much
Understood the memory can only be released after instance dropped. Thank you.
But I am thinking about whether we can reuse the grown linear memory. Say we have:
fn alloc_and_return_ptr() -> Ptr {
let buff = vec![0u8; 1024 * 1024 * 1024]
std::mem::forget(buf);
// somehow return buff's pointer to host
}
pub unsafe extern "C" fn dealloc(ptr: *mut u8, len: usize, align: usize) {
std::alloc::dealloc(
ptr,
std::alloc::Layout::from_size_align_unchecked(len, align),
);
}
Both functions are compiled to WASM and exported to host. After calling alloc_and_return_ptr
say the physical memory grows 1GB, but then calling dealloc
on the pointer does not make the physical memory freed. But then we call alloc_and_return_ptr
I assume that 1GB will be reused and physical memory will not grow? My test shows it grows a little but I don't know where that growth comes from.
I am also wondering why we cannot release the 1GB physical memory after calling dealloc
since it is mmap under the hood. I guess the answer is wasm's linear memory requirement.
Last updated: Dec 23 2024 at 13:07 UTC