I am caching (instance, store) tuple. I want to minimize my allocation as much as possible. So for every wasm exported function call, I preallocate a large chunk of memory and any time host wants to send the data to module, this chunk is used. I am thinking of extending this for the lifetime of the instance, is it possible to do so ? My question is if while caching I allocate memory and store it's pointer, will the allocated memory be valid across subsequent wasm function calls if I use the same instance and store ?
That sounds like a possible ABI. It's completely up to you to decide how your host code and your guest code cooperate -- if they both agree to use a single long-lived buffer to exchange data, that's fine.
Note that if the Wasm memory grows, the host-side address (the actual native address in the process) may change. So you would want to keep around the Wasm/guest-side address and access guest memory via the safe Rust APIs when needed.
Thanks @Chris Fallin for the response. My only doubt after reading the above message is how will new allocators inside WASM module know about the already allocated memory from the previous runs ? each run of exported wasm function I assume would setup up new allocator right ?
It's hard to really say more -- this design is up to you. You've specified that you'll pre-allocate guest memory -- presumably by calling a function in the guest? Maybe you want to pass a pointer to that same buffer each time you call the guest, then, I'm not sure. Or perhaps you could use global state in the guest.
The distinction I'm trying to draw is that this is purely a contract issue between your code (host side) and your code (guest side); Wasmtime doesn't care how you allocate guest memory; so you can invent whatever convention you want. E.g. "each run ... would setup new allocator right?" is a question we can't answer, because you're asking us what your code is doing. You can make it do whatever you want!
If you have questions about specific existing guest language runtimes and how they might interact with all this, we can try to help. With wasi-libc for example, you probably want to call the guest's malloc
, then keep that pointer around and reuse it by passing it into each call.
ohh okay, I understand what you saying @Chris Fallin . My bad, to clarify the guest language is Rust and the target is wasm32-unknown-unknown I am compiling it to and I am exporting a function called __new which allocates a manuallydrop Vec<u8> and I pass ptr to it around while making allocations to pass data from host to guest
so specifically my question would be then, whether this ptr would be valid and pointing to the same allocated buffer when I execute subsequent WASM functions using the same instance and store ?
Gotcha -- yes, the Wasm pointer will continue to be valid as long as the instance is alive. Note this is an offset into the Wasm heap. The host-side pointer that corresponds to where that data is stored in the host process may change, because the storage location of the Wasm heap may move when we resize it. But you can always use the Wasm pointer together with the Store and Memory to get at the current contents.
So to summarize: store the Instance
, Store
, and a u32
corresponding to the Wasm address. Hopefully that makes sense?
If you don't need this buffer to be resizeable (and note that if you do resize it, the old pointer may become invalid), consider using Box<[u8]>
, plus Box::into_raw to turn it into a raw pointer you can pass to the host. That pointer should remain valid until the guest uses Box::from_raw to reclaim (and possibly drop) it.
Or even just use alloc directly if you only ever plan to access the buffer via the raw pointer.
ohh much thanks @Chris Fallin for clearing this up! I guess the allocator and globals of the guest binary would belong to the Instance I presume that's why it remembers the allocated memory in subsequent exported function calls ?
thanks @Joel Dice for this detail. I should definitely put a check and evict the stored offset when memory resizes! If I understand it correct the function is
memory.grow(caller.as_context_mut(), num_pages as u64)
right ? This is the only way memory might resize correct ?
spino17 said:
ohh much thanks @Chris Fallin for clearing this up! I guess the allocator and globals of the guest binary would belong to the Instance I presume that's why it remembers the allocated memory in subsequent exported function calls ?
Basically yeah -- to make it more precise, a malloc
-like allocator typically keeps its state in global variables (in the C/Rust sense) which for a Wasm target are compiled to accesses to specific addresses in the Wasm heap, and in data structures pointed to from those variables. So malloc's "next block to allocate from" data might be a Block*
at some arbitrary Wasm address 0x1020
or something like that. As long as the instance exists, the data in the heap remains there; so when you next call into the guest, it has the same malloc
state.
memory.grow(caller.as_context_mut(), num_pages as u64)
right ? This is the only way memory might resize correct ?
The main way memory will grow is usually by action of the guest -- the memory.grow
Wasm opcode grows the heap. A malloc implementation will use this under the hood to grow memory as more is needed.
so @Chris Fallin if guest resize the memory then my stored offset might get invalid and I have no way to know it ?
The "offset" (offset into heap, Wasm pointer) is still valid; the host-side pointer may not be. This bit from my earlier message above is hopefully useful:
the Wasm pointer will continue to be valid as long as the instance is alive. Note this is an offset into the Wasm heap. The host-side pointer that corresponds to where that data is stored in the host process may change, because the storage location of the Wasm heap may move when we resize it. But you can always use the Wasm pointer together with the Store and Memory to get at the current contents.
ahh ohh okay, got it! yeah I get this offset by calling a wasm exported function called __new which allocates a manually drop Vec<u8> and returns it's pointer and this is the pointer I am storing along with instance and store so I believe this will still be valid
Yes. The key bit to realize is that the two sides (host and guest) have different notions of "pointer". To the code inside the Wasm module, the "offset into the heap" is the pointer: this u32
is what Rust or C pass around as actual memory addresses. It's only outside the Wasm module, in the host code, that you see it as an "offset". Wasm pointers (offsets) are as stable as the guest's memory allocation code defines them to be.
so this means that if I use Wasm pointers (offsets) to index into MemoryView at any point of time before or after resize, it will be pointing to the same expected memory buffer ?
given I use only save APIs available
Yes, exactly
ahh this makes the whole picture clear about this and Wasmtime in general ! much thanks @Chris Fallin for keeping up the patience to answer these queries, really you have helped me avoiding weeks of head-scratching. Much appreciated and thanks again :)
No problem, best of luck!
Last updated: Dec 23 2024 at 13:07 UTC