I think a typical way for host to pass data into guest is first call an exported alloc
to alloc some memory in guest, then copy the data into guest as input. After the host calling some other exported functions to do computations over the input, we can call an exported dealloc
to free the input in the guest memory.
My question is, is it ok that the host does not call the exported dealloc
. Instead, the functions doing those computations take the "ownership" of that input in the guest memory (e.g., unsafely wrap those bytes into Vec) to decide free or not. I tried this approach but it seems the guest memory got messed up and errors occurred.
The reason behind this demand is that those "computation functions" may or may not zero-copy the input to output (output still lives in the guest memory). So the host blindly freeing the input memory causes bug.
The above are under the context of wasmtime Rust API.
The details will depend on the guest language but in general it should be fine to pass responsibility for deallocation to the guest.
Note that any time the guest executes a memory.grow
instruction and grows memory, any pointers the host might have to guest memory might become invalid. Also, it's quite dangerous to use e.g. Vec::from_raw_parts
in the host using a pointer to guest memory; not only can memory.grow
invalidate that pointer, but if the Vec
is dropped, the host will try to deallocate memory as if it had been allocated by the Rust allocator in the host, which is certainly not the case.
As others have mentioned elsewhere, it is much better for the host to hold on to the offset into guest memory rather than a pointer. The host can then turn that offset into a pointer by adding it to the current guest memory base immediately prior to dereferencing it, which avoids the above issues.
Thanks @Joel Dice . I understand the case. My trick to solve this is first use static strategy to mmap a large enough VM for the instance so the base address does not change. Second, the Vec
has a customized allocator which contains an Arc<Instance> and the offset of the data in the guest address space. Upon drop the Vec, it will call an exported func to drop the memory in guest. I feel this is kind of hacky, but it works when I want to use Vec
in host to wrap up underlining data zero copied from guest.
My take is that Just use offset and call .data[offset] is definitely safe and correct. But it needs some other data structure (a new struct) to wrap up and cannot interoperate with the ordinary Vec well. (i am actually not using Vec but Arrow Buffer, sth like bytes:Bytes)
Last updated: Jan 24 2025 at 00:11 UTC