Stream: general

Topic: Can host allocs guest memory, but guest frees it?


view this post on Zulip Xinyu Zeng (Dec 23 2024 at 09:31):

I think a typical way for host to pass data into guest is first call an exported alloc to alloc some memory in guest, then copy the data into guest as input. After the host calling some other exported functions to do computations over the input, we can call an exported dealloc to free the input in the guest memory.

My question is, is it ok that the host does not call the exported dealloc. Instead, the functions doing those computations take the "ownership" of that input in the guest memory (e.g., unsafely wrap those bytes into Vec) to decide free or not. I tried this approach but it seems the guest memory got messed up and errors occurred.

The reason behind this demand is that those "computation functions" may or may not zero-copy the input to output (output still lives in the guest memory). So the host blindly freeing the input memory causes bug.

The above are under the context of wasmtime Rust API.

view this post on Zulip Lann Martin (Dec 23 2024 at 14:48):

The details will depend on the guest language but in general it should be fine to pass responsibility for deallocation to the guest.

view this post on Zulip Joel Dice (Dec 23 2024 at 16:22):

Note that any time the guest executes a memory.grow instruction and grows memory, any pointers the host might have to guest memory might become invalid. Also, it's quite dangerous to use e.g. Vec::from_raw_parts in the host using a pointer to guest memory; not only can memory.grow invalidate that pointer, but if the Vec is dropped, the host will try to deallocate memory as if it had been allocated by the Rust allocator in the host, which is certainly not the case.

As others have mentioned elsewhere, it is much better for the host to hold on to the offset into guest memory rather than a pointer. The host can then turn that offset into a pointer by adding it to the current guest memory base immediately prior to dereferencing it, which avoids the above issues.

view this post on Zulip Xinyu Zeng (Dec 23 2024 at 16:47):

Thanks @Joel Dice . I understand the case. My trick to solve this is first use static strategy to mmap a large enough VM for the instance so the base address does not change. Second, the Vec has a customized allocator which contains an Arc<Instance> and the offset of the data in the guest address space. Upon drop the Vec, it will call an exported func to drop the memory in guest. I feel this is kind of hacky, but it works when I want to use Vec in host to wrap up underlining data zero copied from guest.

view this post on Zulip Xinyu Zeng (Dec 23 2024 at 16:58):

My take is that Just use offset and call .data[offset] is definitely safe and correct. But it needs some other data structure (a new struct) to wrap up and cannot interoperate with the ordinary Vec well. (i am actually not using Vec but Arrow Buffer, sth like bytes:Bytes)


Last updated: Jan 24 2025 at 00:11 UTC