Hi, I am wondering what is the canonical way in wasmtime to read host memory without the copy cost? I think one way is using imported function/callback but section 6 (page8,right half) of this paper says it is also costly because of the conversion of function parameters and return value. It also mentions some tricks to let different virtual memory (like one in host and the other in wasm) point to the same physical memory so the copy is saved. Is it even possible in wasmtime?
I'm not sure exactly what you mean by "read host memory"; the wasm sandbox very intentionally only allows guests to read their own memories. If you are looking for a way to avoid copies between guests and the host, currently the only really viable option for most guest languages / toolchains is for the guest to allocate a buffer, pass the buffer offset ("pointer", from the guest's perspective) to the host, and for the host to operate on that buffer directly via e.g. Memory::data_mut
.
@Xinyu Zeng the straightforward answer to your question would be that providing accessors as imports avoids the copy -- and it is asymptotically more efficient if you're reading only a small part of a buffer. This pattern occurs today in ad-hoc ways already -- for example wasi-http has a notion of handles to pieces of requests/responses (like sets of headers) and provides an API to query and mutate these; the whole body doesn't need to cross the boundary.
The cost issue of that (many tiny accessor calls) is a separate question and one that could be optimized if needed. A comment on the paper you link: it's important to understand what's "fundamental" and what's simply a property of an implementation, especially when discussing research papers. In that section they seem to be describing a limitation of V8 and/or the particular ABI they have chosen. But there's no reason that a Wasm engine in theory couldn't inline "accessor" functions across module boundaries, or the host/guest boundary, so that the access is as cheap as a local memory access. One could imagine an opaque reference as a handle in an API and accessors that provide access to parts of it. Wasmtime doesn't support such inlining today, either, but at least in the cross-module case, this is theoretically pretty clear (inlining IR across functions) and probably could be done.
Thanks a lot for the reply! I will try both approaches. Given the info I guess the "host operating via data_mut" method is the less costly one but it is hard to implement in my use cases.
Hi @Chris Fallin Could you refer me to the wasi-http code that uses imports to avoid the copy?
Look at the handling of headers for example — the “fields” resource type is a handle to data on the host side with individual get/set accessors. That’s a general pattern that is useful when you have “sparse” access to external data.
I am not very familiar with the component model...Maybe a direct question, IIUC, this way of avoiding copy using imports is only possible for small data, e.g., two i32, because the data has to be passed via function return values. Is that correct? I still do not understand how to zero-copy a whole buffer using imports...
Guest components cannot read each other's memories by design. The only way to do zero-copy in the networking sense today is for the host to directly write to one guest's memory.
In other words, zero-copy in this case would mean that the first time the data is written to memory (e.g. from the network), it should go straight into an already-allocated buffer that lives in the guest's linear memory.
malloc
-style function exported by the guest). In the component model, the function would be cabi_realloc
.read(2)
, recv(2)
, or whateverfree
-style function) or reuse the buffer to read more data.The approach I was trying to sketch above -- poorly, sorry, so I'll expand it more here! -- is kind of a path to zero-copy, but to be very clear, it's not something that would be efficient today without more work. @Xinyu Zeng the idea is that even for large buffers, you can expose them only through accessors. Consider, e.g., a "resource handle" for a buffer or string; and an imported function that, given that handle and an index, returns the byte at that index. That gives you access to all the data without copying it in wholesale.
There are a number of tradeoffs though:
&[u8]
or uint8_t*
respectively. This one is pretty fundamental: zero copy means we don't put the data in the guest memory, but the guest memory is all the guest sees with its own pointer types, as others note above.What I'm describing is in the same spirit as the "string imports" idea floating around in core Wasm spec discussions, and the idea is visible in a coarse granularity today (I mentioned headers in wasi-http because these are not copied wholesale into the guest; you use accessors to get pieces of them); but it's more of a direction we could follow in building out engine optimizations to make this efficient, and very much not something one can take off the shelf today. Sorry if this wasn't clear before!
Thanks @Lann Martin and @Joel Dice ! I understand this approach but sometimes it is not easy, e.g., the API of the "producer" of the data allocates its memory in host already e.g. return a Bytes
.
Thanks @Chris Fallin a lot for the explanation. I kind of get what you mean. In an ideal world with perfect inlining this imported function call (the "get byte" accessor) behaves exactly like a raw memory read. Because it does not have "native view", if in guest we want to operate on a (large) buffer in host, we may need some wrapper code to turn every byte read to the buffer into that "get byte" accessor.
A bit late to the thread but you may be interested by those issues:
@Christof Petig may have some insights in that as well since he has been actively working on building a Component Model abstraction over mapping part of the linear memory of guests to host memory IIUC the approach.
Last updated: Jan 24 2025 at 00:11 UTC