Hi, I am wondering what is the canonical way in wasmtime to read host memory without the copy cost? I think one way is using imported function/callback but section 6 (page8,right half) of this paper says it is also costly because of the conversion of function parameters and return value. It also mentions some tricks to let different virtual memory (like one in host and the other in wasm) point to the same physical memory so the copy is saved. Is it even possible in wasmtime?
I'm not sure exactly what you mean by "read host memory"; the wasm sandbox very intentionally only allows guests to read their own memories. If you are looking for a way to avoid copies between guests and the host, currently the only really viable option for most guest languages / toolchains is for the guest to allocate a buffer, pass the buffer offset ("pointer", from the guest's perspective) to the host, and for the host to operate on that buffer directly via e.g. Memory::data_mut
.
@Xinyu Zeng the straightforward answer to your question would be that providing accessors as imports avoids the copy -- and it is asymptotically more efficient if you're reading only a small part of a buffer. This pattern occurs today in ad-hoc ways already -- for example wasi-http has a notion of handles to pieces of requests/responses (like sets of headers) and provides an API to query and mutate these; the whole body doesn't need to cross the boundary.
The cost issue of that (many tiny accessor calls) is a separate question and one that could be optimized if needed. A comment on the paper you link: it's important to understand what's "fundamental" and what's simply a property of an implementation, especially when discussing research papers. In that section they seem to be describing a limitation of V8 and/or the particular ABI they have chosen. But there's no reason that a Wasm engine in theory couldn't inline "accessor" functions across module boundaries, or the host/guest boundary, so that the access is as cheap as a local memory access. One could imagine an opaque reference as a handle in an API and accessors that provide access to parts of it. Wasmtime doesn't support such inlining today, either, but at least in the cross-module case, this is theoretically pretty clear (inlining IR across functions) and probably could be done.
Thanks a lot for the reply! I will try both approaches. Given the info I guess the "host operating via data_mut" method is the less costly one but it is hard to implement in my use cases.
Last updated: Nov 22 2024 at 16:03 UTC