Canonical way to read host memory without copy cost? · general

Hi, I am wondering what is the canonical way in wasmtime to read host memory without the copy cost? I think one way is using imported function/callback but section 6 (page8,right half) of this paper says it is also costly because of the conversion of function parameters and return value. It also mentions some tricks to let different virtual memory (like one in host and the other in wasm) point to the same physical memory so the copy is saved. Is it even possible in wasmtime?

Lann Martin (Nov 15 2024 at 13:45):

I'm not sure exactly what you mean by "read host memory"; the wasm sandbox very intentionally only allows guests to read their own memories. If you are looking for a way to avoid copies between guests and the host, currently the only really viable option for most guest languages / toolchains is for the guest to allocate a buffer, pass the buffer offset ("pointer", from the guest's perspective) to the host, and for the host to operate on that buffer directly via e.g. Memory::data_mut.

Chris Fallin (Nov 15 2024 at 17:15):

@Xinyu Zeng the straightforward answer to your question would be that providing accessors as imports avoids the copy -- and it is asymptotically more efficient if you're reading only a small part of a buffer. This pattern occurs today in ad-hoc ways already -- for example wasi-http has a notion of handles to pieces of requests/responses (like sets of headers) and provides an API to query and mutate these; the whole body doesn't need to cross the boundary.

The cost issue of that (many tiny accessor calls) is a separate question and one that could be optimized if needed. A comment on the paper you link: it's important to understand what's "fundamental" and what's simply a property of an implementation, especially when discussing research papers. In that section they seem to be describing a limitation of V8 and/or the particular ABI they have chosen. But there's no reason that a Wasm engine in theory couldn't inline "accessor" functions across module boundaries, or the host/guest boundary, so that the access is as cheap as a local memory access. One could imagine an opaque reference as a handle in an API and accessors that provide access to parts of it. Wasmtime doesn't support such inlining today, either, but at least in the cross-module case, this is theoretically pretty clear (inlining IR across functions) and probably could be done.

Xinyu Zeng (Nov 16 2024 at 06:22):

Thanks a lot for the reply! I will try both approaches. Given the info I guess the "host operating via data_mut" method is the less costly one but it is hard to implement in my use cases.

Xinyu Zeng (Nov 26 2024 at 07:50):

Hi @Chris Fallin Could you refer me to the wasi-http code that uses imports to avoid the copy?

Chris Fallin (Nov 26 2024 at 08:27):

Look at the handling of headers for example — the “fields” resource type is a handle to data on the host side with individual get/set accessors. That’s a general pattern that is useful when you have “sparse” access to external data.

Xinyu Zeng (Nov 26 2024 at 10:07):

I am not very familiar with the component model...Maybe a direct question, IIUC, this way of avoiding copy using imports is only possible for small data, e.g., two i32, because the data has to be passed via function return values. Is that correct? I still do not understand how to zero-copy a whole buffer using imports...

Lann Martin (Nov 26 2024 at 13:40):

Guest components cannot read each other's memories by design. The only way to do zero-copy in the networking sense today is for the host to directly write to one guest's memory.

Joel Dice (Nov 26 2024 at 15:23):

In other words, zero-copy in this case would mean that the first time the data is written to memory (e.g. from the network), it should go straight into an already-allocated buffer that lives in the guest's linear memory.

Chris Fallin (Nov 26 2024 at 16:46):

The approach I was trying to sketch above -- poorly, sorry, so I'll expand it more here! -- is kind of a path to zero-copy, but to be very clear, it's not something that would be efficient today without more work. @Xinyu Zeng the idea is that even for large buffers, you can expose them only through accessors. Consider, e.g., a "resource handle" for a buffer or string; and an imported function that, given that handle and an index, returns the byte at that index. That gives you access to all the data without copying it in wholesale.

What I'm describing is in the same spirit as the "string imports" idea floating around in core Wasm spec discussions, and the idea is visible in a coarse granularity today (I mentioned headers in wasi-http because these are not copied wholesale into the guest; you use accessors to get pieces of them); but it's more of a direction we could follow in building out engine optimizations to make this efficient, and very much not something one can take off the shelf today. Sorry if this wasn't clear before!

Xinyu Zeng (Nov 27 2024 at 02:08):

Thanks @Lann Martin and @Joel Dice ! I understand this approach but sometimes it is not easy, e.g., the API of the "producer" of the data allocates its memory in host already e.g. return a Bytes.

Xinyu Zeng (Nov 27 2024 at 02:23):

Thanks @Chris Fallin a lot for the explanation. I kind of get what you mean. In an ideal world with perfect inlining this imported function call (the "get byte" accessor) behaves exactly like a raw memory read. Because it does not have "native view", if in guest we want to operate on a (large) buffer in host, we may need some wrapper code to turn every byte read to the buffer into that "get byte" accessor.

raskyld (Dec 15 2024 at 09:33):

@Christof Petig may have some insights in that as well since he has been actively working on building a Component Model abstraction over mapping part of the linear memory of guests to host memory IIUC the approach.

Memory Sharing Discussion · Issue #19 · WebAssembly/memory-control

Introduction and Context There have been a number of conversations and questions raised about the possibility to share subsets of memory between multiple WebAssembly instances and the host environm...

Flat data representation proposal: Enables zero copy shared memory, zero allocation return types, binary serialization · Issue #398 · WebAssembly/component-model

This all started with defining zero copy shared memory over a WIT interface (channel is WIT resource, inspired by iceoryx2): let channel = Channel_u32::new("topic"); loop { let message = channel.al...

Stream: general

Topic: Canonical way to read host memory without copy cost?

Xinyu Zeng (Nov 15 2024 at 04:24):