Hello all,
I am using Rust and WASM to implement a simulation engine. In short, the Rust host process handles graphics, config, etc. and WASM modules are doing the processing. My current architecture would call for modules to process many "large" (many MB) buffers as quickly as possible.
My problem therefore is passing these buffers around between the WASM modules. Resource
types solve the main problem of being able to pass a buffer around by reference. But now my problem is: how do you get the data out of a Resource
in an efficient manner? It seems you either have to play nice with the type system (get one record at a time or return an _owned_ list, which requires copying) or you return a memory offset and calculate the pointer in the module.
To highlight this, here's a toy Buffer WIT definition showing a few different ways to get at the data in a resource:
interface buffer-resource {
resource buffer {
constructor();
get-single: func() -> float32;
get-chunk: func() -> list<float32>;
get-chunk-pointer: func() -> u32;
}
}
If you try to implement this on the Host, you get something like this:
impl HostBuffer for BufferManager {
fn new(&mut self) -> wasmtime::Result<Resource<Buffer>> {
todo!()
}
fn get_single(
&mut self,
self_: Resource<Buffer>,
) -> wasmtime::Result<f32> {
todo!()
}
fn get_chunk(
&mut self,
self_: Resource<Buffer>,
) -> wasmtime::Result<Vec<f32>> {
todo!()
}
fn get_chunk_pointer(
&mut self,
self_: Resource<Buffer>,
) -> wasmtime::Result<u32> {
todo!()
}
fn drop(&mut self, rep: Resource<Buffer>) -> wasmtime::Result<()> {
todo!()
}
}
Again, I'm calling out the owned Vec<f32>
.
If trying to implement this in the guest, you get this:
impl GuestBuffer for Buffer {
fn new() -> Self {
todo!()
}
fn get_single(&self,) -> f32 {
todo!()
}
fn get_chunk(&self,) -> wit_bindgen::rt::vec::Vec::<f32> {
todo!()
}
fn get_chunk_pointer(&self,) -> u32 {
todo!()
}
}
What I _think_ the "best" way for my goals is to do the following:
However, I would really like to play nicer with the type system in the Component Model and avoid unsafe code. I would love to get some feedback on whether this is the right way or if there is some other approach that would be more idiomatic.
Thank you!
If your use case requires quickly sending MB of arrays and the current implementations aren't fast enough then the component model may not be a great fit for your use case right now. You've already found the "main" way of doing this which is to use resource handles, but as you've seen the component model has no primitive notion of a buffer at this time so the only option is to take/return list<f32>
. Also as you've seen bindings generators change that to Vec<f32>
which is an owned allocation which implies copies.
It's technically possible to achieve what you want with the component model right now, but it will require you to not use bindings generators for the function that transfers things. For example Wasmtime has wasmtime::component::WasmList
which is a list that lives in wasm and isn't copied. In Rust you'd have to export your own function which does not have a post-return to clean up the allocation because you wouldn't return an owned allocation. In that sense it's technically feasible to achieve zero-copy transfers, but it's not easy today and bindings generators are not currently built to enable this.
That's why I say that the component model may not be a great fit for this use case right now. If you're hesitant to hack on bindings generators a copy will be required today. If you feel ok diving into all the details here and hacking on bindings generators and/or writing code that's at the bindings generator level, then you can probably achieve this. It'll require a nontrivial amount of knowledge about how lifting/lowering all interact in the component model.
All that being said this is a use case I'd love to at least personally see enabled, so I'd be happy to help out with questions and guidance if you'd like. One thing I'll point out though:
What I _think_ the "best" way for my goals is to do the following:
One point to keep in mind is that the Component Model as an abstraction doesn't allow embedders to see the raw whole linear memory of the guest module. In that sense there's not even an unsafe
way to implement what you outlined above. The "unsafe" way is WasmList<f32>
(plus adding an as_le_slice
method for f32
which doesn't currently exist) and then plumbing that through the host bindings. The guest side will need to be handwritten at this time
Thanks Alex for the detailed response.
I'm definitely interested in figuring out a way to do this and document it. If I get deep enough in the weeds I wouldn't mind contributing to the bindgen projects to make it work.
With that being said, it sounds like the things I need to be looking at are:
as_le_slice
for WasmList<f32>
WasmList<f32>
that does a mem::forget
or similarOne point to keep in mind is that the Component Model as an abstraction doesn't allow embedders to see the raw whole linear memory of the guest module.
I was not aware of this. In my prototyping I must have just got lucky when passing either the pointer directly as a u64
or a memory offset as a u32
.
I feel this is the use case I tried to address by proposing borrow<list<T>> as a return type.
This would enable reusing pre-allocated buffers in the guest without the convention to free the buffer after use.
While this is equivalent to manually returning an address (s32), plus a length here, it still enables other guest languages to make sense of the return value.
Of course a non-owned list is an alien data type in non-system languages. Sadly there also is no good way to express the lifetime of this buffer in wit.
Writing into a list provided by the guest as an argument (a guest side array as the "out" argument to the function) is a more memory safe way to express this, but now the host can't predict the address before the call. The function could still return the number of valid elements.
So perhaps a guest side resource is needed controlling the buffer lifetime (the host could retrieve the address via a method returning a borrow<list>, see above) and possibly a pollable would indicate new data valid, but then there is no good way to avoid overwriting data still used on the guest side without a locking mechanism.
It feels like this quickly evolves into a complex mechanism, perhaps working on preview3's stream<T> is the most elegant and near term solution. :thinking:
:thinking: perhaps iceorix2 provides the right abstraction to model this after, I am going to take a closer look there
For reference, here is a link to the previous discussion: https://bytecodealliance.zulipchat.com/#narrow/stream/327223-wit-bindgen/topic/borrowing.20records.3F.20.28shared.20data.29/near/379740418
It looks like iceoryx2 is different from iceorix 2.x (which I wrongly linked to), iceoryx2 (the faster Rust rewrite of iceoryx) lives at https://github.com/eclipse-iceoryx/iceoryx2
With that being said, it sounds like the things I need to be looking at are:
That seems about right! Note that you in theory should not need mem::forget
since buffers can still be properly managed even without it (e.g. no copies made). I have not yet implemented this, however, so I can't say that for sure.
I was not aware of this. In my prototyping I must have just got lucky when passing either the pointer directly as a u64 or a memory offset as a u32.
I'll note that there's a distinction between core wasm and components here. Core wasms inside of a component can indeed export their memories, it's components themselves that can't export memories (only primitives in the component model which does not include memories).
It feels like this quickly evolves into a complex mechanism, perhaps working on preview3's stream<T> is the most elegant and near term solution.
This is definitely something AFAIK that stream<T>
is intended to help solve. I'll note that stream<T>
doesn't currently have a concrete design in the ways of "here's what you would do to solve this exact problem", however, but now's the right time to feed in design constraints!
Last updated: Dec 23 2024 at 13:07 UTC