In our embedding, we're using our own bindgen-like macro to generate bindings for our wasm modules to call into the host. It often happens that the host will read function parameters from the wasm memory, then write a result somewhere else in the wasm memory. In some situations, the wasm memory is thus both mutably borrowed and immutably borrowed (think for instance: streaming results as they're computed in the wasm memory, instead of stashing them before writing all of them at once in the wasm memory result sub-region).
We're trying to do this safely, but that basically having several mutable references to the underlying wasm memory, as these get/set happen at different places in the code. It seems hard to safely as this likely means that we'd need to implement some kind of dynamic checking ourselves to track which different subregions of the wasm memory are borrowed at the same time, and panic whenever a region written-to is borrowed more than once.
Is this a problem others have encountered in practice, and if so, how have you dealt with it?
Wiggle was designed to solve this problem
It has a dynamic borrow checker
It won’t panic, but rather return an error if the borrowing rules are broken, because the input pointers are untrusted (controlled by Wasm)
If you are using interfaces that aren’t (or can’t be) defined with witx, you can steal the bits of wiggle that does this work, but it should be relatively easy to reuse the GuestPtr parts of the crate without using the proc macro code generator
Thanks, will read more about it.
Ok, so we could use wiggle, directly via rewriting our bindings with wit, or indirectly via integrating wiggle in our code base.
Ah, here we go :-)
Wiggle is witx, not wit
But yeah
(I say that because it’s at a dead end and we likely won’t do any more real work on it, unless it’s to make concessions for adapting legacy stuff to wit)
its dead as in complete and proven in production and red team tests to be solid, not as in bad, though :)
Ah interesting. Does wit have a similar mechanism built-in?
The wit-bindgen
generator currently is able to largely sidestep this since memory is never both simultaneously mutably and immutably borrowed, with the component model no one ever gets a mutable view into memroy and it's the glue that manages writing to memory which means this alias checking is all bypassed
it does mean, however, that host APIs tend to look different
one of the principles of the component model is that a component shouldn't be able to tell whether the other side of an interface is implemented by the host or by another component, and since components can't share memory with each other, the component model prohibits the kind of zero-copy optimization that Benjamin is doing—do I have that right?
I believe so.
Well it's somewhat subtle, while you're right about inter-component communication this I think has to do with host APIs which are very different. The wit-bindgen
bindings do in fact give zero-copy views into strings/list<u8>
/etc, it's just that you don't get mutable windows even with the component model since there's basically no safe way to do that.
although I will say it doesn't prohibit it, per se, just leaves such machinations up to the embedder to come up with. there's no representation of an "address" in the value types (unlike with the witx attributes), but a number is just a number; the embedder could figure out what linear memory to interact with (unsafely); another component can't access the same memory unless imported (e.g. a "shared everything libc" scheme)
Alternatively, with a stream type, hosts can read from and write to the buffers directly without special conventions.
Actually, reading over the shared everything libc explainer again, I guess it's up to the adapter to see that the source and destination memories are the same and pass the lowered pointers straight though without a realloc/copy?
for component-to-component (anyway, off topic for what Benjamin is talking about)
Last updated: Dec 23 2024 at 13:07 UTC