Stream: wasmtime

Topic: Safety when borrowing a subset of the wasm memory


view this post on Zulip Benjamin Bouvier (Aug 10 2022 at 15:06):

In our embedding, we're using our own bindgen-like macro to generate bindings for our wasm modules to call into the host. It often happens that the host will read function parameters from the wasm memory, then write a result somewhere else in the wasm memory. In some situations, the wasm memory is thus both mutably borrowed and immutably borrowed (think for instance: streaming results as they're computed in the wasm memory, instead of stashing them before writing all of them at once in the wasm memory result sub-region).

We're trying to do this safely, but that basically having several mutable references to the underlying wasm memory, as these get/set happen at different places in the code. It seems hard to safely as this likely means that we'd need to implement some kind of dynamic checking ourselves to track which different subregions of the wasm memory are borrowed at the same time, and panic whenever a region written-to is borrowed more than once.

Is this a problem others have encountered in practice, and if so, how have you dealt with it?

view this post on Zulip Pat Hickey (Aug 10 2022 at 15:07):

Wiggle was designed to solve this problem

view this post on Zulip Pat Hickey (Aug 10 2022 at 15:08):

It has a dynamic borrow checker

view this post on Zulip Pat Hickey (Aug 10 2022 at 15:08):

It won’t panic, but rather return an error if the borrowing rules are broken, because the input pointers are untrusted (controlled by Wasm)

view this post on Zulip Pat Hickey (Aug 10 2022 at 15:10):

If you are using interfaces that aren’t (or can’t be) defined with witx, you can steal the bits of wiggle that does this work, but it should be relatively easy to reuse the GuestPtr parts of the crate without using the proc macro code generator

view this post on Zulip Benjamin Bouvier (Aug 10 2022 at 15:11):

Thanks, will read more about it.
Ok, so we could use wiggle, directly via rewriting our bindings with wit, or indirectly via integrating wiggle in our code base.

view this post on Zulip Benjamin Bouvier (Aug 10 2022 at 15:11):

Ah, here we go :-)

view this post on Zulip Pat Hickey (Aug 10 2022 at 15:22):

Wiggle is witx, not wit

view this post on Zulip Pat Hickey (Aug 10 2022 at 15:22):

But yeah

view this post on Zulip Pat Hickey (Aug 10 2022 at 15:23):

(I say that because it’s at a dead end and we likely won’t do any more real work on it, unless it’s to make concessions for adapting legacy stuff to wit)

view this post on Zulip Pat Hickey (Aug 10 2022 at 15:24):

its dead as in complete and proven in production and red team tests to be solid, not as in bad, though :)

view this post on Zulip Benjamin Bouvier (Aug 10 2022 at 16:22):

Ah interesting. Does wit have a similar mechanism built-in?

view this post on Zulip Alex Crichton (Aug 10 2022 at 19:10):

The wit-bindgen generator currently is able to largely sidestep this since memory is never both simultaneously mutably and immutably borrowed, with the component model no one ever gets a mutable view into memroy and it's the glue that manages writing to memory which means this alias checking is all bypassed

view this post on Zulip Alex Crichton (Aug 10 2022 at 19:10):

it does mean, however, that host APIs tend to look different

view this post on Zulip Jamey Sharp (Aug 10 2022 at 19:44):

one of the principles of the component model is that a component shouldn't be able to tell whether the other side of an interface is implemented by the host or by another component, and since components can't share memory with each other, the component model prohibits the kind of zero-copy optimization that Benjamin is doing—do I have that right?

view this post on Zulip Peter Huene (Aug 10 2022 at 19:49):

I believe so.

view this post on Zulip Alex Crichton (Aug 10 2022 at 19:53):

Well it's somewhat subtle, while you're right about inter-component communication this I think has to do with host APIs which are very different. The wit-bindgen bindings do in fact give zero-copy views into strings/list<u8>/etc, it's just that you don't get mutable windows even with the component model since there's basically no safe way to do that.

view this post on Zulip Peter Huene (Aug 10 2022 at 19:53):

although I will say it doesn't prohibit it, per se, just leaves such machinations up to the embedder to come up with. there's no representation of an "address" in the value types (unlike with the witx attributes), but a number is just a number; the embedder could figure out what linear memory to interact with (unsafely); another component can't access the same memory unless imported (e.g. a "shared everything libc" scheme)

view this post on Zulip Dan Gohman (Aug 10 2022 at 19:57):

Alternatively, with a stream type, hosts can read from and write to the buffers directly without special conventions.

view this post on Zulip Peter Huene (Aug 10 2022 at 20:07):

Actually, reading over the shared everything libc explainer again, I guess it's up to the adapter to see that the source and destination memories are the same and pass the lowered pointers straight though without a realloc/copy?

view this post on Zulip Peter Huene (Aug 10 2022 at 20:08):

for component-to-component (anyway, off topic for what Benjamin is talking about)


Last updated: Nov 22 2024 at 17:03 UTC