tl;dr We're hoping to find safer ways to access slices and strs in Wiggle trait implementations, particularly in the presence of invalidating operations like yielding. But we don't want to mess up what's already working well for wasi-common.
Hey folks, this week I've been looking at how to start rolling out Wiggle-generated code to more hostcall implementations in our Wasm execution daemon. I've run into a couple points of additional friction vs writing hostcall implementations manually. I'm hoping y'all might have some ideas for how we might be able to reduce this friction?
The first problem is one that I've actually managed to sidestep for now: Lucet supports a family of yield functions on its VM context object which are quite useful for integrating Lucet execution with async Rust runtimes. For safety, these require an exclusive &mut
reference to the VM context, but Wiggle only provides a shared &
reference to the user-provided context type. I've sidestepped this issue for the moment by converting async calls within hostcalls to blocking calls using futures::executor::block_on
. This works for now, but is not ideal because it prevents those blocked threads from handling other async work.
The bigger problem at the moment is that it's currently quite painstaking to get access to slices and strs in Wiggle implementations. It requires using the honor system with GuestBorrows, plus explicit unsafe to dereference a raw pointer. My understanding is that this is pretty uncommon in wasi-common compared to dealing with [pointers to] scalars, enums, and structs, but in our Wasm daemon the situation is practically the reverse: we're dealing with slices and strings in almost all of our hostcalls. If it was just a few functions in special situations, I would feel better about rolling this out with appropriate signposting and warnings, but right now we're holding off until the common case no longer requires this manually-checked reasoning.
In Lucet, we address this with dynamic borrow checking on heap accessors. This works out nicely, because any operations on the VM context that might invalidate the view of the heap such as yielding or growing the heap (a safe callback interface is still a TODO for us; we don't use callbacks in our daemon) require exclusive &mut
access, and thus the borrow checker prevents the Ref
s and RefMut
s from staying live during these operations. The hostcall implementor is then free to reacquire updated views of these resources on the other side of the invalidating operation.
I tried noodling around a bit adding RefCell
s to my Wiggle context, and to my Wiggle GuestMemory
implementation, in order to achieve a similar effect, but I'm running into a couple problems:
GuestPtr
s contain shared references to the GuestMemory
, and are provided to the implementation as function arguments. This means that even if we change the shared &
reference to a dynamically-checked Ref
, we have no way of re-acquiring that borrow after relinquishing it for an invalidating operation that requires an exclusive &mut
or RefMut
reference.
I haven't found a workable solution for having the GuestMemory
and the user-provided Wiggle context share borrow state. Having a single object produce both our views into memory, and mediate access to invalidating operations is how we get machine-checked assurance that we're not changing the world out from under ourselves. So naturally, I tried to make these values share a borrow state through RefCell
in order to recover these semantics.
What I found was that it's very difficult to nest RefCell
s in a way that allows functions to return a Ref
or RefMut
from the inner cell. While you can map over a Ref
, the closure you map with must return a &U
, it can't return Ref<'_, U>
. This makes sense because of the ownership logic of the dynamic borrows involved here, and I'm guessing could be worked around with a custom variant of RefCell
, but :grimacing:
So I'm stuck wishing for three things, and realizing they're probably incompatible with the current Wiggle API:
GuestMemory
and user-provided context be the same object, with no indirection through RefCell
, RwLock
, etc.&mut
access to that object in the body of my Wiggle trait implementations.GuestPtr
s on the other side of an invalidating operation, or have them be safe to hold across such an operation.Do y'all have ideas for how we might be able to accommodate these wishes without having to tear down what we've already built and what works well for wasi-common?
@acfoltzer FWIW I believe I'm "at fault" for most of the current restrictions in wiggle, but that was mostly just me rationalizing Rust's memory model with executing arbitrary wasm code. Given that there's not a huge amount of wiggle (hah) room without larger changes.
For example if you've got a &mut T
that points into wasm memory, then it's memory unsafe to just call back into the wasm. Nothing about GuestBorrows
will help fix that unsoundness. This could be fixable, however, if you've got a larger system which dynamically prevents reentering the wasm once you have a &mut T
(and then you'd also dynamic-borrow-check to make sure the &mut T
didn't overlap with anything)
Yep, and that's exactly what we have in our daemon
it seems like a fundamentally different approach to what Wiggle is currently doing, though, which led to this wall of text :upside_down:
It sounds like some of these may be solvable with codegen tweaks? My intention was that wiggle would generate 100% safe things, but if y'all have extra checking in place that wiggle isn't aware of then it would be pushing the unsafe boundary outwards
but yeah if y'all are statically preventing reentering wasm then that changes a whole lot about wiggle
yeah, it's a bit of a hack but it works pretty nicely in practice
from a bindings-generator point of view which is purely consuming wasmtime
APIs, it's impossible to provide that guarantee statically though
e.g. if other users of wiggle come along
it does require some pretty gross stuff under the hood though, most notoriously https://github.com/bytecodealliance/lucet/blob/master/lucet-runtime/lucet-runtime-internals/src/vmctx.rs#L177
we might have to wait for Pat to come back from vacation to really get to the point of making any decisions, because he's much more familiar with the API constraints you have on the wasmtime
side
I may have also just mistaken the purpose of wiggle somewhat
one of my fallback ideas though is to see about factoring the witx consumption and reasoning about Rust host-side types into a common module, and just doing different codegen for Lucet
I was viewing it from a lens of "anyone who's binding *.witx
will want to use this to bind host functions"
but if it's instead "fastly is binding *.witx
with these extra assumptions" then wiggle should look very different
it sounds like y'all are making assumptions about the wasm execution which other projects may not always be able to make
hmm, do you have an example of that?
which changes quite a bit how nice a bindings library can be
oh so I don't think we have any other examples of wiggle in use, but I'm imagining a use case which allows reentrancy
e.g. you call into the host and then it calls you back during that host call
the possibility of that happening is a fundamental assumption of wiggle right now, but it sounds like y'all prevent that via other means
but no one's getting too too much use out of wiggle right now, so we can also just document how it's safely used
and change it to assume no reentrancy
oh, you mean that some other thread in the host calls into the same Wasm module?
rather than the hostcall implementation doing the callback on the same thread?
nah not another thread, but the same thread itself
like some funky syscall where you call host.call_me_back(my_fn_pointer)
and then the host, during that function call, calls my_fn_pointer
so the native stack looks like [top] -> wasm code -> [hostcall] -> wasm code -> ...
where any borrows that [hostcall]
has active are immediately invalidated by executing wasm code
right. our assumption is that you can do that, but only through a function that takes &mut Vmctx
so that outstanding heap references must be dropped
(that function doesn't exist yet in our Vmctx
interface though; yield
is a better example right now)
ah ok, that's sort of the same thing though, the risk of allowing reentering wasm during a hostcall is active borrows
exactly
but y'all are dynamically guaranteeing there are no active borrows with Rust's lifetime system (I think?)
that assumption wasn't baked into wiggle b/c that's an application-specific restriction you have
but wiggle could of course be rewritten or updated with that restriction
and simply say "applications gotta have something like this"
that's right. Rust will complain if the heap view Ref<'_, [u8]>
is live when you try to use &mut Vmctx
@acfoltzer Would it be possible to build around wiggle
, here wiggle
staying largely the same and you providing something more specific like @Alex Crichton is pointing out?
that is what I was trying to do with the RefCell
experiments I describe above
maybe I just got myself backed into a corner with it, but I couldn't figure out how to make it work out
Oh and for the record, although perhaps I misunderstood you, we do quite a few calls to get a string out of guest in wasi-common
ah, Pat was describing it as a less common case in wasi-common than it is in our daemon
oh it may be so
(also to clarify I didn't really understand what you were mentioning about RefCell and things above)
in wasi-common
any syscall that includes "path" requires fetching a string
Alex Crichton said:
(also to clarify I didn't really understand what you were mentioning about RefCell and things above)
hah sorry. that was a bit more of a brain dump than I had intended. I was basically trying to recover the connection between the lifetimes of the active borrows with the lifetime of the user-provided context type
and as far &
vs &mut
I thought that was the desired requirement so that we can easily stick any context object in an Rc
/Arc
ahh, that's good to know. we do not have that as a restriction in Lucet, so we've always had the hostcall context as a &mut
But also to be clear, this is again sort of just shaping wiggle as a library that makes minimal assumptions
yeah, that makes sense
Wiggle right now is designed to assume a maximally malicious host which wants to stress everything
but it still wants to provide a 100% safe interface
so the trivial reason for requiring &
for user contexts is that wasm can call the host twice on the stack
which makes it sort of like TLS, so you can only get a shared reference
once you start making other assumptions though these requirements go away
it sounds like y'all are making enough assumptions that wiggle is just getting in the way
I think the assumptions you're making though might be quite difficult for other applications to make as well, so I'm not sure if there's a nice reusable binding layer to share unfortunately
I think I agree with @Alex Crichton on this
Perhaps we could figure out if we could make wiggle
-like interface pluggable into wasi-common
rather than specifically requiring it?
yeah, and it's possible this may just come down to us having faulty assumptions about the safety of our system. for example if you have a stack with [Rust] -> [Wasm] -> [Hostcall] -> [Wasm] -> [Hostcall] then you have two &mut Vmctx
on the stack pointing to the same place. our argument has been this is okay because the second is entirely contained within the first, but maybe that's naïve
Not sure if that makes sense but then you guys could have a more specialised version of it and everything would be fine
yeah, I'm leaning toward something like that
@acfoltzer I think that's a really fine line to walk there
I wouldn't say with certainty that it's safe or unsafe
I'm glad we agree :laughing:
You can have two &mut T
pointers on the stack that point to the same thing, but in safe rust rustc/llvm can see the pointers getting passed around
the unsafety would happen I believe if they can't see the pointers passed around
like:
let x: &mut Context = ...;
let y = x.field1;
foobar(); // may read/mutate `x`
assert_eq!(x.field1, y);
if rustc sees you pass x
to foobar
, then it knows it has to reload x.field1
after the call
if it doesn't see you pass x
to foobar
then it could assume it's impossible to change, so there's no need to reload x.field1
but if x.field1
was mutated in foobar
then that's an issue
clearly if you do foobar(x)
rustc/llvm will see through all that, but if it's actually some_wasm_call()
it's not clear if LLVM/rustc can see enough to assume that pointers might change
that makes sense, and certainly makes our current interface unsafe. but a callback interface that requires a &mut Vmctx
would let Rust see the pointer being used
these are also sort of just vague thoughts though, I'm definitely not like on the unsafe guidelines wg or anything like that
yeah, I'm glad to see that your thinking resembles mine quite a bit, though
fine line indeed
okay, well thank you for the extra context about the whys of the current interface. I think I'll have to wait for Pat to get back and chat with him about what our strategy should be. right now I'm leaning toward doing additional Lucet-specific codegen (we already do a bit in lucet-wiggle
), but this is probably worth talking about more at the meeting next week
given that wasmtime/lucet integration is on our roadmap, it'd be good to understand exactly what the assumptions are in each system, and know how we can reconcile them
acfoltzer said:
okay, well thank you for the extra context about the whys of the current interface. I think I'll have to wait for Pat to get back and chat with him about what our strategy should be. right now I'm leaning toward doing additional Lucet-specific codegen (we already do a bit in
lucet-wiggle
), but this is probably worth talking about more at the meeting next week
Sounds good! Oh and would you mind pinging a couple of links in my direction where this problem space appears in lucet? I used to keep track of you changes guys but sadly not for a while now...
or if they can't be reconciled, how to turn the difference into one of the "knobs" we've discussed before for modularizing the systems
@Jakub Konka I unfortunately can't share the daemon code where we're doing this hostcall implementation, if that's what you mean
the lucet-wiggle
stuff though is just a thin wrapper around wiggle
crates. you can see the core idea here: https://github.com/bytecodealliance/lucet/blob/master/lucet-wiggle/generate/src/lib.rs
sorry, meant to link to this line: https://github.com/bytecodealliance/lucet/blob/master/lucet-wiggle/generate/src/lib.rs#L64
acfoltzer said:
Jakub Konka I unfortunately can't share the daemon code where we're doing this hostcall implementation, if that's what you mean
Oh, sure thing, I actually meant the lucet stuff mainly and lucet-wiggle in particular :-)
acfoltzer said:
sorry, meant to link to this line: https://github.com/bytecodealliance/lucet/blob/master/lucet-wiggle/generate/src/lib.rs#L64
Awesome, thanks!
thank you all for taking a look :heart:
Last updated: Nov 22 2024 at 16:03 UTC