alexcrichton requested dicej for a review on PR #10043.
alexcrichton opened PR #10043 from alexcrichton:vm-safe
to bytecodealliance:main
:
This commit is an internal refactoring of Wasmtime's runtime to prepare
to execute Pulley in MIRI. Currently today this is not possible because
Pulley does not properly respect either strict or permissive provenance
models. The goal of this refactoring is to enable fixing this in a
future commit that doesn't touch everything in the codebase. Instead
everything is touched here in this commit.The basic problem with Pulley is that it is incompatible with the strict
provenance model of Rust which means that we'll be using "exposed
provenance" APIs to satisfy Rust's soundness requirements. In this model
we must explicitly callptr.expose_provenance()
on any pointers
which are exposed to compiled code. Arguably we should also be already
doing this for natively-compiled code but I am not certain about how
strictly this is required.Currently in Wasmtime today we call
ptr.expose_provenance()
nowhere.
It also turns out, though, that we share quite a few pointers in quite a
few places with compiled code. This creates a bit of a problem! The
solution settled on in this commit looks like:
A new marker trait,
VmSafe
, is introduced. This trait is used to
represent "safe to share with compiled code" types and enumerates some
properties such as defined ABIs, primitives wrappers match primitive
ABIs, and notably "does not contain a pointer".A new type,
VmPtr<T>
, is added to represent pointers shared with
compiled code. Internally for now this is justSendSyncPtr<T>
but in
the future it will beusize
. By usingSendSyncPtr<T>
it shouldn't
actually really change anything today other than requiring a lot of
refactoring to get the types to line up.The core
vmctx_plus_offset*
methods are updated to require
T: VmSafe
. Previously they allowed anyT
which is relatively
dangerous to store any possible Rust type in Cranelift-accessible
areas.These three fundamental changes were introduced in this commit. All
further changes were refactoring necessary to get everything working
after these changes. For example many types invmcontext.rs
, such as
VMFuncRef
, have changed to usingVmPtr<T>
instead ofNonNull<T>
or
*mut T
. This is a pretty expansive change which resulted in touching a
lot of places.One premise of
VmPtr<T>
is that it's non-null. This was an
additional refactoring that updated a lot of places where previously
*mut T
was used and now eitherVmPtr<T>
orNonNull<T>
is used.In the end the intention is that
VmPtr<T>
is used whenever pointers
are store in memory that can be accessed from Cranelift. When operating
inside of the runtimeNonNull<T>
orSendSyncPtr<T>
is preferred
instead.As a final note, no provenance changes have actually happened yet. For
example this doesn't fix Pulley in MIRI. What it does enable, though, is
that the future commit to fix Pulley in MIRI will be much smaller with
this having already landed.
alexcrichton requested wasmtime-core-reviewers for a review on PR #10043.
alexcrichton commented on PR #10043:
Note that this is temporarily based on https://github.com/bytecodealliance/wasmtime/pull/10040 which accounts for the first few commits.
alexcrichton updated PR #10043.
cfallin commented on PR #10043:
The basic problem with Pulley is that it is incompatible with the strict
provenance model of RustI'll link here that we have at least one idea (#9060) that would make Pulley compatible with strict provenance, albeit as a major refactor to the way that Wasmtime's data structures work. No argument that with the design today it is basically incompatible with strict provenance.
alexcrichton commented on PR #10043:
Oh true good point! I'm a bit skeptical of the feasibility of such an implementation route though...
One thing I've also been wondering about though, which is a bit related, is let's ignore Pulley and what's our provenance story for Cranelift-compiled x64 code? For example is it possible in Rust's model to still have strict provenance when JIT code is in the mix? Are we required regardless to use "expose provenance" as a primitive to say "modifications to this memory are happening behind you're back". I'm not sure myself.
alexcrichton updated PR #10043.
alexcrichton updated PR #10043.
cfallin commented on PR #10043:
It seems to me at least (with my compiler optimizer hat on) that separate compilation is the ultimate "provenance black hole" -- calling JIT code is no different from, say, calling a C library, which could legally return any pointer derived from anything reachable that we pass into it. As long as Rust supports linking to code that is not compiled by the same rustc invocation (i.e., does not adopt a fully closed-world assumption), its provenance model has to support the "outside world" escape hatch that calling JIT code would also fall into. And unless strict provenance is going to say "you are no longer allowed to link C libraries, and all indirect calls have to be statically devirtualizable" that's as far as an analysis is going to get.
bjorn3 commented on PR #10043:
For example is it possible in Rust's model to still have strict provenance when JIT code is in the mix? Are we required regardless to use "expose provenance" as a primitive to say "modifications to this memory are happening behind you're back". I'm not sure myself.
Calling jitted code is no different from FFI. The foreign code may only access pointers for which it could have received provenance (either through exposed provenance or through a pointer that was passed in) Unluke with Pulley the compiler doesn't have visibility into what the foreign code does, so it for exanple doesn't know if you are loading an integer (erasing provenance) or loading a pointer (preserving provenance if it exists) or which of the available provenances was used for the memory operation. As such it has to assume that the foreign code picked the right provenance to avoid UB if the right provenance could have been picked in the first place due to being exposed to the foreign code.
bjorn3 edited a comment on PR #10043:
For example is it possible in Rust's model to still have strict provenance when JIT code is in the mix? Are we required regardless to use "expose provenance" as a primitive to say "modifications to this memory are happening behind you're back". I'm not sure myself.
Calling jitted code is no different from FFI. The foreign code may only access pointers for which it could have received provenance (either through exposed provenance or through a pointer that was passed in) Unluke with Pulley the compiler doesn't have visibility into what the foreign code does, so it for exanple doesn't know if you are loading an integer (erasing provenance) or loading a pointer (preserving provenance if it exists) or which of the available provenances was used for the memory operation. As such it has to assume that the foreign code picked the right provenance to avoid UB if the right provenance could have been picked in the first place due to being exposed to the foreign code.
Edit: What @cfallin said.
alexcrichton commented on PR #10043:
Ok that makes sense to me yeah. If this becomes a problem in the future we can perhaps have a different implementation of
VmPtr
depending on if pulley is turned on or not. Without pulley we'd require strict provenance but with pulley we'd require only permissive provenance.
alexcrichton updated PR #10043.
github-actions[bot] commented on PR #10043:
Subscribe to Label Action
cc @fitzgen
<details>
This issue or pull request has been labeled: "wasmtime:api", "wasmtime:ref-types"Thus the following users have been cc'd because of the following labels:
- fitzgen: wasmtime:ref-types
To subscribe or unsubscribe from this label, edit the <code>.github/subscribe-to-label.json</code> configuration file.
Learn more.
</details>
alexcrichton updated PR #10043.
alexcrichton updated PR #10043.
alexcrichton commented on PR #10043:
Ok this is now rebased and I believe should be ready for review @dicej
alexcrichton commented on PR #10043:
(I can also tag someone else in if you'd prefer too)
(I can also tag someone else in if you'd prefer too)
I think that would be a good idea. I don't have any experience with the Pulley code so far and don't feel qualified to review something like this -- at least not yet.
alexcrichton updated PR #10043.
...although browsing the code, maybe there's not much here that's Pulley-specific? In which case I could give it a shot.
alexcrichton requested fitzgen for a review on PR #10043.
alexcrichton commented on PR #10043:
Happy to go either way yeah, it's true yeah that while the main motivation here is Pulley that's mostly for a subsequent PR running miri and pulley stuff. This PR itself is mostly just refactoring vm internals towards a
VmPtr<T>
type and aVmSafe
trait. I've tagged @fitzgen here but if you'd like to review I'm sure Nick won't mind heh
fitzgen submitted PR review:
LGTM modulo a couple nitpicks
fitzgen created PR review comment:
In these comments there are a lot of references to Cranelift, but it could also be Winch. Might be worth rewording "Cranelift-compiled code" to "compiled Wasm code" or something.
fitzgen created PR review comment:
This comment is going to be out of date as soon as your next PR lands. Is it even worth including here? Seems like it will be easy to forget to remove down the line, might as well not even add it in the first place...
dicej submitted PR review.
dicej created PR review comment:
Just checking: is this sound? Do we know for a fact (e.g. based on Rust language sematnics) that the
&mut []
pointer will be valid outside of this statement?
dicej created PR review comment:
And if it _is_ sound, I assume that's only true because it's an empty array literal and thus has a
'static
lifetime?
alexcrichton submitted PR review.
alexcrichton created PR review comment:
To the best of my knowledge, which is limited, this is sound even if the compiler doesn't use a
'static
lifetime here. My understanding is that there's no soundness issue having a pointer to something that's deallocated, for example malloc'ing some data, saving the pointer, and then freeing the data. So even if the array here isn't'static
it's safe to have a pointer just pointing to some random address on the stack.The other reason I think this is safe is because this is creating
NonNull<[T]>
which not only has a pointer but also a length. For&mut []
the length is always zero, and that means that the pointer, invalid though it may be, is never accessed.So all in all I think it's safe because the pointer is never accessed due to the zero length on the pointer, regardless of whether rustc extends this to
'static
or not
alexcrichton updated PR #10043.
alexcrichton has enabled auto merge for PR #10043.
alexcrichton merged PR #10043.
Last updated: Jan 24 2025 at 00:11 UTC