Stream: git-wasmtime

Topic: wasmtime / issue #9060 Idea: put VM data structures in ar...


view this post on Zulip Wasmtime GitHub notifications bot (Aug 01 2024 at 19:35):

cfallin opened issue #9060:

As part of the discussion on #9015 / #9026, we discussed handling of VM data structures -- the vmctx struct, tables, function references, and the like -- that are touched both by runtime code (in Rust) and by generated code compiled from the Wasm. There are issues related to strict pointer provenance because pointers to these data structures are exposed to the generated code, and/or the Pulley interpreter, without strict provenance (either through the Pulley bytecode, or through the machine code we invoke that is entirely outside of the domain of Rust's semantics).

It occurs to me that one way to solve this would be to make all VM data structures use relative pointers -- e.g., u32 offsets -- in an arena (per store? per engine?) whose base pointer is a parameter both to the generated code and to the Pulley interpreter. We then trivially have strict provenance because there is only one pointer -- and whatever we need to do to preserve provenance (keep it as a pointer in the Pulley interpreter loop; and "expose" it as we pass it to generated code) is localized and manageable.

If provenance were the only benefit, that may not be so interesting; but there are a few others as well:

So it seems we can get (i) fully strict provenance, (ii) better safety, (iii) other interesting new abstractions like whole-engine snapshotting, if we pay this cost. Something to consider later if any of these needs becomes interesting?

view this post on Zulip Wasmtime GitHub notifications bot (Aug 02 2024 at 18:02):

alexcrichton commented on issue #9060:

How would this be reflected in CLIF? All loads/stores would have to be derived from a small set of "base pointers" such as the base of each linear memory in a module and the arena holding vmctx/tables/etc. For example Pulley might have a load-from-memory-zero instruction or load-from-arena instruction, but how would it know which to choose from a CLIF load?

(and possibly another area for "stack memory of this function", so how to detect loads/stores to the stack)

view this post on Zulip Wasmtime GitHub notifications bot (Aug 02 2024 at 18:14):

cfallin commented on issue #9060:

There are at least two ways I think:

I kind of like the first more -- it's less intrusive to Cranelift (in a way that avoids complexity-footguns around e.g. alias analysis with separate address spaces) at the cost of a little more constraint on memory layout (but then we're already saying we're going to put everything in some arena).

view this post on Zulip Wasmtime GitHub notifications bot (Aug 02 2024 at 18:21):

cfallin commented on issue #9060:

(For clarity on Option 1: the base pointer is implicit and affects all loads/stores; we wouldn't add an extra argument.)

view this post on Zulip Wasmtime GitHub notifications bot (Aug 02 2024 at 18:21):

cfallin edited a comment on issue #9060:

(For clarity on Option 1: the base pointer is implicit and affects all loads/stores; we wouldn't add an extra argument to load/store instructions.)


Last updated: Nov 22 2024 at 16:03 UTC