Stream: git-wasmtime

Topic: wasmtime / issue #13112 Should accesses to the GC heap us...


view this post on Zulip Wasmtime GitHub notifications bot (Apr 15 2026 at 21:32):

fitzgen opened issue #13112:

Random thought while reading this, shouldn't notrap and aligned be absent here? Given the sandboxing strategy I'd expect this to not be asserted to be aligned and additionally would be allowed to trap

_Originally posted by @alexcrichton in https://github.com/bytecodealliance/wasmtime/pull/13107#discussion_r3088485306_

view this post on Zulip Wasmtime GitHub notifications bot (Apr 15 2026 at 21:32):

fitzgen added the wasm-proposal:gc label to Issue #13112.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 15 2026 at 21:52):

fitzgen commented on issue #13112:

What I'm worried about is UB-of-sorts where we're telling Craneilft that this load is always aligned and never traps and then at runtime, assuming there's a bug in either Cranelift or Wasmtime's GC, that's violated (in theory causing UB). I'm wary to bucket this under having a known set of possible outcomes because we're effectively violating a core assumption and I'm not sure we can enumerate all the outcomes. By analogy, the vec OOB isn't UB to hit the #[cold] block naturally, but here I'm worried that it would be UB somehow to hit a trap here.

I could see this argument for aligned.

I don't think it applies to notrap though.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 15 2026 at 22:08):

cfallin commented on issue #13112:

I guess there's a question of what we intend notrap and aligned to mean. We document them here as:

So by the docs, treating Cranelift as opaque, I think we're actually safe to use aligned today (because at worst, we define that the load/store traps or returns a wrong result for a store; neither of those is UB or propagates beyond the intended sandbox). But we are not safe to use notrap today if we want to be robust to accidental sandbox violations because we simply say that the IR asserts that a trap will not occur; we don't say what happens if it does.

We could define notrap more precisely to mean "definitely will not have trap metadata, and may or may not cause a SIGSEGV, and if it does, may or may not occur at exactly the right point; if it does SEGV, it will do so at the given address". That permits all of our intended optimizations (code motion, dead load/store removal), and is still constrained enough that we can reason about that behavior interacting with the Wasmtime runtime: in particular, "may still SEGV, may not, but will not alter address" would let us be reasonably sure about holding the sandbox boundary.

All that said, my take: I think we should reason about GC accesses the same way we do about Wasm guest loads/stores. I realize that's giving up optimization opportunity, but it is closer to the original intent of the idea of our GC sandboxing: we should "just" be using linear memories under the hood.

Separately, though, we should probably think about making post-trap state unobservable as an option; and if we do that, we can then (still) do dead store/load removal, and store reordering (between other sequence points like opaque hostcalls), not only for the GC heap but also for user linear memories as well.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 15 2026 at 22:08):

cfallin edited a comment on issue #13112:

I guess there's a question of what we intend notrap and aligned to mean. We document them here as:

So by the docs, treating Cranelift as opaque, I think we're actually safe to use aligned today (because at worst, we define that the load/store traps or returns a wrong result for a load; neither of those is UB or propagates beyond the intended sandbox). But we are not safe to use notrap today if we want to be robust to accidental sandbox violations because we simply say that the IR asserts that a trap will not occur; we don't say what happens if it does.

We could define notrap more precisely to mean "definitely will not have trap metadata, and may or may not cause a SIGSEGV, and if it does, may or may not occur at exactly the right point; if it does SEGV, it will do so at the given address". That permits all of our intended optimizations (code motion, dead load/store removal), and is still constrained enough that we can reason about that behavior interacting with the Wasmtime runtime: in particular, "may still SEGV, may not, but will not alter address" would let us be reasonably sure about holding the sandbox boundary.

All that said, my take: I think we should reason about GC accesses the same way we do about Wasm guest loads/stores. I realize that's giving up optimization opportunity, but it is closer to the original intent of the idea of our GC sandboxing: we should "just" be using linear memories under the hood.

Separately, though, we should probably think about making post-trap state unobservable as an option; and if we do that, we can then (still) do dead store/load removal, and store reordering (between other sequence points like opaque hostcalls), not only for the GC heap but also for user linear memories as well.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 16 2026 at 00:52):

alexcrichton commented on issue #13112:

Personally I agree with @cfallin's conclusion of treating gc loads/stores the same as wasm loads/stores, I feel that fits our sandboxing model best. W.r.t the optimization concerns you have @fitzgen, my naive assumption is "that's what binaryen is for" or some sort of optimization pass. For example we expect LLVM-optimized-wasm to be suitable for "ok we can't move these memory opts", so I feel like we should have a similar expectation for GC-using wasm where it should be optimal coming in, not rely on Cranelift to clean up the wasm itself. To me Cranelift is responsible for primarily cleaning up Wasmtime's runtime abstractions, e.g. the base pointer of linear memory and hoisting that out, but not for cleaning up the input wasm.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 16 2026 at 13:24):

tschneidereit commented on issue #13112:

For example we expect LLVM-optimized-wasm to be suitable for "ok we can't move these memory opts", so I feel like we should have a similar expectation for GC-using wasm where it should be optimal coming in

I'm not sure how well this will hold up for too much longer, fwiw, perhaps in particular for GC: there are lots of languages that already don't go through LLVM, and then there are some (including Rust!) that do go through LLVM, but not through wasm-opt, which LLVM itself pretty much assumes to be part of its optimization pipeline.

Long-term, my guess is that we'll have to add at least the most obvious optimizations that would happen in wasm-opt if used, and perhaps some of what's happening in LLVM itself, too.


Last updated: May 03 2026 at 22:13 UTC