Stream: git-wasmtime

Topic: wasmtime / PR #11930 Debug: implement call injection to i...


view this post on Zulip Wasmtime GitHub notifications bot (Oct 24 2025 at 00:49):

cfallin opened PR #11930 from cfallin:wasmtime-debug-signal-inject-calls to bytecodealliance:main:

(Stacked on top of #11921)

This repurposes the code from #11826 to "inject calls": when in a signal
handler, we can update the register state to redirect execution upon
signal-handler return to a special hand-written trampoline, and this
trampoline can save all registers and enter the host, just as if a
hostcall had occurred.

As before, this is Linux-only in its current draft. I need to add macOS and Windows support, still. Putting this up to show how a few loose ends in #11921 get used.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 24 2025 at 01:03):

cfallin commented on PR #11930:

I'll note for brainstorming purposes that the current problem in front of me is how to rework macOS' Mach ports-based signal handler to work with this. To recap a bit what the requirements on each platform are, and how this "call injection" works:

The basic need is to inject enough state into the register context, along with redirecting PC, that a stub can take control, find the Store state, invoke any debug event handler, then restore all context and return to the guest if it's a resumable trap (which this PR doesn't have, but we will have in a few more PRs for breakpoints).

One can see how this is a little tricky. The approach I've taken that is at least Windows and Linux-compatible is to update only registers, not the stack (because Windows); inject args into the registers; save off the original register values and PC to the VMStoreContext (which we have via TLS in the signal handler); then in the trampoline, save all regs to the stack, and copy the original values of the injected registers back from the store to the stack save-frame.

macOS inverts most of the "can do" and "can't do" bits: we can push to the stack (unlike Windows) but we can't read TLS, so we have nowhere to save state that we clobber when redirecting other than to push it to the stack. So probably the best we can do is to push the original register values to the guest stack ourselves from the exception handler thread.

Of course this means that we need a slightly different stub for macOS (for x86-64 and aarch64 both); and we'll need a slightly different stub for Windows/x86-64 too because of fastcall when we call the host code.

One more thing about the riscv64 stub: it saves all of the V-extension state, because vector registers are separate from float registers, but unlike our other three architectures, we don't unconditionally assume that vector registers are present. So technically to run with V disabled with debugging enabled, if we care about that, we need an alternate riscv64 stub too that elides that bit. Note that we need to care because we have to save everything, not just the ABI callee-saves, because we're "interrupting" with no regalloc cooperation.

All of this to say: I am starting to think that the efficiency advantage of "trap-based implicit hostcalls", with all that entails (breakpoints that are just break instructions we can patch in), may not be worth the complexity and maintenance burden. The alternative is to go with hostcall-based-traps universally. (We still do need the wonky raw *mut dyn VMStore for the Pulley case, because Pulley does seem to unconditionally rely on interpreter traps on at least the divide instruction.)

Partly that would make me sad, but on the other hand, it would make me quite happy too: it would mean that we are one PR away from breakpoints if we go with the bitmask scheme, or two if we still patch in a call (self-modifying code but not trapping).

I'm happy to go either way, and these stubs were quite fun to write, but with my "not impossible to maintain" hat on, I think I know the better answer...

(cc @alexcrichton and @fitzgen for thoughts)

view this post on Zulip Wasmtime GitHub notifications bot (Oct 24 2025 at 01:22):

cfallin commented on PR #11930:

Quick napkin math on efficiency if we abandon call injection on signals:

The upshot of all that is that it's much more portable and easier to reason about, and the latter at least is in short supply otherwise with everything else we're adding for debugging. One could see this as "hostcalls everywhere" as in debug RFC v1, except with SMC to avoid overhead until patched in.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 24 2025 at 03:48):

github-actions[bot] commented on PR #11930:

Subscribe to Label Action

cc @fitzgen

<details>
This issue or pull request has been labeled: "cranelift", "pulley", "wasmtime:api"

Thus the following users have been cc'd because of the following labels:

To subscribe or unsubscribe from this label, edit the <code>.github/subscribe-to-label.json</code> configuration file.

Learn more.
</details>

view this post on Zulip Wasmtime GitHub notifications bot (Nov 01 2025 at 18:21):

cfallin commented on PR #11930:

Closing as this is pushed to "post-MVP debugging" due to all the above complexities; will keep the branch around for mining for the good bits later as needed.

view this post on Zulip Wasmtime GitHub notifications bot (Nov 01 2025 at 18:21):

cfallin closed without merge PR #11930.


Last updated: Dec 06 2025 at 06:05 UTC