Stream: git-wasmtime

Topic: wasmtime / issue #11896 Debugging: build an async debugge...


view this post on Zulip Wasmtime GitHub notifications bot (Oct 21 2025 at 17:16):

cfallin opened issue #11896:

In general when designing our guest debugger functionality, we would like to balance a few requirements:

We have plans to place the debugger implementation mostly inside a Wasm component, which gives us a little more flexibility to have an "ugly API" underneath, but even still, the closer we get the native host API and paradigm to what the eventual Wasm API describes, the less painful and error-prone the glue will be.

All of these requirements generally push toward a "coroutine"-style async design. In our RFC and in a draft PR (#11826), we have sketched out a general approach to a debug API that contains a "debugger" and "debuggee" as two entities that bounce control back and forth. This is naturally rendered in Rust with an API that literally provides an async API that yields a stream of "debug events", with the debuggee stopped whenever an event is received and running whenever the debugger is polling for the next event. Such an API allows for a nice debugger implementation style: it can keep its main loop in one place, and access the store directly when the debuggee is paused.

Unfortunately, through a bunch of conversations, we have determined that this is not sound as implemented in that draft PR. The PR "teleports" a borrow of the Store outward from an async yield point, where it performs a fiber yield, back to a DebugSession (wrapping the store) on which an async fn next() was invoked to get the next debug event. The idea was that the next() invocation exclusively owns the store while we pass control back to the guest; when it returns, we can return ownership of the store back to the debugger; this is more-or-less like passing a mutable reborrow of the store to a hostcall, except that we plumb it back out to the surface. We could even get the provenance right by passing (via a raw pointer) the reborrow outward. However...

Unfortunately, Future combinators and dropped futures are a thing, and there is a bad case with a "host code sandwich". Consider: debugger context calls Wasm, calls async host code, calls Wasm. In the second Wasm activation, we hit a debug event. We could yield all the way back up to the debugger and pass a reborrowed Store; but that yield control flow passes through the async host code, by way of a Poll::Pending. That async host code may implement some arbitrary future combinator that chooses to (for example) drop the future, in which case we have a dangling reference to the store and the rest of the debug state we were supposed to examine (e.g. stack frames). One could try to patch this up by holding fibers via reference counting and keeping the fibers alive when paused for debugging; but at that point, we have discovered that...

... we are reinventing a bunch of mechanisms in the component-model async implementation. In particular, (i) the Accessor mechanism allows for ownership passing of the Store (timeslicing such that access only exists during one poll, with no borrows persisted across suspends) in a way that is already vetted; and (ii) the task model gives us a first-class way to note that a stack is paused for debugging, and keep it alive. (I'm less sure about the details of (ii), but in principle, the concurrent scheduler is a little tiny OS kernel and we can build the moral equivalent of ptrace pauses there, I think.)

Given all that, the eventual plan is something like:

view this post on Zulip Wasmtime GitHub notifications bot (Oct 21 2025 at 17:16):

cfallin commented on issue #11896:

cc @alexcrichton @fitzgen -- if I missed anything from our discussion, feel free to add!

view this post on Zulip Wasmtime GitHub notifications bot (Oct 21 2025 at 17:59):

fitzgen commented on issue #11896:

Thanks for writing this up!

One small clarification:

It should be possible to provide access to the Store to the debugger, including mutability. This is needed for eventual "mutable debugger commands" (e.g., updating locals' values) but also even for any access to GC objects (because of the root-set).

Note that even reading Wasm state, without mutating it, will mutate store internals (caches and arenas and such) and therefore requires a AsStoreContextMut, e.g.:

So giving the debugger APIs mutable access to the Store isn't something that can be delayed until we get around to adding support for debugging GC objects or mutation from the debugger comes at a later point; it must be part of the initial support for reading Wasm state.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 21 2025 at 18:29):

cfallin commented on issue #11896:

Yep, and to your point that it's already needed, the Store::debug_frames API already landed as part of the debug instrumentation requires a mutable context already, in order to read out (and root) GC refs from the stack. So we'd have to even regress on what we already have to build an immutable variant.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 21 2025 at 19:53):

alexcrichton commented on issue #11896:

cc @dicej on this as well

I suspect that in the near future my job is going to be to reconcile the two async models we have in Wasmtime (e.g. call_{async,concurrent}) and getting the component-model-async bits to more-or-less work in core wasm as well. I have talked with all y'all about this but wanted to write it down here too.


Last updated: Dec 06 2025 at 07:03 UTC