wasmtime / PR #11769 Wasmtime: implement debug instrument... · git-wasmtime

Stream: git-wasmtime

Topic: wasmtime / PR #11769 Wasmtime: implement debug instrument...

Wasmtime GitHub notifications bot (Oct 01 2025 at 03:37):

cfallin opened PR #11769 from cfallin:wasmtime-debug-instrumentation to bytecodealliance:main:

This PR implements ideas from the [recent RFC] to serve as the basis
for Wasm (guest) debugging: it adds a stackslot to each function
translated from Wasm, stores to replicate Wasm VM state in the
stackslot as the program runs, and metadata to describe the format of
that state and allow reading it out at runtime.

As an initial user of this state, this PR adds a basic "stack view"
API that, from host code that has been called from Wasm, can examine
Wasm frames currently on the stack and read out all of their locals
and stack slots.

Note in particular that this PR does not include breakpoints,
watchpoints, stepped execution, or any sort of user interface for any
of this; it is only a foundation.

This PR still has a few unsatisfying bits that I intend to address:

The "stack view" performs some O(n) work when the view is initially
taken, computing some internal data per frame. This is forced by the
current design of Backtrace, which takes a closure and walks that
closure over stack frames eagerly (rather than work as an
iterator). It's got some impressive iterator-chain stuff going on
internally, so refactoring it to the latter approach might not
be too bad, but I haven't tackled it yet.

A O(1) stack view, that is, one that does work only for frames as
the host API is used to walk up the stack, is desirable because some
use-cases may want to quickly examine e.g. only the deepest
frame (say, running with a breakpoint condition that needs to read a
particular local's value after each step).

It includes a new Config::compiler_force_inlining() option that is
used only for testing that we get the correct frames after
inlining. I couldn't get the existing flags to work on a Wasmtime
config level and suspect there may be an existing bug there; I will
try to split out a fix for it.

This PR renames the existing debug option to native_debug, to
distinguish it from the new approach.

(Stacked on #11768.)

[recent RFC]: https://github.com/bytecodealliance/rfcs/pull/44

Wasmtime GitHub notifications bot (Oct 01 2025 at 03:37):

cfallin requested wasmtime-default-reviewers for a review on PR #11769.

Wasmtime GitHub notifications bot (Oct 01 2025 at 03:37):

cfallin requested fitzgen for a review on PR #11769.

Wasmtime GitHub notifications bot (Oct 01 2025 at 03:37):

cfallin requested wasmtime-core-reviewers for a review on PR #11769.

Wasmtime GitHub notifications bot (Oct 01 2025 at 03:37):

cfallin requested wasmtime-compiler-reviewers for a review on PR #11769.

Wasmtime GitHub notifications bot (Oct 01 2025 at 03:38):

cfallin commented on PR #11769:

(Marking this as a draft until I resolve the two issues above; posting to show how the Cranelift side is used in practice, in case that's useful.)

Wasmtime GitHub notifications bot (Oct 01 2025 at 03:39):

cfallin updated PR #11769.

Wasmtime GitHub notifications bot (Oct 01 2025 at 03:41):

cfallin updated PR #11769.

Wasmtime GitHub notifications bot (Oct 01 2025 at 03:49):

cfallin updated PR #11769.

Wasmtime GitHub notifications bot (Oct 01 2025 at 04:03):

cfallin updated PR #11769.

Wasmtime GitHub notifications bot (Oct 01 2025 at 04:05):

cfallin updated PR #11769.

Wasmtime GitHub notifications bot (Oct 01 2025 at 04:11):

cfallin updated PR #11769.

Wasmtime GitHub notifications bot (Oct 01 2025 at 04:11):

cfallin updated PR #11769.

Wasmtime GitHub notifications bot (Oct 01 2025 at 04:17):

cfallin updated PR #11769.

Wasmtime GitHub notifications bot (Oct 01 2025 at 04:41):

cfallin updated PR #11769.

Wasmtime GitHub notifications bot (Oct 01 2025 at 04:42):

cfallin updated PR #11769.

Wasmtime GitHub notifications bot (Oct 01 2025 at 05:07):

cfallin updated PR #11769.

Wasmtime GitHub notifications bot (Oct 01 2025 at 06:01):

cfallin updated PR #11769.

Wasmtime GitHub notifications bot (Oct 02 2025 at 22:39):

cfallin updated PR #11769.

Wasmtime GitHub notifications bot (Oct 02 2025 at 22:48):

cfallin updated PR #11769.

Wasmtime GitHub notifications bot (Oct 02 2025 at 23:17):

cfallin updated PR #11769.

Wasmtime GitHub notifications bot (Oct 03 2025 at 04:37):

cfallin updated PR #11769.

Wasmtime GitHub notifications bot (Oct 03 2025 at 05:03):

cfallin edited PR #11769:

This PR implements ideas from the [recent RFC] to serve as the basis
for Wasm (guest) debugging: it adds a stackslot to each function
translated from Wasm, stores to replicate Wasm VM state in the
stackslot as the program runs, and metadata to describe the format of
that state and allow reading it out at runtime.

As an initial user of this state, this PR adds a basic "stack view"
API that, from host code that has been called from Wasm, can examine
Wasm frames currently on the stack and read out all of their locals
and stack slots.

Note in particular that this PR does not include breakpoints,
watchpoints, stepped execution, or any sort of user interface for any
of this; it is only a foundation.

This PR renames the existing debug option to native_debug, to
distinguish it from the new approach.

(Stacked on #11768 and #11783, and depends on #11784.)

[recent RFC]: https://github.com/bytecodealliance/rfcs/pull/44

Wasmtime GitHub notifications bot (Oct 03 2025 at 05:03):

cfallin has marked PR #11769 as ready for review.

Wasmtime GitHub notifications bot (Oct 03 2025 at 05:06):

cfallin commented on PR #11769:

Alright, I've rebased this on the latest #11768, and updated it to take an iterator-based approach instead. The borrowing is a little tricky but does work out: we implement a custom iterator up the stack frames (StackView), which owns the store while alive, and a mutable borrow of that iterator must be passed into methods on the objects it returns (FrameView) to actually read values. All of that gives a safe, lazy (O(1)) way of walking the stack.

In #11783 (underlying iterator that this refactor used) I chose not to replicate all of the complex logic to walk all activations and continuation stack-chains; for now, this API walks only the last activation that entered the host. That seems sufficient for a basic debugger application to me, but I can update it as desired.

This is now stacked on #11768 and #11783, and depends on the just-filed #11784 to be resolved somehow.

Wasmtime GitHub notifications bot (Oct 03 2025 at 15:58):

cfallin updated PR #11769.

Wasmtime GitHub notifications bot (Oct 03 2025 at 16:01):

cfallin updated PR #11769.

Wasmtime GitHub notifications bot (Oct 03 2025 at 20:12):

cfallin updated PR #11769.

Wasmtime GitHub notifications bot (Oct 03 2025 at 20:13):

cfallin edited PR #11769:

This PR implements ideas from the [recent RFC] to serve as the basis
for Wasm (guest) debugging: it adds a stackslot to each function
translated from Wasm, stores to replicate Wasm VM state in the
stackslot as the program runs, and metadata to describe the format of
that state and allow reading it out at runtime.

As an initial user of this state, this PR adds a basic "stack view"
API that, from host code that has been called from Wasm, can examine
Wasm frames currently on the stack and read out all of their locals
and stack slots.

Note in particular that this PR does not include breakpoints,
watchpoints, stepped execution, or any sort of user interface for any
of this; it is only a foundation.

This PR renames the existing debug option to native_debug, to
distinguish it from the new approach.

(Stacked on #11768, and depends on #11784.)

[recent RFC]: https://github.com/bytecodealliance/rfcs/pull/44

Wasmtime GitHub notifications bot (Oct 03 2025 at 20:17):

cfallin commented on PR #11769:

Rebased on #11783 now that it's landed.

Also a few more refined thoughts on

I chose not to replicate all of the complex logic to walk all activations and continuation stack-chains; for now, this API walks only the last activation that entered the host. That seems sufficient for a basic debugger application to me, but I can update it as desired.

I am realizing in building the next step that we probably do want a "debug session" to apply only to one Wasm activation -- if for no other reason than that the API is much more semantically clear when exits back to the host are atomic events rather than somehow "translucent" (host code invisible but nested Wasm invocations visible). And there are weird corner cases where, for example, the recursively invoked host code itself tries to set up a (nested) debug context that we'd want to rule out anyway. So: the limit to introspect only the most recent activation is a feature, not a bug!

Wasmtime GitHub notifications bot (Oct 06 2025 at 20:04):

cfallin updated PR #11769.

Wasmtime GitHub notifications bot (Oct 06 2025 at 20:05):

cfallin commented on PR #11769:

And rebased out #11768 as well; now this only depends on #11784 (either accepting the compiler_force_inlining option added here, or doing something else).

Wasmtime GitHub notifications bot (Oct 06 2025 at 20:05):

cfallin edited PR #11769:

This PR implements ideas from the [recent RFC] to serve as the basis
for Wasm (guest) debugging: it adds a stackslot to each function
translated from Wasm, stores to replicate Wasm VM state in the
stackslot as the program runs, and metadata to describe the format of
that state and allow reading it out at runtime.

As an initial user of this state, this PR adds a basic "stack view"
API that, from host code that has been called from Wasm, can examine
Wasm frames currently on the stack and read out all of their locals
and stack slots.

Note in particular that this PR does not include breakpoints,
watchpoints, stepped execution, or any sort of user interface for any
of this; it is only a foundation.

This PR renames the existing debug option to native_debug, to
distinguish it from the new approach.

(Depends on #11784.)

[recent RFC]: https://github.com/bytecodealliance/rfcs/pull/44

Wasmtime GitHub notifications bot (Oct 06 2025 at 21:40):

cfallin updated PR #11769.

Wasmtime GitHub notifications bot (Oct 06 2025 at 21:40):

cfallin edited PR #11769:

This PR implements ideas from the [recent RFC] to serve as the basis
for Wasm (guest) debugging: it adds a stackslot to each function
translated from Wasm, stores to replicate Wasm VM state in the
stackslot as the program runs, and metadata to describe the format of
that state and allow reading it out at runtime.

As an initial user of this state, this PR adds a basic "stack view"
API that, from host code that has been called from Wasm, can examine
Wasm frames currently on the stack and read out all of their locals
and stack slots.

Note in particular that this PR does not include breakpoints,
watchpoints, stepped execution, or any sort of user interface for any
of this; it is only a foundation.

This PR renames the existing debug option to native_debug, to
distinguish it from the new approach.

[recent RFC]: https://github.com/bytecodealliance/rfcs/pull/44

Wasmtime GitHub notifications bot (Oct 06 2025 at 21:40):

cfallin commented on PR #11769:

Rebased out temporary config fix to #11784 as well now that #11797 merged.

Wasmtime GitHub notifications bot (Oct 06 2025 at 23:45):

github-actions[bot] commented on PR #11769:

Label Messager: wasmtime:config

It looks like you are changing Wasmtime's configuration options. Make sure to
complete this check list:

[ ] If you added a new Config method, you wrote extensive documentation for
it.

<details>

Our documentation should be of the following form:

```text
Short, simple summary sentence.

More details. These details can be multiple paragraphs. There should be
information about not just the method, but its parameters and results as
well.

Is this method fallible? If so, when can it return an error?

Can this method panic? If so, when does it panic?

Example

Optional example here.
```

</details>

[ ] If you added a new Config method, or modified an existing one, you
ensured that this configuration is exercised by the fuzz targets.

<details>

For example, if you expose a new strategy for allocating the next instance
slot inside the pooling allocator, you should ensure that at least one of our
fuzz targets exercises that new strategy.

Often, all that is required of you is to ensure that there is a knob for this
configuration option in [wasmtime_fuzzing::Config][fuzzing-config] (or one
of its nested structs).

Rarely, this may require authoring a new fuzz target to specifically test this
configuration. See [our docs on fuzzing][fuzzing-docs] for more details.

</details>

[ ] If you are enabling a configuration option by default, make sure that it
has been fuzzed for at least two weeks before turning it on by default.

[fuzzing-config]: https://github.com/bytecodealliance/wasmtime/blob/ca0e8d0a1d8cefc0496dba2f77a670571d8fdcab/crates/fuzzing/src/generators.rs#L182-L194
[fuzzing-docs]: https://docs.wasmtime.dev/contributing-fuzzing.html

<details>

To modify this label's message, edit the <code>.github/label-messager/wasmtime-config.md</code> file.

To add new label messages or remove existing label messages, edit the
<code>.github/label-messager.json</code> configuration file.

Learn more.

</details>

Wasmtime GitHub notifications bot (Oct 07 2025 at 22:36):

alexcrichton submitted PR review:

These are some initial thoughts I had from reading this last night and just now. I wanted to take a closer look at debug.rs but I stopped once I realized that I think the current API proposed here is unsound. Assuming you agree with my comment below I'll take another look once it's updated.

Wasmtime GitHub notifications bot (Oct 07 2025 at 22:36):

alexcrichton created PR review comment:

Could this go in before_translate_operator rather than being an additional "before" hook?

Wasmtime GitHub notifications bot (Oct 07 2025 at 22:36):

alexcrichton created PR review comment:

Would it make sense to store this in stack directly? That'd avoid the need to always try to keep the two in sync

Wasmtime GitHub notifications bot (Oct 07 2025 at 22:36):

alexcrichton created PR review comment:

Is this safety comment still relevant? Naively I'd expect this to be a safe function since getting a backtrace of a store should always be a safe operation

Wasmtime GitHub notifications bot (Oct 07 2025 at 22:36):

alexcrichton created PR review comment:

Passing thought about this -- how do stack maps come into play here with GC? E.g. if we implement a moving GC, wouldn't all of these values need to be roots for the GC?

Wasmtime GitHub notifications bot (Oct 07 2025 at 22:36):

alexcrichton created PR review comment:

This is technically a breaking change for the CLI and our existing documentation around this. I don't have a great idea of what to do instead, but I'm wary of breaking existing users of our debug info without prior warning

Wasmtime GitHub notifications bot (Oct 07 2025 at 22:36):

alexcrichton created PR review comment:

s/EXTERNREF/FUNCREF/

Wasmtime GitHub notifications bot (Oct 07 2025 at 22:36):

alexcrichton created PR review comment:

Instead of duplicating this check from below, could this be refactored either into a loop outside of this conditional or some sort of helper method?

Wasmtime GitHub notifications bot (Oct 07 2025 at 22:36):

alexcrichton created PR review comment:

If debug is the default here, should it also be in the default list for the wasmtime crate?

Wasmtime GitHub notifications bot (Oct 07 2025 at 22:36):

alexcrichton created PR review comment:

This scoping seems like something we'll want to change perhaps? For example maybe the entire iterator holds AutoAssertNoGc? It seems like if a GC happened in the middle of a stack walk that things would probably go bad...

There's also the question of the scope that these rooted values are pushed onto. It seems like that should be a pseudo-requirement of a caller? Or maybe just documented? I'm not sure if there's too much we can do to help this from a caller's perspective...

Wasmtime GitHub notifications bot (Oct 07 2025 at 22:36):

alexcrichton created PR review comment:

Would it make sense to pass these around as a unit-of-sorts if they all live in some sort of FuncEnvironment structure or similar? They're otherwise being passesd around pretty frequently and it's a lot of boilerplate to skip over

Wasmtime GitHub notifications bot (Oct 07 2025 at 22:36):

alexcrichton created PR review comment:

I think this is going to have to change one way or another to make this sound. Some things I can think of to break soundness with this are:

I think a FrameView could get paired with the wrong Store.

A FrameView could be persisted outside of a stack trace and then later read once the frames are all gone (or with the wrong store)

Personally I think it would be worth trying to consolidate the unsafe here in this module as much as possible. Ideally we'd just have one tiny "thing", whatever that is, which does the unsafe bits of actually reading from the stack and, ideally, we'd be able to read the definition of this "thing" and trivially conclude "yep that's fine".

Wasmtime GitHub notifications bot (Oct 07 2025 at 22:36):

alexcrichton created PR review comment:

GIven that Val is self-describing in its type, could the ValType return be omitted?

Wasmtime GitHub notifications bot (Oct 07 2025 at 22:36):

alexcrichton created PR review comment:

GIven that wasm modules are generally all u32-heavy on the encoding side, could this be a u32 and local would also take a u32 below?

Wasmtime GitHub notifications bot (Oct 07 2025 at 22:36):

alexcrichton created PR review comment:

To make this more amenable to strict provenance (hypothetically one day being supported in Wasmtime...) instead of passing addresses around as usize could they be passed around as *const u8 or similar?

Wasmtime GitHub notifications bot (Oct 07 2025 at 23:00):

cfallin commented on PR #11769:

Regarding unsoundness of FrameView being unattached to a lifetime:

Hmm. I agree that as written this API permits those patterns; I had been mostly concerned with keeping the stack "frozen" with the StackView iterator, then realized that the Store needs to be kept separate from whatever the iterator yields (because you cannot yield a &mut _ more than once from an iterator if they overlap, and the alternative would otherwise be to have the store mut-borrow be reborrowed by FrameView, but that's not viable here).

I suspect that the easiest sound option may be to give up on providing an actual iterator, that literally yields externalizable values (FrameView) for each frame, but rather integrate the reading-of-the-frame methods into the StackView itself. That cleanly disallows any of the "save it for later" or "use it with the wrong store" cases because the StackView holds the store borrow.

It does mean a user would look something like
while !stack_view.done() {
  let instance = stack_view.instance();
  let func = stack_view.defined_func_index();
  let value = stack_view.local(0);
  // ...
  stack_view.move_up();
}
What do you think?

Wasmtime GitHub notifications bot (Oct 07 2025 at 23:03):

alexcrichton commented on PR #11769:

Yeah that matches my intuition as well where the disconnect of the iterator value from the store felt a bit fishy. "Lending iterators" is definitely something that's not natural/easy to do in Rust, but I agree that having a custom API to handle this is reasonable. Best-in-class ergonomics seems ok to not have as a goal for a debugging-exclusive API.

Wasmtime GitHub notifications bot (Oct 07 2025 at 23:04):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Oct 07 2025 at 23:04):

cfallin created PR review comment:

It's possible; it would mean moving things around so that stack.push*() helpers, and all other manipulations of stack.stack, directly create a stack-shape node as well, which involves deeper refactoring. I was trying to keep changes to the code translator to a minimum in the initial PR. I'll play with this though!

Wasmtime GitHub notifications bot (Oct 07 2025 at 23:08):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Oct 07 2025 at 23:08):

cfallin created PR review comment:

Sure, I was trying to invent a policy to keep the old debugging approach separate by renaming it to "native debugging", but I'm happy to swap things around so existing native debug remains debug-info and the new thing becomes guest-debug or something like that. I guess I was leaning on our "semver lets us change things each major release" policy but I agree CLI stability here may be desirable.

What do you think about guest-debug as a general moniker for the new approach?

Wasmtime GitHub notifications bot (Oct 08 2025 at 00:03):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Oct 08 2025 at 00:03):

cfallin created PR review comment:

I believe so: the difference between this and Backtrace::trace for example is that the latter doesn't give you a mutable store while it's doing the tracing. In theory, a maximally evil client could take the owned store, translate that to full ownership of the raw stack, and mutate the frames that we're walking through.

That said, we don't provide any other APIs on a store today (to my knowledge!) that could mutate the stack arbitrarily, e.g. on-stack replacement of some form; so perhaps we can kick this can down the road and re-assign responsibility for safety to whichever mechanism eventually does that. Happy to do what you feel is best!

Wasmtime GitHub notifications bot (Oct 08 2025 at 00:10):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Oct 08 2025 at 00:10):

cfallin created PR review comment:

Yes, I think so; I'll add a note about this.

I can see two options:

We could extend the "tail is new and needs to be synced" state to arbitrary dirtyness, and mark all ref-typed values as dirty after a safepoint; this would then force the stack sync to re-write all (potentially updated) GC refs to the instrumentation slots.

We could extend the instrumentation frame-shape metadata to be able to refer directly to other stackslots; then we could let the spilling that already happens for safepoints also give us full debug visibility.

I favor the second overall but probably want to defer this for now, to get the basic functionality working...

Wasmtime GitHub notifications bot (Oct 08 2025 at 00:16):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Oct 08 2025 at 00:16):

cfallin created PR review comment:

Yes, assert-no-gc seems reasonable; I initially was thinking I didn't want to impose that limitation because a mutating debugger interface may want to allow evaluations that could allocate other things, but that's far-future stuff; happy to defer it.

(And actually, just to note it, this intersects with your above point about GC rooting too; in the presence of a moving GC we'd have to be careful to have debug metadata refer directly to the existing user-stack-map for that to work, rather than re-spill values to instrumentation after a safepoint, because while still walking the stack that re-spill code wouldn't run.)

Wasmtime GitHub notifications bot (Oct 08 2025 at 00:17):

cfallin updated PR #11769.

Wasmtime GitHub notifications bot (Oct 08 2025 at 00:17):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Oct 08 2025 at 00:17):

cfallin created PR review comment:

(Updated in 2df6a9140c.)

Wasmtime GitHub notifications bot (Oct 08 2025 at 00:17):

cfallin created PR review comment:

Fixed in 2df6a9140c per the sketch below.

Wasmtime GitHub notifications bot (Oct 08 2025 at 00:17):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Oct 08 2025 at 00:18):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Oct 08 2025 at 00:18):

cfallin created PR review comment:

Good point; I think this is left over from something earlier (was I trying to return a precise type including runtime GC types? I'm not sure) but it's simpler not to. Removed in 2df6a9140c.

Wasmtime GitHub notifications bot (Oct 08 2025 at 00:18):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Oct 08 2025 at 00:18):

cfallin created PR review comment:

Fixed via removal in 2df6a9140c.

Wasmtime GitHub notifications bot (Oct 08 2025 at 00:19):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Oct 08 2025 at 00:19):

cfallin created PR review comment:

Note added in 2df6a9140c; to be addressed in a more thorough way later.

Wasmtime GitHub notifications bot (Oct 08 2025 at 00:19):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Oct 08 2025 at 00:19):

cfallin created PR review comment:

(Kept this but feel free to ask me to remove or reword if you feel differently.)

Wasmtime GitHub notifications bot (Oct 08 2025 at 00:20):

cfallin commented on PR #11769:

Pushed a fix to the API to encapsulate the frame with the iterator; will address the other bits later tonight or tomorrow. Thanks!

Wasmtime GitHub notifications bot (Oct 08 2025 at 03:46):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Oct 08 2025 at 03:46):

cfallin created PR review comment:

OK, I've pushed on this a bit locally, but I think it makes things much worse:

a lot of the operand stack management assumes it can take slices of &[Value] and even mutate those in some cases, so although (Value, FrameStackShape) is the natural type for a stack entry (or rather (Value, Option<FrameStackShape>) since instrumentation is optional), that alters a ton of code;

push/pop methods on stacks require the builder now, which creates a huge diff that alters every operator translation (it's pretty mechanical, but still);

the distinction between stack and stack_shape actually corresponds an important algorithmic detail: we let them go out of sync (specifically, stack_shape can always be shorter than stack if not all values are flushed) then do the sync at the end of a given operator translation. This is important because we get the specific type from the validator's view of the Wasm state and it only makes sense to query that after the operator translation has updated the stack completely to match the current program point, rather than at each individual push/pop.

The last one is the most fundamental, the other two are "only" big mechanical refactors, but for all these reasons I think the current approach is the most workable (and the doc-comment here does a not terrible job explaining the invariant imho).

Wasmtime GitHub notifications bot (Oct 08 2025 at 03:50):

cfallin updated PR #11769.

Wasmtime GitHub notifications bot (Oct 08 2025 at 03:51):

cfallin created PR review comment:

Sure, merged this (and the after-hook too) into existing hooks in 32b15c30f8.

Wasmtime GitHub notifications bot (Oct 08 2025 at 03:51):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Oct 08 2025 at 03:54):

cfallin updated PR #11769.

Wasmtime GitHub notifications bot (Oct 08 2025 at 03:55):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Oct 08 2025 at 03:55):

cfallin created PR review comment:

Sure, addressed in 645dfe1470.

Wasmtime GitHub notifications bot (Oct 08 2025 at 03:56):

cfallin updated PR #11769.

Wasmtime GitHub notifications bot (Oct 08 2025 at 03:56):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Oct 08 2025 at 03:56):

cfallin created PR review comment:

Sure, added in 618bd094ef.

Wasmtime GitHub notifications bot (Oct 08 2025 at 04:20):

cfallin updated PR #11769.

Wasmtime GitHub notifications bot (Oct 08 2025 at 04:20):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Oct 08 2025 at 04:20):

cfallin created PR review comment:

So currently stack and func_env are kept separate throughout the whole translator for what I suspect are deep borrowing-related reasons; it just so happens that we need stack in a few more places. srcloc on the other hand is in theory known by the builder anywhere we have that, so I've added an accessor and removed it in a few places in e19e867b0d.

Wasmtime GitHub notifications bot (Oct 08 2025 at 04:20):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Oct 08 2025 at 04:20):

cfallin created PR review comment:

Sure, addressed in 21c3b4fe4b (and used with_exposed_provenance to go from the FP that comes in from other runtime machinery as a usize to the *const u8 we use here).

Wasmtime GitHub notifications bot (Oct 08 2025 at 04:20):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Oct 08 2025 at 04:20):

cfallin created PR review comment:

Sure, addressed in 011f656292.

Wasmtime GitHub notifications bot (Oct 08 2025 at 04:34):

cfallin updated PR #11769.

Wasmtime GitHub notifications bot (Oct 08 2025 at 04:35):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Oct 08 2025 at 04:35):

cfallin created PR review comment:

Alright, I've renamed some things in f660ba9296 so the current (native DWARF) approach remains under the same option names on the CLI and Config and the new thing is "guest debugging".

Wasmtime GitHub notifications bot (Oct 08 2025 at 04:35):

cfallin commented on PR #11769:

I believe I've addressed all review comments so far; thanks!

Wasmtime GitHub notifications bot (Oct 08 2025 at 04:36):

cfallin updated PR #11769.

Wasmtime GitHub notifications bot (Oct 09 2025 at 07:37):

cfallin updated PR #11769.

Wasmtime GitHub notifications bot (Oct 09 2025 at 15:23):

alexcrichton created PR review comment:

Nah yeah I understand the rationale for the rename and I agree that if we were to start from scratch I'd prefer to rename this, but also yeah the CLI generally needs more stability than the major version bumps technically allow so I think it'll cause unnecessary churn to rename the CLI option.

That being said I still like the idea of classifying debugging as "native debugging" or "guest debugging" internally in Wasmtime. For example the new tunables option debug_guest and using guest-debug here in the CLI sounds reasonable to me. I think it'd be reasonable to "deprecate" this -Ddebug-info option in favor of -Dnative-debug where "deprecate" is in quotes because we don't actually warn on deprecations, but we could technicaly accept both.

Wasmtime GitHub notifications bot (Oct 09 2025 at 15:23):

alexcrichton submitted PR review.

Wasmtime GitHub notifications bot (Oct 09 2025 at 19:38):

fitzgen created PR review comment:

Similarly to how you already merged the debug instrumentation before/after each op into the existing before/after each op hook, can we put the debug_instrumentation_at_{start,end} stuff inside the existing {before,after}_translate_function hooks?

Wasmtime GitHub notifications bot (Oct 09 2025 at 19:38):

fitzgen submitted PR review:

Didn't get to finish looking over everything yet, but what I've seen looks great, one minor nitpick below.

Wasmtime GitHub notifications bot (Oct 09 2025 at 20:34):

alexcrichton submitted PR review:

:+1: from me, but @fitzgen should probably also sign off too

Wasmtime GitHub notifications bot (Oct 09 2025 at 20:34):

alexcrichton created PR review comment:

Could the tunables option here retain the "native" part of the name? To mirror the debug_guest option I think it'd be reasonable to rename this as debug_native too

Wasmtime GitHub notifications bot (Oct 09 2025 at 20:34):

alexcrichton created PR review comment:

s/teh/the/

Wasmtime GitHub notifications bot (Oct 09 2025 at 20:34):

alexcrichton created PR review comment:

If most of this module operated on &StoreOpaque rather than &mut StoreOpaque that might be a way to avoid this Vec too perhaps

Wasmtime GitHub notifications bot (Oct 09 2025 at 20:34):

alexcrichton created PR review comment:

Could this, and maybe stackwalking in general, actually take &StoreOpaque, notably not mutable? That way I think stack iteration could be 100% safe since mutations are impossible while the store is held, and it'd also prevent a GC since that requires a mutable store

Wasmtime GitHub notifications bot (Oct 09 2025 at 20:34):

alexcrichton created PR review comment:

Could the documentation here link to the types in the public API which are enabled when this option is turned on?

Wasmtime GitHub notifications bot (Oct 09 2025 at 20:34):

alexcrichton created PR review comment:

Mind adding a "SAFETY" comment about why this is safe?

Wasmtime GitHub notifications bot (Oct 09 2025 at 20:34):

alexcrichton created PR review comment:

Ah ok yeah that all makes sense, thanks for pushing on this regardless!

Wasmtime GitHub notifications bot (Oct 09 2025 at 20:34):

alexcrichton created PR review comment:

Could this point to a spot in code translation and/or Cranelift where offset 0 is the vmctx? (and ideally point from there to here too)

Wasmtime GitHub notifications bot (Oct 09 2025 at 20:34):

alexcrichton created PR review comment:

Ah ok that makes sense to me yeah, but can the safety comment here be worded specifically around how the store is a public field? The current docs feel a bit inaccurate because the store isn't yielded on each iteration nor is it possible to store this iterator, return to wasm, and then resume the iterator later.

(also unresolving for visibility)

Wasmtime GitHub notifications bot (Oct 09 2025 at 20:34):

alexcrichton created PR review comment:

Mind also filing an issue for this? And maybe making a tag for guest-debug related things? Mostly want to make sure we don't lose track of this as this seems pretty important and if it's just a comment we'll inevitably forget it a few months from now.

(unresolving this conversation for visibility)

Wasmtime GitHub notifications bot (Oct 09 2025 at 20:34):

alexcrichton created PR review comment:

Shouldn't this be a bug or debug assert if a program point isn't found?

Wasmtime GitHub notifications bot (Oct 10 2025 at 17:58):

cfallin created PR review comment:

The tricky thing here is that reading out a GC reference actually does mutate the store: it roots the reference and the root-set is in the store. This is not super-satisfying to me either but it's forced by the design today I think...

Wasmtime GitHub notifications bot (Oct 10 2025 at 17:58):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Oct 10 2025 at 21:40):

alexcrichton submitted PR review.

Wasmtime GitHub notifications bot (Oct 10 2025 at 21:40):

alexcrichton created PR review comment:

Ah good point, and yeah that forces our hand here.

Wasmtime GitHub notifications bot (Oct 13 2025 at 00:44):

cfallin updated PR #11769.

Wasmtime GitHub notifications bot (Oct 13 2025 at 00:46):

cfallin updated PR #11769.

Wasmtime GitHub notifications bot (Oct 13 2025 at 00:46):

cfallin created PR review comment:

Done in 0b20becd0c!

Wasmtime GitHub notifications bot (Oct 13 2025 at 00:46):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Oct 13 2025 at 00:48):

cfallin updated PR #11769.

Wasmtime GitHub notifications bot (Oct 13 2025 at 00:48):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Oct 13 2025 at 00:48):

cfallin created PR review comment:

Done in 690d88c2fb!

Wasmtime GitHub notifications bot (Oct 13 2025 at 00:53):

cfallin updated PR #11769.

Wasmtime GitHub notifications bot (Oct 13 2025 at 00:53):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Oct 13 2025 at 00:53):

cfallin created PR review comment:

Done (to debug_native) in 59df486fca.

Wasmtime GitHub notifications bot (Oct 13 2025 at 01:03):

cfallin updated PR #11769.

Wasmtime GitHub notifications bot (Oct 13 2025 at 01:03):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Oct 13 2025 at 01:03):

cfallin created PR review comment:

Done in 6b48c9fd61.

Wasmtime GitHub notifications bot (Oct 13 2025 at 01:03):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Oct 13 2025 at 01:03):

cfallin created PR review comment:

Done in 6b48c9fd61.

Wasmtime GitHub notifications bot (Oct 13 2025 at 01:03):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Oct 13 2025 at 01:03):

cfallin created PR review comment:

Done; added a few references in 6b48c9fd61.

Wasmtime GitHub notifications bot (Oct 13 2025 at 01:03):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Oct 13 2025 at 01:03):

cfallin created PR review comment:

Yes, I think this was left over from some earlier thinking before I realized that debug instrumentation should be uniform across the whole activation because we're in one store with one set of settings... removed in

Wasmtime GitHub notifications bot (Oct 13 2025 at 01:03):

cfallin edited PR review comment.

Wasmtime GitHub notifications bot (Oct 13 2025 at 01:04):

cfallin created PR review comment:

(see other comment about mutability -- unfortunately I don't think we can change this at the moment)

Wasmtime GitHub notifications bot (Oct 13 2025 at 01:04):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Oct 13 2025 at 01:05):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Oct 13 2025 at 01:05):

cfallin created PR review comment:

OK cool -- I'll add such deprecation to my TODO list for the public-facing interface once I get there on my marathon (I don't think we should do it before we have the alternative thing available) :-)

Wasmtime GitHub notifications bot (Oct 13 2025 at 01:07):

cfallin updated PR #11769.

Wasmtime GitHub notifications bot (Oct 13 2025 at 01:07):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Oct 13 2025 at 01:07):

cfallin created PR review comment:

Oops, sorry, I missed that outdated wording -- fixed in ef26d17428.

Wasmtime GitHub notifications bot (Oct 13 2025 at 01:16):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Oct 13 2025 at 01:16):

cfallin created PR review comment:

Filed #11841, but as I wrote that, I realized that even for our current non-moving GC this is a problem, because DCE could remove the part of the program that would otherwise root a value (e.g., for a Wasm program that takes an anyref as a parameter, pushes it on the operand stack, then drops it at the end; that's live for the debugger but won't be in any user stack maps).

I'll see about adding debug metadata traversal to tracing directly, so every ref-typed value at the current program point gets traced...

Wasmtime GitHub notifications bot (Oct 13 2025 at 01:27):

cfallin updated PR #11769.

Wasmtime GitHub notifications bot (Oct 13 2025 at 06:06):

cfallin updated PR #11769.

Wasmtime GitHub notifications bot (Oct 13 2025 at 06:09):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Oct 13 2025 at 06:09):

cfallin created PR review comment:

OK, I've actually implemented the proper rooting in 66e650522e. I have a basic test with a GC ref semantically-live-but-optimizing-compiler-dead across a hostcall, with that hostcall observing the value in the slot. A really proper test would actually invoke a GC on the store during that hostcall, which I can't do in this PR yet (the followup gives a store context that would let me do that) but hopefully this is sufficient for now.

Wasmtime GitHub notifications bot (Oct 13 2025 at 06:09):

cfallin commented on PR #11769:

Alright, I think this is ready for a final look -- thanks for the feedback so far!

Wasmtime GitHub notifications bot (Oct 13 2025 at 17:43):

alexcrichton created PR review comment:

Musing about this, and I realize this isn't supported right now anyway, but will this be enough to support modifying the stack or locals during a debugging session. I was naively thinking "no" because locals can have a more refined type as well as values on the stack. Effectively the wasm validator understands a local/stack entry to have a particular type, and the runtime value may be that type or a subtype. Mutating the stack should allow anything that's a subtype of the known type, but by only recording the top type here we lose the knowledge of precisely what type it needs to be.

I suppose bringing it back to this PR: I would be surprised if FrameValType continues to support future use cases as-is. In the end what I might naively expect is that FrameValType becomes literally WasmValType. Do you think it might be worthwhile to go ahead and plumb that to avoid a future conversion? (and of course deferring this to an issue is also fine)

Wasmtime GitHub notifications bot (Oct 13 2025 at 17:43):

alexcrichton submitted PR review:

One final thought, but doesn't change the fact that I'm still :+1: on this landing as-is

Wasmtime GitHub notifications bot (Oct 13 2025 at 17:47):

cfallin updated PR #11769.

Wasmtime GitHub notifications bot (Oct 13 2025 at 17:53):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Oct 13 2025 at 17:53):

cfallin created PR review comment:

Interesting; yeah, I think you're right.

The driving factor here so far (as I'm sure you intuited) is that I didn't want to pull runtime types down into environ; but I guess the environ-level metadata can use module-level type indices and then we translate that at runtime via the engine type registry, like everywhere else.

I've created #11847 to track. I think that's a bit much to put on top of this PR, IMHO, but happy to do it once needed for mutation. It'd also give a nice ability to query the precise type (in runtime terms) on any local/stack slot.

Wasmtime GitHub notifications bot (Oct 13 2025 at 22:32):

fitzgen submitted PR review:

r=me with the bit about stack tracing using stack maps instead of also debug info (https://github.com/bytecodealliance/wasmtime/issues/11841#issuecomment-3399148128) and then also a couple nitpicks/bikesheds that I am happy to be convinced otherwise about

Thanks Chris!

Wasmtime GitHub notifications bot (Oct 13 2025 at 22:32):

fitzgen created PR review comment:

Bikeshedding: So this doesn't view the whole stack anymore, just one frame in the linked list that makes up the stack, so what do you think of FrameView instead? And it is for the purposes of debugging, so maybe DebugFrameView? Or maybe just DebugFrame?

Wasmtime GitHub notifications bot (Oct 13 2025 at 22:32):

fitzgen created PR review comment:

Placeholder for https://github.com/bytecodealliance/wasmtime/issues/11841#issuecomment-3399148128

Wasmtime GitHub notifications bot (Oct 13 2025 at 22:32):

fitzgen created PR review comment:

This name is assuming that the stack grows down, which isn't really defined at the Wasm level.

How about
pub fn parent(&'a mut self) -> Option<DebugFrame<'a>> {
    ...
}
?

Or if we want to keep it cursor-style, rather than returning a new frame, maybe we should call it move_to_parent and also call the type DebugStackCursor. (I think View implies a more slice-like, O(1)-access thing than Cursor does).

What I like about returning a new frame, over the cursor style, is that it is very clear when you've reached to initial stack frame and there is nothing else to traverse. I personally just find it more natural as well. (The lifetimes should make it so that you can't have two DebugFrames alive at the same time for the same stack anyways).

Wasmtime GitHub notifications bot (Oct 13 2025 at 22:32):

fitzgen created PR review comment:

    /// The flattened list of (symbol name, FuncKey, compiled
    /// function) triples, as they will be laid out in the object file.

Wasmtime GitHub notifications bot (Oct 14 2025 at 04:49):

cfallin updated PR #11769.

Wasmtime GitHub notifications bot (Oct 14 2025 at 04:55):

cfallin updated PR #11769.

Wasmtime GitHub notifications bot (Oct 14 2025 at 05:02):

cfallin commented on PR #11769:

@fitzgen

the bit about stack tracing using stack maps instead of also debug info (https://github.com/bytecodealliance/wasmtime/issues/11841#issuecomment-3399148128)

I built this out in e462fda5eb but I'm not super happy about it for three reasons:

user stack maps were emitted only on calls (as they were designed for); we now need to add them to sequence point instructions, and in the future will need to add them to every trapping instruction, because on trap, we yield to the debugger, and the instrumentation-slot values need to remain rooted until we eventually tear down (and with mutable access to the store that could include a GC). So that means that e.g. every load/store, and every ALU instruction emit case on x86-64 (because mem folding).

That would be fine if we handle it in one place (take stack map before emitting every inst), but now we have a pretty big blowup of metadata space. The stack map will contain an entry for every ref-typed local, and duplicate these for every Wasm memory instruction -- quadratic space. The debug frame state in contrast stores the types and offsets for all locals once.

There's the awkward bit about zero-length sequence points and duplicated stack maps; again not fatal, we can handle it, but shows how this is pushing what stackmaps were designed for a bit.

The asymptotic efficiency bit is the fundamental one I think.

I can make an argument in principle for GC tracing using debug metadata as well: debug state has different requirements (much more frequent state points, and many of them are identical i.e. locals), and is logically a separate kind of rooting area (albeit one per frame), just as globals and user roots are separate areas.

What do you think? Happy to revert the last commit if you agree.

Wasmtime GitHub notifications bot (Oct 14 2025 at 05:03):

cfallin edited a comment on PR #11769:

@fitzgen

the bit about stack tracing using stack maps instead of also debug info (https://github.com/bytecodealliance/wasmtime/issues/11841#issuecomment-3399148128)

I built this out in e462fda5eb but I'm not super happy about it for three reasons:

user stack maps were emitted only on calls (as they were designed for); we now need to add them to sequence point instructions, and in the future will need to add them to every trapping instruction, because on trap, we yield to the debugger, and the instrumentation-slot values need to remain rooted until we eventually tear down (and with mutable access to the store that could include a GC). So that means e.g. every load/store, and every ALU instruction emit case on x86-64 (because mem folding).

That would be fine if we handle it in one place (take stack map before emitting every inst), but now we have a pretty big blowup of metadata space. The stack map will contain an entry for every ref-typed local, and duplicate these for every Wasm memory instruction -- quadratic space. The debug frame state in contrast stores the types and offsets for all locals once.

There's the awkward bit about zero-length sequence points and duplicated stack maps; again not fatal, we can handle it, but shows how this is pushing what stackmaps were designed for a bit.

The asymptotic efficiency bit is the fundamental one I think.

I can make an argument in principle for GC tracing using debug metadata as well: debug state has different requirements (much more frequent state points, and many of them are identical i.e. locals), and is logically a separate kind of rooting area (albeit one per frame), just as globals and user roots are separate areas.

What do you think? Happy to revert the last commit if you agree.

Wasmtime GitHub notifications bot (Oct 14 2025 at 05:04):

cfallin edited a comment on PR #11769:

@fitzgen

the bit about stack tracing using stack maps instead of also debug info (https://github.com/bytecodealliance/wasmtime/issues/11841#issuecomment-3399148128)

I built this out in e462fda5eb but I'm not super happy about it for three reasons:

user stack maps were emitted only on calls (as they were designed for); we now need to add them to sequence point instructions, and in the future will need to add them to every trapping instruction, because on trap, we yield to the debugger, and the instrumentation-slot values need to remain rooted until we eventually tear down (and with mutable access to the store that could include a GC). So that means e.g. every load/store, and every ALU instruction emit case on x86-64 (because mem folding).

That would be fine if we handle it in one place (take stack map before emitting every inst), but now we have a pretty big blowup of metadata space. The stack map will contain an entry for every ref-typed local, and duplicate these for every Wasm instruction -- quadratic space. The debug frame state in contrast stores the types and offsets for all locals once.

There's the awkward bit about zero-length sequence points and duplicated stack maps; again not fatal, we can handle it, but shows how this is pushing what stackmaps were designed for a bit.

The asymptotic efficiency bit is the fundamental one I think.

I can make an argument in principle for GC tracing using debug metadata as well: debug state has different requirements (much more frequent state points, and many of them are identical i.e. locals), and is logically a separate kind of rooting area (albeit one per frame), just as globals and user roots are separate areas.

What do you think? Happy to revert the last commit if you agree.

Wasmtime GitHub notifications bot (Oct 14 2025 at 18:09):

fitzgen commented on PR #11769:

Overall, the latest commit looks like what I expected, and seems good to me.

The asymptotic efficiency bit is the fundamental one I think.

It is true that the total size of stack maps is technically O(num_live_gc_refs * num_safepoints). However, I don't think this will actually be an issue in practice because

Each stack map is already a packed bitmap, minimizing the effect of num_live_gc_refs to a degree

num_live_gc_refs is not actually large in practice

num_safepoints is only large when debugging in practice

All of this only affects code size in edge cases, not run time performance

But if we are concerned about this when adding debugging instrumentation's stack map entries, then I think we should also be concerned about stack maps as they exist today. It is certainly already possible to construct a Wasm that just does a ton of calls (making num_safepoints grow) with a ton of live GC refs on the stack. So if we are concerned about this, then we should make the stack maps in general have prefix sharing similar to what you've done for debug info. (Also I'm sure it is also possible to construct Wasm inputs that have similar Wasm:native code size blow ups in general due to Cranelift being an optimizing/non-template compiler with remat, demand-based egraph elaboration in the face of conditional branches, etc.)

But I don't really want to keep holding up your debugging work, especially since you have so many PRs stacked on top of this one already. So if you still prefer the previous approach, then go for it. I won't draw any lines in the sand here.

Wasmtime GitHub notifications bot (Oct 14 2025 at 22:02):

cfallin updated PR #11769.

Wasmtime GitHub notifications bot (Oct 14 2025 at 22:03):

cfallin commented on PR #11769:

OK, yeah, I pushed on this a bit more to see if I could make it nicer; I do agree that in principle stack maps should be the sole owner of "GC refs on the stack", and also that any efficiency concerns with them could be addressed directly.

That said I was trying to harmonize the stackmap emission a bit so it isn't special-cased for sequence_point and calls only, since we'll ultimately need them on many different kinds of instructions; and working through that ended up calling for reinvention of a lot of what debug tags already handle, e.g. pre- and post-points on instructions, and seemed to be going against the grain in general. So I think I'll keep it as-is but we can always try this refactor later. Thanks for the push to try this in any case!

Wasmtime GitHub notifications bot (Oct 14 2025 at 22:04):

cfallin created PR review comment:

(Resolving for now but we can try this refactor later if we can make supporting stackmaps across all instructions nicer)

Wasmtime GitHub notifications bot (Oct 14 2025 at 22:04):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Oct 14 2025 at 22:11):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Oct 14 2025 at 22:11):

cfallin created PR review comment:

Fixed in 4fdcd4eb1b, thanks!

Wasmtime GitHub notifications bot (Oct 14 2025 at 22:12):

cfallin updated PR #11769.

Wasmtime GitHub notifications bot (Oct 14 2025 at 22:12):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Oct 14 2025 at 22:12):

cfallin created PR review comment:

Ah, I like the cursor naming here; switched to DebugFrameCursor, an accessor debug_frames() on the store, and move_to_parent(). Thanks! Updated in 4fdcd4eb1b.

Wasmtime GitHub notifications bot (Oct 14 2025 at 22:13):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Oct 14 2025 at 22:13):

cfallin created PR review comment:

Updated to DebugFrameCursor per below in 4fdcd4eb1b.

Wasmtime GitHub notifications bot (Oct 14 2025 at 22:13):

cfallin has enabled auto merge for PR #11769.

Wasmtime GitHub notifications bot (Oct 14 2025 at 23:50):

cfallin updated PR #11769.

Wasmtime GitHub notifications bot (Oct 14 2025 at 23:51):

cfallin has enabled auto merge for PR #11769.

Wasmtime GitHub notifications bot (Oct 15 2025 at 00:27):

cfallin merged PR #11769.

Last updated: Feb 24 2026 at 06:21 UTC