Stream: git-wasmtime

Topic: wasmtime / PR #12176 Debugging: allow frame cursor to vis...


view this post on Zulip Wasmtime GitHub notifications bot (Dec 17 2025 at 07:44):

cfallin requested fitzgen for a review on PR #12176.

view this post on Zulip Wasmtime GitHub notifications bot (Dec 17 2025 at 07:44):

cfallin opened PR #12176 from cfallin:debugging-host-sandwich to bytecodealliance:main:

In the initial design for the DebugFrameCursor, I was concerned about the effects of host async on the stability of visiting earlier activations (see also the discussion of async combinators in #11896). The basic hypothesized problem was that when Wasm calls host-code calls Wasm, the sequence of activations on the stack is not even stable between async polls; so any debugger hook, which is an async function, should not be allowed to hold a frame cursor across a yield point since it could become invalidated if the next poll stacks up the activations differently.

In further conversations it's become clear that this is not actually a possibility, for the simple reason that the inner re-entrant activations into the same store take full ownership (mutably reborrow) that store, and that mut reborrow becomes part of the future; so the exact chain of activations will remain in the same sequence when re-polled. Said another way, it is impossible at any given level of async host-code to create more than one future that re-enters the same store and somehow poll those in different orders at different times. The worst that a host-code async combinator can do is drop the future that re-enters the store. This drops and invalidates whatever frames a cursor held over a yield might be referencing, but it also drops the async invocation of the debugger hook itself, and due to lifetimes the cursor cannot escape that hook, so everything is still sound.

This PR thus updates the DebugFrameCursor to visit all activations. I've generalized the backtrace code a bit to enable this, and built an internal StoreBacktrace that is an iterator over all activations associated with the store.

At the DebugFrameCursor (public API) level, the two basic choices were to present a sentinel for host frame(s) explicitly and make all Wasm-specific accessors return Option<T>, or skip over host frames. I opted for the latter, with move_to_parent() returning an enum value now that indicates whether it moved to a new activation.

A note regarding the async component ABI: once debugging is possible within a run_concurrent environment, it will again be the case that a single frame cursor should see only one activation, because each (re)-entry into the store becomes a new task, if my understanding is correct. At that time, we should build an API that lets the debugger see the activation for each task separately. That's a simpler model ultimately, and it will be nice when we move to it, but as long as we have the sync component ABI with async host code and the ability to stack activations as we do today, we need to provide the debugger this visibility.

(Aside: why does the debugger need to see more than one activation? In addition to presenting a weird and incoherent view of the world to the user if we don't, it is also necessary to implement the "next" (step-over) debugger action, because otherwise a call to a host function that re-enters the store may lead to a state with fewer, but completely disjoint, stack frames on the "one latest activation" from which it's not possible to reason about whether we've left the called-into function yet.)

<!--
Please make sure you include the following information:

Our development process is documented in the Wasmtime book:
https://docs.wasmtime.dev/contributing-development-process.html

Please ensure all communication follows the code of conduct:
https://github.com/bytecodealliance/wasmtime/blob/main/CODE_OF_CONDUCT.md
-->

view this post on Zulip Wasmtime GitHub notifications bot (Dec 17 2025 at 07:44):

cfallin requested wasmtime-core-reviewers for a review on PR #12176.

view this post on Zulip Wasmtime GitHub notifications bot (Dec 17 2025 at 07:44):

cfallin updated PR #12176.

view this post on Zulip Wasmtime GitHub notifications bot (Dec 17 2025 at 07:47):

cfallin updated PR #12176.

view this post on Zulip Wasmtime GitHub notifications bot (Dec 17 2025 at 07:50):

cfallin updated PR #12176.

view this post on Zulip Wasmtime GitHub notifications bot (Dec 17 2025 at 19:09):

fitzgen submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Dec 17 2025 at 19:09):

fitzgen created PR review comment:

Kind of feel like this constructor should be safe and the StoreBacktrace::store_mut method should be unsafe, since that is the thing that actually lets you get the mutable store and potentially mutate the stack.

view this post on Zulip Wasmtime GitHub notifications bot (Dec 17 2025 at 21:09):

cfallin updated PR #12176.

view this post on Zulip Wasmtime GitHub notifications bot (Dec 17 2025 at 21:10):

cfallin submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Dec 17 2025 at 21:10):

cfallin created PR review comment:

Ah, that's a great point -- updated in 0e6ec66e1d.

view this post on Zulip Wasmtime GitHub notifications bot (Dec 17 2025 at 21:10):

cfallin has enabled auto merge for PR #12176.

view this post on Zulip Wasmtime GitHub notifications bot (Dec 17 2025 at 21:25):

cfallin updated PR #12176.

view this post on Zulip Wasmtime GitHub notifications bot (Dec 17 2025 at 22:06):

cfallin merged PR #12176.

view this post on Zulip Wasmtime GitHub notifications bot (Dec 18 2025 at 20:35):

alexcrichton submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Dec 18 2025 at 20:35):

alexcrichton created PR review comment:

Could you clarify what you're worried about with the unsafe here? I can't actually imagine anything myself that would, in safe code, do something that violates a stack trace. While the stack is sort of morally mutatable/owned by the store we don't allow arbitrary mutations anywhere. Allowing arbitrary mutations itself would be quite unsafe due to return pointers and such on the stack.

Given that I would, perhaps naively, expect this to be able to be a safe function, which would in theory help clean up callers of this a bit too

view this post on Zulip Wasmtime GitHub notifications bot (Dec 18 2025 at 20:44):

cfallin submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Dec 18 2025 at 20:44):

cfallin created PR review comment:

I remember discussing this safety condition in an earlier PR, too, where we decided to keep this unsafe (constructing an iterator over the stack has always had an unsafe, it isn't new in this PR). It's forward-looking to APIs that might, e.g., allow for early returns or exception-throws to be enacted from within the debugger. That would undoubtedly require using a raise-libcall-like strategy to resume to the new execution point after returning into the breakpoint trampoline, but might edit stack frames in the meantime. It seems fine to me to say that we put unsafe on such APIs when they exist instead. Happy to clean this up in upcoming work.

view this post on Zulip Wasmtime GitHub notifications bot (Dec 19 2025 at 18:44):

alexcrichton submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Dec 19 2025 at 18:44):

alexcrichton created PR review comment:

I have vague recollections of that, but if this is actually unsafe, even in a forward looking way, then this needs to be deleted:

impl<'a, T: 'static> AsContextMut for DebugFrameCursor<'a, T> {
    fn as_context_mut(&mut self) -> StoreContextMut<'_, Self::Data> {
        StoreContextMut(self.iter.store.0)
        // SAFETY: `StoreContextMut` does not provide any methods that
        // could remove frames from the stack, so the iterator remains
        // valid.
        unsafe { StoreContextMut(self.iter.store_mut().0) }
    }
}

You're saying that looking forward we'll allow editing the stack which would make this genuinely unsafe, but this impl also is only possible to exist if there's not actually any unsafety.

Something about this I think needs to be reconciled. Ideally the iterator would actually borrow the stack and would force a .collect or something like that into a vector so we wouldn't have to deal with these sorts of questions. I don't have all this in my head to know what the best route forward is, but I do think that something here needs to change as I believe this PR is otherwise self-contradictory where it's trying to future-proof against something that it's also saying can't ever happen.

view this post on Zulip Wasmtime GitHub notifications bot (Dec 19 2025 at 18:45):

alexcrichton edited PR review comment.

view this post on Zulip Wasmtime GitHub notifications bot (Dec 19 2025 at 19:09):

cfallin submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Dec 19 2025 at 19:09):

cfallin created PR review comment:

I'm not sure I follow -- the suggestion I gave above was to make any methods that actually edit the stack unsafe rather than the cursor construction. Wouldn't that also be sufficient? In other words, you can get a StoreContextMut safely, you can create a cursor safely, but using mutable access to the store for nefarious bits that invalidate a cursor is still unsafe.

A few other constraints:

view this post on Zulip Wasmtime GitHub notifications bot (Dec 19 2025 at 21:46):

alexcrichton submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Dec 19 2025 at 21:46):

alexcrichton created PR review comment:

With stack editing things as unsafe, this API here is safe, right? That's mostly what I am trying to clarify.

view this post on Zulip Wasmtime GitHub notifications bot (Dec 19 2025 at 21:55):

cfallin submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Dec 19 2025 at 21:55):

cfallin created PR review comment:

Yep, exactly. So my suggestion from above is

It seems fine to me to say that we put unsafe on such APIs when they exist instead. Happy to clean this up in upcoming work.

So (i) this function is safe, (ii) DebugFrameCursor can still safely give us a StoreContextMut, and (iii) unsafe is used in the future for anything that might invalidate a cursor by editing frames. I think we're agreeing on that final state but let me know if I'm missing something :-)

view this post on Zulip Wasmtime GitHub notifications bot (Dec 19 2025 at 22:07):

alexcrichton submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Dec 19 2025 at 22:07):

alexcrichton created PR review comment:

Sounds good! Sorry just wasn't clear to me from your original comment, but all good nonetheless


Last updated: Jan 09 2026 at 13:15 UTC