cfallin opened issue #12486:
In our "bottom-half" API that provides access to a debuggee
Store's execution state via the native Wasmtime API, we have theStore::debug_framesmethod that returns aDebugFrameCursor<'a, T>. This cursor follows established iterator idioms in Rust: it borrows theStorefor the duration of its existence, thus freezing its state (disallowing execution) and making stack frames safe to traverse and hold pointers into.However, as we reflect this API into a form that can be used by a "top half" that is built in Wasm (per our current RFC consensus, with current informal discussions to tweak bits of how this works but not fundamentally abandon the notion of using a Wasm component), we need to follow the ownership semantics of WIT.
In particular, this means that if we reify both the debuggee (the
Storebeing debugged) and a handle to a particular frame as resources, we need to be able to put types in theResourceTablethat are standalone. A resource can have a parent-child relationship with another, but cannot "borrow" (and freeze) another via a lifetime parameter. In fact this would not even be representable in a Rust host-side implementation: lifetime parameters must bottom out at some actual stack frame; they cannot refer to some value with dynamic lifetime extent.The above is all well-and-good and unsurprising: WIT simplifies the available semantic range of borrowing/ownership considerably. We'll instead need to enforce the restrictions dynamically: something like, if a
frameresource exists, then use of any API on thedebuggeewill trap.Unfortunately, implementing even this is difficult, because of the above-mentioned limitation: a borrow needs to correspond to some frame; without a single hostcall instance that corresponds to the whole stack traversal, the only other option is to correspond to the lifetime of the entire resource table.
The net effect of this is that if we want to have a "dynamic checks" variant of the API exposed at any level, we need to fundamentally rethink the stack iterator idea. We instead need a notion of "frame handles" that refer to frozen stack frames, just as we have instance handles, table handles, and the like today; the handles are
CopyorClone-able things that are morally indexes into theStore. Unlike instances, however, we need more complex invalidation rules: we need a way to make a frame handle invalid (and dynamically fail if used) once execution resumes. We could do this by, internally, keeping a "frame handle table" and a generation number on each slot. This fits better with the semantic abstraction level of an API with an arbitrary number of handles (i.e., WIT resources).The alternative to all of this is to force an eager capture of the entire stack, all local values, and all operand-stack values on every pause; then provide an API over these
Vecs held in resources rather than by actually lazily querying the paused machine stack.The other major question that arises is: would this be better if we wrote the top-half of the debugger in a native Rust implementation instead, using the existing Wasmtime APIs? Yes and no: yes, in the sense that we have tests today showing how to use the API effectively to walk the stack with a stack iterator. No, in the sense that we likely want to have a single top-level event loop, and always hold the debuggee
Store; to use the iterator effectively we'd need to have a state machine where we take the iterator and poll the inbound requests/commands from the debugger connection, moving the iterator or exiting to the outer loop (ending the inner borrow) when leaving "stack traversal mode". It's much cleaner to be "modeless" and allow handles to frames to coexist with all other handles.So basically I believe we need to do this refactor regardless, and I don't see a way around it unless we do a very inefficient copy on every pause.
cc @alexcrichton and @fitzgen for thoughts?
cfallin edited issue #12486:
In our "bottom-half" API that provides access to a debuggee
Store's execution state via the native Wasmtime API, we have theStore::debug_framesmethod that returns aDebugFrameCursor<'a, T>. This cursor follows established iterator idioms in Rust: it borrows theStorefor the duration of its existence, thus freezing its state (disallowing execution) and making stack frames safe to traverse and hold pointers into.However, as we reflect this API into a form that can be used by a "top half" that is built in Wasm (per our current RFC consensus, with current informal discussions to tweak bits of how this works but not fundamentally abandon the notion of using a Wasm component), we need to follow the ownership semantics of WIT.
In particular, this means that if we reify both the debuggee (the
Storebeing debugged) and a handle to a particular frame as resources, we need to be able to put types in theResourceTablethat are standalone. A resource can have a parent-child relationship with another, but cannot "borrow" (and freeze) another via a lifetime parameter. In fact this would not even be representable in a Rust host-side implementation: lifetime parameters must bottom out at some actual stack frame; they cannot refer to some value with dynamic lifetime extent.The above is all well-and-good and unsurprising: WIT simplifies the available semantic range of borrowing/ownership considerably. We'll instead need to enforce the restrictions dynamically: something like, if a
frameresource exists, then use of any API on thedebuggeewill trap.Unfortunately, implementing even this is difficult, because of the above-mentioned limitation: a borrow needs to correspond to some host frame, not a data structure; without a single hostcall instance that corresponds to the whole stack traversal, (EDIT) we cannot create a resource that is stored in a data structure that holds this borrow. (Said more simply: this implies a captured lifetime on whatever dynamic table the API is backed by; that lifetime itself also refers to the table; that's a self-reference and can't work, at least without unsafe code and very weird and easy-to-get-wrong semantics.)
The net effect of this is that if we want to have a "dynamic checks" variant of the API exposed at any level, we need to fundamentally rethink the stack iterator idea. We instead need a notion of "frame handles" that refer to frozen stack frames, just as we have instance handles, table handles, and the like today; the handles are
CopyorClone-able things that are morally indexes into theStore. Unlike instances, however, we need more complex invalidation rules: we need a way to make a frame handle invalid (and dynamically fail if used) once execution resumes. We could do this by, internally, keeping a "frame handle table" and a generation number on each slot. This fits better with the semantic abstraction level of an API with an arbitrary number of handles (i.e., WIT resources).The alternative to all of this is to force an eager capture of the entire stack, all local values, and all operand-stack values on every pause; then provide an API over these
Vecs held in resources rather than by actually lazily querying the paused machine stack.The other major question that arises is: would this be better if we wrote the top-half of the debugger in a native Rust implementation instead, using the existing Wasmtime APIs? Yes and no: yes, in the sense that we have tests today showing how to use the API effectively to walk the stack with a stack iterator. No, in the sense that we likely want to have a single top-level event loop, and always hold the debuggee
Store; to use the iterator effectively we'd need to have a state machine where we take the iterator and poll the inbound requests/commands from the debugger connection, moving the iterator or exiting to the outer loop (ending the inner borrow) when leaving "stack traversal mode". It's much cleaner to be "modeless" and allow handles to frames to coexist with all other handles.So basically I believe we need to do this refactor regardless, and I don't see a way around it unless we do a very inefficient copy on every pause.
cc @alexcrichton and @fitzgen for thoughts?
cfallin edited issue #12486:
In our "bottom-half" API that provides access to a debuggee
Store's execution state via the native Wasmtime API, we have theStore::debug_framesmethod that returns aDebugFrameCursor<'a, T>. This cursor follows established iterator idioms in Rust: it borrows theStorefor the duration of its existence, thus freezing its state (disallowing execution) and making stack frames safe to traverse and hold pointers into.However, as we reflect this API into a form that can be used by a "top half" that is built in Wasm (per our current RFC consensus, with current informal discussions to tweak bits of how this works but not fundamentally abandon the notion of using a Wasm component), we need to follow the ownership semantics of WIT.
In particular, this means that if we reify both the debuggee (the
Storebeing debugged) and a handle to a particular frame as resources, we need to be able to put types in theResourceTablethat are standalone. A resource can have a parent-child relationship with another, but cannot "borrow" (and freeze) another via a lifetime parameter. In fact this would not even be representable in a Rust host-side implementation: lifetime parameters must bottom out at some actual stack frame; they cannot refer to some value with dynamic lifetime extent.The above is all well-and-good and unsurprising: WIT simplifies the available semantic range of borrowing/ownership considerably. We'll instead need to enforce the restrictions dynamically: something like, if a
frameresource exists, then use of any API on thedebuggeewill trap.Unfortunately, implementing even this is difficult, because of the above-mentioned limitation: a borrow needs to correspond to some host frame, not a data structure; without a single hostcall instance that corresponds to the whole stack traversal, (EDIT) we cannot create a resource that is stored in a data structure that holds this borrow. (Said more simply: this implies a captured lifetime on whatever dynamic table the API is backed by; that lifetime itself also refers to the table; that's a self-reference and can't work.)
The net effect of this is that if we want to have a "dynamic checks" variant of the API exposed at any level, we need to fundamentally rethink the stack iterator idea. We instead need a notion of "frame handles" that refer to frozen stack frames, just as we have instance handles, table handles, and the like today; the handles are
CopyorClone-able things that are morally indexes into theStore. Unlike instances, however, we need more complex invalidation rules: we need a way to make a frame handle invalid (and dynamically fail if used) once execution resumes. We could do this by, internally, keeping a "frame handle table" and a generation number on each slot. This fits better with the semantic abstraction level of an API with an arbitrary number of handles (i.e., WIT resources).The alternative to all of this is to force an eager capture of the entire stack, all local values, and all operand-stack values on every pause; then provide an API over these
Vecs held in resources rather than by actually lazily querying the paused machine stack.The other major question that arises is: would this be better if we wrote the top-half of the debugger in a native Rust implementation instead, using the existing Wasmtime APIs? Yes and no: yes, in the sense that we have tests today showing how to use the API effectively to walk the stack with a stack iterator. No, in the sense that we likely want to have a single top-level event loop, and always hold the debuggee
Store; to use the iterator effectively we'd need to have a state machine where we take the iterator and poll the inbound requests/commands from the debugger connection, moving the iterator or exiting to the outer loop (ending the inner borrow) when leaving "stack traversal mode". It's much cleaner to be "modeless" and allow handles to frames to coexist with all other handles.So basically I believe we need to do this refactor regardless, and I don't see a way around it unless we do a very inefficient copy on every pause.
cc @alexcrichton and @fitzgen for thoughts?
cfallin commented on issue #12486:
Extra wrinkle: frame-handle invalidation needs to happen not only on resuming execution but on dropping a future representing Wasm execution as well. I suspect what we'll want is something like a "current execution generation" counter on a
Storethat is incremented on every return from hostcall and in a drop guard in the fiber glue around async Wasm invocations. Then everyFramecarries (i) unsafe rawusizepointing to the frame in the actual stack, (ii) store ID, (iii) execution generation in that store.
fitzgen commented on issue #12486:
In general, the more heavily we lean on Rust's type system, the more of these impedance mismatches like this we will encounter (and similar for e.g. exposing things in C APIs). For APIs that we expect to be used primarily from other languages or via WIT, we should probably design the APIs with this in mind, going forward.
I suspect what we'll want is something like a "current execution generation" counter on a
Storethat is incremented on every return from hostcall and in a drop guard in the fiber glue around async Wasm invocations.Is this not equivalent to the Wasm's exit PC and (FP or SP)? When paired with the store id, this seems like it should be a way to validate whether a debugging stack frame iterator is still valid or not without any additional overhead on top of what we already track today.
alexcrichton commented on issue #12486:
I definitely agree with @fitzgen about how if we leverage Rust's uniqueness then it falls down in other integrations. That being said I'm not sure if there's really a good alternative for a frame iterator here, as ideally we wouldn't add more infrastructure around wasm entry/exit to invalidate handles/etc.
An idea to maybe thread this needle: in the WIT for guests a frame iterator could be modeled where the construction of the frame iterator takes the debuggee store by-
own. This models how nothing can be done while a debugger is iterating frames, and then when iteration is finished there'd be a way to get back the store. To model this on the host what we sort of want is typed coroutines where a coroutine closes over the store. Resuming the coroutine is done for the various API calls that a frame iterator can do, and the yields correspond to those results. We don't have coroutines in Rust, however, so the next closest way to model this on the host would be an async task with a channel going in/out.The WIT for example might look like:
resource stack { constructor(store: store); up: func() -> bool; down: func() -> bool; finish: static func(stack: stack) -> store; instance: func() -> instance; func-index: func() -> u32; get-local: func(idx: u32) -> result<wasm-value, ...>; set-local: func(idx: u32, val: wasm-value) -> result<_, ...>; // ... }The implementation would look something like:
- On calling
constructoran async task is created which owns the store and starts the frame iteration.- A
stackresource internally owns a channel to this task- Each method on
stacksends a message over this channel- Each message also has a oneshot going back saying "here's the result of the thing you just asked for"
- The async task would be a
forloop over all the messages in the channel plus synchronously handling each message. This wouldbreakwhenfinishis called or the channel is closed (e.g.stackis dropped)That would enable all the borrowing we have in today's API (which accurately models to Rust how this works), would avoid any quadratic behavior, and in theory could map ok to a C-like API too
cfallin commented on issue #12486:
Is this not equivalent to the Wasm's exit PC and (FP or SP)? When paired with the store id, this seems like it should be a way to validate whether a debugging stack frame iterator is still valid or not without any additional overhead on top of what we already track today.
Not exactly, because one could exit from the same PC/FP/SP in multiple successive calls (in fact it's even likely if one has e.g. a loop in the same function making the same call). If nothing above that frame has changed, then handles to higher frames may still be valid by chance, but e.g. the same function could have been called via some other path with the same frame sizes.
To be concrete e.g. we have
e,f,g,hall with minimal frame sizes (e.g. 16 bytes for the saved FP/ret addr pair only). We havee->f->h-> hostcall; create some frame handles, resume. Thenhreturns,freturns,ecallse->g->h-> hostcall. Same FP and SP in the stack; same PC; but a handle to the middle off's frame should be invalidated.
cfallin commented on issue #12486:
To Alex's sketch: I like the idea of passing ownership of the store (in my prototype,
debuggee, so I'll call it that) to model what is actually possible at the WIT API level. That said, I think the implementation sketched above has complexity significantly higher than I would have hoped for. Question: is the cost of one u64 increment on hostcall return too much to bear?
cfallin commented on issue #12486:
The other tricky bit about the API that truly takes ownership of the store (
debuggee) is that it needs to essentially duplicate all of the API surface of the rest of the reflection API, too: (i) working with Wasm values as read from stack frames may require use of the store (e.g. for GC refs), and (ii) the debugger is free to query other state while reading out the stack.I guess that's the fundamental thing: a protocol like gdbstub or DAP has a notion of either "read stack frame
ilocalj" (gdbstub witih Wasm extensions) or "give me a handle to stack framei" then "with stack handlefread localj" (DAP). I believe we'll thus want to implement this by essentially caching for each level a stack frame handle, as requested. This lets the client interleave walking up the stack with querying other state. That cache is then invalidated whenever we resume execution. Basically, I think the iterator paradigm is fundamentally too restrictive for what the protocols require, and emulating what the protocols require will require quadratic behavior if we are stuck with an iterator interface.
alexcrichton commented on issue #12486:
Needing to do store-related things while you otherwise wouldn't have ownership to the debuggee definitely sinks the idea of passing ownership -- it's definitely not worth it to duplicate API surface area.
I think I don't fully understand what the proposal is here for a non-iterator-like-interface then? My hope was that we wouldn't need more
unsafecode to deal with stack frames, but moving away from something iterator-like seems like it will require some form ofunsafesomewhere. There's sort of a few layers here -- what the debugging protocol wants, how the debugger is implemented, what Wasmtime's WIT provides, how Wasmtime implements the WIT, and the base-level API Wasmtime provides. Could you detail a bit more what you're thinking about w.r.t. specifically the base-level API Wasmtime provides?I'm not worried about a u64 increment/decrement specifically, what I'm worried about is further expanding the
unsafescope here. The iterator approach, at least to me, is pretty trivially safe. Anything beyond that will require further special care/consideration which isn't a dealbreaker, but I'd be surprised if it boiled down to just a single increment/decrement/check and that's it.
cfallin commented on issue #12486:
Yes, the proposal is more or less what's written up above:
We instead need a notion of "frame handles" that refer to frozen stack frames, just as we have instance handles, table handles, and the like today; the handles are Copy or Clone-able things that are morally indexes into the Store
These handles capture:
Then every Frame carries (i) unsafe raw usize pointing to the frame in the actual stack, (ii) store ID, (iii) execution generation in that store.
The raw address is unsafe, but the unsafety is bounded by the check that the frame is still valid -- given by the combination of store ID matching a passed-in store, and the generation matching the current generation on that passed-in store, and generation being updated whenever we give control of the stack back to Wasm code (by returning) or drop it (by dropping a fiber).
Then we have a
debug_frame(&self) -> Option<Frame>onStorethat gives the innermost frame (if any); any frame has afn parent(&self) -> Option<Frame>; and anyFramehas accessors for locals and operand stack and instance and all the rest, as the iterator does today, that takes&mut Storeexplicitly.Frameitself would be at leastClone(maybe evenCopy?) as e.g.Instanceis today.
cfallin commented on issue #12486:
(Slight complication with reentrancy and also continuations: we may actually want
debug_frame_exits(&self) -> impl Iterator<Item = Frame>or something like that to give all Wasm exit frames, so we don't have to explicitly unroll the iterator that traverses all of those as part of the stack iter currently. Then theparent()impl is just following one link in the chain of one activation, up to a saved entry FP)
alexcrichton commented on issue #12486:
Ah ok thanks makes sense. I agree that the idea of a copy-able
Frameis appealing as it helps disentangle the lifetimes here and basically the same dynamic check that the rest of the API has. I'd be surprised if this only needed a single raw usize to identify the frame itself, for example it probably also needs the original entry fp to know when to stop walking, but that's fine to iron out details over time.How are you imagining that the generation number is going to work? The simplest implementation would invalidate all historical frames as soon as wasm starts running, but it also sounds like you might be thinking that lower down activations on the stack may stay valid.
the unsafety is bounded by the check that the frame is still valid
This is partly what I'm worried about, but I think might just be something we have to stomach. The store ID is simple enough to check but we already maintain a surprisingly large amount of state around wasm calls and this would be "yet one more piece of state to check". For example this is all the state we currently maintain which is different from fiber state which neither handles returning to wasm from the host. I realize that this'll "just" be a matter of putting some increments in the right places but the consequence for forgetting an increment is a use-after-free CVE where in most other places it's a denial-of-service CVE with a crash of some kind. Basically I'd love to try really hard to lean on existing state rather than adding yet-more-state. I don't know how that can be done, and I realize how in isolation the easiest thing to do is to add-one-more-piece-of-state, but this is at least where my original concern was coming from.
fitzgen commented on issue #12486:
FWIW, in SpiderMonkey, we had a fairly fancy cache for captured
SavedFrameobjects: https://searchfox.org/firefox-main/rev/33fd6bd39c625067a29f153adce6a4646e45750f/js/src/vm/Activation.h#46This wasn't used directly for the
Debug.Framedebugging API, IIRC, but we could do something similar.
cfallin commented on issue #12486:
I'd be surprised if this only needed a single raw usize to identify the frame itself, for example it probably also needs the original entry fp to know when to stop walking, but that's fine to iron out details over time.
Right, yeah, you're right of course -- we need two
usizes for current FP and entry FP.Basically I'd love to try really hard to lean on existing state rather than adding yet-more-state. I don't know how that can be done, and I realize how in isolation the easiest thing to do is to add-one-more-piece-of-state, but this is at least where my original concern was coming from.
I was thinking we'd keep the version per
Store(or I guessVMStoreContextto make it reachable from trampoline-land), notEntryStoreContext, as this is scoped for lifetime of store not one activation. You're right that we need to capture all cases where we might return control to execution that could unwind a stack. I was hoping that we'd have a narrow waist where we could do that around the same place that we deal withHostResults intraphandlers.rs; and then a drop guard in one place in the fiber entry point that owns the fiber stack. If we think this is workable I'll put together a draft PR and of couse welcome your comments where I've forgotten it :-)
cfallin commented on issue #12486:
To be explicit, too, bumping the version number on every return into Wasm means that, e.g., if we are suspended at some breakpoint and then we manually invoke another Wasm function from the debugger at that point, that invalidates our cache. But the "cache" framing is intentional here as we can always rebuild the cache by counting up the frames from our current point again. So one
u64and not, e.g., versions per level or something like that.
alexcrichton commented on issue #12486:
The best narrow waist here might actually be the trampolines themselves given how tightly we control entry/exit into wasm. With a
VMStoreContext-reachable location the increment could be folded into the entry trampolines and then any return from a host/libcall would handle the increment-back-into-wasm. That'd also handle the case where we're raising a trap, longjmp'ing, etc.The only other case I can think of is related to signal handlers, so there could also be an increment on the "returned with exception" path of entry trampolines too
cfallin closed issue #12486:
In our "bottom-half" API that provides access to a debuggee
Store's execution state via the native Wasmtime API, we have theStore::debug_framesmethod that returns aDebugFrameCursor<'a, T>. This cursor follows established iterator idioms in Rust: it borrows theStorefor the duration of its existence, thus freezing its state (disallowing execution) and making stack frames safe to traverse and hold pointers into.However, as we reflect this API into a form that can be used by a "top half" that is built in Wasm (per our current RFC consensus, with current informal discussions to tweak bits of how this works but not fundamentally abandon the notion of using a Wasm component), we need to follow the ownership semantics of WIT.
In particular, this means that if we reify both the debuggee (the
Storebeing debugged) and a handle to a particular frame as resources, we need to be able to put types in theResourceTablethat are standalone. A resource can have a parent-child relationship with another, but cannot "borrow" (and freeze) another via a lifetime parameter. In fact this would not even be representable in a Rust host-side implementation: lifetime parameters must bottom out at some actual stack frame; they cannot refer to some value with dynamic lifetime extent.The above is all well-and-good and unsurprising: WIT simplifies the available semantic range of borrowing/ownership considerably. We'll instead need to enforce the restrictions dynamically: something like, if a
frameresource exists, then use of any API on thedebuggeewill trap.Unfortunately, implementing even this is difficult, because of the above-mentioned limitation: a borrow needs to correspond to some host frame, not a data structure; without a single hostcall instance that corresponds to the whole stack traversal, (EDIT) we cannot create a resource that is stored in a data structure that holds this borrow. (Said more simply: this implies a captured lifetime on whatever dynamic table the API is backed by; that lifetime itself also refers to the table; that's a self-reference and can't work.)
The net effect of this is that if we want to have a "dynamic checks" variant of the API exposed at any level, we need to fundamentally rethink the stack iterator idea. We instead need a notion of "frame handles" that refer to frozen stack frames, just as we have instance handles, table handles, and the like today; the handles are
CopyorClone-able things that are morally indexes into theStore. Unlike instances, however, we need more complex invalidation rules: we need a way to make a frame handle invalid (and dynamically fail if used) once execution resumes. We could do this by, internally, keeping a "frame handle table" and a generation number on each slot. This fits better with the semantic abstraction level of an API with an arbitrary number of handles (i.e., WIT resources).The alternative to all of this is to force an eager capture of the entire stack, all local values, and all operand-stack values on every pause; then provide an API over these
Vecs held in resources rather than by actually lazily querying the paused machine stack.The other major question that arises is: would this be better if we wrote the top-half of the debugger in a native Rust implementation instead, using the existing Wasmtime APIs? Yes and no: yes, in the sense that we have tests today showing how to use the API effectively to walk the stack with a stack iterator. No, in the sense that we likely want to have a single top-level event loop, and always hold the debuggee
Store; to use the iterator effectively we'd need to have a state machine where we take the iterator and poll the inbound requests/commands from the debugger connection, moving the iterator or exiting to the outer loop (ending the inner borrow) when leaving "stack traversal mode". It's much cleaner to be "modeless" and allow handles to frames to coexist with all other handles.So basically I believe we need to do this refactor regardless, and I don't see a way around it unless we do a very inefficient copy on every pause.
cc @alexcrichton and @fitzgen for thoughts?
Last updated: Feb 24 2026 at 04:36 UTC