alexcrichton requested fitzgen for a review on PR #13320.
alexcrichton opened PR #13320 from alexcrichton:fallible-gc to bytecodealliance:main:
This commit is an attempt to harden Wasmtime in the face of GC heap corruption to downgrade panics to an error being returned instead. Normal operation should never hit any of these paths and in theory this is all dead code. The intention, however, is to further downgrade the severity of GC heaps from a DoS to, in theory, maybe not even a CVE at all.
This commit is inspired by the transition done for component-model-async recently too where many
assert!'d conditions and panics were translated intobail_bug!within Wasmtime. This returns a special kind of error in release mode and panics in debug mode. The rationale behind this is that, like component-model-async, the GC implementation is the intersection of:
- Easy for guests to control.
- Difficult to guarantee 100% correctness of the host.
- Low consequences if corruption is detected.
- Easy to generate a trap via
?to propagate upwards.In this situation the goal here is to more aggressively return errors, in release mode, rather than panic which risks a quick DoS of embedders. The ideal goal is for GC heap corruption to not be a DoS at all, but we're not quite ready to make that commitment just yet.
Many methods in this commit were refactored to return
Result, and many implementations internally within the GC implementation have been updated to usebail_bug!or similar to downgrade panics to errors. Note that in debug mode (orcfg(debug_assertions)) all of these are still panics.cc #13216
<!--
Please make sure you include the following information:
If this work has been discussed elsewhere, please include a link to that
conversation. If it was discussed in an issue, just mention "issue #...".Explain why this change is needed. If the details are in an issue already,
this can be brief.Our development process is documented in the Wasmtime book:
https://docs.wasmtime.dev/contributing-development-process.htmlPlease ensure all communication follows the code of conduct:
https://github.com/bytecodealliance/wasmtime/blob/main/CODE_OF_CONDUCT.md
-->
alexcrichton requested wasmtime-core-reviewers for a review on PR #13320.
alexcrichton updated PR #13320.
alexcrichton updated PR #13320.
alexcrichton updated PR #13320.
github-actions[bot] added the label wasmtime:ref-types on PR #13320.
github-actions[bot] added the label wasmtime:api on PR #13320.
github-actions[bot] commented on PR #13320:
Subscribe to Label Action
cc @fitzgen
<details>
This issue or pull request has been labeled: "wasmtime:api", "wasmtime:ref-types"Thus the following users have been cc'd because of the following labels:
- fitzgen: wasmtime:ref-types
To subscribe or unsubscribe from this label, edit the <code>.github/subscribe-to-label.json</code> configuration file.
Learn more.
</details>
bjorn3 commented on PR #13320:
If you are worried about DoS due to bugs as an embedder, why wouldn't you be using catch_unwind already? catch_unwind in part exists to reduce the blast radius of bugs.
alexcrichton commented on PR #13320:
There's a few reasons for that. One is that embedders can indeed already use
catch_unwindand that's their perrogative, but this is a guarantee we'd like to have Wasmtime provide (as opposed to requiring embedders to provide it). When doing that, however, the API surface area of Wasmtime is quite large and would requirecatch_unwindpretty much everywhere. Additionallycatch_unwindas a mitigation only works with-Cpanic=unwind, and I'd like to see us support these sorts of mitigations in-Cpanic=abortembeddings as well.
bjorn3 commented on PR #13320:
When doing that, however, the API surface area of Wasmtime is quite large and would require catch_unwind pretty much everywhere.
In the case of for example an HTTP server you would only need a single
catch_unwindaround the request handler. You almost certainly aren't going to be able to do recovery more finegrained than returning a 500 from the entire request anyway. Similarly for most other embedders I would assume. If you can do recovery at all it would require throwing away the entire Store as clearly the state stored in the Store is now corrupt. (By the way maybe mark the Store as poisoned when this happens and return an error/panic whenever you call into it again?)Additionally catch_unwind as a mitigation only works with -Cpanic=unwind, and I'd like to see us support these sorts of mitigations in -Cpanic=abort embeddings as well.
If you use panic=abort you give up the ability to do recovery for programming errors (panics) more finegrained than the entire process. The entire reason panic=unwind exists afaik is to allow recovery from programming errors without an external restart similar to how in Erlang you crash a single process (green thread) when a bug or error happens that is not locally recoverable without bringing down the entire BEAM VM. https://erlang.org/pipermail/erlang-questions/2003-March/007870.html
alexcrichton commented on PR #13320:
That all makes sense, yes, but panics in the GC heap due to corruption are still CVEs in Wasmtime (e.g. DoS vectors). The purpose here is to pave the road to making these not even a CVE.
alexcrichton updated PR #13320.
alexcrichton updated PR #13320.
alexcrichton updated PR #13320.
alexcrichton commented on PR #13320:
@fitzgen would you be ok taking a look at this? (unsure if this fell through the inbox cracks) or if you wanted to take some more time
:thumbs_up: fitzgen submitted PR review:
Sorry, this did indeed get lost, so thanks for the ping!
:speech_balloon: fitzgen created PR review comment:
Update the doc comments for all of these methods?
:speech_balloon: fitzgen created PR review comment:
Update the docs for these trait methods as necessary too?
:speech_balloon: fitzgen created PR review comment:
Perhaps we should move these to
wasmtime_core?
alexcrichton updated PR #13320.
alexcrichton updated PR #13320.
alexcrichton has enabled auto merge for PR #13320.
alexcrichton has disabled auto merge for PR #13320.
alexcrichton updated PR #13320.
alexcrichton added PR #13320 Always use fallible accesses of the GC heap to the merge queue.
alexcrichton removed PR #13320 Always use fallible accesses of the GC heap from the merge queue.
alexcrichton added PR #13320 Always use fallible accesses of the GC heap to the merge queue.
github-merge-queue[bot] removed PR #13320 Always use fallible accesses of the GC heap from the merge queue.
alexcrichton added PR #13320 Always use fallible accesses of the GC heap to the merge queue.
:check: alexcrichton merged PR #13320.
alexcrichton removed PR #13320 Always use fallible accesses of the GC heap from the merge queue.
Last updated: Jun 01 2026 at 09:49 UTC