wasmtime / PR #13469 Fix safepoint stack slot reuse · git-wasmtime

Stream: git-wasmtime

Topic: wasmtime / PR #13469 Fix safepoint stack slot reuse

Wasmtime GitHub notifications bot (May 24 2026 at 04:51):

angelnereira opened PR #13469 from angelnereira:fix-gc-safepoint-slot-reuse to bytecodealliance:main:

Summary

Fixes #13461.

The safepoint spiller walks instructions backwards and can free a stack slot for a value defined by a safepoint instruction before assigning stack-map slots for values live across that same safepoint. If the freed slot is reused, the stack map can point at a slot that contains the instruction result rather than the value that must remain live across the call.

This changes the rewrite order so safepoint stack-map entries are assigned before result slots for that instruction are freed for reuse. It also adds the GC regression test from the issue to cover the null-reference trap that exposed this.

Testing

cargo fmt --all -- --check

cargo test -p cranelift-frontend safepoints

cargo build -p wasmtime-cli

target/debug/wasmtime wast -Wgc -Wfunction-references -Wwide-arithmetic -Wsimd -Wthreads -Wreference-types /tmp/safepoint-reload-aliased-ref-null.wast

WASMTIME_TEST_GC_KEYWORDS=safepoint-reload-aliased-ref-null cargo test -p wasmtime-cli --test wast safepoint-reload-aliased-ref-null

Wasmtime GitHub notifications bot (May 24 2026 at 04:51):

angelnereira requested fitzgen for a review on PR #13469.

Wasmtime GitHub notifications bot (May 24 2026 at 04:51):

angelnereira requested wasmtime-compiler-reviewers for a review on PR #13469.

Wasmtime GitHub notifications bot (May 24 2026 at 04:51):

angelnereira requested wasmtime-core-reviewers for a review on PR #13469.

Wasmtime GitHub notifications bot (May 24 2026 at 07:50):

gfx commented on PR #13469:

Confirmed — this fixes #13461 on our end.

We originally hit the null reference trap in Wado, whose compiler emits Wasm GC code with (ref null $t) values that stay live across call safepoints.

What I did:

Applied this PR's safepoints.rs reorder on top of v45.0.0 in our wasmtime fork: wado-lang/wasmtime@gfx/wasmtime-45 — fix commit 4e32285 (on top of v45.0.0 plus the regression .wast test).

Built the Wado compiler against that fork via path deps: wado-lang/wado#1175.

Ran our full e2e suite: the 7 tests that previously trapped with wasm trap: null reference (all on our JSON de/serialization paths) now pass, and the suite is green. Before this fix the exact same wasm bytes ran fine on wasmtime 44 and trapped on 45, which is what led us to bisect to #13228.

Thanks for the quick turnaround.

Wasmtime GitHub notifications bot (May 24 2026 at 07:53):

gfx edited a comment on PR #13469:

Confirmed — this fixes #13461 on our end.

We originally hit the null reference trap in Wado, whose compiler emits Wasm GC code with (ref null $t) values that stay live across call safepoints.

What I did:

Applied this PR's safepoints.rs reorder on top of v45.0.0 in our wasmtime fork: wado-lang/wasmtime@gfx/wasmtime-45 — fix commit 4e32285 (on top of v45.0.0 plus the regression .wast test).

Built the Wado compiler against that fork via path deps: wado-lang/wado#1175.

Ran our full e2e suite: the 7 tests that previously trapped with wasm trap: null reference now pass, and the suite is green. Before this fix the exact same wasm bytes ran fine on wasmtime 44 and trapped on 45, which is what led us to bisect to #13228.

Thanks for the quick turnaround.

Wasmtime GitHub notifications bot (May 24 2026 at 10:18):

github-actions[bot] added the label cranelift on PR #13469.

Wasmtime GitHub notifications bot (May 26 2026 at 17:59):

fitzgen commented on PR #13469:

Hi @angelnereira, thanks for the PR. However, it seems like the write up is a AI text. Please review https://github.com/bytecodealliance/governance/blob/main/AI_TOOL_POLICY.md, in particular regardless whether you are using AI as a tool yourself, you must review and own that output, and you must not simply use AI output for comments, PR descriptions, etc. You must fully own your contributions and take responsibility for them, not foist the work of understanding what the LLM did on project maintainers.

Wasmtime GitHub notifications bot (May 26 2026 at 18:04):

:repeat: fitzgen submitted PR review.

Wasmtime GitHub notifications bot (May 26 2026 at 18:04):

:speech_balloon: fitzgen created PR review comment:

This test is too large and effectively useless. Maintainers cannot understand it, and if it ever regressed in the future due to a reintroduction of this bug or one like it, we wouldn't be able to diagnose what is happening. Please make a test case that is no more than ~100 lines long. Which you should be able to do as the contributor taking responsibility for your own pull request and understanding the bug it is fixing.

Wasmtime GitHub notifications bot (May 26 2026 at 18:10):

angelnereira commented on PR #13469:

Hi @angelnereira, thanks for the PR. However, it seems like the write up is a AI text. Please review https://github.com/bytecodealliance/governance/blob/main/AI_TOOL_POLICY.md, in particular regardless whether you are using AI as a tool yourself, you must review and own that output, and you must not simply use AI output for comments, PR descriptions, etc. You must fully own your contributions and take responsibility for them, not foist the work of understanding what the LLM did on project maintainers.

Thanks for pointing this out.

My native language is Spanish, so I sometimes use tools to help translate or synthesize comments in English. That said, I understand the concern and the policy: I am responsible for fully reviewing, understanding, and owning anything I post.

I’ll be more careful going forward and make sure my comments and PR descriptions are written and reviewed by me, and only posted when I can fully stand behind them.

Sorry for the noise, and thanks for the clarification.

Wasmtime GitHub notifications bot (May 26 2026 at 18:19):

fitzgen commented on PR #13469:

My native language is Spanish, so I sometimes use tools to help translate or synthesize comments in English. That said, I understand the concern and the policy: I am responsible for fully reviewing, understanding, and owning anything I post.

To be clear, using an LLM to translate a human-written comment from Spanish to English is perfectly fine. Having the LLM write an english comment based on a Spanish prompt is not.

Thanks! Appreciate that you are receptive to this feedback.

Wasmtime GitHub notifications bot (May 27 2026 at 00:10):

angelnereira updated PR #13469.

Wasmtime GitHub notifications bot (May 27 2026 at 00:11):

angelnereira commented on PR #13469:

Thanks for the review. I removed the large WAST regression
test and replaced it with a focused unit test for the safepoint spiller.

The new test directly checks the slot-reuse condition fixed here: a safepoint result's stack slot must not be reused for another
value that is live across that same safepoint.

I verified it with:

cargo test -p cranelift-frontend safepoint_reserves_live_slots_before_freeing_result_slots cargo test -p cranelift-frontend safepoints

Wasmtime GitHub notifications bot (May 27 2026 at 00:16):

angelnereira updated PR #13469.

Wasmtime GitHub notifications bot (May 27 2026 at 13:32):

:memo: fitzgen submitted PR review.

Wasmtime GitHub notifications bot (May 27 2026 at 13:32):

:speech_balloon: fitzgen created PR review comment:

This test is a little too low-level to really be very useful, because the bug is not in the low-level get-or-create-stack-slot APIs, it is in the order that those APIs are called when rewriting the whole function based on the liveness analysis.

It should be possible to make a test at the whole CLIF function level which asserts the expected output CLIF after running the safepoint spiller via assert_eq_output!(...), similar to e.g. the needs_stack_map_and_loop test in this test module, but which exercises this bug and checks for regressions. If I understand correctly, what is needed is something like this:
block0(v0: i64):
    v1 = call f(v0)
    ;; v1 needs inclusion in stack maps
    v2 = call f(v1)
    ;; v2 needs inclusion in stack maps
    v3 = call f(v2)
    return v3
That is, we have two values that need inclusion in stack maps, have the same type and non-overlapping live ranges and therefore could possibly reuse the same stack slot, and the live range for one ends at a safepoint.

This might not be the exact shape necessary to trigger the bug. It might require another call or that the values have longer live ranges across additional safepoints. I'm not exactly sure, but you should be able to come up with something based off this initial starting point. Basically just look at the low-level API call sequence you're currently making and craft a CLIF function that will trigger that same low-level API call sequence.

Please make sure that the invalid stack slot reuse is present in this test without the fix, and then that the invalid stack slot reuse goes away after the fix is reapplied.

Wasmtime GitHub notifications bot (May 27 2026 at 17:11):

vouillon commented on PR #13469:

I think this is superseded by #13480. The root cause is that loop-invariant values aren't tracked properly in the rewrite walk. Reordering rewrite_safepoint and rewrite_def only covers the case where the def is also a safepoint. In #13480's regression test the slot is freed by a plain def (an iconst), so the reorder doesn't help. #13480 fixes the general case by reserving the slots for loop-invariant values up front, after which the order of the two calls no longer matters.

Wasmtime GitHub notifications bot (May 27 2026 at 20:04):

:cross_mark: fitzgen closed without merge PR #13469.

Wasmtime GitHub notifications bot (May 27 2026 at 20:04):

fitzgen commented on PR #13469:

Closing in favor of https://github.com/bytecodealliance/wasmtime/pull/13498 but if you can create a test case that still fails and isn't fixed by that PR, then please open a new PR/issue! Thanks

Last updated: Jul 29 2026 at 05:03 UTC