Stream: cranelift

Topic: stack overflow after adding simplification rules(#12799)


view this post on Zulip Hyunbin Kim (Mar 19 2026 at 08:33):

Hi!

This is about stack overflow failures observed in CI for #12799.

The failure looks like this:

     Running tests/all/main.rs (target/debug/deps/all-049eddf3751e9505)

running 104 tests

thread '<unknown>' (16230) has overflowed its stack
fatal runtime error: stack overflow, aborting
error: test failed, to rerun pass `-p wasi-common --test all`

Caused by:
  process didn't exit successfully: `/home/runner/work/wasmtime/wasmtime/target/debug/deps/all-049eddf3751e9505` (signal: 6, SIGABRT: process abort signal)
Error: Process completed with exit code 101.

This shows up in multiple jobs, including:

So this does not look like a single crate-specific issue; it seems like a shared compile/rewrite path.
I tried to narrow this down by doing a rough binary search on CI.

So far, I found that the stack overflow starts appearing after I added the bit-operation simplification rules in bitops.isle, specifically in this range.

I was not able to narrow it down further yet, because removing a continuous block of the recently added rules was not enough to make the failure disappear. It looks like the issue may involve multiple rules in bitops.isle, rather than a single obviously bad rule.

Does this ring a bell as a known pattern with ISLE/egraph rule growth, especially in older toolchains / MSRV / specific Linux CI environments? Also, if anyone has suggestions on which recent bitops rule families are the most suspicious to remove, that would be very helpful.

Since I am still very new to this project, any observations/pointers/related context would be helpful.

Thanks!

view this post on Zulip Chris Fallin (Mar 19 2026 at 09:29):

We have a rewrite depth limit, so it at least shouldn't be the obvious thing of infinite recursion in rules or whatnot. If you have a reliable reproduction, please do file an issue and we can look at it (a bunch of us are at Wasm IO at the moment so responses may be slow). In the meantime if reverting the entire PR (or set of PRs) fixes the issue let's definitely do that so we get back to a good state on main


Last updated: Mar 23 2026 at 16:19 UTC