wasmtime / issue #13204 Cranelift: follow-up analysis of ... · git-wasmtime

Stream: git-wasmtime

Topic: wasmtime / issue #13204 Cranelift: follow-up analysis of ...

Wasmtime GitHub notifications bot (Apr 27 2026 at 04:26):

The original problem was shown here: #13068.

Problem

After testing the fuzzer-generated input foo.clif, we identified three rule groups introduced by #12926 (commit 1578325) as candidates:

A: ((x + y) - (x + z)) -> (y - z)

B: ((x - z) - (y - z)) -> (x - y)

C: ((x - y) - (x - z)) -> (z - y)

Rule B and Rule C are referenced by nickname below.

The regression disappears as soon as both Rule B and Rule C are removed from 1578325.

<details>
<summary>benchmark details</summary>

Baseline commit: 1578325

Measurement command: CRANELIFT_FILETESTS_THREADS=1 /usr/bin/time -p <clif-util> test <foo.clif>

Runs per scenario: 5

scenario removed groups median real (s) mean real (s) min max stdev stalled slow vs baseline

minus_group_ab A + B 31.11 31.196 30.96 31.52 0.228 5 5 1.03x

minus_group_ac A + C 31.21 31.230 31.02 31.63 0.218 5 5 1.03x

minus_group_bc B + C 0.17 0.164 0.14 0.18 0.016 0 0 0.01x

minus_group_abc A + B + C 0.15 0.156 0.14 0.18 0.016 0 0 0.00x

Leaving either Rule B or Rule C active still blows up, so each of them is independently sufficient to trigger the regression. Group A has no measurable impact for this testcase.

</details>

Observation

foo.clif summary
    ;; --- foo.clif:124-136  (ishl chain sharing v214) ---
    v214 = ishl v211, v213
    v215 = ishl v0,   v214
    v216 = ishl v215, v214
    v217 = ishl v216, v214
    v218 = ishl v217, v214
    v219 = ishl v218, v214
    v220 = ishl v219, v214
    v221 = ishl v220, v214
    v222 = ishl v221, v214
    v223 = ishl v222, v214
    v224 = ishl v223, v214
    v225 = ishl v224, v214
    v226 = ishl v225, v214

    ;; --- foo.clif:155  (selects.isle:4 aliases v242 to v226) ---
    v242 = select v4, v226, v226

    ;; --- foo.clif:186-242  (48-step isub(v, v) chain on i64x2) ---
    v247 = isub v242, v242
    v248 = isub v247, v247
    v249 = isub v248, v248
    v250 = isub v249, v249

    ... 43 more steps ...

    v293 = isub v292, v292
    v294 = isub v293, v293
The fuzzer combined two structural ingredients:

an ishl chain sharing a common shift amount (v214),

a long isub.i64x2(v, v) chain where vector typing blocks the scalar-only short-circuit prior to #13063.

Additional observations:

The regression was resolved after vector support in #13063 (f2807a1), which makes this rule effectively work on vectors too:

isle ;; x - x --> 0 (rule (simplify (isub (ty_int ty) x x)) (subsume (iconst_u ty 0)))

Once isub.i64x2 v v collapses to 0 at the root, Rule B/C lose their fuel.

Under 1578325 (Rule B/C active, pre-#13063 behavior), this rule fires ~3,000,000 times on a single rule:

isle ;; (x<<z) - (y<<z) --> (x-y)<<z (rule (simplify (isub ty (ishl ty x z) (ishl ty y z))) (ishl ty (isub ty x y) z))
Let us call Rule D.

Causality

On the chain v_{k+1} = isub(v_k, v_k) with v_k = isub(v_{k-1}, v_{k-1}), each v_{k+1} unfolds to:

isub(isub(v_{k-1}, v_{k-1}), isub(v_{k-1}, v_{k-1}))

That shape matches Rule B and Rule C, so each fire can union v_k's leader into v_{k+1}'s e-class. Equivalently, each chain step absorbs its predecessor's e-class.

Example

Take v252 = v251 - v251 after prior chain compression has already begun.

Each ≡ means "same e-class":
v252 ≡ v251 - v251
     ≡ (v250 - v250) - (v250 - v250)          ; unfold v251
     ≡ v250 - v250                            ; Rule B (or C)
     ≡ v251                                   ; equivalent value already exists
Result: v251 is added into v252's e-class. Repeating this at earlier steps pulls in more predecessors transitively.

Because those absorbed e-classes already contain ishl-form alternatives, Rule D can also fire on the same top-level:
v252 ≡ v251 - v251
     ≡ (A << v214) - (A << v214)               ; pick ishl-form alternatives
     ≡ (A - A) << v214                         ; Rule D
     ≡ ((v225 - v225) - (v225 - v225)) << v214
     ≡ (v225 - v225) << v214                   ; Rule B (or C)
     ≡ A << v214
     ...
This is the key feedback loop:

Rule B/C at step k+1 absorbs step k's e-class.

That gives Rule D more valid bindings at step k+1.

Rule D creates more ishl-form enodes, which are then absorbed at step k+2 by Rule B/C.

So Rule B/C do not need many fires themselves; they expand downstream match pools for Rule D.

Without Rule B/C
Rule D can still fire, but v_{k+1} = isub(v_k, v_k) mostly stays single-path: step k+1 only sees alternatives created inside v_k itself, so each step quickly hits a local limit. Inductively, v_k has same single choice.

With Rule B/C
Step k+1 can absorb step k's e-class, so v_{k+1} carries not only its own forms but also alternatives already present in v_k (and then from v_{k-1}, v_{k-2}, ... through repeated absorption). That is why later nodes inherit earlier-node choices and must search a much larger space.

Limits for e-graphs

REWRITE_LIMIT, ECLASS_ENODE_LIMIT, and MATCHES_LIMIT are local caps (per recursion stack / per e-class / per ISLE call), not global per-function budgets.

ECLASS_ENODE_LIMIT blocks additional unions for one e-class, but does not impose a global function-wide cap.

MATCHES_LIMIT truncates returned values per call, but many calls still occur across 48 independent top-level chain nodes.

We also noticed a off-by-one:

REWRITE_LIMIT = 5 is checked as if rewrite_depth > 5 before the increment and its initial value is 0, so rewrites actually run up to 6. If the intent is to cap at 5, the check should be >= 5.

With no function-global or per-rule budget, each top-level can replay the same cascade and totals accumulate.

Per-top-level Rule D fire counts (first few chain steps):

top-level fires

v247 6

v248 6

v249 44

v251 138

This growth pattern aligns with the observed 3.3M total fires.

Trace evidence

Trace evidence that v247's leader v830 is absorbed into v248 and v249 e-classes (combined.log, trace-log build):
1123:  optimizing inst inst132 orig result v247 gave v830
1269:  -> returned from ISLE: v248 -> [v830, v852]
1904:  -> returned from ISLE: v249 -> [v830, v852, v932, v939, v944]
v830 in both later return lists is direct evidence that later chain steps absorb prior chain e-classes.

Open question

My current interpretation is that this blow-up is not only a bug in one rewrite rule, but may be an example of a more fundamental limitation of e-graph style rewriting: even if saturation would eventually be reached, the search space can become too large for compile-time use before that happens.

Rule B/C makes later v_{k+1} nodes inherit alternatives from earlier v_k nodes, and Rule D then explores those accumulated alternatives. The current limits bound local recursion, per-e-class growth, and per-call result count, but not the total number of rule applications across the function.

Is this the right way to interpret the issue: an expected theoretical limitation of e-graph-based rewriting as applied in Cranelift, rather than only a bug in this specific rewrite rule?

scenario	removed groups	median real (s)	mean real (s)	min	max	stdev	stalled	slow	vs baseline
minus_group_ab	A + B	31.11	31.196	30.96	31.52	0.228	5	5	1.03x
minus_group_ac	A + C	31.21	31.230	31.02	31.63	0.218	5	5	1.03x
minus_group_bc	B + C	0.17	0.164	0.14	0.18	0.016	0	0	0.01x
minus_group_abc	A + B + C	0.15	0.156	0.14	0.18	0.016	0	0	0.00x

top-level	fires
v247	6
v248	6
v249	44
v251	138

Wasmtime GitHub notifications bot (Apr 27 2026 at 04:26):

myunbin added the bug label to Issue #13204.

Wasmtime GitHub notifications bot (Apr 27 2026 at 04:26):

myunbin added the cranelift label to Issue #13204.

Wasmtime GitHub notifications bot (Apr 27 2026 at 04:59):

cfallin commented on issue #13204:

Thanks, @myunbin, for the writeup.

A few high-level questions, as this writeup uses terminology I'm not familiar with:

"Take v252 = v251 - v251 after prior chain compression has already begun." --> What does "chain compression" mean? This is not a term I'm familiar with nor one that we use in the aegraph implementation.

You keep writing "absorb's X's e-class". This seems to suggest more of a classical or egg-style e-graph, where e-classes are merged and uses are rewritten (and a union-find redirects uses to the merged e-class). As you probably know, the aegraph implementation uses union nodes so we don't "absorb another e-class"; we define a new value that represents a merged e-class.

Then a meta-question: I haven't worked through all of the details above yet, but this an extremely verbose writeup, with unfamiliar terminology and some aspects above that sound like misunderstandings (or, at least, an odd way of describing things). I have to ask bluntly: did you use an LLM to produce this analysis? If so, could you please work through it as a human and try to condense it to something that shows understanding of our local terminology and addresses the above questions?

Thanks!

Wasmtime GitHub notifications bot (Apr 27 2026 at 04:59):

cfallin edited a comment on issue #13204:

Thanks, @myunbin, for the writeup.

A few high-level questions, as this writeup uses terminology I'm not familiar with:

"Take v252 = v251 - v251 after prior chain compression has already begun." --> What does "chain compression" mean? This is not a term I'm familiar with nor one that we use in the aegraph implementation.

You keep writing "absorbs X's e-class". This seems to suggest more of a classical or egg-style e-graph, where e-classes are merged and uses are rewritten (and a union-find redirects uses to the merged e-class). As you probably know, the aegraph implementation uses union nodes so we don't "absorb another e-class"; we define a new value that represents a merged e-class.

Then a meta-question: I haven't worked through all of the details above yet, but this an extremely verbose writeup, with unfamiliar terminology and some aspects above that sound like misunderstandings (or, at least, an odd way of describing things). I have to ask bluntly: did you use an LLM to produce this analysis? If so, could you please work through it as a human and try to condense it to something that shows understanding of our local terminology and addresses the above questions?

Thanks!

Wasmtime GitHub notifications bot (Apr 27 2026 at 06:27):

myunbin edited issue #13204:

The original problem was shown here: #13068.

Problem

After testing the fuzzer-generated input foo.clif, we identified three rule groups introduced by #12926 (commit 1578325) as candidates:

rule A: ((x + y) - (x + z)) -> (y - z)

rule B: ((x - z) - (y - z)) -> (x - y)

rule C: ((x - y) - (x - z)) -> (z - y)

The regression disappears as soon as both Rule B and Rule C are removed from 1578325.

<details>
<summary>benchmark details</summary>

Baseline commit: 1578325

Measurement command: CRANELIFT_FILETESTS_THREADS=1 /usr/bin/time -p <clif-util> test <foo.clif>

Runs per scenario: 5

scenario removed groups median real (s) mean real (s) min max stdev stalled slow vs baseline

minus_group_ab A + B 31.11 31.196 30.96 31.52 0.228 5 5 1.03x

minus_group_ac A + C 31.21 31.230 31.02 31.63 0.218 5 5 1.03x

minus_group_bc B + C 0.17 0.164 0.14 0.18 0.016 0 0 0.01x

minus_group_abc A + B + C 0.15 0.156 0.14 0.18 0.016 0 0 0.00x

Leaving either Rule B or Rule C active still blows up, so each of them is independently sufficient to trigger the regression. Group A has no measurable impact for this testcase.

</details>

Observation

foo.clif summary
    ;; --- foo.clif:124-136  (repeated ishl values sharing v214) ---
    v214 = ishl v211, v213
    v215 = ishl v0,   v214
    v216 = ishl v215, v214
    v217 = ishl v216, v214
    v218 = ishl v217, v214
    v219 = ishl v218, v214
    v220 = ishl v219, v214
    v221 = ishl v220, v214
    v222 = ishl v221, v214
    v223 = ishl v222, v214
    v224 = ishl v223, v214
    v225 = ishl v224, v214
    v226 = ishl v225, v214

    ;; --- foo.clif:155  (selects.isle:4 aliases v242 to v226) ---
    v242 = select v4, v226, v226

    ;; --- foo.clif:186-242  (48 repeated isub(v, v) values on i64x2) ---
    v247 = isub v242, v242
    v248 = isub v247, v247
    v249 = isub v248, v248
    v250 = isub v249, v249

    ... 43 more steps ...

    v293 = isub v292, v292
    v294 = isub v293, v293
The fuzzer has two main structure:

Repeated ishl values sharing a common shift amount (v214),

Repeated isub.i64x2(v, v) (48 times).

Additional observations:

The regression was resolved after vector support in #13063 (f2807a1):

;; x - x --> 0 (rule (simplify (isub (ty_int ty) x x)) (subsume (iconst_u ty 0)))

isub.i64x2 v v collapses to 0 at the root.

Under 1578325 (Rule B/C active, pre-#13063 behavior), this rule fires ~3,000,000 times on a single rule:

;; (x<<z) - (y<<z) --> (x-y)<<z (rule (simplify (isub ty (ishl ty x z) (ishl ty y z))) (ishl ty (isub ty x y) z))
Let us call Rule D.

Causality

From v_{k+1} = isub(v_k, v_k), where v_k = isub(v_{k-1}, v_{k-1}), one level of unfolding gives:
v_{k+1}
  = isub(v_k, v_k)
  = isub(isub(v_{k-1}, v_{k-1}), isub(v_{k-1}, v_{k-1}))
  = isub(v_{k-1}, v_{k-1})                    ; Rule B
Result is the earlier expression, v_k. This means that rules can return the earlier value as an equivalent value while optimizing v_{k+1}. This equivalence is represented by a Union value, unless the rewrite subsumes the original or hits the eclass-size limit.

For example:
v252 = isub(v251, v251)
v251 = isub(v250, v250)

--

v252 = isub(v251, v251)
     = isub(isub(v250, v250), isub(v250, v250))
     = isub(v250, v250)                           ; Rule B
     = v251
Rule B/C can make a later repeated isub(v, v) value equivalent to an earlier one.
(rule (simplify (isub ty (ishl ty x z) (ishl ty y z))) (ishl ty (isub ty x y) z))
Rule D matches when both operands can be viewed as ishl expressions with the same shift amount. For example,
v247 = isub(v242, v242)
     = isub(v226, v226)
     = isub(ishl(v225, v214), ishl(v225, v214))
     = ishl(isub(v225, v225), v214)              ; Rule D
B/C add the extra effect that a later value can become equivalent to an earlier repeated isub value:
v248 = isub(v247, v247)
     = isub(isub(v242, v242), isub(v242, v242))
     = isub(v242, v242)                          ; Rule B
     = v247
So later values can see both locally-created forms and forms already associated with earlier values such as v247.

Without Rule B/C, Rule D can still fire. But later repeated isub(v, v) values do not get rewritten back to earlier repeated isub values. Rule D is also bounded by the rewrite-depth limit.

With Rule B/C, later values can also be equivalent to earlier values, so Rule D gets many more valid bindings.

In short:

Rule B/C can rewrite a later isub(v, v) to a form already represented by an earlier value.

Rule D then has more equivalent forms to match against.

Limits for egraph rewriting

The existing limits are local:

REWRITE_LIMIT bounds recursive optimization.

ECLASS_ENODE_LIMIT bounds the size of one tree.

MATCHES_LIMIT truncates results from one ISLE call.

Each of the 48 isub(v, v) values is optimized separately. Rule B can rewrite a later one back to the previous isub(v, v) value, so each local rewrite can match forms recorded for earlier values.

Additional off-by-one: REWRITE_LIMIT = 5 is checked before incrementing rewrite_depth(which starts from 0), so the maximum observed depth is 6.

Evidence

Per-top-level Rule D fire counts (first few repeated isub(v, v) values):

top-level fires

v247 6

v248 6

v249 44

v251 138

This growth pattern accumulates to observed 3.3M total fires.

Trace evidence that an earlier optimized value is returned again for later values(trace-log):
1123:  optimizing inst inst132 orig result v247 gave v830
1269:  -> returned from ISLE: v248 -> [v830, v852]
1904:  -> returned from ISLE: v249 -> [v830, v852, v932, v939, v944]
v830 appearing in later return lists shows that an earlier optimized value is returned again as an equivalent value for later inputs.

Open question

Our interpretation is that this is not just a problem with one rewrite rule, but a compile-time risk in how e-graphs(and Cranelift's aegraph) expose accumulated equivalent alternatives during matching.

Does this interpretation sound reasonable, or would you think about this failure mode differently?

scenario	removed groups	median real (s)	mean real (s)	min	max	stdev	stalled	slow	vs baseline
minus_group_ab	A + B	31.11	31.196	30.96	31.52	0.228	5	5	1.03x
minus_group_ac	A + C	31.21	31.230	31.02	31.63	0.218	5	5	1.03x
minus_group_bc	B + C	0.17	0.164	0.14	0.18	0.016	0	0	0.01x
minus_group_abc	A + B + C	0.15	0.156	0.14	0.18	0.016	0	0	0.00x

top-level	fires
v247	6
v248	6
v249	44
v251	138

Wasmtime GitHub notifications bot (Apr 27 2026 at 06:34):

myunbin commented on issue #13204:

Hi @cfallin, first, I apologize for using unfamiliar terminology and vague concepts.

Some of the wording came from terms we use internally in my current project. Since English is not my first language, relying on an LLM for translation also seems to have made parts of the text less precise. I have now tried to make it as concise as possible.

Although I used an LLM while preparing this analysis, it was only for auxiliary tasks such as writing scripts and polishing text. The experiments and root-cause analysis were done by self.

If anything in my interpretation seems wrong, unclear, or incomplete, I would appreciate your feedback.

Thanks!

Wasmtime GitHub notifications bot (Apr 27 2026 at 06:43):

myunbin commented on issue #13204:

Oh, and regarding the questions:

By "chain compression", I was referring to the repeated pattern v_k = isub(v_{k-1}, v_{k-1}) in the testcase.

By "absorbs", I meant that the newly created aegraph value can include the result of an earlier value. I have updated the wording to avoid that implication.

Again, sorry for the poor word choices.
Thanks!

Wasmtime GitHub notifications bot (Apr 27 2026 at 06:45):

myunbin edited a comment on issue #13204:

Hi @cfallin, first, I apologize for using unfamiliar terminology and vague concepts.

Some of the wording came from terms we use internally in my current project. Since English is not my first language, relying on an LLM for translation also seems to have made parts of the text less precise. I have now tried to make it as concise as possible.

Although I used an LLM while preparing this analysis, it was only for auxiliary tasks such as writing scripts and polishing text. The experiments and root-cause analysis were done by self.

I've updated the original writeup. If anything still seems wrong or worth discussing, I would appreciate your feedback.

Thanks!

Wasmtime GitHub notifications bot (May 01 2026 at 19:25):

cfallin commented on issue #13204:

I've read through this again and I think I understand the failure mode better now. Thanks for writing up the example.

My very-simplified view is: the recursive x_k+1 = x_k - x_k, rewriting x_k+1 as an alias of x_k instead of 0, while valid, creates very large eclasses so the rules that match on these eclasses (such as the shift-rule) blow up their search time because they evaluate every combination of inputs.

subsume does its job once we introduce the x - x == 0 simplification rule; but separately, this is a new growth dimension for eclasses. If I am reading this right, the chain of subtracts apparently creates an eclass larger than the eclass enode-count limit (ECLASS_ENODE_LIMIT)? Is that what you observed? Naively I would have expected the limit to kick in. I'll poke at this further.

In any case, aside from use of subsume wherever we have a clear "collapsing optimization" that deletes part of the program (e.g. x-x = 0), I wonder if we should implement some sort of fuel throughout the ISLE-generated matching logic. It's always valid not to optimize a given value, so we can quit at any point. That would put a hard bound on this kind of unexpected rule interaction.

Wasmtime GitHub notifications bot (May 15 2026 at 23:12):

cfallin closed issue #13204:

The original problem was shown here: #13068.

Problem

After testing the fuzzer-generated input foo.clif, we identified three rule groups introduced by #12926 (commit 1578325) as candidates:

rule A: ((x + y) - (x + z)) -> (y - z)

rule B: ((x - z) - (y - z)) -> (x - y)

rule C: ((x - y) - (x - z)) -> (z - y)

The regression disappears as soon as both Rule B and Rule C are removed from 1578325.

<details>
<summary>benchmark details</summary>

Baseline commit: 1578325

Measurement command: CRANELIFT_FILETESTS_THREADS=1 /usr/bin/time -p <clif-util> test <foo.clif>

Runs per scenario: 5

scenario removed groups median real (s) mean real (s) min max stdev stalled slow vs baseline

minus_group_ab A + B 31.11 31.196 30.96 31.52 0.228 5 5 1.03x

minus_group_ac A + C 31.21 31.230 31.02 31.63 0.218 5 5 1.03x

minus_group_bc B + C 0.17 0.164 0.14 0.18 0.016 0 0 0.01x

minus_group_abc A + B + C 0.15 0.156 0.14 0.18 0.016 0 0 0.00x

Leaving either Rule B or Rule C active still blows up, so each of them is independently sufficient to trigger the regression. Group A has no measurable impact for this testcase.

</details>

Observation

foo.clif summary
    ;; --- foo.clif:124-136  (repeated ishl values sharing v214) ---
    v214 = ishl v211, v213
    v215 = ishl v0,   v214
    v216 = ishl v215, v214
    v217 = ishl v216, v214
    v218 = ishl v217, v214
    v219 = ishl v218, v214
    v220 = ishl v219, v214
    v221 = ishl v220, v214
    v222 = ishl v221, v214
    v223 = ishl v222, v214
    v224 = ishl v223, v214
    v225 = ishl v224, v214
    v226 = ishl v225, v214

    ;; --- foo.clif:155  (selects.isle:4 aliases v242 to v226) ---
    v242 = select v4, v226, v226

    ;; --- foo.clif:186-242  (48 repeated isub(v, v) values on i64x2) ---
    v247 = isub v242, v242
    v248 = isub v247, v247
    v249 = isub v248, v248
    v250 = isub v249, v249

    ... 43 more steps ...

    v293 = isub v292, v292
    v294 = isub v293, v293
The fuzzer has two main structure:

Repeated ishl values sharing a common shift amount (v214),

Repeated isub.i64x2(v, v) (48 times).

Additional observations:

The regression was resolved after vector support in #13063 (f2807a1):

;; x - x --> 0 (rule (simplify (isub (ty_int ty) x x)) (subsume (iconst_u ty 0)))

isub.i64x2 v v collapses to 0 at the root.

Under 1578325 (Rule B/C active, pre-#13063 behavior), this rule fires ~3,000,000 times on a single rule:

;; (x<<z) - (y<<z) --> (x-y)<<z (rule (simplify (isub ty (ishl ty x z) (ishl ty y z))) (ishl ty (isub ty x y) z))
Let us call Rule D.

Causality

From v_{k+1} = isub(v_k, v_k), where v_k = isub(v_{k-1}, v_{k-1}), one level of unfolding gives:
v_{k+1}
  = isub(v_k, v_k)
  = isub(isub(v_{k-1}, v_{k-1}), isub(v_{k-1}, v_{k-1}))
  = isub(v_{k-1}, v_{k-1})                    ; Rule B
Result is the earlier expression, v_k. This means that rules can return the earlier value as an equivalent value while optimizing v_{k+1}. This equivalence is represented by a Union value, unless the rewrite subsumes the original or hits the eclass-size limit.

For example:
v252 = isub(v251, v251)
v251 = isub(v250, v250)

--

v252 = isub(v251, v251)
     = isub(isub(v250, v250), isub(v250, v250))
     = isub(v250, v250)                           ; Rule B
     = v251
Rule B/C can make a later repeated isub(v, v) value equivalent to an earlier one.
(rule (simplify (isub ty (ishl ty x z) (ishl ty y z))) (ishl ty (isub ty x y) z))
Rule D matches when both operands can be viewed as ishl expressions with the same shift amount. For example,
v247 = isub(v242, v242)
     = isub(v226, v226)
     = isub(ishl(v225, v214), ishl(v225, v214))
     = ishl(isub(v225, v225), v214)              ; Rule D
B/C add the extra effect that a later value can become equivalent to an earlier repeated isub value:
v248 = isub(v247, v247)
     = isub(isub(v242, v242), isub(v242, v242))
     = isub(v242, v242)                          ; Rule B
     = v247
So later values can see both locally-created forms and forms already associated with earlier values such as v247.

Without Rule B/C, Rule D can still fire. But later repeated isub(v, v) values do not get rewritten back to earlier repeated isub values. Rule D is also bounded by the rewrite-depth limit.

With Rule B/C, later values can also be equivalent to earlier values, so Rule D gets many more valid bindings.

In short:

Rule B/C can rewrite a later isub(v, v) to a form already represented by an earlier value.

Rule D then has more equivalent forms to match against.

Limits for egraph rewriting

The existing limits are local:

REWRITE_LIMIT bounds recursive optimization.

ECLASS_ENODE_LIMIT bounds the size of one tree.

MATCHES_LIMIT truncates results from one ISLE call.

Each of the 48 isub(v, v) values is optimized separately. Rule B can rewrite a later one back to the previous isub(v, v) value, so each local rewrite can match forms recorded for earlier values.

Additional off-by-one: REWRITE_LIMIT = 5 is checked before incrementing rewrite_depth(which starts from 0), so the maximum observed depth is 6.

Evidence

Per-top-level Rule D fire counts (first few repeated isub(v, v) values):

top-level fires

v247 6

v248 6

v249 44

v251 138

This growth pattern accumulates to observed 3.3M total fires.

Trace evidence that an earlier optimized value is returned again for later values(trace-log):
1123:  optimizing inst inst132 orig result v247 gave v830
1269:  -> returned from ISLE: v248 -> [v830, v852]
1904:  -> returned from ISLE: v249 -> [v830, v852, v932, v939, v944]
v830 appearing in later return lists shows that an earlier optimized value is returned again as an equivalent value for later inputs.

Open question

Our interpretation is that this is not just a problem with one rewrite rule, but a compile-time risk in how e-graphs(and Cranelift's aegraph) expose accumulated equivalent alternatives during matching.

Does this interpretation sound reasonable, or would you think about this failure mode differently?

scenario	removed groups	median real (s)	mean real (s)	min	max	stdev	stalled	slow	vs baseline
minus_group_ab	A + B	31.11	31.196	30.96	31.52	0.228	5	5	1.03x
minus_group_ac	A + C	31.21	31.230	31.02	31.63	0.218	5	5	1.03x
minus_group_bc	B + C	0.17	0.164	0.14	0.18	0.016	0	0	0.01x
minus_group_abc	A + B + C	0.15	0.156	0.14	0.18	0.016	0	0	0.00x

top-level	fires
v247	6
v248	6
v249	44
v251	138

Last updated: Jul 29 2026 at 05:03 UTC

Stream: git-wasmtime

Topic: wasmtime / issue #13204 Cranelift: follow-up analysis of ...

Wasmtime GitHub notifications bot (Apr 27 2026 at 04:26):

Problem

Observation

`foo.clif` summary

Causality

Example

Limits for e-graphs

Trace evidence

Open question

Wasmtime GitHub notifications bot (Apr 27 2026 at 04:26):

Wasmtime GitHub notifications bot (Apr 27 2026 at 04:26):

Wasmtime GitHub notifications bot (Apr 27 2026 at 04:59):

Wasmtime GitHub notifications bot (Apr 27 2026 at 04:59):

Wasmtime GitHub notifications bot (Apr 27 2026 at 06:27):

Problem

Observation

`foo.clif` summary

Causality

Limits for egraph rewriting

Evidence

Open question

Wasmtime GitHub notifications bot (Apr 27 2026 at 06:34):

Wasmtime GitHub notifications bot (Apr 27 2026 at 06:43):

Wasmtime GitHub notifications bot (Apr 27 2026 at 06:45):

Wasmtime GitHub notifications bot (May 01 2026 at 19:25):

Wasmtime GitHub notifications bot (May 15 2026 at 23:12):

Problem

Observation

`foo.clif` summary

Causality

Limits for egraph rewriting

Evidence

Open question

Stream: git-wasmtime

Topic: wasmtime / issue #13204 Cranelift: follow-up analysis of ...

Wasmtime GitHub notifications bot (Apr 27 2026 at 04:26):

Problem

Observation

foo.clif summary

Causality

Example

Limits for e-graphs

Trace evidence

Open question

Wasmtime GitHub notifications bot (Apr 27 2026 at 04:26):

Wasmtime GitHub notifications bot (Apr 27 2026 at 04:26):

Wasmtime GitHub notifications bot (Apr 27 2026 at 04:59):

Wasmtime GitHub notifications bot (Apr 27 2026 at 04:59):

Wasmtime GitHub notifications bot (Apr 27 2026 at 06:27):

Problem

Observation

foo.clif summary

Causality

Limits for egraph rewriting

Evidence

Open question

Wasmtime GitHub notifications bot (Apr 27 2026 at 06:34):

Wasmtime GitHub notifications bot (Apr 27 2026 at 06:43):

Wasmtime GitHub notifications bot (Apr 27 2026 at 06:45):

Wasmtime GitHub notifications bot (May 01 2026 at 19:25):

Wasmtime GitHub notifications bot (May 15 2026 at 23:12):

Problem

Observation

foo.clif summary

Causality

Limits for egraph rewriting

Evidence

Open question

`foo.clif` summary

`foo.clif` summary

`foo.clif` summary