scottmcm opened PR #10489 from scottmcm:eq-add
to bytecodealliance:main
:
Noticed these looking at niched discriminant calculations, because
discr(a) == discr(b)
sometimes expands tofrobble(a) + OFFSET == frobble(b) + OFFSET
, where it's quite helpful to simplify tofrobble(a) == frobble(b)
instead.
scottmcm requested abrown for a review on PR #10489.
scottmcm requested wasmtime-compiler-reviewers for a review on PR #10489.
scottmcm requested fitzgen for a review on PR #10489.
scottmcm updated PR #10489.
scottmcm requested wasmtime-core-reviewers for a review on PR #10489.
scottmcm submitted PR review.
scottmcm created PR review comment:
A case in the wild, here: was
a + -1 == 0
, nowa == 1
.
scottmcm submitted PR review.
scottmcm created PR review comment:
Remind me: is this the kind of place where it makes sense to
subsume
so it stops looking at furtheriadd
/isub
rules?
scottmcm updated PR #10489.
cfallin submitted PR review.
cfallin created PR review comment:
Yes, in this case it seems reasonable IMHO -- we're completely removing an op; it feels similar to cprop, where we also subsume because we have unambiguous improvements.
scottmcm updated PR #10489.
cfallin submitted PR review:
Nice, thanks!
cfallin merged PR #10489.
fitzgen submitted PR review.
fitzgen created PR review comment:
What if we could apply not only this rule, but also cprop the whole thing away? In that case, we would get two rewritten values marked as subsume, and would choose between then arbitrarily, but we would always want the constant right? Should we only be using subsume for constant results? Remembering the rules around subsumption is always frought for me...
cfallin submitted PR review.
cfallin created PR review comment:
"Nested subsumes" will actually work out fine I think, with the new implementation I merged last week: at each level there is only one value returned that is subsumed. Another way to think of it is that the outer subsume takes an argument that comes from the inner rewrite's result, which is (because of the inner subsume) the one constant value, so that constant value flows all the way outward.
In general I think "subsume confusion" is a result of our earlier more ad-hoc approach (as soon as we see it, stop); part of the goal of the rewrite was to make this simpler and more understandable, to the point that subsume should mean "return this one result from this level of rewrite" and that's it; so we shouldn't be afraid of using it IMHO.
fitzgen submitted PR review.
fitzgen created PR review comment:
And this rule and cprop can never both happen at the same level, because cprop uses
subsume
?That is, in this example
v0 = { iconst 2; isub 5, 3 } v1 = { iconst 3; isub 6, 3 } v2 = icmp eq v0, v1
we would hit the issue that both cprop and this rule would match and
subsume
, but we can't actually get that example because thev0
andv1
eclasses would never contain both theiconst
s and theisub
s because cprop alwayssubsume
s?That's pretty subtle.
cfallin submitted PR review.
cfallin created PR review comment:
Right, yeah; I guess the important points here are:
- subsume is purely heuristic; it's always safe to use, it will never result in a miscompilation; and
- it's actually important to use it when we can, i.e. when unambiguously better rewrites exist, because it results in better canonicalization;
- usage of subsume compounds goodness, in the sense that eliminating unnecessary representations in args can eliminate rule firings later that rewrite that (unambiguously worse) representation.
In other words, if we did have the example above, there wouldn't be anything wrong -- we just see two different rules claiming they have the best rewrite, so we take both, and we have not as much canonicalization. (Cost-based extraction will still pick the constant in the end.)
So I still think the advice is generally "use it whenever rewrite result is unambiguously better", and trust the framework -- in other words, I would hope we don't have to game out all of these possible outcomes when considering whether to use the feature.
cfallin submitted PR review.
cfallin created PR review comment:
(After chewing on this more to try to explain better:)
Maybe the clearest way to see "subsume" is: it is a one-level prioritization. It's fine if multiple rules match and put results at this higher priority. We get the most goodness if we're consistent in the use of the priority, so we trim whenever appropriate, and don't get this redundant rule application. But this isn't correctness-bearing at all. Rule authors shouldn't need to worry about it -- the sublety you point out in the example is only "subtle" in the sense that the prioritizations compound and we get a good answer out at the end, but this doesn't need to be visible sublety. Regalloc is also really subtle and can interact in weird ways with IR shapes and various choices we make, but we don't worry about it because it "just works" -- should be the same here.
(I'm trying hard to defend "subsume" and see it used where appropriate because it's important for good results; I think giving up on it is a mistake and so I want to quash any folkloric "maybe don't use it" ideas that float around; and I do think there is latent confusion from our earlier approaches that is still haunting us that is no longer applicable!)
scottmcm submitted PR review.
scottmcm created PR review comment:
Hmm, and I think that if we do end up in that scenario and subsume from
1 + 3 == 2 + 3
to1 == 2
, that's ok because we'll still then apply the const-comparison rule to that and get thefalse
in the end anyway.That said, @cfallin, if subsume isn't an "almost never" thing, I wonder if it'd be good to separate out -- probably just with different macros, or something, not actually changing the behaviours or the rule runner -- the "this is an equivalent structure" rules from the "this is a better structure" rules.
While as I understand things it doesn't work this way, from your explanation it makes me think of like → vs ⇌ in chemistry, for whether things are one-way or equilibria. It feels like what you're describing is that we want both
a + 0 → a
rules and2 × a ⇌ a + a
rules, with actually rather more → rules now than used to be recommended.
cfallin submitted PR review.
cfallin created PR review comment:
Sure, one option would be to build two separate top-level entry points:
simplify
vsequivalent
or something. "Simplify" (our current word choice) actually to me carries the implication of directionality, while the "not sure, either could work, keep both" semantics associated with equality saturation is the latter. That's isomorphic to what we've done withsubsume
, maybe cleaner in some sense (conveyed via placement in a decision tree rather than a side-effect to a side-list). I don't feel too strongly about it either way...
Last updated: Apr 19 2025 at 20:03 UTC