afonso360 opened PR #6843 from afonso360:opt-select
to bytecodealliance:main
:
:wave: Hey,
I was playing around with a project that uses cranelift as it's backend, and found some IR patterns where we could do better optimizations.
This PR adds a few optimizations:
We try to remove
{u,s}extend
andireduce
from the controlling input ofselect
. In many cases we can avoid them sinceselect
supports pretty much any input type.When we have
(select c (iconst -1) (iconst 0))
we can remove theselect
if we know that the controlling value already generates a-1
or0
. I've implemented this foricmp
andfcmp
* I wanted to add a base rule that would turn any other value intobmask
, but it ended up messing with theicmp
andfcmp
rules, and I'm not sure how to deal with that.Delete consecutive
bmask
's- Delete consecutive
bswap
'sI've ran sightglass on these and it shows no difference. I've also fuzzed this change for a couple of hours and it also didn't trigger anything.
afonso360 requested abrown for a review on PR #6843.
afonso360 requested wasmtime-compiler-reviewers for a review on PR #6843.
jameysharp submitted PR review:
This has some overlap with my PR #6214. I haven't gotten back to that in a while and I'm glad you're picking this up! Although you've marked this a draft, I got excited and reviewed it anyway since I'd reasoned through some of this already.
jameysharp submitted PR review:
This has some overlap with my PR #6214. I haven't gotten back to that in a while and I'm glad you're picking this up! Although you've marked this a draft, I got excited and reviewed it anyway since I'd reasoned through some of this already.
jameysharp created PR review comment:
I think these rules can be simpler.
Both
sextend
anduextend
preserve whether their argument is non-zero, so removing the extend from select-of-extend is valid regardless of whether the condition came from fcmp/icmp or something else. In my PR I noticed thatineg
andbmask
also preserve whether the argument is non-zero and can be eliminated inside select for the same reason.That's not true of
ireduce
, but you can use a different rule there: I believe it's safe to eliminate eitherireduce
orsextend
(but notuextend
) when they're applied tobmask
. At least according tobmask
's definition, it can produce any integer type regardless of the type of its argument. Hopefully the backends implement that faithfully.
jameysharp created PR review comment:
I think
(select c -1 0)
can be rewritten to(bmask c)
, regardless of the widths of eitherc
or the result type, and even ifc
doesn't matchicmp
orfcmp
. Then all the other rules we're discussing that matchbmask
can apply.I'm not sure if it's useful, but I guess the other pattern we could extract from these rules is that when
bmask
is applied to eithericmp
orfcmp
, it's a no-op if the result type isI8
, or it's equivalent tosextend
for any wider scalar type.I always forget that
icmp
andfcmp
returnI8
when given any width of scalar. That's kinda weird and had me very confused about these rules. Maybe we should change that someday.There's a minor efficiency improvement if you reuse unchanged nodes from the input. For example, bind the
fcmp
to a name using an at-pattern (cmp @ (fcmp fty cc x y)
) and then use that name instead of the full pattern on the right-hand side. This avoids having to look up the original node in the GVN map, since we already have the node ID from doing the pattern match. In theory, ISLE could do this transformation automatically, as long as we guarantee that the extractor and constructor for the same term are always inverses of each other.
afonso360 created PR review comment:
Separating this into
select->bmask
andbmask+icmp->icmp
is a really neat way of simplifying these optimizations! Thanks!
afonso360 submitted PR review.
afonso360 updated PR #6843.
afonso360 updated PR #6843.
jameysharp requested jameysharp for a review on PR #6843.
afonso360 updated PR #6843.
afonso360 updated PR #6843.
afonso360 updated PR #6843.
afonso360 has marked PR #6843 as ready for review.
afonso360 updated PR #6843.
jameysharp submitted PR review:
Looks great and I'm hopeful that CI will pass this time. :grin:
I'd love to see some Sightglass numbers from this. (Someday I will remember how to use /bench_x64; is today that day?) With the wide range of cases that you've covered now, I imagine these patterns might fire fairly often, and it would be exciting if this made a measurable difference.
As future work, I think there might be value in a corresponding
falsy
helper that can be used to swap arguments toselect
. If nothing else, there's a dedicated wasm instruction that corresponds to(eq ty v 0)
so I bet we'd see that match frequently.(rule (simplify (select ty v t f)) (if-let c (falsy v)) (select ty c f t))
jameysharp created PR review comment:
It just occurred to me that "left" and "right" are not very helpful descriptions of the operands to
select
. How about this?(rule (simplify (select ty v t f)) (if-let c (truthy v)) (select ty c t f))
jameysharp submitted PR review:
Looks great and I'm hopeful that CI will pass this time. :grin:
I'd love to see some Sightglass numbers from this. (Someday I will remember how to use /bench_x64; is today that day?) With the wide range of cases that you've covered now, I imagine these patterns might fire fairly often, and it would be exciting if this made a measurable difference.
As future work, I think there might be value in a corresponding
falsy
helper that can be used to swap arguments toselect
. If nothing else, there's a dedicated wasm instruction that corresponds to(eq ty v 0)
so I bet we'd see that match frequently.(rule (simplify (select ty v t f)) (if-let c (falsy v)) (select ty c f t))
jameysharp created PR review comment:
Nice job spotting
(ne v 0)
as also truthy!I'm having a little trouble following the constraints on this rule. I think what you're doing is that the operands to
icmp
must both have the same type, so iftruthy
unwrapsv
to a different type then the rule doesn't apply. Is that right?Contrary to what I said before about re-using the existing
iconst
, I think this rule will be better if we reconstruct the constant at whatever typec
has. In the case where the types do match, we'll go through the GVN map and discover that the newiconst
is identical to the old one, which is a little silly; but the rule will also succeed when the types don't match, making it more useful.(rule (simplify (ne cty v (iconst _ (u64_from_imm64 0)))) (if-let c (truthy v)) (if-let (value_type ty) c) (ne cty c (iconst ty (u64_from_imm64 0))))
That said, I believe this rule is also correct the way you wrote it; feel free to merge as-is.
In either case, it would be nice to add a comment, either saying that we're constraining
value_type
to be equal to the type ofzero
, or saying that we're extracting the type ofc
to make sure that the zero has the same type.
afonso360 updated PR #6843.
afonso360 updated PR #6843.
afonso360 submitted PR review.
afonso360 created PR review comment:
Oh, that's really neat! I don't know how I didn't spot that.
afonso360 edited PR review comment.
afonso360 merged PR #6843.
Last updated: Dec 23 2024 at 12:05 UTC