alexcrichton edited PR #5709 from wasm-bandnot
to main
:
I was curious to see the effect of #5701 on the compilation of spidermonkey.wasm and much to my dismay I saw that the
andn
instruction was never generated. It turns out, however, that WebAssembly does not have a "bnot" equivalent instruction fori32
ori64
, nor a "band_not" instruction. To trigger the optimization added there I added a few egraph simplification rules to pattern match xor against -1 tobnot
and then take a sequence of ands/nots and turn that into a singleband_not
instruction.I'm not so certain that this is the best way to do this, however. There's a lot of optimization rules around just bnot and just band, but most of them don't mention the fused form of
band_not
(or other forms likebor_not
). I tried adding a simple "band_not is equivalent to (band (bnot))" rule but it ended up not generating theband_not
instruction. It seems like the best way forward might be to remove the fused instructions from clif entirely, instead pattern matching in backends for those that support fused instructions. If that's the way to go I'm happy to remove theband_not
fusing in algebraic.isle and can push on that later.
alexcrichton edited PR #5709 from wasm-bandnot
to main
:
I was curious to see the effect of #5701 on the compilation of spidermonkey.wasm and much to my dismay I saw that the
andn
instruction was never generated. It turns out, however, that WebAssembly does not have a "bnot" equivalent instruction fori32
ori64
, nor a "band_not" instruction. To trigger the optimization added there I added a few egraph simplification rules to pattern match xor against -1 tobnot
.I then additionally noticed that the lowering added in #5701 was for the clif
band_not
instruction, but this was never actually used by wasm. Additionally egraphs didn't fuseband
-of-bnot
into one instruction, so the lowering still never fired. To address this issue I've added legalizations now fromband_not
intoband
-of-bnot
in addition tobor_not
andbxor_not
. This involved updating all backends with fused lowerings to do more pattern-matching on the top-level operation instead of matching onb*_not
since those now never reach the backends.After all this I've added a
*.wat
test to assert thatbnot
shows up and at least for x86_64 the specialized lowering ofband_not
is triggered.Along the way I noticed a few things:
- I cleaned up some riscv64 bits where
band_not
unconditionally usedgen_andn
but internallygen_andn
tested for codegen features. Instead the lowering of(band ... (bnot ..))
now explicitly tests for codegen features and falls back to normal rules if the feature forandn
isn't available.- I think the
bxor_not
instruction was buggy on s390x int that it was calculating(bnot (bxor x y))
instead of(bxor x (bnot y))
. The tests got updated as a result. I'm not so certain that the remaining lowerings are correct since the instructions are calledNotXor32
, for example, but I haven't tested.
alexcrichton updated PR #5709 from wasm-bandnot
to main
.
alexcrichton updated PR #5709 from wasm-bandnot
to main
.
alexcrichton updated PR #5709 from wasm-bandnot
to main
.
alexcrichton updated PR #5709 from wasm-bandnot
to main
.
alexcrichton updated PR #5709 from wasm-bandnot
to main
.
jameysharp submitted PR review.
jameysharp submitted PR review.
jameysharp created PR review comment:
Tiny whitespace fix:
(rule 9 (lower (has_type ty @ (multi_lane _bits _lane) (band (bnot y) x)))
alexcrichton updated PR #5709 from wasm-bandnot
to main
.
alexcrichton merged PR #5709.
Last updated: Nov 22 2024 at 17:03 UTC