cfallin opened PR #4088 from x64-u64-constants
to main
:
Another isel improvement from staring at disassemblies of spidermonkey.wasm compilation results: I noticed quite a few uses of constants of the form
0xC0DD259: movabsq $-9223372036854775808,%r9
where
r9
is used exactly once. This is a consequence of the "emit constants right before use" logic in the x64 backend that is meant to reduce register pressure (otherwise, we get onemovabs
at the top of the function and then a spill/reload).There's a better way though: a bunch of these uses are in reg/reg instructions that could be mem/reg, so we could load directly from a constant pool. So instead of
orq %r9, %rax
after the above, we could doorq OFFSET(%rip), %rax
. This avoids the need to use a temporary register, reducing register pressure, and also deduplicates constants (because our constant pool is deduplicated) vsmovabs
-per-use-site.This PR improves performance on
meshoptimizer
by 1%. Surprisingly, and frustratingly, it does not actually make SpiderMonkey faster. I am starting to realize that SpiderMonkey.wasm on the current compilation is cache-bound, and so further improvements in isel may not translate to further performance improvements (at least at the individual lowering level); we need better mid-end opts to advance further. But its hot blocks are a good source of examples of awkward codegen regardless :-)Builds on top of #4080.
Also fixes an emission bug in
pinsrd
where RIP-relative addresses were off by one (due to extra byte at end of instruction, not accounted for in the relocation). We'll never hit this currently because no lowerings usepinsrd
with RIP-relative data (only used for the constant pool) but with this PR it's now relevant.
cfallin updated PR #4088 from x64-u64-constants
to main
.
cfallin has marked PR #4088 as ready for review.
cfallin updated PR #4088 from x64-u64-constants
to main
.
fitzgen submitted PR review.
cfallin merged PR #4088.
Last updated: Dec 23 2024 at 12:05 UTC