cfallin opened PR #4080 from x64-amodes
to main
:
This draft PR modifies the x64 backend's addressing-mode lowering to use both scaled indexing, e.g.
[rax+rbx*4]
, and to consolidate offsets, sox+y+8+12
can become [rax+rbx+20]`.Unfortunately this doesn't yet appear to be quite enough to improve on this really annoying sequence that occurs frequently in SpiderMonkey.wasm's hot blocks:
0xC064911: movq %rsi,%rdx 0xC064914: addl $3, %edx 0xC064917: movzbq 0(%r9,%rdx),%rdx
the issue being that
edx+3
, as a 32-bit add, can wrap around and so this is not quite equivalent to3(%r9, %rsi)
(assuming we could prove rsi is zero-extended). @abrown or others, if you have ideas on a better lowering here, I'd be happy to hear them!This PR builds on top of #4072, #4078, and #4079; only the last commit is new.
cfallin updated PR #4080 from x64-amodes
to main
.
cfallin updated PR #4080 from x64-amodes
to main
.
iximeow created PR review comment:
not having to change this makes me seriously question something, somewhere. the 67 prefix should have made this sequence decode as
lea 48(%r12d,%r13d,4), %eax
?
iximeow submitted PR review.
cfallin submitted PR review.
cfallin created PR review comment:
Ah, this is our internal pretty-printing and doesn't go through any external (dis)assembler, so it isn't necessarily guaranteed to have the correct register names per the operand size. Or at least for this quick experiment/hack I didn't bother to hack the EA prettyprinting to take an
OperandSize
.
iximeow submitted PR review.
iximeow created PR review comment:
oh! i have a ~bridge~ disassembler to sell you, potentially :)
cfallin updated PR #4080 from x64-amodes
to main
.
cfallin updated PR #4080 from x64-amodes
to main
.
cfallin updated PR #4080 from x64-amodes
to main
.
cfallin updated PR #4080 from x64-amodes
to main
.
cfallin has marked PR #4080 as ready for review.
Last updated: Nov 22 2024 at 17:03 UTC