alexcrichton opened PR #5986 from lea to main:
This commit adds a rule for the lowering of
iaddto useleafor 32
and 64-bit addition. The theoretical benefit ofleaover theadd
instruction is that theleavariant can emulate a 3-operand
instruction which doesn't destructively modify on of its operands.
Additionally theleaoperation can fold in other components such as
constant additions and shifts.In practice, however, if
leais unconditionally used instead ofiadd
it ends up losing 10% performance on a localmeshoptimizerbenchmark.
My best guess as to what's going on here is that my CPU's dedicated
units for address computation are all overloaded while the ALUs are
basically idle in a memory-intensive loop. Previously when the ALU was
used foraddand the address units for stores/loads it in theory
pipelined things better (most of this is me shooting in the dark). To
prevent the performance loss here I've updated the lowering ofiaddto
conditionally sometimes useleaand sometimes useadddepending on
how "complicated" theAmodeis. Simple ones likea + bora + $imm
continue to useadd(and its subsequent hypothetical extramov
necessary into the result). More complicated ones likea + b + $immor
a + b << c + $immuseleaas it can remove the need for extra
instructions. Locally at least this fixes the performance loss relative
to unconditionally usinglea.One note is that this adds an
OperandSizeargument to the
MInst::LoadEffectiveAddressvariant to add an encoding for 32-bit
leain addition to the preexisting 64-bit encoding.Additionally this PR has a prior commit which is a "no functional changes intended" update to the
Amodecomputation in the x64 backend to rely less on recursion and avoid blowing the stack at compile time for very-long-chains of theiaddinstruction.
alexcrichton updated PR #5986 from lea to main.
fitzgen submitted PR review.
fitzgen submitted PR review.
fitzgen created PR review comment:
This says "higher-priority" but the actual priority given is less than the other cases. Something doesn't add up here.
fitzgen created PR review comment:
;; instruction to fold multiple operations into one. The actual determination
alexcrichton updated PR #5986 from lea to main.
alexcrichton created PR review comment:
With a negative priority though I think this is higher than the prior two?
(I can switch to giving all positive priority too)
alexcrichton submitted PR review.
alexcrichton requested fitzgen for a review on PR #5986.
fitzgen submitted PR review.
fitzgen merged PR #5986.
Last updated: Dec 06 2025 at 06:05 UTC