Stream: git-wasmtime

Topic: wasmtime / PR #5986 x64: Add lea-based lowering for iadd


view this post on Zulip Wasmtime GitHub notifications bot (Mar 10 2023 at 21:06):

alexcrichton opened PR #5986 from lea to main:

This commit adds a rule for the lowering of iadd to use lea for 32
and 64-bit addition. The theoretical benefit of lea over the add
instruction is that the lea variant can emulate a 3-operand
instruction which doesn't destructively modify on of its operands.
Additionally the lea operation can fold in other components such as
constant additions and shifts.

In practice, however, if lea is unconditionally used instead of iadd
it ends up losing 10% performance on a local meshoptimizer benchmark.
My best guess as to what's going on here is that my CPU's dedicated
units for address computation are all overloaded while the ALUs are
basically idle in a memory-intensive loop. Previously when the ALU was
used for add and the address units for stores/loads it in theory
pipelined things better (most of this is me shooting in the dark). To
prevent the performance loss here I've updated the lowering of iadd to
conditionally sometimes use lea and sometimes use add depending on
how "complicated" the Amode is. Simple ones like a + b or a + $imm
continue to use add (and its subsequent hypothetical extra mov
necessary into the result). More complicated ones like a + b + $imm or
a + b << c + $imm use lea as it can remove the need for extra
instructions. Locally at least this fixes the performance loss relative
to unconditionally using lea.

One note is that this adds an OperandSize argument to the
MInst::LoadEffectiveAddress variant to add an encoding for 32-bit
lea in addition to the preexisting 64-bit encoding.

Additionally this PR has a prior commit which is a "no functional changes intended" update to the Amode computation in the x64 backend to rely less on recursion and avoid blowing the stack at compile time for very-long-chains of the iadd instruction.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 10 2023 at 21:19):

alexcrichton updated PR #5986 from lea to main.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 10 2023 at 21:53):

fitzgen submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 10 2023 at 21:53):

fitzgen submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 10 2023 at 21:53):

fitzgen created PR review comment:

This says "higher-priority" but the actual priority given is less than the other cases. Something doesn't add up here.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 10 2023 at 21:53):

fitzgen created PR review comment:

;; instruction to fold multiple operations into one. The actual determination

view this post on Zulip Wasmtime GitHub notifications bot (Mar 15 2023 at 02:05):

alexcrichton updated PR #5986 from lea to main.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 15 2023 at 02:07):

alexcrichton created PR review comment:

With a negative priority though I think this is higher than the prior two?

(I can switch to giving all positive priority too)

view this post on Zulip Wasmtime GitHub notifications bot (Mar 15 2023 at 02:07):

alexcrichton submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 15 2023 at 02:10):

alexcrichton requested fitzgen for a review on PR #5986.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 15 2023 at 17:13):

fitzgen submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 15 2023 at 17:50):

fitzgen merged PR #5986.


Last updated: Oct 23 2024 at 20:03 UTC