alexcrichton opened PR #5895 from fmls
to main
:
This commit adds lowerings to the AArch64 backend for the
fmls
instruction which is intended to be leveraged in the relaxed-simd proposal for WebAssembly. This should hopefully allow for a teeny-bit-more efficient codegen for this operator instead of using thefmla
instruction plus a negation instruction.<!--
Please ensure that the following steps are all taken care of before submitting
the PR.
[ ] This has been discussed in issue #..., or if not, please tell us why
here.[ ] A short description of what this does, why it is needed; if the
description becomes long, the matter should probably be discussed in an issue
first.[ ] This PR contains test cases, if meaningful.
- [ ] A reviewer from the core maintainer team has been assigned for this PR.
If you don't know who could review this, please indicate so. The list of
suggested reviewers on the right can help you.Please ensure all communication adheres to the code of conduct.
-->
jameysharp submitted PR review.
jameysharp submitted PR review.
jameysharp created PR review comment:
I suppose if both
x
andy
arefneg
then this can emitfmla
instead offneg
+fmls
, right? But I guess that's a rewrite we ought to do in the egraph optimizations instead.
alexcrichton submitted PR review.
alexcrichton created PR review comment:
Indeed! The x64 rules actually end up implementing that (they enable sort of switching back and forth given their structure) but it wasn't as obvious to do here - x64 uses a helper that manages sinking a load as well which adds a fair number of permutations.
I'll send a follow-up which implements the egraph optimization.
alexcrichton merged PR #5895.
Last updated: Nov 22 2024 at 16:03 UTC