alexcrichton opened PR #5819 from more-avx
to main
:
This commit goes through
inst.isle
and adds a corresponding AVX lowering for most SSE lowerings. I opted to skip instructions where the SSE lowering didn't read/modify a register, such asroundps
. I think that AVX will benefit these instructions when there's load-merging since AVX doesn't require alignment, but I've deferred that work to a future PR.Otherwise though in this PR I think all (or almost all) of the 3-operand forms of AVX instructions are supported with their SSE counterparts. This should ideally improve codegen slightly by removing register pressure and the need for
movdqa
between registers. I've attempted to ensure that there's at least one codegen test for all the new instructions.As a side note, the recent capstone integration into
precise-output
tests helped me catch a number of encoding bugs much earlier than otherwise, so I've found that incredibly useful in tests!<!--
Please ensure that the following steps are all taken care of before submitting
the PR.
[ ] This has been discussed in issue #..., or if not, please tell us why
here.[ ] A short description of what this does, why it is needed; if the
description becomes long, the matter should probably be discussed in an issue
first.[ ] This PR contains test cases, if meaningful.
- [ ] A reviewer from the core maintainer team has been assigned for this PR.
If you don't know who could review this, please indicate so. The list of
suggested reviewers on the right can help you.Please ensure all communication adheres to the code of conduct.
-->
cfallin submitted PR review.
cfallin submitted PR review.
cfallin created PR review comment:
I'm not wild about the argument-dependent regalloc metadata shape here, and ideally I'd prefer us to split out variants like
XmmRmiRVexConst
that are used instead whenever the intent of an instruction is to produce a constant 0/1/whatever, and that assert the correct setup for that; but that's a separate refactor you don't necessarily need to do in this PR. Just noting as I see the existing pattern expanding...
cfallin created PR review comment:
Given that, could we split this into two enum variants, one with
XmmMem
and one withGprMem
, to keep the type safety?
alexcrichton submitted PR review.
alexcrichton created PR review comment:
Sure, I was following this precedent but I can split out too.
alexcrichton submitted PR review.
alexcrichton created PR review comment:
Sure yeah, I was just following the preexisting pattern for the xmm instructions, but I can try to update those too in a subsequent PR.
alexcrichton updated PR #5819 from more-avx
to main
.
alexcrichton updated PR #5819 from more-avx
to main
.
cfallin submitted PR review.
alexcrichton updated PR #5819 from more-avx
to main
.
alexcrichton merged PR #5819.
Last updated: Jan 24 2025 at 00:11 UTC