alexcrichton opened PR #6045 from more-extend-elision
to main
:
I've confirmed locally now that
pextr{b,w,d}
all zero the upper bits of the full 64-bit register size which means that theextractlane
operation with a zero-extend can be elided for more cases, including 8-to-64-bit casts as well as 32-to-64.This helps elide a few extra
mov
s in a loop I was looking at and had a modest corresponding increase in performance (my guess was due to the slightly decreased code size mostly as opposed to the removedmov
s).<!--
Please ensure that the following steps are all taken care of before submitting
the PR.
[ ] This has been discussed in issue #..., or if not, please tell us why
here.[ ] A short description of what this does, why it is needed; if the
description becomes long, the matter should probably be discussed in an issue
first.[ ] This PR contains test cases, if meaningful.
- [ ] A reviewer from the core maintainer team has been assigned for this PR.
If you don't know who could review this, please indicate so. The list of
suggested reviewers on the right can help you.Please ensure all communication adheres to the code of conduct.
-->
abrown submitted PR review.
abrown submitted PR review.
abrown created PR review comment:
Not sure you wanted to check this one in.
alexcrichton updated PR #6045 from more-extend-elision
to main
.
alexcrichton submitted PR review.
alexcrichton created PR review comment:
Whoops, indeed!
alexcrichton has enabled auto merge for PR #6045.
alexcrichton merged PR #6045.
Last updated: Nov 22 2024 at 16:03 UTC