alexcrichton opened PR #5982 from pextr
to main
:
This commit improves adds support to Cranelift to emit
pextr{b,w,d,q}
with a memory destination, merging a store-of-extract operation into one instruction. Additionally AVX support is added for thepextr*
instructions.I've additionally tried to ensure that codegen tests and runtests exist for all forms of these instructions too.
<!--
Please ensure that the following steps are all taken care of before submitting
the PR.
[ ] This has been discussed in issue #..., or if not, please tell us why
here.[ ] A short description of what this does, why it is needed; if the
description becomes long, the matter should probably be discussed in an issue
first.[ ] This PR contains test cases, if meaningful.
- [ ] A reviewer from the core maintainer team has been assigned for this PR.
If you don't know who could review this, please indicate so. The list of
suggested reviewers on the right can help you.Please ensure all communication adheres to the code of conduct.
-->
alexcrichton requested abrown for a review on PR #5982.
abrown submitted PR review.
abrown submitted PR review.
abrown created PR review comment:
This is
0b11101110
... is that right?
abrown created PR review comment:
Looking at the other cases, I started to worry that one of these instructions actually zeroes bits we do not want to zero. I think it is this case; from the
MOVSD
documentation:Legacy version: When the source and destination operands are XMM registers, bits MAXVL:64 of the destination operand remains unchanged. When the source operand is a memory location and destination operand is an XMM registers, the quadword at bits 127:64 of the destination operand is cleared to all 0s, bits MAXVL:128 of the destination operand remains unchanged.
VEX and EVEX encoded register-register syntax: Moves a scalar double precision floating-point value from the second source operand (the third operand) to the low quadword element of the destination operand (the first operand). Bits 127:64 of the destination operand are copied from the first source operand (the second operand). Bits (MAXVL-1:128) of the corresponding destination register are zeroed
So it is the legacy SSE case where a
movsd 0(%rdi) ....
would zero too many bits but actually in the AVX code we could merge these twovmovsd
into one. Is there a way to do that?
abrown created PR review comment:
Missing comma after
$1
?
abrown created PR review comment:
I checked and
u8_from_uimm8
does not guarantee thatn
is in the right range (e.g., 0 to 15 here). Did you see any validation that would prevent wrongn
here at the CLIF level or do we rely exclusively on the Wasm-level construction? If there is nothing and it ends up being too tricky to do here in ISLE, maybeemit.rs
could gain someassert!
s or something like that.
abrown created PR review comment:
:+1:
alexcrichton submitted PR review.
alexcrichton created PR review comment:
I think this is the block in the verifier which validates that lanes are always inbounds, so I think we should be good on that front. I can add some extra asserts though to the backend too.
alexcrichton submitted PR review.
alexcrichton created PR review comment:
Indeed!
alexcrichton submitted PR review.
alexcrichton created PR review comment:
Going by this which I realize isn't official but has been what I've been using so far, my read is that
VEX.128.F2.0F 11 /r: VMOVSD xmm1, xmm2, xmm3
has different semantics thanVEX.128.F2.0F 10 /r: VMOVSD xmm1, m64
so I think that AVX and SSE match here where if you use the register-to-register form it preserves the upper bits but if you use the memory-to-register form it always zeros the upper bits, so I don't think we can fuse?I'll admit though that this is all real subtle and if the official docs are different I wouldn't be too surprised.
alexcrichton updated PR #5982 from pextr
to main
.
alexcrichton updated PR #5982 from pextr
to main
.
alexcrichton merged PR #5982.
Last updated: Nov 22 2024 at 17:03 UTC