julian-seward1 opened PR #2308 from arm64-simd-bitmask
to main
:
The
bitmask.{8x16,16x8,32x4}
instructions do not map neatly to any single
AArch64 SIMD instruction, and instead need a sequence of around ten
instructions. Because of this, this patch is somewhat longer and more complex
than it would be for (eg) x64.Main changes are:
the relevant testsuite test (
simd_boolean.wast
) has been enabled.at the CLIF level, add a new instruction
vhigh_bits
, into which these wasm
instructions are to be translated.in the wasm->CLIF translation (code_translator.rs), translate into
vhigh_bits
. This is straightforward.in the CLIF->AArch64 translation (lower_inst.rs), translate
vhigh_bits
into equivalent sequences of AArch64 instructions. There is a different
sequence for each of the{8x16, 16x8, 32x4}
variants.All other changes are AArch64-specific, and add instruction definitions needed
by the previous step:
Add two new families of AArch64 instructions:
VecShiftImm
(vector shift by
immediate) andVecExtract
(effectively a double-length vector shift)To the existing AArch64 family
VecRRR
, add azip1
variant. To the
VecLanesOp
family add anaddv
variant.Add supporting code for the above changes to AArch64 instructions:
- getting the register uses (
aarch64_get_regs
)- mapping the registers (
aarch64_map_regs
)- printing instructions
emitting instructions (
impl MachInstEmit for Inst
). The handling of
VecShiftImm
is a bit complex.emission tests for new instructions and variants.
<!--
Please ensure that the following steps are all taken care of before submitting
the PR.
[ ] This has been discussed in issue #..., or if not, please tell us why
here.[ ] A short description of what this does, why it is needed; if the
description becomes long, the matter should probably be discussed in an issue
first.[ ] This PR contains test cases, if meaningful.
- [ ] A reviewer from the core maintainer team has been assigned for this PR.
If you don't know who could review this, please indicate so. The list of
suggested reviewers on the right can help you.Please ensure all communication adheres to the code of conduct.
-->
julian-seward1 updated PR #2308 from arm64-simd-bitmask
to main
:
The
bitmask.{8x16,16x8,32x4}
instructions do not map neatly to any single
AArch64 SIMD instruction, and instead need a sequence of around ten
instructions. Because of this, this patch is somewhat longer and more complex
than it would be for (eg) x64.Main changes are:
the relevant testsuite test (
simd_boolean.wast
) has been enabled.at the CLIF level, add a new instruction
vhigh_bits
, into which these wasm
instructions are to be translated.in the wasm->CLIF translation (code_translator.rs), translate into
vhigh_bits
. This is straightforward.in the CLIF->AArch64 translation (lower_inst.rs), translate
vhigh_bits
into equivalent sequences of AArch64 instructions. There is a different
sequence for each of the{8x16, 16x8, 32x4}
variants.All other changes are AArch64-specific, and add instruction definitions needed
by the previous step:
Add two new families of AArch64 instructions:
VecShiftImm
(vector shift by
immediate) andVecExtract
(effectively a double-length vector shift)To the existing AArch64 family
VecRRR
, add azip1
variant. To the
VecLanesOp
family add anaddv
variant.Add supporting code for the above changes to AArch64 instructions:
- getting the register uses (
aarch64_get_regs
)- mapping the registers (
aarch64_map_regs
)- printing instructions
emitting instructions (
impl MachInstEmit for Inst
). The handling of
VecShiftImm
is a bit complex.emission tests for new instructions and variants.
<!--
Please ensure that the following steps are all taken care of before submitting
the PR.
[ ] This has been discussed in issue #..., or if not, please tell us why
here.[ ] A short description of what this does, why it is needed; if the
description becomes long, the matter should probably be discussed in an issue
first.[ ] This PR contains test cases, if meaningful.
- [ ] A reviewer from the core maintainer team has been assigned for this PR.
If you don't know who could review this, please indicate so. The list of
suggested reviewers on the right can help you.Please ensure all communication adheres to the code of conduct.
-->
julian-seward1 requested yurydelendik for a review on PR #2308.
akirilov-arm submitted PR Review.
akirilov-arm submitted PR Review.
akirilov-arm created PR Review Comment:
Please, use
lower_constant_f128()
.lower_splat_const()
would be even better, but it won't be available until #2310 is merged.
akirilov-arm created PR Review Comment:
Same comment about
Inst::MovToFpu
.
akirilov-arm created PR Review Comment:
This should be just
Inst::MovToFpu
. I still think that the constant should be handled inlower_constant_f128()
- we can add a special case for it, since we know that it is always going to be used for bitmask extraction, but we can do that later.
julian-seward1 updated PR #2308 from arm64-simd-bitmask
to main
:
The
bitmask.{8x16,16x8,32x4}
instructions do not map neatly to any single
AArch64 SIMD instruction, and instead need a sequence of around ten
instructions. Because of this, this patch is somewhat longer and more complex
than it would be for (eg) x64.Main changes are:
the relevant testsuite test (
simd_boolean.wast
) has been enabled.at the CLIF level, add a new instruction
vhigh_bits
, into which these wasm
instructions are to be translated.in the wasm->CLIF translation (code_translator.rs), translate into
vhigh_bits
. This is straightforward.in the CLIF->AArch64 translation (lower_inst.rs), translate
vhigh_bits
into equivalent sequences of AArch64 instructions. There is a different
sequence for each of the{8x16, 16x8, 32x4}
variants.All other changes are AArch64-specific, and add instruction definitions needed
by the previous step:
Add two new families of AArch64 instructions:
VecShiftImm
(vector shift by
immediate) andVecExtract
(effectively a double-length vector shift)To the existing AArch64 family
VecRRR
, add azip1
variant. To the
VecLanesOp
family add anaddv
variant.Add supporting code for the above changes to AArch64 instructions:
- getting the register uses (
aarch64_get_regs
)- mapping the registers (
aarch64_map_regs
)- printing instructions
emitting instructions (
impl MachInstEmit for Inst
). The handling of
VecShiftImm
is a bit complex.emission tests for new instructions and variants.
<!--
Please ensure that the following steps are all taken care of before submitting
the PR.
[ ] This has been discussed in issue #..., or if not, please tell us why
here.[ ] A short description of what this does, why it is needed; if the
description becomes long, the matter should probably be discussed in an issue
first.[ ] This PR contains test cases, if meaningful.
- [ ] A reviewer from the core maintainer team has been assigned for this PR.
If you don't know who could review this, please indicate so. The list of
suggested reviewers on the right can help you.Please ensure all communication adheres to the code of conduct.
-->
julian-seward1 submitted PR Review.
julian-seward1 created PR Review Comment:
The code sequences here were chosen in part so as to be as close as feasible to what the SM wasm Baseline compiler generates. I would be happy to use the improved constant-generation facilities as provided by #2310, but would prefer to make that change only once it is landed and stable.
akirilov-arm submitted PR Review.
akirilov-arm created PR Review Comment:
FYI in #2310
lower_constant_f128()
generates exactly the same sequence.
akirilov-arm submitted PR Review.
akirilov-arm submitted PR Review.
akirilov-arm created PR Review Comment:
Shouldn't this be
TxN
?Any128
could be a scalar if I am not mistaken.
akirilov-arm created PR Review Comment:
Similarly here - value of
imm4
?
akirilov-arm created PR Review Comment:
Perhaps add the values of
size
,is_shr
, andimm
to the message? It should make it easier to diagnose panics.
akirilov-arm submitted PR Review.
yurydelendik submitted PR Review.
julian-seward1 submitted PR Review.
julian-seward1 created PR Review Comment:
I added a FIXME note in the
Opcode::VhighBits
lowering rule that refers to #2310. That comment also already refers to #2296.
julian-seward1 updated PR #2308 from arm64-simd-bitmask
to main
:
The
bitmask.{8x16,16x8,32x4}
instructions do not map neatly to any single
AArch64 SIMD instruction, and instead need a sequence of around ten
instructions. Because of this, this patch is somewhat longer and more complex
than it would be for (eg) x64.Main changes are:
the relevant testsuite test (
simd_boolean.wast
) has been enabled.at the CLIF level, add a new instruction
vhigh_bits
, into which these wasm
instructions are to be translated.in the wasm->CLIF translation (code_translator.rs), translate into
vhigh_bits
. This is straightforward.in the CLIF->AArch64 translation (lower_inst.rs), translate
vhigh_bits
into equivalent sequences of AArch64 instructions. There is a different
sequence for each of the{8x16, 16x8, 32x4}
variants.All other changes are AArch64-specific, and add instruction definitions needed
by the previous step:
Add two new families of AArch64 instructions:
VecShiftImm
(vector shift by
immediate) andVecExtract
(effectively a double-length vector shift)To the existing AArch64 family
VecRRR
, add azip1
variant. To the
VecLanesOp
family add anaddv
variant.Add supporting code for the above changes to AArch64 instructions:
- getting the register uses (
aarch64_get_regs
)- mapping the registers (
aarch64_map_regs
)- printing instructions
emitting instructions (
impl MachInstEmit for Inst
). The handling of
VecShiftImm
is a bit complex.emission tests for new instructions and variants.
<!--
Please ensure that the following steps are all taken care of before submitting
the PR.
[ ] This has been discussed in issue #..., or if not, please tell us why
here.[ ] A short description of what this does, why it is needed; if the
description becomes long, the matter should probably be discussed in an issue
first.[ ] This PR contains test cases, if meaningful.
- [ ] A reviewer from the core maintainer team has been assigned for this PR.
If you don't know who could review this, please indicate so. The list of
suggested reviewers on the right can help you.Please ensure all communication adheres to the code of conduct.
-->
julian-seward1 merged PR #2308.
Last updated: Dec 23 2024 at 12:05 UTC