wasmtime / PR #2308 CL/aarch64 back end: implement the wa... · git-wasmtime

Stream: git-wasmtime

Topic: wasmtime / PR #2308 CL/aarch64 back end: implement the wa...

Wasmtime GitHub notifications bot (Oct 22 2020 at 14:30):

julian-seward1 opened PR #2308 from arm64-simd-bitmask to main:

The bitmask.{8x16,16x8,32x4} instructions do not map neatly to any single
AArch64 SIMD instruction, and instead need a sequence of around ten
instructions. Because of this, this patch is somewhat longer and more complex
than it would be for (eg) x64.

Main changes are:

the relevant testsuite test (simd_boolean.wast) has been enabled.

at the CLIF level, add a new instruction vhigh_bits, into which these wasm
instructions are to be translated.

in the wasm->CLIF translation (code_translator.rs), translate into
vhigh_bits. This is straightforward.

in the CLIF->AArch64 translation (lower_inst.rs), translate vhigh_bits
into equivalent sequences of AArch64 instructions. There is a different
sequence for each of the {8x16, 16x8, 32x4} variants.

All other changes are AArch64-specific, and add instruction definitions needed
by the previous step:

Add two new families of AArch64 instructions: VecShiftImm (vector shift by
immediate) and VecExtract (effectively a double-length vector shift)

To the existing AArch64 family VecRRR, add a zip1 variant. To the
VecLanesOp family add an addv variant.

Add supporting code for the above changes to AArch64 instructions:

getting the register uses (aarch64_get_regs)

mapping the registers (aarch64_map_regs)

printing instructions

emitting instructions (impl MachInstEmit for Inst). The handling of
VecShiftImm is a bit complex.

emission tests for new instructions and variants.

Wasmtime GitHub notifications bot (Oct 22 2020 at 15:10):

julian-seward1 updated PR #2308 from arm64-simd-bitmask to main:

The bitmask.{8x16,16x8,32x4} instructions do not map neatly to any single
AArch64 SIMD instruction, and instead need a sequence of around ten
instructions. Because of this, this patch is somewhat longer and more complex
than it would be for (eg) x64.

Main changes are:

the relevant testsuite test (simd_boolean.wast) has been enabled.

at the CLIF level, add a new instruction vhigh_bits, into which these wasm
instructions are to be translated.

in the wasm->CLIF translation (code_translator.rs), translate into
vhigh_bits. This is straightforward.

in the CLIF->AArch64 translation (lower_inst.rs), translate vhigh_bits
into equivalent sequences of AArch64 instructions. There is a different
sequence for each of the {8x16, 16x8, 32x4} variants.

All other changes are AArch64-specific, and add instruction definitions needed
by the previous step:

Add two new families of AArch64 instructions: VecShiftImm (vector shift by
immediate) and VecExtract (effectively a double-length vector shift)

To the existing AArch64 family VecRRR, add a zip1 variant. To the
VecLanesOp family add an addv variant.

Add supporting code for the above changes to AArch64 instructions:

getting the register uses (aarch64_get_regs)

mapping the registers (aarch64_map_regs)

printing instructions

emitting instructions (impl MachInstEmit for Inst). The handling of
VecShiftImm is a bit complex.

emission tests for new instructions and variants.

Wasmtime GitHub notifications bot (Oct 22 2020 at 15:49):

julian-seward1 requested yurydelendik for a review on PR #2308.

Wasmtime GitHub notifications bot (Oct 22 2020 at 17:37):

akirilov-arm submitted PR Review.

Wasmtime GitHub notifications bot (Oct 22 2020 at 17:37):

akirilov-arm submitted PR Review.

Wasmtime GitHub notifications bot (Oct 22 2020 at 17:37):

akirilov-arm created PR Review Comment:

Please, use lower_constant_f128(). lower_splat_const() would be even better, but it won't be available until #2310 is merged.

Wasmtime GitHub notifications bot (Oct 22 2020 at 17:37):

akirilov-arm created PR Review Comment:

Same comment about Inst::MovToFpu.

Wasmtime GitHub notifications bot (Oct 22 2020 at 17:37):

akirilov-arm created PR Review Comment:

This should be just Inst::MovToFpu. I still think that the constant should be handled in lower_constant_f128() - we can add a special case for it, since we know that it is always going to be used for bitmask extraction, but we can do that later.

Wasmtime GitHub notifications bot (Oct 22 2020 at 17:38):

julian-seward1 updated PR #2308 from arm64-simd-bitmask to main:

The bitmask.{8x16,16x8,32x4} instructions do not map neatly to any single
AArch64 SIMD instruction, and instead need a sequence of around ten
instructions. Because of this, this patch is somewhat longer and more complex
than it would be for (eg) x64.

Main changes are:

the relevant testsuite test (simd_boolean.wast) has been enabled.

at the CLIF level, add a new instruction vhigh_bits, into which these wasm
instructions are to be translated.

in the wasm->CLIF translation (code_translator.rs), translate into
vhigh_bits. This is straightforward.

in the CLIF->AArch64 translation (lower_inst.rs), translate vhigh_bits
into equivalent sequences of AArch64 instructions. There is a different
sequence for each of the {8x16, 16x8, 32x4} variants.

All other changes are AArch64-specific, and add instruction definitions needed
by the previous step:

Add two new families of AArch64 instructions: VecShiftImm (vector shift by
immediate) and VecExtract (effectively a double-length vector shift)

To the existing AArch64 family VecRRR, add a zip1 variant. To the
VecLanesOp family add an addv variant.

Add supporting code for the above changes to AArch64 instructions:

getting the register uses (aarch64_get_regs)

mapping the registers (aarch64_map_regs)

printing instructions

emitting instructions (impl MachInstEmit for Inst). The handling of
VecShiftImm is a bit complex.

emission tests for new instructions and variants.

Wasmtime GitHub notifications bot (Oct 22 2020 at 17:49):

julian-seward1 submitted PR Review.

Wasmtime GitHub notifications bot (Oct 22 2020 at 17:49):

julian-seward1 created PR Review Comment:

The code sequences here were chosen in part so as to be as close as feasible to what the SM wasm Baseline compiler generates. I would be happy to use the improved constant-generation facilities as provided by #2310, but would prefer to make that change only once it is landed and stable.

Wasmtime GitHub notifications bot (Oct 22 2020 at 17:53):

akirilov-arm submitted PR Review.

Wasmtime GitHub notifications bot (Oct 22 2020 at 17:53):

akirilov-arm created PR Review Comment:

FYI in #2310 lower_constant_f128() generates exactly the same sequence.

Wasmtime GitHub notifications bot (Oct 22 2020 at 22:14):

akirilov-arm submitted PR Review.

Wasmtime GitHub notifications bot (Oct 22 2020 at 22:14):

akirilov-arm submitted PR Review.

Wasmtime GitHub notifications bot (Oct 22 2020 at 22:14):

akirilov-arm created PR Review Comment:

Shouldn't this be TxN? Any128 could be a scalar if I am not mistaken.

Wasmtime GitHub notifications bot (Oct 22 2020 at 22:14):

akirilov-arm created PR Review Comment:

Similarly here - value of imm4?

Wasmtime GitHub notifications bot (Oct 22 2020 at 22:14):

akirilov-arm created PR Review Comment:

Perhaps add the values of size, is_shr, and imm to the message? It should make it easier to diagnose panics.

Wasmtime GitHub notifications bot (Oct 22 2020 at 22:17):

akirilov-arm submitted PR Review.

Wasmtime GitHub notifications bot (Oct 23 2020 at 00:16):

yurydelendik submitted PR Review.

Wasmtime GitHub notifications bot (Oct 23 2020 at 02:22):

julian-seward1 submitted PR Review.

Wasmtime GitHub notifications bot (Oct 23 2020 at 02:22):

julian-seward1 created PR Review Comment:

I added a FIXME note in the Opcode::VhighBits lowering rule that refers to #2310. That comment also already refers to #2296.

Wasmtime GitHub notifications bot (Oct 23 2020 at 02:50):

julian-seward1 updated PR #2308 from arm64-simd-bitmask to main:

The bitmask.{8x16,16x8,32x4} instructions do not map neatly to any single
AArch64 SIMD instruction, and instead need a sequence of around ten
instructions. Because of this, this patch is somewhat longer and more complex
than it would be for (eg) x64.

Main changes are:

the relevant testsuite test (simd_boolean.wast) has been enabled.

at the CLIF level, add a new instruction vhigh_bits, into which these wasm
instructions are to be translated.

in the wasm->CLIF translation (code_translator.rs), translate into
vhigh_bits. This is straightforward.

in the CLIF->AArch64 translation (lower_inst.rs), translate vhigh_bits
into equivalent sequences of AArch64 instructions. There is a different
sequence for each of the {8x16, 16x8, 32x4} variants.

All other changes are AArch64-specific, and add instruction definitions needed
by the previous step:

Add two new families of AArch64 instructions: VecShiftImm (vector shift by
immediate) and VecExtract (effectively a double-length vector shift)

To the existing AArch64 family VecRRR, add a zip1 variant. To the
VecLanesOp family add an addv variant.

Add supporting code for the above changes to AArch64 instructions:

getting the register uses (aarch64_get_regs)

mapping the registers (aarch64_map_regs)

printing instructions

emitting instructions (impl MachInstEmit for Inst). The handling of
VecShiftImm is a bit complex.

emission tests for new instructions and variants.

Wasmtime GitHub notifications bot (Oct 23 2020 at 03:26):

julian-seward1 merged PR #2308.

Last updated: Apr 18 2025 at 00:13 UTC