Stream: git-wasmtime

Topic: wasmtime / issue #6600 riscv64: Improve SIMD `ExtAddPairw...


view this post on Zulip Wasmtime GitHub notifications bot (Jun 17 2023 at 12:51):

afonso360 opened issue #6600:

:wave: Hey,

Feature

Currently we don't have any special lowerings for the ExtAddPairwise family of WASM instructions, so we generate a quite poor implementation.

Benefit

We can get better codegen on this backend for these instructions.

Implementation

cranelift-wasm translates these as iadd_pairwise(uwiden_low(x), uwiden_high(y)) and similar variations.

We use the generic iadd_pairwise implementation which is quite large, however with the widening instructions as an input, we can generate much better codegen.

Since we know we are going to discard half of the input register elements, we can use a single vrgather.vv on each input to reshuffle them and then use a vwadd.vv to do the sum.

This is pretty much what v8 emits for these instructions.

Alternatives

We don't need to do this. I also don't know how often these instructions get used in real code.

view this post on Zulip Wasmtime GitHub notifications bot (Jun 17 2023 at 12:51):

afonso360 labeled issue #6600:

:wave: Hey,

Feature

Currently we don't have any special lowerings for the ExtAddPairwise family of WASM instructions, so we generate a quite poor implementation.

Benefit

We can get better codegen on this backend for these instructions.

Implementation

cranelift-wasm translates these as iadd_pairwise(uwiden_low(x), uwiden_high(y)) and similar variations.

We use the generic iadd_pairwise implementation which is quite large, however with the widening instructions as an input, we can generate much better codegen.

Since we know we are going to discard half of the input register elements, we can use a single vrgather.vv on each input to reshuffle them and then use a vwadd.vv to do the sum.

This is pretty much what v8 emits for these instructions.

Alternatives

We don't need to do this. I also don't know how often these instructions get used in real code.

view this post on Zulip Wasmtime GitHub notifications bot (Jun 17 2023 at 12:55):

afonso360 edited issue #6600:

:wave: Hey,

Feature

Currently we don't have any special lowerings for the ExtAddPairwise family of WASM instructions, so we generate a quite poor implementation.

Benefit

We can get better codegen on this backend for these instructions.

Implementation

cranelift-wasm translates these as iadd_pairwise(uwiden_low(x), uwiden_high(y)) and similar variations.

We use the generic iadd_pairwise implementation which is quite large, however with the widening instructions as an input, we can generate much better codegen.

Since we know we are going to discard half of the input register elements, we can use a single vrgather.vv on each input to reshuffle them and then use a vwadd.vv to do the sum.

This is pretty much what v8 emits for these instructions.

Alternatives

We don't need to do this, the current lowerings are working as intended. I also don't know how often these instructions get used in real code.

view this post on Zulip Wasmtime GitHub notifications bot (Jun 22 2023 at 14:08):

afonso360 labeled issue #6600:

:wave: Hey,

Feature

Currently we don't have any special lowerings for the ExtAddPairwise family of WASM instructions, so we generate a quite poor implementation.

Benefit

We can get better codegen on this backend for these instructions.

Implementation

cranelift-wasm translates these as iadd_pairwise(uwiden_low(x), uwiden_high(y)) and similar variations.

We use the generic iadd_pairwise implementation which is quite large, however with the widening instructions as an input, we can generate much better codegen.

Since we know we are going to discard half of the input register elements, we can use a single vrgather.vv on each input to reshuffle them and then use a vwadd.vv to do the sum.

This is pretty much what v8 emits for these instructions.

Alternatives

We don't need to do this, the current lowerings are working as intended. I also don't know how often these instructions get used in real code.


Last updated: Dec 23 2024 at 12:05 UTC