wasmtime / issue #7186 riscv64: Implement Remaining Vecto... · git-wasmtime

Stream: git-wasmtime

Topic: wasmtime / issue #7186 riscv64: Implement Remaining Vecto...

Wasmtime GitHub notifications bot (Oct 08 2023 at 11:28):

afonso360 opened issue #7186:

:wave: Hey,

This is a general issue for tracking the missing instructions from the RISC-V Vector extension. We have implemented most of them, however there are a couple of categories that are still missing

Widening and Narrowing Operations

[x] Vector Widening Integer Add/Subtract

[ ] Vector Widening Integer Multiply Instructions

[ ] Vector Narrowing Integer Right Shift Instructions

[ ] Vector Widening Integer Multiply-Add Instructions

[ ] Vector Widening Floating-Point Add/Subtract Instructions

[ ] Vector Widening Floating-Point Multiply

[ ] Widening Floating-Point/Integer Type-Convert Instructions

[ ] Narrowing Floating-Point/Integer Type-Convert Instructions

[ ] Vector Widening Floating-Point Fused Multiply-Add Instructions

I have Implemented Vector Widening Integer Add/Subtract in #6542 and #6555.

I have a branch that contains Vector Widening Integer Multiply Instructions in riscv-simd-widening-mul, however before merging that I think it we should rework our approach towards these instructions.

Matching one of these instructions includes a lot of rules mostly due to how many combinations we can perform with {s,u}widen_{low,high}, so before adding any more instruction I think we need to add an extractor that can match all of these patterns.

As an example vwmulsu.vv requires 8 rules to match all combinations of {s,u}widen_{low,high}.

Vector Multiple Register Move

We have implemented vmv1r.v as our move instruction, this instruction moves one register into another. However we also have vmv2r.v / vmv4r.v / vmv8r.v. These instructions move n consecutive registers.

As an example, vmv2r.v v10, v12 copies v10=v12; v11=v13.

These instructions have a constraint that the source and destination registers must be "aligned" to the number of the instruction.

So, vmv2r.v v11, v15 would be an illegal instruction.

[ ] Whole Vector Register Move

Vector Reduction Operations

[ ] Vector Single-Width Integer Reduction Instructions

[ ] Vector Widening Integer Reduction Instructions

[ ] Vector Single-Width Floating-Point Reduction Instructions

[ ] Vector Widening Floating-Point Reduction Instructions

I'm not entirely sure how we can best match these operations. We don't have the equivalent instructions in cranelift.

These instructions do a operation on an entire vector register, such as a sum and return a scalar value. We have two of these implemented vredminu and vredmaxu that are used when lowering vany_true or vall_true.

Special Load / Store Addressing Modes

We currently implement Vector Unit-Stride Instructions for all our Loads and Stores. These are fairly simple in that they load vl elements that are a fixed number of bits apart. These match the semantics of the load and store instructions in cranelift.

The remaining Addressing Modes might be beneficial, but I have no idea how we could match them. Some of these might not be that useful.

[ ] Vector Strided Instructions

[ ] Vector Indexed Instructions

[ ] Unit-stride Fault-Only-First Loads

[ ] Vector Load/Store Segment Instructions

[ ] Vector Load/Store Whole Register Instructions

Vector Integer Division

We don't actually support vector sdiv / udiv, nevertheless RISC-V includes instructions for them in Vector Integer Divide Instructions. I'm including this here mostly for completeness.

Wasmtime GitHub notifications bot (Oct 08 2023 at 11:28):

afonso360 added the cranelift:area:riscv64 label to Issue #7186.

Wasmtime GitHub notifications bot (Oct 08 2023 at 11:28):

afonso360 edited issue #7186:

:wave: Hey,

This is a general issue for tracking the missing instructions from the RISC-V Vector extension. We have implemented most of them, however there are a couple of categories that are still missing

Widening and Narrowing Operations

[x] Vector Widening Integer Add/Subtract

[ ] Vector Widening Integer Multiply Instructions

[ ] Vector Narrowing Integer Right Shift Instructions

[ ] Vector Widening Integer Multiply-Add Instructions

[ ] Vector Widening Floating-Point Add/Subtract Instructions

[ ] Vector Widening Floating-Point Multiply

[ ] Widening Floating-Point/Integer Type-Convert Instructions

[ ] Narrowing Floating-Point/Integer Type-Convert Instructions

[ ] Vector Widening Floating-Point Fused Multiply-Add Instructions

I have Implemented Vector Widening Integer Add/Subtract in #6542 and #6555.

I have a branch that contains Vector Widening Integer Multiply Instructions in riscv-simd-widening-mul, however before merging that I think it we should rework our approach towards these instructions.

Matching one of these instructions includes a lot of rules mostly due to how many combinations we can perform with {s,u}widen_{low,high}, so before adding any more instruction I think we need to add an extractor that can match all of these patterns.

As an example vwmulsu.vv requires 8 rules to match all combinations of {s,u}widen_{low,high}.

Vector Multiple Register Move

We have implemented vmv1r.v as our move instruction, this instruction moves one register into another. However we also have vmv2r.v / vmv4r.v / vmv8r.v. These instructions move n consecutive registers.

As an example, vmv2r.v v10, v12 copies v10=v12; v11=v13.

These instructions have a constraint that the source and destination registers must be "aligned" to the number of the instruction.

So, vmv2r.v v11, v15 would be an illegal instruction.

[ ] Whole Vector Register Move

Vector Reduction Operations

[ ] Vector Single-Width Integer Reduction Instructions

[ ] Vector Widening Integer Reduction Instructions

[ ] Vector Single-Width Floating-Point Reduction Instructions

[ ] Vector Widening Floating-Point Reduction Instructions

I'm not entirely sure how we can best match these operations. We don't have the equivalent instructions in cranelift.

These instructions do a operation on an entire vector register, such as a sum and return a scalar value. We have two of these implemented vredminu and vredmaxu that are used when lowering vany_true or vall_true.

Special Load / Store Addressing Modes

We currently implement Vector Unit-Stride Instructions for all our Loads and Stores. These are fairly simple in that they load vl elements that are a fixed number of bits apart. These match the semantics of the load and store instructions in cranelift.

The remaining Addressing Modes might be beneficial, but I have no idea how we could match them. Some of these might not be that useful.

[ ] Vector Strided Instructions

[ ] Vector Indexed Instructions

[ ] Unit-stride Fault-Only-First Loads

[ ] Vector Load/Store Segment Instructions

[ ] Vector Load/Store Whole Register Instructions

Vector Integer Division

We don't actually support vector sdiv / udiv, nevertheless RISC-V includes instructions for them in Vector Integer Divide Instructions. I'm including this here mostly for completeness.

[ ] Vector Integer Divide Instructions

Wasmtime GitHub notifications bot (Oct 08 2023 at 11:30):

afonso360 edited issue #7186:

:wave: Hey,

This is a general issue for tracking the missing instructions from the RISC-V Vector extension. We have implemented most of them, however there are a couple of categories that are still missing

Widening and Narrowing Operations

[x] Vector Widening Integer Add/Subtract

[ ] Vector Widening Integer Multiply Instructions

[ ] Vector Narrowing Integer Right Shift Instructions

[ ] Vector Widening Integer Multiply-Add Instructions

[ ] Vector Widening Floating-Point Add/Subtract Instructions

[ ] Vector Widening Floating-Point Multiply

[ ] Widening Floating-Point/Integer Type-Convert Instructions

[ ] Narrowing Floating-Point/Integer Type-Convert Instructions

[ ] Vector Widening Floating-Point Fused Multiply-Add Instructions

I have Implemented Vector Widening Integer Add/Subtract in #6542 and #6555.

I have a branch that contains Vector Widening Integer Multiply Instructions in riscv-simd-widening-mul, however before merging that I think it we should rework our approach towards these instructions.

Matching one of these instructions includes a lot of rules mostly due to how many combinations we can perform with {s,u}widen_{low,high}, so before adding any more instruction I think we need to add an extractor that can match all of these patterns.

As an example vwmulsu.vv requires 8 rules to match all combinations of {s,u}widen_{low,high}.

Vector Multiple Register Move

We have implemented vmv1r.v as our move instruction, this instruction moves one register into another. However we also have vmv2r.v / vmv4r.v / vmv8r.v. These instructions move n consecutive registers.

As an example, vmv2r.v v10, v12 copies v10=v12; v11=v13.

These instructions have a constraint that the source and destination registers must be "aligned" to the number of the instruction. vmv2r.v v11, v15 would be an illegal instruction.

[ ] Whole Vector Register Move

Vector Reduction Operations

[ ] Vector Single-Width Integer Reduction Instructions

[ ] Vector Widening Integer Reduction Instructions

[ ] Vector Single-Width Floating-Point Reduction Instructions

[ ] Vector Widening Floating-Point Reduction Instructions

I'm not entirely sure how we can best match these operations. We don't have the equivalent instructions in cranelift.

These instructions do a operation on an entire vector register, such as a sum and return a scalar value. We have two of these implemented vredminu and vredmaxu that are used when lowering vany_true or vall_true.

Special Load / Store Addressing Modes

We currently implement Vector Unit-Stride Instructions for all our Loads and Stores. These are fairly simple in that they load vl elements that are a fixed number of bits apart. These match the semantics of the load and store instructions in cranelift.

The remaining Addressing Modes might be beneficial, but I have no idea how we could match them. Some of these might not be that useful.

[ ] Vector Strided Instructions

[ ] Vector Indexed Instructions

[ ] Unit-stride Fault-Only-First Loads

[ ] Vector Load/Store Segment Instructions

[ ] Vector Load/Store Whole Register Instructions

Vector Integer Division

We don't actually support vector sdiv / udiv, nevertheless RISC-V includes instructions for them in Vector Integer Divide Instructions. I'm including this here mostly for completeness.

[ ] Vector Integer Divide Instructions

Wasmtime GitHub notifications bot (Oct 08 2023 at 12:37):

afonso360 edited issue #7186:

:wave: Hey,

This is a general issue for tracking the missing instructions from the RISC-V Vector extension. We have implemented most of them, however there are a couple of categories that are still missing

Widening and Narrowing Operations

[x] Vector Widening Integer Add/Subtract

[ ] Vector Widening Integer Multiply Instructions

[ ] Vector Narrowing Integer Right Shift Instructions

[ ] Vector Widening Integer Multiply-Add Instructions

[ ] Vector Widening Floating-Point Add/Subtract Instructions

[ ] Vector Widening Floating-Point Multiply

[ ] Widening Floating-Point/Integer Type-Convert Instructions

[ ] Narrowing Floating-Point/Integer Type-Convert Instructions

[ ] Vector Widening Floating-Point Fused Multiply-Add Instructions

I have Implemented Vector Widening Integer Add/Subtract in #6542 and #6555.

I also have a branch that contains Vector Widening Integer Multiply Instructions in riscv-simd-widening-mul, however before merging that I think it we should rework our approach towards these instructions.

Matching one of these instructions includes a lot of rules mostly due to how many combinations we can perform with {s,u}widen_{low,high}, so before adding any more instruction I think we need to add an extractor that can match all of these patterns.

As an example vwmulsu.vv requires 8 rules to match all combinations of {s,u}widen_{low,high}.

Vector Multiple Register Move

We have implemented vmv1r.v as our move instruction, this instruction moves one register into another. However we also have vmv2r.v / vmv4r.v / vmv8r.v. These instructions move n consecutive registers.

As an example, vmv2r.v v10, v12 copies v10=v12; v11=v13.

These instructions have a constraint that the source and destination registers must be "aligned" to the number of the instruction. vmv2r.v v11, v15 would be an illegal instruction.

[ ] Whole Vector Register Move

Vector Reduction Operations

[ ] Vector Single-Width Integer Reduction Instructions

[ ] Vector Widening Integer Reduction Instructions

[ ] Vector Single-Width Floating-Point Reduction Instructions

[ ] Vector Widening Floating-Point Reduction Instructions

I'm not entirely sure how we can best match these operations. We don't have the equivalent instructions in cranelift.

These instructions do a operation on an entire vector register, such as a sum and return a scalar value. We have two of these implemented vredminu and vredmaxu that are used when lowering vany_true or vall_true.

Special Load / Store Addressing Modes

We currently implement Vector Unit-Stride Instructions for all our Loads and Stores. These are fairly simple in that they load vl elements that are a fixed number of bits apart. These match the semantics of the load and store instructions in cranelift.

The remaining Addressing Modes might be beneficial, but I have no idea how we could match them. Some of these might not be that useful.

[ ] Vector Strided Instructions

[ ] Vector Indexed Instructions

[ ] Unit-stride Fault-Only-First Loads

[ ] Vector Load/Store Segment Instructions

[ ] Vector Load/Store Whole Register Instructions

Vector Integer Division

We don't actually support vector sdiv / udiv, nevertheless RISC-V includes instructions for them in Vector Integer Divide Instructions. I'm including this here mostly for completeness.

[ ] Vector Integer Divide Instructions

Last updated: Apr 18 2025 at 08:04 UTC