abrown opened Issue #1412:
<!-- Please try to describe precisely what you would like to do in Cranelift and/or
expect from it. You can answer the questions below if they're relevant and
delete this text before submitting. Thanks for opening an issue! -->Feature
<!-- What is the feature or code improvement you would like to do in Cranelift? -->
As discussed in https://github.com/WebAssembly/simd/issues/192, the Wasm SIMDbitselect
instruction could be lowered to one of the x86BLEND*
family of instructions instead using 3-4 instructions.Benefit
<!-- What is the value of adding this in Cranelift? -->
Potentially faster code, smaller code size.Implementation
<!-- Do you have an implementation plan, and/or ideas for data structures or algorithms to use? -->
If we know that thebitselect
control mask value comes from a comparison instruction then, for each lane, all of the bits will be either 0s or 1s. This allows us to use aBLEND*
instruction on x86, since the ISA has no bit-level selection like ARM'sVBSL
.Alternatives
<!-- Have you considered alternative implementations? If so, how are they better or worse than your proposal? -->
https://github.com/WebAssembly/simd/issues/192 also proposed making this type of information explicit in the Wasm SIMD spec by having comparisons return a mask type that would allow following instructions to make the optimization above without looking up the control mask origin and through function calls. Unfortunately, this is not likely to happen soon--"a lot of work".
alexcrichton transferred Issue #1412:
<!-- Please try to describe precisely what you would like to do in Cranelift and/or
expect from it. You can answer the questions below if they're relevant and
delete this text before submitting. Thanks for opening an issue! -->Feature
<!-- What is the feature or code improvement you would like to do in Cranelift? -->
As discussed in https://github.com/WebAssembly/simd/issues/192, the Wasm SIMDbitselect
instruction could be lowered to one of the x86BLEND*
family of instructions instead using 3-4 instructions.Benefit
<!-- What is the value of adding this in Cranelift? -->
Potentially faster code, smaller code size.Implementation
<!-- Do you have an implementation plan, and/or ideas for data structures or algorithms to use? -->
If we know that thebitselect
control mask value comes from a comparison instruction then, for each lane, all of the bits will be either 0s or 1s. This allows us to use aBLEND*
instruction on x86, since the ISA has no bit-level selection like ARM'sVBSL
.Alternatives
<!-- Have you considered alternative implementations? If so, how are they better or worse than your proposal? -->
https://github.com/WebAssembly/simd/issues/192 also proposed making this type of information explicit in the Wasm SIMD spec by having comparisons return a mask type that would allow following instructions to make the optimization above without looking up the control mask origin and through function calls. Unfortunately, this is not likely to happen soon--"a lot of work".
Last updated: Jan 24 2025 at 00:11 UTC