afonso360 opened PR #8133 from afonso360:riscv-fix-bitselect-cmp
to bytecodealliance:main
:
:wave: Hey
We have a special lowering that allows us to fuse a
bitselect
with a comparison instruction. This saves us a few instructions due to the mismatch that exists between native RISC-V masks and WASM masks.Native RISC-V masks have a single bit per lane, whereas WASM masks have all bits in a lane set to 1.
The lowering for
bitselect+bitcast+{i,f}cmp
avoids the need to generate the WASM mask, by directly using the comparison mask withvmerge
.The bug that this fixes was that when we introduce a
bitcast
in the middle, the comparison and the merge may have different types with different lanes. And if that happens thevmerge
will only look at the first n bits of the mask. n being the number of lanes currently configured.This commit ensures that they are always equal by using the same type for both vmerge and the comparison instruction.
I also manually checked all other uses of
gen_{f,i}cmp_mask
and they are all using the same type in the subsequent instructions.With this fix we no longer really care about the type of the
bitselect
as long as it has the same bitlength as the type of{i,f}cmp
, which I think is enforced by the verifier. (i.e. We would have the same bug ifbitselect.i8x16+icmp.i8x8
was legal.)Thanks for reporting this @candymate!
afonso360 requested elliottt for a review on PR #8133.
afonso360 requested wasmtime-compiler-reviewers for a review on PR #8133.
afonso360 edited PR #8133:
:wave: Hey
We have a special lowering that allows us to fuse a
bitselect
with a comparison instruction. This saves us a few instructions due to the mismatch that exists between native RISC-V masks and WASM masks.Native RISC-V masks have a single bit per lane, whereas WASM masks have all bits in a lane set to 1.
The lowering for
bitselect+bitcast+{i,f}cmp
avoids the need to generate the WASM mask, by directly using the native comparison mask withvmerge
.The bug that this fixes was that when we introduce a
bitcast
in the middle, the comparison and the merge may have different types with different lanes. And if that happens thevmerge
will only look at the first n bits of the mask. n being the number of lanes currently configured.This commit ensures that they are always equal by using the same type for both vmerge and the comparison instruction.
I also manually checked all other uses of
gen_{f,i}cmp_mask
and they are all using the same type in the subsequent instructions.With this fix we no longer really care about the type of the
bitselect
as long as it has the same bitlength as the type of{i,f}cmp
, which I think is enforced by the verifier. (i.e. We would have the same bug ifbitselect.i8x16+icmp.i8x8
was legal.)Thanks for reporting this @candymate!
afonso360 updated PR #8133.
alexcrichton submitted PR review:
Thanks for the quick fix!
alexcrichton merged PR #8133.
Last updated: Dec 23 2024 at 12:05 UTC