afonso360 edited issue #6890:
:wave: Hey,
.clifTest Casetest interpret test run target x86_64 has_sse41 function %bitselect_vconst_f64x2(f64x2, f64x2) -> f64x2 { block0(v1: f64x2, v2: f64x2): v0 = vconst.f64x2 0xFF00000000000000FF00000000000000 v3 = bitselect v0, v1, v2 return v3 } ; run: %bitselect_vconst_f64x2(0x11111111111111111111111111111111, 0x00000000000000000000000000000000) == 0x11000000000000001100000000000000Steps to Reproduce
clif-util test ./the-above.clifExpected Results
The test to pass.
Actual Results
Running `/home/afonso/git/wasmtime/target/debug/clif-util test ./lmao2.clif` ERROR cranelift_filetests::concurrent > FAIL: run FAIL ./lmao2.clif: run Caused by: Failed test: run: %bitselect_vconst_f64x2(0x11111111111111111111111111111111, 0x00000000000000000000000000000000) == 0x11000000000000001100000000000000, actual: 0x11111111111111111111111111111111 1 tests Error: 1 failureVersions and Environment
Cranelift version or commit: main
Operating system: Linux
Architecture: x86_64
Extra Info
This issue is caused by this optimization to
bitselect. It checks if every byte in thevconstis0xFFor0x00, which it is in this case, but then emits ablendinstruction of whatever type the original bitselect was issued.This is correct for
i8x16, but not for any type with a larger lane size.This does not affect wasmtime since wasmtime always bitcasts the inputs to bitselect into a
i8x16before the operation which is the only type for which this works.We also currently don't remove bitcasts in the midend, so this won't get accidentally converted into a
bitselect.f64x2.
afonso360 commented on issue #6890:
I think bitcast isn't the cause of this problem, right? Should the issue title say bitselect.f64x2 instead?
Oops, yes, bitselect, not bitcast!
jameysharp commented on issue #6890:
I'm fine with converting
bitselectwith a constant mask intoshuffleif you think that's best, @alexcrichton.But if we don't do that then I'd like to think through the rest of what you said more carefully. I don't see why
vconst_all_ones_or_all_zerosneeds to do anything but a byte-oriented pattern-match, so long as the result is always a byte-oriented blend. As long as both use the same definition of what a "lane" is, then we only fire this rule when the MSB of a lane is the same as all the other bits in the lane, so the intended type doesn't matter. Choosing byte-sized lanes means the rule can match more cases while still giving correct results.
alexcrichton commented on issue #6890:
Good point! Also sorry I should have read your comment more closely and more carefully. I believe you're correct and always using
x64_pblendvbhere is sufficient, and my point about turning things intoshuffleis orthogonal.(sorry I should have learned by now to read things completely)
abrown closed issue #6890:
:wave: Hey,
.clifTest Casetest interpret test run target x86_64 has_sse41 function %bitselect_vconst_f64x2(f64x2, f64x2) -> f64x2 { block0(v1: f64x2, v2: f64x2): v0 = vconst.f64x2 0xFF00000000000000FF00000000000000 v3 = bitselect v0, v1, v2 return v3 } ; run: %bitselect_vconst_f64x2(0x11111111111111111111111111111111, 0x00000000000000000000000000000000) == 0x11000000000000001100000000000000Steps to Reproduce
clif-util test ./the-above.clifExpected Results
The test to pass.
Actual Results
Running `/home/afonso/git/wasmtime/target/debug/clif-util test ./lmao2.clif` ERROR cranelift_filetests::concurrent > FAIL: run FAIL ./lmao2.clif: run Caused by: Failed test: run: %bitselect_vconst_f64x2(0x11111111111111111111111111111111, 0x00000000000000000000000000000000) == 0x11000000000000001100000000000000, actual: 0x11111111111111111111111111111111 1 tests Error: 1 failureVersions and Environment
Cranelift version or commit: main
Operating system: Linux
Architecture: x86_64
Extra Info
This issue is caused by this optimization to
bitselect. It checks if every byte in thevconstis0xFFor0x00, which it is in this case, but then emits ablendinstruction of whatever type the original bitselect was issued.This is correct for
i8x16, but not for any type with a larger lane size.This does not affect wasmtime since wasmtime always bitcasts the inputs to bitselect into a
i8x16before the operation which is the only type for which this works.We also currently don't remove bitcasts in the midend, so this won't get accidentally converted into a
bitselect.f64x2.
Last updated: Dec 13 2025 at 21:03 UTC