I was surprised to see:
v55 = ishl.i32x4 v8, v25
and
arg 1 (v25) with type i32x4 failed to satisfy type set ValueTypeSet { lanes: BitSet(1), ints: BitSet(248), floats: BitSet(0), r\|
efs: BitSet(0), dynamic_lanes: BitSet(0) }
when both AVX2 and NEON support per-lane shift amounts.
https://developer.arm.com/documentation/102159/0400/Shifting-left-and-right
https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#ig_expand=3707,3729,3724,3708,6143&text=mm256_sllv
Yes Cranelift doesn't currently have support for per-lane shifts, much SIMD support comes from WebAssembly's simd instruction set which also doesn't have this instruction
that being said I don't think anyone would be opposed to adding such an instruction
Ok, I guess I have to get my hands dirty then
One thing perhaps worth pointing out is AFAIK there's very few, if any, instructions implemented for avx2, lots of stuff for avx and a few for avx512*, but if you're getting your hands dirty a few things may need to be plumbed around for avx2
avx512 but not avx2? That's odd
avx512 isn't exhaustively bound, only some special cases for a few instructions that have a single-instruction equivalent in avx512 but a larger fallback with older pocessors
Last updated: Jan 24 2025 at 00:11 UTC