afonso360 opened PR #6397 from afonso360:riscv-extract-splat to bytecodealliance:main:
:wave: Hey,
This PR implements both
extractlaneandsplat.
splatis fairly simple in that we have a bunch of move instructions that by default splat the source X or F register into the vector register. These arevmv.v.x,vfmv.v.fandvmv.v.i, for X, F and immediate sources. The only noteworthy thing about these instructions is that they have weird encodings, and I've added two new instruction formats to deal with this.vmv.v.ihas no source operands, only a destination register. And the other ones could maybe be encoded using the existing Imm5 instruction format it was becoming a bit weird keeping all of the variations together.For
extractlanewe have two additional move instructionsvfmv.f.sandvmv.x.sthese move element 0 of the source vector into the appropriate X or F register. Additionally for extracting other elements we usevslidedownthat moves all elements of a vector down by n positions and then emit the appropriate move into the destination register.
afonso360 requested elliottt for a review on PR #6397.
afonso360 requested wasmtime-compiler-reviewers for a review on PR #6397.
afonso360 requested wasmtime-default-reviewers for a review on PR #6397.
alexcrichton submitted PR review:
Nice!
alexcrichton submitted PR review:
Nice!
alexcrichton created PR review comment:
IIRC the aarch64 backend does some trickery along these lines where it iteratively halves the size of a constant if it's splatted, which may serve as good inspiration for supporting this.
alexcrichton created PR review comment:
Out of curiosity, is there a particular motivation for having this helper here vs inlining it into the lowering of
extractlane?
afonso360 created PR review comment:
Mostly to avoid having two rules for float and integer in the slide down cases.
vslidedownis generic across integer and float elements, butvmvis not, so we recursively call this rule to decide the correctvmvinstruction to use.That being said, we can probably inline the
vslidedownrules and have just a genericvmvthat decides the correct instruction based on the type.
alexcrichton created PR review comment:
Oh sorry to clarify I mean that you've got 4 cases of
gen_extractlanehere, but why not have those 4 cases be cases onlower (extractlane ..)?
afonso360 created PR review comment:
If I inline it directly into
(lower (extractlane ..))then it would become, 6 rules, right? Since I would have to duplicatevslidedown.vi+vmv,vslidedown.vi+vfm,vslidedown.vx+vmvandvslidedown.vx+vfm, which are currently just 2 rules.Or can I then recursively call the lower instructions directly?
afonso360 edited PR review comment.
alexcrichton created PR review comment:
Ah I apologize I missed that crucial bit of this being a recursive rule! In that case yeah definitely makes sense as a standalone decl.
afonso360 edited PR review comment.
afonso360 updated PR #6397.
afonso360 has enabled auto merge for PR #6397.
afonso360 merged PR #6397.
Last updated: Dec 13 2025 at 19:03 UTC