afonso360 edited PR #6876:
:wave: Hey,
This is a followup to #6874 where
f{min,max}_pseudo
was removed and replaced it withbitselect+fcmp
. Here we optimize that pattern into a mask generation instruction andvmerge.vvm
that merges both inputs.This allows us to avoid the quite long sequence for bitselect (4 instructions) and also vector mask expansion (1 instruction) in these patterns.
For tests here I'm relying mostly on wasmtimes wast testsuite.
alexcrichton submitted PR review:
Aha I see, and makes sense, thanks for explaining! In that case I don't think a pattern similar to x64 fits well here, so I think what you've written is the way to go :+1:
alexcrichton merged PR #6876.
Last updated: Nov 22 2024 at 16:03 UTC