Stream: git-wasmtime

Topic: wasmtime / PR #6876 riscv64: Optimize `bitselect+cmp` cod...


view this post on Zulip Wasmtime GitHub notifications bot (Aug 21 2023 at 23:14):

afonso360 opened PR #6876 from afonso360:riscv-bitselect-opt to bytecodealliance:main:

:wave: Hey,

This is a followup to #6874 where it removed f{min,max}_pseudo and replaced it with bitselect+fcmp. Here we optimize that pattern into a mask generation instruction and vmerge.vvm that merges both inputs.

This allows us to avoid the quite long sequence for bitselect (4 instructions) and also mask expansion (1 instruction) in these patterns.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 21 2023 at 23:14):

afonso360 requested wasmtime-compiler-reviewers for a review on PR #6876.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 21 2023 at 23:14):

afonso360 requested jameysharp for a review on PR #6876.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 21 2023 at 23:14):

afonso360 edited PR #6876:

:wave: Hey,

This is a followup to #6874 where f{min,max}_pseudo was removed and replaced it with bitselect+fcmp. Here we optimize that pattern into a mask generation instruction and vmerge.vvm that merges both inputs.

This allows us to avoid the quite long sequence for bitselect (4 instructions) and also mask expansion (1 instruction) in these patterns.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 21 2023 at 23:15):

afonso360 edited PR #6876:

:wave: Hey,

This is a followup to #6874 where f{min,max}_pseudo was removed and replaced it with bitselect+fcmp. Here we optimize that pattern into a mask generation instruction and vmerge.vvm that merges both inputs.

This allows us to avoid the quite long sequence for bitselect (4 instructions) and also mask expansion (1 instruction) in these patterns.

For tests here I'm relying mostly on wasmtimes wast testsuite.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 21 2023 at 23:16):

afonso360 edited PR #6876:

:wave: Hey,

This is a followup to #6874 where f{min,max}_pseudo was removed and replaced it with bitselect+fcmp. Here we optimize that pattern into a mask generation instruction and vmerge.vvm that merges both inputs.

This allows us to avoid the quite long sequence for bitselect (4 instructions) and also vector mask expansion (1 instruction) in these patterns.

For tests here I'm relying mostly on wasmtimes wast testsuite.


Last updated: Nov 22 2024 at 16:03 UTC