alexcrichton opened PR #6496 from alexcrichton:x64-sse2-splat
to bytecodealliance:main
:
Update the lowerings for 64-bit splats to use
pshufd
instead, as LLVM does.<!--
Please make sure you include the following information:
If this work has been discussed elsewhere, please include a link to that
conversation. If it was discussed in an issue, just mention "issue #...".Explain why this change is needed. If the details are in an issue already,
this can be brief.Our development process is documented in the Wasmtime book:
https://docs.wasmtime.dev/contributing-development-process.htmlPlease ensure all communication follows the code of conduct:
https://github.com/bytecodealliance/wasmtime/blob/main/CODE_OF_CONDUCT.md
-->
alexcrichton requested abrown for a review on PR #6496.
alexcrichton requested wasmtime-compiler-reviewers for a review on PR #6496.
alexcrichton updated PR #6496.
alexcrichton updated PR #6496.
alexcrichton updated PR #6496.
abrown submitted PR review.
abrown submitted PR review.
abrown created PR review comment:
Something feels off here: I look at the top of the file and see
has_ssse3
but then heremovddup
is not used. Is it that the rule ordering should be reversed? I.e., themovddup
rule should be higher priority than thepshufd
rule?
alexcrichton created PR review comment:
Ah yes this is intended, I ended up changing the translation of splat from the previous
movddup
topshufd
in the default case. I did that because it seems it's what LLVM does for me in Rust, but I don't know if there's a specific reason to usemovddup
vspshufd
(happy to add lowerings for both of course)
abrown created PR review comment:
Ok, looking into this more I think this is the right change: we use
PSHUFD
for the XMM-to-XMM splats (slightly higher throughput) andMOVDDUP
for the MEM-to-XMM splats (slightly lower latency and higher throughput).
abrown submitted PR review.
abrown merged PR #6496.
Last updated: Jan 24 2025 at 00:11 UTC