Stream: git-wasmtime

Topic: wasmtime / PR #6323 x64: Add non-SSE 4.1 lowerings for `i...


view this post on Zulip Wasmtime GitHub notifications bot (May 02 2023 at 02:07):

alexcrichton requested abrown for a review on PR #6323.

view this post on Zulip Wasmtime GitHub notifications bot (May 02 2023 at 02:07):

alexcrichton opened PR #6323 from alexcrichton:x64-insertlane-sse2 to bytecodealliance:main:

This commit avoids the use of pinsr* when SSE 4.1 is enabled by using alternative means of inserting values into vectors.

<!--
Please make sure you include the following information:

Our development process is documented in the Wasmtime book:
https://docs.wasmtime.dev/contributing-development-process.html

Please ensure all communication follows the code of conduct:
https://github.com/bytecodealliance/wasmtime/blob/main/CODE_OF_CONDUCT.md
-->

view this post on Zulip Wasmtime GitHub notifications bot (May 02 2023 at 02:07):

alexcrichton requested wasmtime-compiler-reviewers for a review on PR #6323.

view this post on Zulip Wasmtime GitHub notifications bot (May 08 2023 at 17:09):

abrown submitted PR review:

LGTM but see comments... (Sorry about the delay; I had my comments pending and forgot to submit it).

view this post on Zulip Wasmtime GitHub notifications bot (May 08 2023 at 17:09):

abrown submitted PR review:

LGTM but see comments... (Sorry about the delay; I had my comments pending and forgot to submit it).

view this post on Zulip Wasmtime GitHub notifications bot (May 08 2023 at 17:09):

abrown created PR review comment:

Before I look at the rest of the PR, I just wanted to recall something that bit me long ago: if moving from another XMM register, the bottom lane is merged with the rest; but if loading from memory, the upper lanes are zeroed. We just need to make sure Cranelift doesn't load-coalesce into this instruction.

view this post on Zulip Wasmtime GitHub notifications bot (May 08 2023 at 17:09):

abrown created PR review comment:

The way this reads, we shuffle the lane: n >> 2? Why? The shuffle immediates seem to line up with the original n, not n >> 2.

view this post on Zulip Wasmtime GitHub notifications bot (May 08 2023 at 17:09):

abrown created PR review comment:

So no load-coalescing after all, right?

view this post on Zulip Wasmtime GitHub notifications bot (May 08 2023 at 18:08):

alexcrichton updated PR #6323.

view this post on Zulip Wasmtime GitHub notifications bot (May 08 2023 at 18:08):

alexcrichton created PR review comment:

Indeed! The x64_movss_regmove helper has a doc block above it for x64_movsd_regmove covering why load coalesing is disallowed there.

view this post on Zulip Wasmtime GitHub notifications bot (May 08 2023 at 18:09):

alexcrichton created PR review comment:

I've updated the doc block above this to hopefully help clear up, but the low 2 bits of n are where in a 32-bit lane the value goes and the upper 2 bits are which 32-bit lane is chosen for the final pshufd (which the documentation was definitely not at all clearly saying)

view this post on Zulip Wasmtime GitHub notifications bot (May 08 2023 at 18:09):

alexcrichton has enabled auto merge for PR #6323.

view this post on Zulip Wasmtime GitHub notifications bot (May 08 2023 at 19:25):

alexcrichton merged PR #6323.


Last updated: Jan 24 2025 at 00:11 UTC