Stream: git-wasmtime

Topic: wasmtime / PR #5931 x64: Add more support for more AVX in...


view this post on Zulip Wasmtime GitHub notifications bot (Mar 04 2023 at 22:33):

alexcrichton opened PR #5931 from x64-more-avx to main:

One aspect of AVX that I have just recently become aware of is that there's apparently a performance penalty associated with mixing AVX and SSE intructions. One reason for this is that AVX instructions which operate over 128-bit values always zero the higher-than-128-bits of each register operate on. SSE instructions, however, don't do this. This means that false dependencies can be created between instructions because SSE instructions look like they're intentionally preserving higher bits where AVX instructions intentionally zero them. According to this stackoverflow question the processor also tracks whether an instruction has been executed and there's a "scary red line" for mixing AVX/SSE.

On the local meshoptimizer benchmark this PR doesn't actually have any effect on the generate code's performance, or not one that I can measure. In that sense this is more of a hygiene thing than anything else.

Specifically the changes here were to refactor many ISLE helpers that were generating instructions with SseOpcode.XXX manually to instead using the instruction helpers which will use the AVX variant if enabled. Additionally more AVX instructions were added for moving data to/from memory and such.

I don't think this 100% handles all the SSE instructions cranelift can generate when AVX is enabled, but it at least raises the bar further and removes a bunch of cases of SSE-generated instructions when AVX is enabled.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 08 2023 at 20:49):

alexcrichton requested abrown for a review on PR #5931.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 09 2023 at 20:26):

abrown submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 09 2023 at 20:26):

abrown submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 09 2023 at 20:26):

abrown created PR review comment:

Maybe a rename?

      (xmm_movrm_sse (SseOpcode.Movdqu) addr data))

Back when I was adding things like this, I tried to keep a convention of x64_<instruction> but it's been a while.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 09 2023 at 20:26):

abrown created PR review comment:

A rename might help here:

(rule 1 (x64_xor_vector $F32 x y) (x64_xorps x y))

Not sure about the x64_ part (see my other comment), but I don't think this is exclusively SSE anymore and so we could go with xor_vector to distinguish from the scalar version.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 09 2023 at 22:41):

alexcrichton updated PR #5931 from x64-more-avx to main.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 09 2023 at 22:42):

alexcrichton submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 09 2023 at 22:42):

alexcrichton created PR review comment:

I opted to drop the _sse suffix since the other xmm_* helpers don't have that, although it might be good to go back and rename Xmm* to Sse and Xmm*Vex to Avx* perhaps.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 09 2023 at 22:42):

alexcrichton has enabled auto merge for PR #5931.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 09 2023 at 23:24):

alexcrichton updated PR #5931 from x64-more-avx to main.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 09 2023 at 23:25):

alexcrichton has disabled auto merge for PR #5931.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 09 2023 at 23:40):

alexcrichton has enabled auto merge for PR #5931.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 10 2023 at 00:42):

alexcrichton merged PR #5931.


Last updated: Nov 22 2024 at 16:03 UTC