wasmtime / PR #5982 x64: Improve memory support in `{inse... · git-wasmtime

Stream: git-wasmtime

Topic: wasmtime / PR #5982 x64: Improve memory support in `{inse...

Wasmtime GitHub notifications bot (Mar 10 2023 at 18:22):

alexcrichton opened PR #5982 from pextr to main:

This commit improves adds support to Cranelift to emit pextr{b,w,d,q} with a memory destination, merging a store-of-extract operation into one instruction. Additionally AVX support is added for the pextr* instructions.

I've additionally tried to ensure that codegen tests and runtests exist for all forms of these instructions too.

Wasmtime GitHub notifications bot (Mar 10 2023 at 18:22):

alexcrichton requested abrown for a review on PR #5982.

Wasmtime GitHub notifications bot (Mar 10 2023 at 23:52):

abrown submitted PR review.

Wasmtime GitHub notifications bot (Mar 10 2023 at 23:52):

abrown submitted PR review.

Wasmtime GitHub notifications bot (Mar 10 2023 at 23:52):

abrown created PR review comment:

This is 0b11101110... is that right?

Wasmtime GitHub notifications bot (Mar 10 2023 at 23:52):

abrown created PR review comment:

Looking at the other cases, I started to worry that one of these instructions actually zeroes bits we do not want to zero. I think it is this case; from the MOVSD documentation:

Legacy version: When the source and destination operands are XMM registers, bits MAXVL:64 of the destination operand remains unchanged. When the source operand is a memory location and destination operand is an XMM registers, the quadword at bits 127:64 of the destination operand is cleared to all 0s, bits MAXVL:128 of the destination operand remains unchanged.

VEX and EVEX encoded register-register syntax: Moves a scalar double precision floating-point value from the second source operand (the third operand) to the low quadword element of the destination operand (the first operand). Bits 127:64 of the destination operand are copied from the first source operand (the second operand). Bits (MAXVL-1:128) of the corresponding destination register are zeroed

So it is the legacy SSE case where a movsd 0(%rdi) .... would zero too many bits but actually in the AVX code we could merge these two vmovsd into one. Is there a way to do that?

Wasmtime GitHub notifications bot (Mar 10 2023 at 23:52):

abrown created PR review comment:

Missing comma after $1?

Wasmtime GitHub notifications bot (Mar 10 2023 at 23:52):

abrown created PR review comment:

I checked and u8_from_uimm8 does not guarantee that n is in the right range (e.g., 0 to 15 here). Did you see any validation that would prevent wrong n here at the CLIF level or do we rely exclusively on the Wasm-level construction? If there is nothing and it ends up being too tricky to do here in ISLE, maybe emit.rs could gain some assert!s or something like that.

Wasmtime GitHub notifications bot (Mar 10 2023 at 23:52):

abrown created PR review comment:

:+1:

Wasmtime GitHub notifications bot (Mar 11 2023 at 02:21):

alexcrichton submitted PR review.

Wasmtime GitHub notifications bot (Mar 11 2023 at 02:21):

alexcrichton created PR review comment:

I think this is the block in the verifier which validates that lanes are always inbounds, so I think we should be good on that front. I can add some extra asserts though to the backend too.

Wasmtime GitHub notifications bot (Mar 11 2023 at 02:22):

alexcrichton submitted PR review.

Wasmtime GitHub notifications bot (Mar 11 2023 at 02:22):

alexcrichton created PR review comment:

Indeed!

Wasmtime GitHub notifications bot (Mar 11 2023 at 02:28):

alexcrichton submitted PR review.

Wasmtime GitHub notifications bot (Mar 11 2023 at 02:28):

alexcrichton created PR review comment:

Going by this which I realize isn't official but has been what I've been using so far, my read is that VEX.128.F2.0F 11 /r: VMOVSD xmm1, xmm2, xmm3 has different semantics than VEX.128.F2.0F 10 /r: VMOVSD xmm1, m64 so I think that AVX and SSE match here where if you use the register-to-register form it preserves the upper bits but if you use the memory-to-register form it always zeros the upper bits, so I don't think we can fuse?

I'll admit though that this is all real subtle and if the official docs are different I wouldn't be too surprised.

Wasmtime GitHub notifications bot (Mar 11 2023 at 02:31):

alexcrichton updated PR #5982 from pextr to main.

Wasmtime GitHub notifications bot (Mar 13 2023 at 16:27):

alexcrichton updated PR #5982 from pextr to main.

Wasmtime GitHub notifications bot (Mar 13 2023 at 20:21):

alexcrichton merged PR #5982.

Last updated: Apr 18 2025 at 16:03 UTC