wasmtime / PR #9853 pulley: Implement integer vector comp... · git-wasmtime

Stream: git-wasmtime

Topic: wasmtime / PR #9853 pulley: Implement integer vector comp...

Wasmtime GitHub notifications bot (Dec 18 2024 at 18:41):

alexcrichton opened PR #9853 from alexcrichton:pulley-simd-compare to bytecodealliance:main:

More wast tests passing.

Wasmtime GitHub notifications bot (Dec 18 2024 at 18:41):

alexcrichton requested cfallin for a review on PR #9853.

Wasmtime GitHub notifications bot (Dec 18 2024 at 18:41):

alexcrichton requested wasmtime-compiler-reviewers for a review on PR #9853.

Wasmtime GitHub notifications bot (Dec 18 2024 at 18:41):

alexcrichton requested dicej for a review on PR #9853.

Wasmtime GitHub notifications bot (Dec 18 2024 at 18:41):

alexcrichton requested wasmtime-core-reviewers for a review on PR #9853.

Wasmtime GitHub notifications bot (Dec 18 2024 at 18:41):

alexcrichton requested wasmtime-default-reviewers for a review on PR #9853.

Wasmtime GitHub notifications bot (Dec 18 2024 at 19:09):

cfallin submitted PR review:

Thanks! Looks right to me; I checked over most of this for copy-pastos but am additionally trusting the runtests as a backstop. Item of stray curiosity below but nothing to block on.

Wasmtime GitHub notifications bot (Dec 18 2024 at 19:09):

cfallin created PR review comment:

Stray curiosity: did you happen to look if LLVM can autovectorize this? It sure would be neat to have vector op implementations bottom out in native vector instructions when Pulley runs on a SIMD-capable host...

(No worries if not, it's not the main goal, but if it inspires anything then all the better)

Wasmtime GitHub notifications bot (Dec 18 2024 at 19:20):

alexcrichton submitted PR review.

Wasmtime GitHub notifications bot (Dec 18 2024 at 19:20):

alexcrichton created PR review comment:

Heh I've been double-checking this along the way for most of the simd opcodes. The good news is yes! LLVM does a pretty good job at auto-vectorizing all these methods.

For example vaddi32x4 looks like this:

0000000000000000 <_ZN97_$LT$pulley_interpreter..interp..Interpreter$u20$as$u20$pulley_interpreter..decode..OpVisitor$GT$9vaddi32x417h24bc21fe57c19519E>:
   0:   48 8b 07                mov    (%rdi),%rax
   3:   89 f1                   mov    %esi,%ecx
   5:   40 0f b6 d6             movzbl %sil,%edx
   9:   c1 ee 04                shr    $0x4,%esi
   c:   81 e6 f0 0f 00 00       and    $0xff0,%esi
  12:   c1 e9 0c                shr    $0xc,%ecx
  15:   81 e1 f0 0f 00 00       and    $0xff0,%ecx
  1b:   c1 e2 04                shl    $0x4,%edx
  1e:   c5 f9 6f 04 30          vmovdqa (%rax,%rsi,1),%xmm0
  23:   c5 f9 fe 04 08          vpaddd (%rax,%rcx,1),%xmm0,%xmm0
  28:   c5 f9 7f 04 10          vmovdqa %xmm0,(%rax,%rdx,1)
  2d:   31 c0                   xor    %eax,%eax
  2f:   c3                      ret

and the method here looks like this:

0000000000000000 <_ZN105_$LT$pulley_interpreter..interp..Interpreter$u20$as$u20$pulley_interpreter..decode..ExtendedOpVisitor$GT$7veq8x1617h73aa3ce30a2d51abE>:
   0:   48 8b 07                mov    (%rdi),%rax
   3:   89 f1                   mov    %esi,%ecx
   5:   40 0f b6 d6             movzbl %sil,%edx
   9:   c1 ee 04                shr    $0x4,%esi
   c:   81 e6 f0 0f 00 00       and    $0xff0,%esi
  12:   c1 e9 0c                shr    $0xc,%ecx
  15:   81 e1 f0 0f 00 00       and    $0xff0,%ecx
  1b:   c1 e2 04                shl    $0x4,%edx
  1e:   c5 f9 6f 04 30          vmovdqa (%rax,%rsi,1),%xmm0
  23:   c5 f9 74 04 08          vpcmpeqb (%rax,%rcx,1),%xmm0,%xmm0
  28:   c5 f9 7f 04 10          vmovdqa %xmm0,(%rax,%rdx,1)
  2d:   31 c0                   xor    %eax,%eax
  2f:   c3                      ret

Most of the complexity here is decoding BinaryOperands<VReg> where it's three 5-bit values packed into a 16-bit value, but otherwise it's pretty optimal in terms of lowering.

Wasmtime GitHub notifications bot (Dec 18 2024 at 19:21):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Dec 18 2024 at 19:21):

cfallin created PR review comment:

Nice, that's great!

Wasmtime GitHub notifications bot (Dec 18 2024 at 19:28):

cfallin merged PR #9853.

Last updated: Apr 16 2025 at 22:03 UTC