Stream: git-wasmtime

Topic: wasmtime / issue #3945 Cranelift: extra `movdqa` when low...


view this post on Zulip Wasmtime GitHub notifications bot (Mar 18 2022 at 18:28):

abrown opened issue #3945:

#3886 ports CLIF icmp operations to ISLE but in the process added new, unnecessary moves to the lowering:

https://github.com/bytecodealliance/wasmtime/blob/e92cbfb28383fdd53b5a48bdd99f4ea48c57fa78/cranelift/filetests/filetests/isa/x64/simd-comparison-legalize.clif#L61-L64

The two initial movdqa instructions are not needed with the proper use of registers; my current thinking is that ISLE's move elision may not be working as it should but this could be due to re-ordering of operands in a lowering rule that ISLE cannot resolve.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 18 2022 at 18:28):

abrown commented on issue #3945:

This may be similar to #3744.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 23 2022 at 20:10):

alexcrichton labeled issue #3945:

#3886 ports CLIF icmp operations to ISLE but in the process added new, unnecessary moves to the lowering:

https://github.com/bytecodealliance/wasmtime/blob/e92cbfb28383fdd53b5a48bdd99f4ea48c57fa78/cranelift/filetests/filetests/isa/x64/simd-comparison-legalize.clif#L61-L64

The two initial movdqa instructions are not needed with the proper use of registers; my current thinking is that ISLE's move elision may not be working as it should but this could be due to re-ordering of operands in a lowering rule that ISLE cannot resolve.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 23 2022 at 20:10):

alexcrichton labeled issue #3945:

#3886 ports CLIF icmp operations to ISLE but in the process added new, unnecessary moves to the lowering:

https://github.com/bytecodealliance/wasmtime/blob/e92cbfb28383fdd53b5a48bdd99f4ea48c57fa78/cranelift/filetests/filetests/isa/x64/simd-comparison-legalize.clif#L61-L64

The two initial movdqa instructions are not needed with the proper use of registers; my current thinking is that ISLE's move elision may not be working as it should but this could be due to re-ordering of operands in a lowering rule that ISLE cannot resolve.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 23 2022 at 20:10):

alexcrichton labeled issue #3945:

#3886 ports CLIF icmp operations to ISLE but in the process added new, unnecessary moves to the lowering:

https://github.com/bytecodealliance/wasmtime/blob/e92cbfb28383fdd53b5a48bdd99f4ea48c57fa78/cranelift/filetests/filetests/isa/x64/simd-comparison-legalize.clif#L61-L64

The two initial movdqa instructions are not needed with the proper use of registers; my current thinking is that ISLE's move elision may not be working as it should but this could be due to re-ordering of operands in a lowering rule that ISLE cannot resolve.

view this post on Zulip Wasmtime GitHub notifications bot (May 03 2022 at 15:43):

alexcrichton commented on issue #3945:

With the new regalloc the current test looks like:

https://github.com/bytecodealliance/wasmtime/blob/f85047b0842d26b4ac9797356a130c7697ec934c/cranelift/filetests/filetests/isa/x64/simd-comparison-legalize.clif#L47-L49

I'm not familiar enough with these instructions to know whether this is fixed though (there's still a single movdqa)

view this post on Zulip Wasmtime GitHub notifications bot (May 03 2022 at 16:12):

abrown commented on issue #3945:

I mean... this just looks wrong: I can never quite remember Cranelift's order of operands when using the x64 backend but our use of % makes this look like AT&T syntax (so destination on the right). %xmm0 and %xmm1 are probably used for ABI arguments so it looks like we copy %xmm0 to %xmm5 in order to use the value twice--so far so good. But then the problem: I think what we need to do is to a) find the maximum between %xmm5 and %xmm1 and put that into %xmm5 and b) equality-compare %xmm0 and %xmm5 and put that into %xmm0 for the return. but the last two instructions there don't quite say that! Looks like it should be:

pmaxsw %xmm5, %xmm1, %xmm5
pcmpeqw %xmm0, %xmm5, %xmm0

But even this is a bit silly because those SIMD instructions aren't really being used in their three-operand form.


@alexcrichton, I think this issue is fixed though :grinning_face_with_smiling_eyes:. The extra move is gone but we should resolve this debug output question...

view this post on Zulip Wasmtime GitHub notifications bot (May 03 2022 at 16:13):

abrown edited a comment on issue #3945:

I mean... this just looks wrong: I can never quite remember Cranelift's order of operands when using the x64 backend but our use of % makes this look like AT&T syntax (i.e., destination on the right). %xmm0 and %xmm1 are probably used for ABI arguments so it looks like we copy %xmm0 to %xmm5 in order to use the value twice--so far so good. But then the problem: I think what we need to do is to a) find the maximum between %xmm5 and %xmm1 and put that into %xmm5 and b) equality-compare %xmm0 and %xmm5 and put that into %xmm0 for the return. but the last two instructions there don't quite say that! Looks like it should be:

pmaxsw %xmm5, %xmm1, %xmm5
pcmpeqw %xmm0, %xmm5, %xmm0

But even this is a bit silly because those SIMD instructions aren't really being used in their three-operand form.


@alexcrichton, I think this issue is fixed though :grinning_face_with_smiling_eyes:. The extra move is gone but we should resolve this debug output question...

view this post on Zulip Wasmtime GitHub notifications bot (May 03 2022 at 16:26):

cfallin commented on issue #3945:

@abrown the output of clif-util compile on the function is:

Disassembly of 21 bytes:
   0:   55                      push    rbp
   1:   48 89 e5                mov     rbp, rsp
   4:   66 0f 6f e8             movdqa  xmm5, xmm0
   8:   66 0f ee e9             pmaxsw  xmm5, xmm1
   c:   66 0f 75 c5             pcmpeqw xmm0, xmm5
  10:   48 89 ec                mov     rsp, rbp
  13:   5d                      pop     rbp
  14:   c3                      ret

so this looks right in actuality (result is put in xmm0). I think it's a pretty-printing bug (argument swap) actually -- PR incoming to fix that.

The extra move is indeed elided now, so I think we can close this particular issue.

view this post on Zulip Wasmtime GitHub notifications bot (May 03 2022 at 16:26):

cfallin closed issue #3945:

#3886 ports CLIF icmp operations to ISLE but in the process added new, unnecessary moves to the lowering:

https://github.com/bytecodealliance/wasmtime/blob/e92cbfb28383fdd53b5a48bdd99f4ea48c57fa78/cranelift/filetests/filetests/isa/x64/simd-comparison-legalize.clif#L61-L64

The two initial movdqa instructions are not needed with the proper use of registers; my current thinking is that ISLE's move elision may not be working as it should but this could be due to re-ordering of operands in a lowering rule that ISLE cannot resolve.


Last updated: Jan 24 2025 at 00:11 UTC