Stream: git-wasmtime

Topic: wasmtime / PR #4704 x64: Lower fcvt_to_{u,s}int{,_sat} in...


view this post on Zulip Wasmtime GitHub notifications bot (Aug 12 2022 at 18:15):

elliottt edited PR #4704 from trevor/x64-fcvt-to-uint to main:

<!--

Please ensure that the following steps are all taken care of before submitting
the PR.

Please ensure all communication adheres to the code of conduct.
-->

view this post on Zulip Wasmtime GitHub notifications bot (Aug 12 2022 at 18:17):

elliottt updated PR #4704 from trevor/x64-fcvt-to-uint to main.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 12 2022 at 18:20):

elliottt edited PR #4704 from trevor/x64-fcvt-to-uint to main:

Migrate the lowering for the four instructions fcvt_to_{u,s}int{,_sat} to ISLE.

This work uncovered the fact that we currently don't support the unsaturating versions of both instructions for vector types, and that we currently only support vector conversions from F32X4 to I32X4. I didn't want to tackle implementing those here, and have preserved the current behavior instead.
<!--

Please ensure that the following steps are all taken care of before submitting
the PR.

Please ensure all communication adheres to the code of conduct.
-->

view this post on Zulip Wasmtime GitHub notifications bot (Aug 12 2022 at 18:21):

elliottt submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 12 2022 at 18:21):

elliottt created PR review comment:

Looking through the diff to the precise-output tests, it seems like regalloc2 is introducing an unnecessary move for the lowering of fcvt_to_sint_sat when given vector-typed arguments. I'm not sure how easy this would be to remove, but it might warrant some investigation.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 12 2022 at 18:23):

elliottt submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 12 2022 at 18:23):

elliottt created PR review comment:

The move on line 468 seems unnecessary, as line 467 could be cvttps2dq %xmm0, %xmm0 instead.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 12 2022 at 18:24):

elliottt submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 12 2022 at 18:24):

elliottt created PR review comment:

I was having trouble figuring out how to make a zero in a register, as the following snippet caused a panic in regalloc2:

(tmp2 WritableXmm (temp_writable_xmm))
(tmp2 Xmm (x64_pxor tmp2, tmp2))

It seems that using a temp register without first having assigned to it is not expected. Is there an easy way to make a zero that I'm missing?

view this post on Zulip Wasmtime GitHub notifications bot (Aug 12 2022 at 18:34):

elliottt edited PR review comment.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 12 2022 at 18:35):

elliottt edited PR #4704 from trevor/x64-fcvt-to-uint to main:

Migrate the lowering for the four instructions fcvt_to_{u,s}int{,_sat} to ISLE.

I realized while porting this lowering that we currently don't support the unsaturating versions of both instructions for vector types, and that we currently only support vector conversions from F32X4 to I32X4. I didn't want to tackle implementing those here, and have preserved the current behavior instead.
<!--

Please ensure that the following steps are all taken care of before submitting
the PR.

Please ensure all communication adheres to the code of conduct.
-->

view this post on Zulip Wasmtime GitHub notifications bot (Aug 12 2022 at 18:47):

elliottt has marked PR #4704 as ready for review.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 12 2022 at 19:00):

elliottt edited PR #4704 from trevor/x64-fcvt-to-uint to main:

Migrate the lowering for the four instructions fcvt_to_{u,s}int{,_sat} to ISLE.

I realized while porting this lowering that we currently don't support the unsaturating versions of both instructions for vector types, and that we currently only support vector conversions from F32X4 to I32X4. I didn't want to tackle implementing those here, and have preserved the current behavior instead. That bug is tracked in #4693.
<!--

Please ensure that the following steps are all taken care of before submitting
the PR.

Please ensure all communication adheres to the code of conduct.
-->

view this post on Zulip Wasmtime GitHub notifications bot (Aug 15 2022 at 15:57):

cfallin submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 15 2022 at 15:57):

cfallin created PR review comment:

The pxor instruction takes both args as sources, so expects them to be defined; so regalloc2 is correctly panic'ing here that there is a use of an undefined value.

The issue is that we haven't special-cased the "xor of any value with itself gives zero" semantics of the instruction: RA2 doesn't know (can't know) that this particular instruction is invariant to its input, so it's fine to use an undefined value.

I think we do special-case this for at least one other xor variant (XmmUninitializedConst enum arm, IIRC?) -- we'd want to do the same for pxor here.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 15 2022 at 16:15):

fitzgen submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 15 2022 at 16:15):

fitzgen created PR review comment:

https://github.com/bytecodealliance/wasmtime/blob/main/cranelift/codegen/src/isa/x64/inst/mod.rs#L1725-L1726

https://github.com/bytecodealliance/wasmtime/blob/main/cranelift/codegen/src/isa/x64/inst/mod.rs#L770

view this post on Zulip Wasmtime GitHub notifications bot (Aug 15 2022 at 18:40):

elliottt updated PR #4704 from trevor/x64-fcvt-to-uint to main.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 15 2022 at 18:46):

elliottt submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 15 2022 at 18:46):

elliottt created PR review comment:

I rebased on #4709 to pull in the changes to produces_const, and it removed some unnecessary movs as a side-benefit :D

view this post on Zulip Wasmtime GitHub notifications bot (Aug 15 2022 at 18:47):

elliottt submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 15 2022 at 18:47):

elliottt created PR review comment:

This move was indeed unnecesary, and rebasing on #4709 to fix the interaction between the ISLE xmm_rm_r constructor and produces_const removed it.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 15 2022 at 18:48):

elliottt deleted PR review comment.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 15 2022 at 19:08):

cfallin submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 15 2022 at 19:08):

cfallin created PR review comment:

Is this TODO still active?

In general it's good to understand why spurious moves occur, but in some cases rearranging the ops causes things to shift somewhat and it happens; especially when the old handwritten code was making multiple defs on one temp which forced values into the same location (a sort of accidental constraint). If we get one extra move in a long conversion sequence it's not the end of the world, IMHO.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 15 2022 at 19:08):

cfallin submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 15 2022 at 19:53):

elliottt updated PR #4704 from trevor/x64-fcvt-to-uint to main.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 15 2022 at 21:59):

elliottt updated PR #4704 from trevor/x64-fcvt-to-uint to main.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 15 2022 at 22:45):

elliottt submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 15 2022 at 22:45):

elliottt created PR review comment:

If it's not bad to include that move there then I'll remove the TODO :+1:

view this post on Zulip Wasmtime GitHub notifications bot (Aug 15 2022 at 22:45):

elliottt created PR review comment:

Removing the TODO: it's not a big deal to add a single move here.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 15 2022 at 22:45):

elliottt submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 15 2022 at 22:46):

elliottt created PR review comment:

We might be able to remove this in the future, but for now the additional move doesn't seem too costly.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 15 2022 at 22:46):

elliottt submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 15 2022 at 22:48):

elliottt submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 15 2022 at 22:48):

elliottt created PR review comment:

I've also rebased this on #4714, as it turns out that treating the cmpps with the same source registers is not correct. As we couldn't reliably add the move instruction in the translation, we couldn't avoid using a random value in the xmm register chosen for the cmpps instruction. As a result, we would get spurious errors, as it would compare random values against themselves, and sometimes those random values would be NaN.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 15 2022 at 22:48):

elliottt edited PR review comment.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 15 2022 at 22:50):

elliottt edited PR review comment.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 15 2022 at 22:51):

elliottt updated PR #4704 from trevor/x64-fcvt-to-uint to main.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 16 2022 at 16:03):

elliottt merged PR #4704.


Last updated: Oct 23 2024 at 20:03 UTC