elliottt edited PR #4704 from trevor/x64-fcvt-to-uint
to main
:
- Add tests for fcvt_to_uint
- Lower fcvt_to_{u,s}int{,_sat}
- Add tests for fcvt_to_uint with vector arguments
- Lower fcvt_to_uint for f32x4
- Lower
fcvt_to_uint_sat
in ISLE<!--
Please ensure that the following steps are all taken care of before submitting
the PR.
[ ] This has been discussed in issue #..., or if not, please tell us why
here.[ ] A short description of what this does, why it is needed; if the
description becomes long, the matter should probably be discussed in an issue
first.[ ] This PR contains test cases, if meaningful.
- [ ] A reviewer from the core maintainer team has been assigned for this PR.
If you don't know who could review this, please indicate so. The list of
suggested reviewers on the right can help you.Please ensure all communication adheres to the code of conduct.
-->
elliottt updated PR #4704 from trevor/x64-fcvt-to-uint
to main
.
elliottt edited PR #4704 from trevor/x64-fcvt-to-uint
to main
:
Migrate the lowering for the four instructions fcvt_to_{u,s}int{,_sat} to ISLE.
This work uncovered the fact that we currently don't support the unsaturating versions of both instructions for vector types, and that we currently only support vector conversions from
F32X4
toI32X4
. I didn't want to tackle implementing those here, and have preserved the current behavior instead.
<!--Please ensure that the following steps are all taken care of before submitting
the PR.
[ ] This has been discussed in issue #..., or if not, please tell us why
here.[ ] A short description of what this does, why it is needed; if the
description becomes long, the matter should probably be discussed in an issue
first.[ ] This PR contains test cases, if meaningful.
- [ ] A reviewer from the core maintainer team has been assigned for this PR.
If you don't know who could review this, please indicate so. The list of
suggested reviewers on the right can help you.Please ensure all communication adheres to the code of conduct.
-->
elliottt submitted PR review.
elliottt created PR review comment:
Looking through the diff to the precise-output tests, it seems like regalloc2 is introducing an unnecessary move for the lowering of
fcvt_to_sint_sat
when given vector-typed arguments. I'm not sure how easy this would be to remove, but it might warrant some investigation.
elliottt submitted PR review.
elliottt created PR review comment:
The move on line 468 seems unnecessary, as line 467 could be
cvttps2dq %xmm0, %xmm0
instead.
elliottt submitted PR review.
elliottt created PR review comment:
I was having trouble figuring out how to make a zero in a register, as the following snippet caused a panic in regalloc2:
(tmp2 WritableXmm (temp_writable_xmm)) (tmp2 Xmm (x64_pxor tmp2, tmp2))
It seems that using a temp register without first having assigned to it is not expected. Is there an easy way to make a zero that I'm missing?
elliottt edited PR review comment.
elliottt edited PR #4704 from trevor/x64-fcvt-to-uint
to main
:
Migrate the lowering for the four instructions fcvt_to_{u,s}int{,_sat} to ISLE.
I realized while porting this lowering that we currently don't support the unsaturating versions of both instructions for vector types, and that we currently only support vector conversions from
F32X4
toI32X4
. I didn't want to tackle implementing those here, and have preserved the current behavior instead.
<!--Please ensure that the following steps are all taken care of before submitting
the PR.
[ ] This has been discussed in issue #..., or if not, please tell us why
here.[ ] A short description of what this does, why it is needed; if the
description becomes long, the matter should probably be discussed in an issue
first.[ ] This PR contains test cases, if meaningful.
- [ ] A reviewer from the core maintainer team has been assigned for this PR.
If you don't know who could review this, please indicate so. The list of
suggested reviewers on the right can help you.Please ensure all communication adheres to the code of conduct.
-->
elliottt has marked PR #4704 as ready for review.
elliottt edited PR #4704 from trevor/x64-fcvt-to-uint
to main
:
Migrate the lowering for the four instructions fcvt_to_{u,s}int{,_sat} to ISLE.
I realized while porting this lowering that we currently don't support the unsaturating versions of both instructions for vector types, and that we currently only support vector conversions from
F32X4
toI32X4
. I didn't want to tackle implementing those here, and have preserved the current behavior instead. That bug is tracked in #4693.
<!--Please ensure that the following steps are all taken care of before submitting
the PR.
[ ] This has been discussed in issue #..., or if not, please tell us why
here.[ ] A short description of what this does, why it is needed; if the
description becomes long, the matter should probably be discussed in an issue
first.[ ] This PR contains test cases, if meaningful.
- [ ] A reviewer from the core maintainer team has been assigned for this PR.
If you don't know who could review this, please indicate so. The list of
suggested reviewers on the right can help you.Please ensure all communication adheres to the code of conduct.
-->
cfallin submitted PR review.
cfallin created PR review comment:
The
pxor
instruction takes both args as sources, so expects them to be defined; so regalloc2 is correctly panic'ing here that there is a use of an undefined value.The issue is that we haven't special-cased the "xor of any value with itself gives zero" semantics of the instruction: RA2 doesn't know (can't know) that this particular instruction is invariant to its input, so it's fine to use an undefined value.
I think we do special-case this for at least one other xor variant (XmmUninitializedConst enum arm, IIRC?) -- we'd want to do the same for pxor here.
fitzgen submitted PR review.
fitzgen created PR review comment:
elliottt updated PR #4704 from trevor/x64-fcvt-to-uint
to main
.
elliottt submitted PR review.
elliottt created PR review comment:
I rebased on #4709 to pull in the changes to
produces_const
, and it removed some unnecessarymov
s as a side-benefit :D
elliottt submitted PR review.
elliottt created PR review comment:
This move was indeed unnecesary, and rebasing on #4709 to fix the interaction between the ISLE
xmm_rm_r
constructor andproduces_const
removed it.
elliottt deleted PR review comment.
cfallin submitted PR review.
cfallin created PR review comment:
Is this TODO still active?
In general it's good to understand why spurious moves occur, but in some cases rearranging the ops causes things to shift somewhat and it happens; especially when the old handwritten code was making multiple defs on one temp which forced values into the same location (a sort of accidental constraint). If we get one extra move in a long conversion sequence it's not the end of the world, IMHO.
cfallin submitted PR review.
elliottt updated PR #4704 from trevor/x64-fcvt-to-uint
to main
.
elliottt updated PR #4704 from trevor/x64-fcvt-to-uint
to main
.
elliottt submitted PR review.
elliottt created PR review comment:
If it's not bad to include that move there then I'll remove the TODO :+1:
elliottt created PR review comment:
Removing the TODO: it's not a big deal to add a single move here.
elliottt submitted PR review.
elliottt created PR review comment:
We might be able to remove this in the future, but for now the additional move doesn't seem too costly.
elliottt submitted PR review.
elliottt submitted PR review.
elliottt created PR review comment:
I've also rebased this on #4714, as it turns out that treating the
cmpps
with the same source registers is not correct. As we couldn't reliably add the move instruction in the translation, we couldn't avoid using a random value in the xmm register chosen for thecmpps
instruction. As a result, we would get spurious errors, as it would compare random values against themselves, and sometimes those random values would beNaN
.
elliottt edited PR review comment.
elliottt edited PR review comment.
elliottt updated PR #4704 from trevor/x64-fcvt-to-uint
to main
.
elliottt merged PR #4704.
Last updated: Jan 24 2025 at 00:11 UTC