elliottt opened PR #4722 from trevor/x64-widen
to main
:
Lower
uwiden_high
,uwiden_low
,swiden_high
,swiden_low
,snarrow
, andunarrow
in ISLE.<!--
Please ensure that the following steps are all taken care of before submitting
the PR.
[ ] This has been discussed in issue #..., or if not, please tell us why
here.[ ] A short description of what this does, why it is needed; if the
description becomes long, the matter should probably be discussed in an issue
first.[ ] This PR contains test cases, if meaningful.
- [ ] A reviewer from the core maintainer team has been assigned for this PR.
If you don't know who could review this, please indicate so. The list of
suggested reviewers on the right can help you.Please ensure all communication adheres to the code of conduct.
-->
elliottt has marked PR #4722 as ready for review.
cfallin submitted PR review.
cfallin submitted PR review.
cfallin created PR review comment:
I know this is pre-existing but it seems to me we should clarify this before we merge; what exactly does this mean? Is there something in the Wasm translator that generates code with an explicitly incorrect type?
Likewise ignoring the second argument below seems outright wrong: if the semantics of
snarrow
are that it combines lanes of the two vectors, the lanes ofb
should show up somewhere.I suspect this is a weird shoehorning of different semantics into a particular combo of ops (ie, the frontend generates exactly this shape, and the backend agrees that it means something different than what the actual instruction semantics, composed, would mean. We should fix this if so as it's a correctness bug waiting to happen!
A way to find out if this is the case is: what happens if we remove this specialization? We have generic lowerings for
snarrow
andfcvt_to_sint_sat
; do those two lowerings, composed together, give a correct result for whatever code is meant to hit this rule?cc @abrown @jlb6740 for more thoughts on this (you may remember more history here?)
elliottt created PR review comment:
I think this comment from
lower.rs
might shed some light: it seems like this combination ofsnarrow
andfcvt_to_sint_sat
is implementing the behavior oftrunc_sat_f64x2
:
https://github.com/bytecodealliance/wasmtime/blob/0a71df6a37b4f116eeb7e041ca434a544c5463bb/cranelift/codegen/src/isa/x64/lower.rs#L701-L707
elliottt submitted PR review.
elliottt edited PR review comment.
elliottt created PR review comment:
Here's where the instructions that this special case handles are introduced:
https://github.com/bytecodealliance/wasmtime/blob/main/cranelift/wasm/src/code_translator.rs#L1789-L1795The second argument to
snarrow
is always all zeros, so perhaps we could check that as well to better catch this case?
elliottt submitted PR review.
cfallin created PR review comment:
I still feel like I don't fully understand what's going on here; unfortunately that comment doesn't clarify much for me. Why is there a type mismatch if we're composing the float-to-int conversion (which produces an I64x2) with an int-to-int narrow? In other words what change is the TODO proposing in the instruction definitions?
I'm also curious whether removing this special case results in correct execution still; if not then that's a sign that we're trying to fit extra meaning into the combo here that shouldn't exist.
cfallin submitted PR review.
elliottt edited PR review comment.
elliottt submitted PR review.
elliottt created PR review comment:
I think the comment that I migrated over on line 3282 is suggesting that the intent of this lowering is to lower the trunc_sat instance I mentioned above, thus the type mismatch as trunc_sat takes a floating point number and produces an integer. The comment from lower.rs backs this up by suggesting that this is the translation of
i32x4.trunc_sat_f64x2_s_zero
, and the translation of that operation produces the sequence that this rule matches.I think that if we implement the signed and unsigned narrowings for
I64X2
we could probably remove this special case and have the general case work, but just removing it now will cause test failures. It might be worth keeping the special case if it turns out that it avoids some unnecessary work, andi32x4.trunc_sat_f64x2_{s,u}_zero
is a common enough operation.
elliottt updated PR #4722 from trevor/x64-widen
to main
.
elliottt submitted PR review.
elliottt created PR review comment:
I've clarified the purpose of the special case, and restricted it further to cases where the rhs of the
snarrow
is a zero vector which should match the code generated by the front end. I've also filed #4734 about the missingi64x2
cases forsnarrow
andunarrow
.
elliottt requested cfallin for a review on PR #4722.
cfallin submitted PR review.
elliottt merged PR #4722.
Last updated: Dec 23 2024 at 12:05 UTC