jlb6740 opened PR #2783 from implement_64_packed_promotelow_demote
to main
:
<!--
Please ensure that the following steps are all taken care of before submitting
the PR.
[ ] This has been discussed in issue #..., or if not, please tell us why
here.[ ] A short description of what this does, why it is needed; if the
description becomes long, the matter should probably be discussed in an issue
first.[ ] This PR contains test cases, if meaningful.
- [ ] A reviewer from the core maintainer team has been assigned for this PR.
If you don't know who could review this, please indicate so. The list of
suggested reviewers on the right can help you.Please ensure all communication adheres to the code of conduct.
-->
jlb6740 updated PR #2783 from implement_64_packed_promotelow_demote
to main
.
jlb6740 requested abrown for a review on PR #2783.
jlb6740 requested cfallin and abrown for a review on PR #2783.
abrown submitted PR Review.
abrown created PR Review Comment:
For this one, I think it does make sense to keep the
_low
since it indicates which lanes get promoted, but I would drop thev
since the_low
already implies lanes.
abrown created PR Review Comment:
I would suggest trying to fit this under
fdemote
: the semantics are very close and we could document onfdemote
that, when used on vector types, it's conversions are placed in the lowest lanes. E.g., when demoting lanes 0 and 1 of anf64x2
, the results are placed in lanes 0 and 1 of thef32x4
). We will also want to document that lanes 2 and 3 are zeroed out.
abrown submitted PR Review.
abrown created PR Review Comment:
This allows
fpromote.f64x2
andfpromote_low.f64x2
to compile to the same x64 instruction, which I don't think is intended. Perhaps assert that the vectors haveop == Opcode::FpromoteLow
and the scalars haveop == Opcode::Fpromote
. (This comment also applies below if you don't do the "fvdemote -> fdemote" renaming).
jlb6740 updated PR #2783 from implement_64_packed_promotelow_demote
to main
.
jlb6740 updated PR #2783 from implement_64_packed_promotelow_demote
to main
.
jlb6740 updated PR #2783 from implement_64_packed_promotelow_demote
to main
.
jlb6740 updated PR #2783 from implement_64_packed_promotelow_demote
to main
.
jlb6740 submitted PR review.
jlb6740 created PR review comment:
Yes, I don't disagree about combining fdemote and fvdemote feeling like the natural thing to do. In fact I tried that first as but doing that failed due to the constraint requiring the lane sizes remain the same. Trying to support both both f32.demote and f32x4.demote_f64x2_zero with fdemote would require us to make this instruction definition less specific. In that case comments about zeroing lanes would not apply to the general instruction here as it only applies when lowering f32x4.demote_f64x2_zero? If not combined then certainly adding that zeroing part makes sense. Even though the language is there and seemly necessary in instruction.rs, frankly I don't really see the concept of lanes with the definition of f32.demote. But that's a different subject. What do you think about the implication of making the fdemote instruction comments less specific if we are trying to cover the two different lowerings?
jlb6740 submitted PR review.
abrown submitted PR review.
abrown created PR review comment:
Considering only the lower half of the register, the low lanes in `x` are interpreted as single precision floats that are then converted to double precision floats.
abrown submitted PR review.
abrown created PR review comment:
I think we still need to think about this comment...
abrown created PR review comment:
I also see
.constraints(vec![WiderOrEq(Float.clone(), FloatTo.clone())]),
used in a few places, which might be a more generic approach.
abrown created PR review comment:
let FxN = &TypeVar::new( "FxN", "A vector floating point number", TypeSetBuilder::new() .floats(Interval::All) .simd_lanes(Interval::All) .includes_scalars(false) .build(), ); let x = &Operand::new("x", FxN); let a = &Operand::new("a", FxN.merge_lanes());
I think if we want to be more precise with the types here we should use a "vector float" type (e.g. build a new
FxN
) and tell the type system using.merge_lanes()
that the result will have the same number of bits, just half the number of lanes, like we do in the documentation.
jlb6740 updated PR #2783 from implement_64_packed_promotelow_demote
to main
.
jlb6740 updated PR #2783 from implement_64_packed_promotelow_demote
to main
.
jlb6740 updated PR #2783 from implement_64_packed_promotelow_demote
to main
.
jlb6740 submitted PR review.
jlb6740 created PR review comment:
Reverted back to having a separate lowering for fvdemote after some offline discussion with @abrown. The impact of adding a constraint was not clear. For example, I thought if adding the constraint:
.constraints(vec![WiderOrEq(Float.clone(), FloatTo.clone())]),
this meant that x needed to be wider or equal to a. When I did this the spec test passed as expected. However, when I reversed the constraint to say
.constraints(vec![WiderOrEq(FloatTo.clone(), Float.clone())]),
This also passed. For this reason I just left it out which is what the original patch submission did (as written in the notes).
jlb6740 submitted PR review.
jlb6740 created PR review comment:
Note the original patch used fvpromote_low. This was to be consistent with fvdemote. However there was a desire to rename fvpromote_low to fpromote_low and also git rid of fvdemote (combine with fdemote). Now that fvdemote is back, the naming is inconsistent here but that may be OK.
jlb6740 edited PR review comment.
abrown submitted PR review.
abrown submitted PR review.
abrown created PR review comment:
Same type comments apply here.
abrown created PR review comment:
"fvpromote_low",
abrown created PR review comment:
We need to make sure that we constrain this so that the following is not possible:
v1 = fconst.f64x2 ... v2 = fvdemote.f32 v1
Right now this looks possible, but we shouldn't be able to demote from a vector to a scalar (doesn't make sense). If we add the
WiderOrEq(x, a)
constraint, though, the types allowed should be, e.g.,f64x2
->f32x2
, butf32x2
isn't really a valid type in WebAssembly (it is in CLIF).I think we should use the
.merge_lanes()
approach here so that the types aref64x2
->f32x4
and then in the instruction description we have to specify that the top two lanes are zeroed out. Or alternately, just explicitly put the types in.
abrown created PR review comment:
With Fvdemote there is no constraint on having the result type have the same
jlb6740 updated PR #2783 from implement_64_packed_promotelow_demote
to main
.
jlb6740 submitted PR review.
jlb6740 created PR review comment:
I've created new types with explicit lanes. Interesting enough after doing this I had to update the state push call in code_translator.rs. I also could have used split_lanes here though that would have been less constrained and semantically speaking just sounds vague on what that implies.
jlb6740 updated PR #2783 from implement_64_packed_promotelow_demote
to main
.
abrown submitted PR review.
cfallin submitted PR review.
abrown merged PR #2783.
Last updated: Dec 23 2024 at 12:05 UTC