abrown opened PR #1990 from trunc-sat-unsigned-again
to main
:
This replaces #1822; it consists of the same functionality but removes the AVX512 instruction lowering for the time being. There are two reasons for this:
- the default MXCSR rounding is round to nearest even, which does not match the semantics required by
i32x4.trunc_sat_f32x4_u
. We can then use embedded rounding control but lose the ability to specify the vector length, so the instruction would operate on 512-bits which we should discuss (@sunfishcode has reported issues with 512-bit vectors in Spidermonkey)- the output of
VCVTPS2UDQ
for negative lanes is0xFFFFFFFF
(I had thought it would be0x00000000
); this can be resolved with the following sequence:v0 = pxor ...; v2 = fcmp gte v1, v0 (gte ensures they are ordered); v3 = vcvtps2udq v1; v4 = band v2, v3
. However, I would like to look at this a little bit more before submitting a separate PR for it (this is the reason for keeping the legalization inenc_tables.rs
and undernarrow_avx
, BTW).
abrown requested julian-seward1 for a review on PR #1990.
abrown updated PR #1990 from trunc-sat-unsigned-again
to main
:
This replaces #1822; it consists of the same functionality but removes the AVX512 instruction lowering for the time being. There are two reasons for this:
- the default MXCSR rounding is round to nearest even, which does not match the semantics required by
i32x4.trunc_sat_f32x4_u
. We can then use embedded rounding control but lose the ability to specify the vector length, so the instruction would operate on 512-bits which we should discuss (@sunfishcode has reported issues with 512-bit vectors in Spidermonkey)- the output of
VCVTPS2UDQ
for negative lanes is0xFFFFFFFF
(I had thought it would be0x00000000
); this can be resolved with the following sequence:v0 = pxor ...; v2 = fcmp gte v1, v0 (gte ensures they are ordered); v3 = vcvtps2udq v1; v4 = band v2, v3
. However, I would like to look at this a little bit more before submitting a separate PR for it (this is the reason for keeping the legalization inenc_tables.rs
and undernarrow_avx
, BTW).
julian-seward1 submitted PR Review.
abrown updated PR #1990 from trunc-sat-unsigned-again
to main
:
This replaces #1822; it consists of the same functionality but removes the AVX512 instruction lowering for the time being. There are two reasons for this:
- the default MXCSR rounding is round to nearest even, which does not match the semantics required by
i32x4.trunc_sat_f32x4_u
. We can then use embedded rounding control but lose the ability to specify the vector length, so the instruction would operate on 512-bits which we should discuss (@sunfishcode has reported issues with 512-bit vectors in Spidermonkey)- the output of
VCVTPS2UDQ
for negative lanes is0xFFFFFFFF
(I had thought it would be0x00000000
); this can be resolved with the following sequence:v0 = pxor ...; v2 = fcmp gte v1, v0 (gte ensures they are ordered); v3 = vcvtps2udq v1; v4 = band v2, v3
. However, I would like to look at this a little bit more before submitting a separate PR for it (this is the reason for keeping the legalization inenc_tables.rs
and undernarrow_avx
, BTW).
abrown merged PR #1990.
Last updated: Dec 23 2024 at 12:05 UTC