Stream: git-wasmtime

Topic: wasmtime / PR #4722 x64: Lower widening and narrowing ope...


view this post on Zulip Wasmtime GitHub notifications bot (Aug 16 2022 at 19:45):

elliottt opened PR #4722 from trevor/x64-widen to main:

Lower uwiden_high, uwiden_low, swiden_high, swiden_low, snarrow, and unarrow in ISLE.

<!--

Please ensure that the following steps are all taken care of before submitting
the PR.

Please ensure all communication adheres to the code of conduct.
-->

view this post on Zulip Wasmtime GitHub notifications bot (Aug 16 2022 at 20:24):

elliottt has marked PR #4722 as ready for review.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 17 2022 at 23:52):

cfallin submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 17 2022 at 23:52):

cfallin submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 17 2022 at 23:52):

cfallin created PR review comment:

I know this is pre-existing but it seems to me we should clarify this before we merge; what exactly does this mean? Is there something in the Wasm translator that generates code with an explicitly incorrect type?

Likewise ignoring the second argument below seems outright wrong: if the semantics of snarrow are that it combines lanes of the two vectors, the lanes of b should show up somewhere.

I suspect this is a weird shoehorning of different semantics into a particular combo of ops (ie, the frontend generates exactly this shape, and the backend agrees that it means something different than what the actual instruction semantics, composed, would mean. We should fix this if so as it's a correctness bug waiting to happen!

A way to find out if this is the case is: what happens if we remove this specialization? We have generic lowerings for snarrow and fcvt_to_sint_sat; do those two lowerings, composed together, give a correct result for whatever code is meant to hit this rule?

cc @abrown @jlb6740 for more thoughts on this (you may remember more history here?)

view this post on Zulip Wasmtime GitHub notifications bot (Aug 17 2022 at 23:58):

elliottt created PR review comment:

I think this comment from lower.rs might shed some light: it seems like this combination of snarrow and fcvt_to_sint_sat is implementing the behavior of trunc_sat_f64x2:
https://github.com/bytecodealliance/wasmtime/blob/0a71df6a37b4f116eeb7e041ca434a544c5463bb/cranelift/codegen/src/isa/x64/lower.rs#L701-L707

view this post on Zulip Wasmtime GitHub notifications bot (Aug 17 2022 at 23:58):

elliottt submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 17 2022 at 23:59):

elliottt edited PR review comment.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 18 2022 at 00:07):

elliottt created PR review comment:

Here's where the instructions that this special case handles are introduced:
https://github.com/bytecodealliance/wasmtime/blob/main/cranelift/wasm/src/code_translator.rs#L1789-L1795

The second argument to snarrow is always all zeros, so perhaps we could check that as well to better catch this case?

view this post on Zulip Wasmtime GitHub notifications bot (Aug 18 2022 at 00:07):

elliottt submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 18 2022 at 00:10):

cfallin created PR review comment:

I still feel like I don't fully understand what's going on here; unfortunately that comment doesn't clarify much for me. Why is there a type mismatch if we're composing the float-to-int conversion (which produces an I64x2) with an int-to-int narrow? In other words what change is the TODO proposing in the instruction definitions?

I'm also curious whether removing this special case results in correct execution still; if not then that's a sign that we're trying to fit extra meaning into the combo here that shouldn't exist.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 18 2022 at 00:10):

cfallin submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 18 2022 at 00:11):

elliottt edited PR review comment.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 18 2022 at 00:36):

elliottt submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 18 2022 at 00:36):

elliottt created PR review comment:

I think the comment that I migrated over on line 3282 is suggesting that the intent of this lowering is to lower the trunc_sat instance I mentioned above, thus the type mismatch as trunc_sat takes a floating point number and produces an integer. The comment from lower.rs backs this up by suggesting that this is the translation of i32x4.trunc_sat_f64x2_s_zero, and the translation of that operation produces the sequence that this rule matches.

I think that if we implement the signed and unsigned narrowings for I64X2 we could probably remove this special case and have the general case work, but just removing it now will cause test failures. It might be worth keeping the special case if it turns out that it avoids some unnecessary work, and i32x4.trunc_sat_f64x2_{s,u}_zero is a common enough operation.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 18 2022 at 17:56):

elliottt updated PR #4722 from trevor/x64-widen to main.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 18 2022 at 17:58):

elliottt submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 18 2022 at 17:58):

elliottt created PR review comment:

I've clarified the purpose of the special case, and restricted it further to cases where the rhs of the snarrow is a zero vector which should match the code generated by the front end. I've also filed #4734 about the missing i64x2 cases for snarrow and unarrow.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 18 2022 at 17:59):

elliottt requested cfallin for a review on PR #4722.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 18 2022 at 18:34):

cfallin submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 18 2022 at 18:53):

elliottt merged PR #4722.


Last updated: Jan 24 2025 at 00:11 UTC