abrown requested cfallin and julian-seward1 for a review on PR #2489.
abrown opened PR #2489 from load-zero
to main
:
Like #2480, this should not be merged until #2432 is resolved and x64 Wasm SIMD spec tests are turned back on. The
simd_load_zero.wast
test is now in-tree since #2481 but will have to be enabled in thebuild.rs
file.
abrown requested cfallin and julian-seward1 for a review on PR #2489.
bjorn3 submitted PR Review.
bjorn3 created PR Review Comment:
nit: missing trailing newline
abrown updated PR #2489 from load-zero
to main
:
Like #2480, this should not be merged until #2432 is resolved and x64 Wasm SIMD spec tests are turned back on. The
simd_load_zero.wast
test is now in-tree since #2481 but will have to be enabled in thebuild.rs
file.
abrown updated PR #2489 from load-zero
to main
:
Like #2480, this should not be merged until #2432 is resolved and x64 Wasm SIMD spec tests are turned back on. The
simd_load_zero.wast
test is now in-tree since #2481 but will have to be enabled in thebuild.rs
file.
abrown updated PR #2489 from load-zero
to main
:
Like #2480, this should not be merged until #2432 is resolved and x64 Wasm SIMD spec tests are turned back on. The
simd_load_zero.wast
test is now in-tree since #2481 but will have to be enabled in thebuild.rs
file.
abrown has marked PR #2489 as ready for review.
cfallin submitted PR Review.
cfallin submitted PR Review.
cfallin created PR Review Comment:
Did you work out what the issue was with the earlier version of this test that could not coalesce?
abrown submitted PR Review.
abrown created PR Review Comment:
I debug-ed to here and realized that indeed the insertion of the block was preventing
input_to_reg_mem
from seeing the input as mergeable. I realized I didn't understand how the coloring is working, but my hazy mental model makes me think that the logic should not be "allow sinking if the color is +1" but rather "allow sinking if the color is 0 or +1". I mean, I haven't looked too closely at the logic, but from the comments insrc/machinst/lower.rs
, I would expect the following code to coalesce, but it does not:function %load32_zero(i64) -> i32x4 { block0(v0: i64): v1 = load.i32 v0 v2 = scalar_to_vector.i32x4 v1 jump block1 block1: return v2 }
In any case, the optimization here is a separate issue than implementing the load*_zero instruction itself.
cfallin submitted PR Review.
cfallin created PR Review Comment:
So the reason for the +1 correction is that the test is comparing the input color of the possibly-merged load to the color at the merging point; and because loads are effectful (in this model), the output color of the load is input+1. So the test is really just "output color of load is current color". That said, it's possible that the new-block increment is happening at the wrong place such that we disallow this case -- can debug further in a bit...
cfallin submitted PR Review.
cfallin created PR Review Comment:
(To clarify the above: I'm fine with this PR merging as-is, but please feel free to create a tracking issue for this coalescing case if still unsolved!)
abrown created PR Review Comment:
Ok, opened #2562 to track this; will merge as-is to keep things moving along.
abrown submitted PR Review.
abrown merged PR #2489.
Last updated: Jan 24 2025 at 00:11 UTC