meithecatte added the bug label to Issue #8959.
meithecatte added the cranelift label to Issue #8959.
meithecatte opened issue #8959:
.clif
Test Casetest optimize set opt_level=speed target x86_64 function %f(i32, i32, i32) -> i32 { block0(v0: i32, v1: i32, v2: i32): v3 = imul.i32 v0, v1 brif v2, block1, block2 block1: return v3 block2: v4 = iconst.i32 1 v5 = iadd.i32 v3, v4 return v5 } ;; check that the two occurrences of v3 are still the same value after optimization ; check: return v3 ; check: iadd.i32 v3, v4
Steps to Reproduce
clif-util test meow.clif
Expected Results
The
imul
stays inblock0
.Actual Results
The
imul
is needlessly duplicated betweenblock1
andblock2
.Versions and Environment
Cranelift version or commit: de29ce35982eba2060709491309ccc93b397168c (main at the time of writing)
Operating system: Linux
Architecture: x86_64
cfallin commented on issue #8959:
That's true, but unfortunately I can't see a low-cost of way avoiding this: one would have to ask "is this value demanded in every subtree of the domtree from this point", at every point, and move it there; in the extreme, requiring an upward-pass on the domtree. Remember that we're reconstructing code placement as we recreate the CFG, we don't have any notion that there was a single
imul
in the parent block previously.I would consider this an issue in the same category as the "instruction scheduling" general concern (#6159 / #6260): the current algorithm is working as designed and produces correct code, but better heuristics could be better.
I'm curious if you have an idea for a better algorithm here?
Last updated: Dec 23 2024 at 12:05 UTC