cfallin opened PR #4061 from lowering-unique-use
to main
:
This PR addresses the longstanding issue with loads trying to merge
into compares on x86-64, and more generally, with the lowering
framework falsely recognizing "single uses" of one op by
another (which would normally allow merging of side-effecting ops like
loads) when there is indirect duplication.To fix this, we replace the direct
value_uses
count with a
transitive notion of uniqueness (not unlike Rust's&
/&mut
and how
a&mut
downgrades to&
when accessed through another&
!). A
value is used multiple times transitively if it has multiple direct
uses, or is used by another op that is used multiple times
transitively.The canonical example of badness is:
v1 := load v2 := ifcmp v1, ... v3 := selectif v2, ... v4 := selectif v2, ...
both
v3
andv4
effectively merge theifcmp
(v2
), so even
though the use ofv1
is "unique", it is codegenned twice. This is
why wecan't have nice thingscan't merge loads into
compares (#3953).There is quite a subtle and interesting design space around this
problem and how we might solve it. See the long doc-comment on
ValueUseState
in this PR for more justification for the particular
design here. In particular, this design deliberately simplifies a bit
relative to an "optimal" solution: some uses can become unique
depending on merging, but we don't design our data structures for such
updates because that would require significant extra costly
tracking (some sort of transitive refcounting). For example, in the
above, ifselectif
somehow did not mergeifcmp
, then we would only
codegen theifcmp
once into its result register (and use that
register twice); then the load is uniquely used, and could be
merged. But that requires transitioning from "multiple use" back to
"unique use" with careful tracking as we do pattern-matching, which
I've chosen to make out-of-scope here for now. In practice, I don't
think it will matter too much (and we can always improve later).With this PR, we can now re-enable load-op merging for compares.
The first commit makes the core lowering logic change, while the
second commit makes the x64 backend changes; the last two commits
update filetests.@fitzgen, I'll note in particular that this differs from the color-based
approach I had described earlier today to you; I realized that doesn't
quite work, and I think the above is probably the best we can do without
significantly more heavyweight bookkeeping. Let me know if you've got
any better ideas though!
cfallin requested fitzgen for a review on PR #4061.
cfallin requested abrown for a review on PR #4061.
abrown submitted PR review.
abrown submitted PR review.
abrown created PR review comment:
I think the comments here can be removed.
abrown created PR review comment:
/// into flags users (`selectif`, `trueif`, branches) and can
abrown created PR review comment:
/// codegen-ed in multiple places, where it is incorporated as a
I don't even know if this is more grammatically correct.
cfallin updated PR #4061 from lowering-unique-use
to main
.
cfallin submitted PR review.
cfallin created PR review comment:
I went with
codegen'd
. Someday dictionaries will catch up with compiler engineer jargon and tell us the right answer; until then that seems right to me :-)
cfallin submitted PR review.
cfallin created PR review comment:
Fixed!
cfallin created PR review comment:
Fixed!
cfallin submitted PR review.
cfallin updated PR #4061 from lowering-unique-use
to main
.
cfallin updated PR #4061 from lowering-unique-use
to main
.
cfallin updated PR #4061 from lowering-unique-use
to main
.
cfallin updated PR #4061 from lowering-unique-use
to main
.
abrown submitted PR review.
abrown submitted PR review.
abrown created PR review comment:
let mark_all_uses_as_multiple = |value_ir_uses: &mut SecondaryMap<Value, ValueUseState>,
I think this might make it more clear.
abrown created PR review comment:
// a DFS. We iterate over all instructions and mark their args
abrown created PR review comment:
So the invariant is that if
value
is used multiple times, the inputs[i]
to the instruction that createdvalue
are also used multiple times. Perhaps we should add adebug_assert!
here that at least the next level up[i]
are indeed already marked as multiple, ifvalue
has them. Everything looks correct but it might be smart to add something like this to keep things correct.
abrown created PR review comment:
As I considered what this might look like, I started to wonder as well if we even need the stack structure? Why not use the actual stack instead?
fn mark_all... (value_ir_uses: &mut SecondaryMap<...>, value: Value, f: ...) { if let ValueDef::Result(src_inst, _) = f.dfg.value_def(value) { for input in f.dfg.inst_args(src_inst).iter() { if value_ir_uses[input] == ValueUseState::Multiple { debug_assert!(...); // seems like we could factor out the ValueDef::Result / inst_args.iter into a helper and use it here } else { mark_all...(value_ir_uses, value); } } }
Not sure it works with mutability rules and all that; just thought I would mention it.
cfallin updated PR #4061 from lowering-unique-use
to main
.
cfallin submitted PR review.
cfallin created PR review comment:
Added the
debug_assert
, thanks!Re: explicit stack vs. recursion, I wrote it this way to avoid running out of stack when long chains exist in the input... preemptively addressing a future fuzzbug I guess :-) This is a practice I picked up in SpiderMonkey (no recursion bounded by user-controlled input sizes) and seems like a good practice in general that I think we follow in most places.
cfallin merged PR #4061.
Last updated: Dec 23 2024 at 12:05 UTC