wasmtime / PR #4061 Cranelift: fix #3953: rework single/m... · git-wasmtime

Stream: git-wasmtime

Topic: wasmtime / PR #4061 Cranelift: fix #3953: rework single/m...

Wasmtime GitHub notifications bot (Apr 21 2022 at 05:47):

cfallin opened PR #4061 from lowering-unique-use to main:

This PR addresses the longstanding issue with loads trying to merge
into compares on x86-64, and more generally, with the lowering
framework falsely recognizing "single uses" of one op by
another (which would normally allow merging of side-effecting ops like
loads) when there is indirect duplication.

To fix this, we replace the direct value_uses count with a
transitive notion of uniqueness (not unlike Rust's &/&mut and how
a &mut downgrades to & when accessed through another &!). A
value is used multiple times transitively if it has multiple direct
uses, or is used by another op that is used multiple times
transitively.

The canonical example of badness is:
    v1 := load
    v2 := ifcmp v1, ...
    v3 := selectif v2, ...
    v4 := selectif v2, ...
both v3 and v4 effectively merge the ifcmp (v2), so even
though the use of v1 is "unique", it is codegenned twice. This is
why we ~~can't have nice things~~ can't merge loads into
compares (#3953).

There is quite a subtle and interesting design space around this
problem and how we might solve it. See the long doc-comment on
ValueUseState in this PR for more justification for the particular
design here. In particular, this design deliberately simplifies a bit
relative to an "optimal" solution: some uses can become unique
depending on merging, but we don't design our data structures for such
updates because that would require significant extra costly
tracking (some sort of transitive refcounting). For example, in the
above, if selectif somehow did not merge ifcmp, then we would only
codegen the ifcmp once into its result register (and use that
register twice); then the load is uniquely used, and could be
merged. But that requires transitioning from "multiple use" back to
"unique use" with careful tracking as we do pattern-matching, which
I've chosen to make out-of-scope here for now. In practice, I don't
think it will matter too much (and we can always improve later).

With this PR, we can now re-enable load-op merging for compares.
The first commit makes the core lowering logic change, while the
second commit makes the x64 backend changes; the last two commits
update filetests.

@fitzgen, I'll note in particular that this differs from the color-based
approach I had described earlier today to you; I realized that doesn't
quite work, and I think the above is probably the best we can do without
significantly more heavyweight bookkeeping. Let me know if you've got
any better ideas though!

Wasmtime GitHub notifications bot (Apr 21 2022 at 05:47):

cfallin requested fitzgen for a review on PR #4061.

Wasmtime GitHub notifications bot (Apr 21 2022 at 05:47):

cfallin requested abrown for a review on PR #4061.

Wasmtime GitHub notifications bot (Apr 21 2022 at 15:54):

abrown submitted PR review.

Wasmtime GitHub notifications bot (Apr 21 2022 at 15:54):

abrown submitted PR review.

Wasmtime GitHub notifications bot (Apr 21 2022 at 15:54):

abrown created PR review comment:

I think the comments here can be removed.

Wasmtime GitHub notifications bot (Apr 21 2022 at 15:54):

abrown created PR review comment:

/// into flags users (`selectif`, `trueif`, branches) and can

Wasmtime GitHub notifications bot (Apr 21 2022 at 15:54):

abrown created PR review comment:

/// codegen-ed in multiple places, where it is incorporated as a
I don't even know if this is more grammatically correct.

Wasmtime GitHub notifications bot (Apr 21 2022 at 19:24):

cfallin updated PR #4061 from lowering-unique-use to main.

Wasmtime GitHub notifications bot (Apr 21 2022 at 19:25):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Apr 21 2022 at 19:25):

cfallin created PR review comment:

I went with codegen'd. Someday dictionaries will catch up with compiler engineer jargon and tell us the right answer; until then that seems right to me :-)

Wasmtime GitHub notifications bot (Apr 21 2022 at 19:25):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Apr 21 2022 at 19:25):

cfallin created PR review comment:

Fixed!

Wasmtime GitHub notifications bot (Apr 21 2022 at 19:25):

cfallin created PR review comment:

Fixed!

Wasmtime GitHub notifications bot (Apr 21 2022 at 19:25):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Apr 21 2022 at 19:25):

cfallin updated PR #4061 from lowering-unique-use to main.

Wasmtime GitHub notifications bot (Apr 22 2022 at 01:23):

cfallin updated PR #4061 from lowering-unique-use to main.

Wasmtime GitHub notifications bot (Apr 22 2022 at 01:40):

cfallin updated PR #4061 from lowering-unique-use to main.

Wasmtime GitHub notifications bot (Apr 22 2022 at 17:58):

cfallin updated PR #4061 from lowering-unique-use to main.

Wasmtime GitHub notifications bot (Apr 22 2022 at 22:35):

abrown submitted PR review.

Wasmtime GitHub notifications bot (Apr 22 2022 at 22:35):

abrown submitted PR review.

Wasmtime GitHub notifications bot (Apr 22 2022 at 22:35):

abrown created PR review comment:

        let mark_all_uses_as_multiple = |value_ir_uses: &mut SecondaryMap<Value, ValueUseState>,

I think this might make it more clear.

Wasmtime GitHub notifications bot (Apr 22 2022 at 22:35):

abrown created PR review comment:

        // a DFS. We iterate over all instructions and mark their args

Wasmtime GitHub notifications bot (Apr 22 2022 at 22:35):

abrown created PR review comment:

So the invariant is that if value is used multiple times, the inputs [i] to the instruction that created value are also used multiple times. Perhaps we should add a debug_assert! here that at least the next level up [i] are indeed already marked as multiple, if value has them. Everything looks correct but it might be smart to add something like this to keep things correct.

Wasmtime GitHub notifications bot (Apr 22 2022 at 22:35):

abrown created PR review comment:

As I considered what this might look like, I started to wonder as well if we even need the stack structure? Why not use the actual stack instead?

fn mark_all... (value_ir_uses: &mut SecondaryMap<...>, value: Value, f: ...) {
  if let ValueDef::Result(src_inst, _) = f.dfg.value_def(value) {
    for input in f.dfg.inst_args(src_inst).iter() {
      if value_ir_uses[input] == ValueUseState::Multiple {
        debug_assert!(...); // seems like we could factor out the ValueDef::Result / inst_args.iter into a helper and use it here
      } else {
        mark_all...(value_ir_uses, value);
      }
  }
}

Not sure it works with mutability rules and all that; just thought I would mention it.

Wasmtime GitHub notifications bot (Apr 23 2022 at 00:30):

cfallin updated PR #4061 from lowering-unique-use to main.

Wasmtime GitHub notifications bot (Apr 23 2022 at 00:32):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Apr 23 2022 at 00:32):

cfallin created PR review comment:

Added the debug_assert, thanks!

Re: explicit stack vs. recursion, I wrote it this way to avoid running out of stack when long chains exist in the input... preemptively addressing a future fuzzbug I guess :-) This is a practice I picked up in SpiderMonkey (no recursion bounded by user-controlled input sizes) and seems like a good practice in general that I think we follow in most places.

Wasmtime GitHub notifications bot (Apr 23 2022 at 01:00):

cfallin merged PR #4061.

Last updated: Apr 18 2025 at 19:03 UTC