Stream: git-wasmtime

Topic: wasmtime / PR #4163 Add a basic alias analysis with redun...


view this post on Zulip Wasmtime GitHub notifications bot (May 18 2022 at 23:58):

cfallin requested fitzgen for a review on PR #4163.

view this post on Zulip Wasmtime GitHub notifications bot (May 18 2022 at 23:58):

cfallin opened PR #4163 from alias-analysis to main:

This PR adds a basic alias analysis, and optimizations that use it.
This is a "mid-end optimization": it operates on CLIF, the
machine-independent IR, before lowering occurs.

The alias analysis (or maybe more properly, a sort of memory-value
analysis) determines when it can prove a particular memory
location is equal to a given SSA value, and when it can, it replaces any
loads of that location.

This subsumes two common optimizations:

Both of these optimizations rely on a "last store" analysis that is a
sort of coloring mechanism, split across disjoint categories of abstract
state. The basic idea is that every memory-accessing operation is put
into one of N disjoint categories; it is disallowed for memory to ever
be accessed by an op in one category and later accessed by an op in
another category. (The frontend must ensure this.)

Then, given this, we scan the code and determine, for each
memory-accessing op, when a single prior instruction is a store to the
same category. This "colors" the instruction: it is, in a sense, a
static name for that version of memory.

This analysis provides an important invariant: if two operations access
memory with the same last-store, then no other store can alias in the
time between that last store and these operations. This must-not-alias
property, together with a check that the accessed address is *exactly
the same* (same SSA value and offset), and other attributes of the
access (type, extension mode) are the same, let us prove that the
results are the same.

Given last-store info, we scan the instructions and build a table from
"memory location" key (last store, address, offset, type, extension) to
known SSA value stored in that location. A store inserts a new mapping.
A load may also insert a new mapping, if we didn't already have one.
Then when a load occurs and an entry already exists for its "location",
we can reuse the value. This will be either RLE or St-to-Ld depending on
where the value came from.

Note that this does work across basic blocks: the last-store analysis
is a full iterative dataflow pass, and we are careful to check dominance
of a previously-defined value before aliasing to it at a potentially
redundant load. So we will do the right thing if we only have a
"partially redundant" load (loaded already but only in one predecessor
block), but we will also correctly reuse a value if there is a store or
load above a loop and a redundant load of that value within the loop, as
long as no potentially-aliasing stores happen within the loop.

Fixes #4131.

Passes tests and runs SpiderMonkey correctly locally; benchmarks TBD.

Creating this as a draft for early feedback; will likely refine the comments
and explanations a bit more, and benchmark this.

<!--

Please ensure that the following steps are all taken care of before submitting
the PR.

Please ensure all communication adheres to the code of conduct.
-->

view this post on Zulip Wasmtime GitHub notifications bot (May 18 2022 at 23:58):

cfallin updated PR #4163 from alias-analysis to main.

view this post on Zulip Wasmtime GitHub notifications bot (May 19 2022 at 05:56):

bjorn3 submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (May 19 2022 at 05:56):

bjorn3 created PR review comment:

It would be nice if all stack slots would be their own category until their address is leaked using stack_addr.

view this post on Zulip Wasmtime GitHub notifications bot (May 19 2022 at 06:16):

cfallin submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (May 19 2022 at 06:16):

cfallin created PR review comment:

Sure, that would be nice, but entails an escape analysis, which is outside the scope of this PR. Happy to discuss further (and review PRs!) once this is in, of course.

view this post on Zulip Wasmtime GitHub notifications bot (May 19 2022 at 09:02):

bjorn3 submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (May 19 2022 at 09:02):

bjorn3 created PR review comment:

Even stopping optimization entirely if stack_addr is used on a stack slot would likely help cg_clif a lot. Having each stack slot whose address is never leaked be optimized independently is the most important for cg_clif. I'm fine with leaving that for a later PR.

view this post on Zulip Wasmtime GitHub notifications bot (May 19 2022 at 16:46):

cfallin updated PR #4163 from alias-analysis to main.

view this post on Zulip Wasmtime GitHub notifications bot (May 19 2022 at 18:44):

cfallin updated PR #4163 from alias-analysis to main.

view this post on Zulip Wasmtime GitHub notifications bot (May 19 2022 at 18:44):

cfallin has marked PR #4163 as ready for review.

view this post on Zulip Wasmtime GitHub notifications bot (May 19 2022 at 19:15):

lqd submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (May 19 2022 at 19:15):

lqd created PR review comment:

Tiny typo

//! to the abstract state category occurred, *and* no other trapping

view this post on Zulip Wasmtime GitHub notifications bot (May 19 2022 at 19:27):

fitzgen submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (May 19 2022 at 19:27):

fitzgen submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (May 19 2022 at 19:27):

fitzgen created PR review comment:

For clarity:

    /// Accesses only the "heap" part of abstract state. Used for alias

Same for the other abstract state parts.

view this post on Zulip Wasmtime GitHub notifications bot (May 19 2022 at 19:27):

fitzgen created PR review comment:

Slightly unfortunate that we can't easily differentiate between two different memories or two different tables. I don't think it will matter too much in practice, but unfortunate none the less.

File a follow up issue for having an alias analysis bit for each memory/table instead of lumping all memories and all tables together?

view this post on Zulip Wasmtime GitHub notifications bot (May 19 2022 at 19:27):

fitzgen created PR review comment:

I think this already exists as https://docs.rs/cranelift-codegen/latest/cranelift_codegen/ir/instructions/enum.InstructionData.html#method.memflags no?

view this post on Zulip Wasmtime GitHub notifications bot (May 19 2022 at 19:27):

fitzgen created PR review comment:

Could you add some comments about what cases this is testing and what is expected? That will be really helpful for folks updating tests for unrelated reasons and trying to determine whether the new precise output is valid or not. Also useful for reviewers trying to figure out which cases do or don't have tests ;)

view this post on Zulip Wasmtime GitHub notifications bot (May 19 2022 at 19:27):

fitzgen created PR review comment:

These are becoming very verbose. Can we make set_table et al take and return Self so that we can chain the methods like

ir::MemFlags::trusted().set_table()

view this post on Zulip Wasmtime GitHub notifications bot (May 19 2022 at 19:27):

fitzgen created PR review comment:

I think these tests would be a lot easier to read as CLIF to CLIF tests. That way we don't have to figure out what changes are from unrelated passes or just related to aarch64-specific lowering, or even have to know aarch64 to understand these tests that are logically portable and architecture-independent.

Adding test alias_analysis that just does alias analysis followed by GVN by calling Context::remove_redundant_loads should be really straightforward. Should be able to pretty much copy+paste the test simple_gvn implementation:

https://github.com/bytecodealliance/wasmtime/blob/0a0c232a14e651f1457ced0ad0c950771de2f3c3/cranelift/filetests/src/test_simple_gvn.rs

view this post on Zulip Wasmtime GitHub notifications bot (May 19 2022 at 19:27):

fitzgen created PR review comment:

File a follow up issue to track/discuss dead-store elimination?

view this post on Zulip Wasmtime GitHub notifications bot (May 19 2022 at 19:27):

cfallin updated PR #4163 from alias-analysis to main.

view this post on Zulip Wasmtime GitHub notifications bot (May 19 2022 at 19:27):

cfallin submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (May 19 2022 at 19:27):

cfallin created PR review comment:

Thanks!

view this post on Zulip Wasmtime GitHub notifications bot (May 19 2022 at 23:03):

cfallin updated PR #4163 from alias-analysis to main.

view this post on Zulip Wasmtime GitHub notifications bot (May 19 2022 at 23:03):

cfallin submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (May 19 2022 at 23:03):

cfallin created PR review comment:

That's a great idea, thanks -- done!

view this post on Zulip Wasmtime GitHub notifications bot (May 19 2022 at 23:03):

cfallin submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (May 19 2022 at 23:03):

cfallin created PR review comment:

Done (here and in other new tests)!

view this post on Zulip Wasmtime GitHub notifications bot (May 19 2022 at 23:13):

cfallin updated PR #4163 from alias-analysis to main.

view this post on Zulip Wasmtime GitHub notifications bot (May 19 2022 at 23:13):

cfallin submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (May 19 2022 at 23:13):

cfallin created PR review comment:

Ah yes, I had missed that; using that one now, thanks.

view this post on Zulip Wasmtime GitHub notifications bot (May 19 2022 at 23:13):

cfallin created PR review comment:

Done!

view this post on Zulip Wasmtime GitHub notifications bot (May 19 2022 at 23:13):

cfallin submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (May 19 2022 at 23:14):

cfallin submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (May 19 2022 at 23:14):

cfallin created PR review comment:

Sure, good idea; I added with_<flag> methods to MemFlags.

view this post on Zulip Wasmtime GitHub notifications bot (May 19 2022 at 23:19):

cfallin submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (May 19 2022 at 23:19):

cfallin created PR review comment:

Sure! There are efficiency reasons why this is not trivial -- bits in a bitset (MemFlags is a u8, but even with a u32 or u64 that's still a small fixed number of heaps/tables), and the vector of last-stores (does it scale arbitrarily?). But maybe there's a hybrid approach (small number of privileged heaps/tables) or maybe we can find a way to scale it well enough.

Filed #4166.

view this post on Zulip Wasmtime GitHub notifications bot (May 19 2022 at 23:23):

cfallin submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (May 19 2022 at 23:23):

cfallin created PR review comment:

Sure, #4167.

view this post on Zulip Wasmtime GitHub notifications bot (May 20 2022 at 18:22):

fitzgen submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (May 20 2022 at 18:22):

fitzgen submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (May 20 2022 at 18:22):

fitzgen created PR review comment:

Can we add a test for loads of the same address but of different types? I.e. something like

v1 = load.i32 v0
v2 = load.f32 v0
;; check: ... the second load was not eliminated ...

Probably would be good to test signed vs unsigned extensions and 8-to-32 vs 16-to-32 extensions as well.

view this post on Zulip Wasmtime GitHub notifications bot (May 20 2022 at 18:22):

fitzgen created PR review comment:

Can we also have tests for the other fence-semantics instructions too? Just tiny functions for each of them like

v1 = load.i32 v0
<fence instruction>
v2 = load.i32 v0
;; check: ...

view this post on Zulip Wasmtime GitHub notifications bot (May 20 2022 at 18:22):

fitzgen created PR review comment:

It might make sense to reduce indentation here, since this is basically the rest of the function:

let (address, offset, ty) = match inst_addr_offset_type(pos.func, inst) {
    None => continue,
    Some(x) => x,
};

// ...

Totally up to you.

view this post on Zulip Wasmtime GitHub notifications bot (May 20 2022 at 18:22):

fitzgen created PR review comment:

Just to double check ourselves

fn get_ext_opcode(op: Opcode) -> Option<Opcode> {
    debug_assert!(op.can_load() || op.can_store());
    match op {

view this post on Zulip Wasmtime GitHub notifications bot (May 20 2022 at 19:38):

cfallin updated PR #4163 from alias-analysis to main.

view this post on Zulip Wasmtime GitHub notifications bot (May 20 2022 at 19:38):

cfallin submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (May 20 2022 at 19:38):

cfallin created PR review comment:

Done! Added filetests/alias/fence.clif.

view this post on Zulip Wasmtime GitHub notifications bot (May 20 2022 at 19:39):

cfallin submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (May 20 2022 at 19:39):

cfallin created PR review comment:

Done!

view this post on Zulip Wasmtime GitHub notifications bot (May 20 2022 at 19:39):

cfallin submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (May 20 2022 at 19:39):

cfallin created PR review comment:

Done! Added filetests/alias/extends.clif.

view this post on Zulip Wasmtime GitHub notifications bot (May 20 2022 at 19:40):

cfallin submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (May 20 2022 at 19:40):

cfallin created PR review comment:

Ah, there's actually a final call to update_state below the long if-let, which this would skip, and factoring the body out is a bit complex because of mutable borrows of self. Left it as-is for now, but thanks in any case!

view this post on Zulip Wasmtime GitHub notifications bot (May 20 2022 at 20:19):

cfallin merged PR #4163.


Last updated: Oct 23 2024 at 20:03 UTC