wasmtime / PR #4163 Add a basic alias analysis with redun... · git-wasmtime

Stream: git-wasmtime

Topic: wasmtime / PR #4163 Add a basic alias analysis with redun...

Wasmtime GitHub notifications bot (May 18 2022 at 23:58):

cfallin requested fitzgen for a review on PR #4163.

Wasmtime GitHub notifications bot (May 18 2022 at 23:58):

cfallin opened PR #4163 from alias-analysis to main:

This PR adds a basic alias analysis, and optimizations that use it.
This is a "mid-end optimization": it operates on CLIF, the
machine-independent IR, before lowering occurs.

The alias analysis (or maybe more properly, a sort of memory-value
analysis) determines when it can prove a particular memory
location is equal to a given SSA value, and when it can, it replaces any
loads of that location.

This subsumes two common optimizations:

Redundant load elimination: when the same memory address is loaded two
times, and it can be proven that no intervening operations will write
to that memory, then the second load is redundant and its result
must be the same as the first. We can use the first load's result and
remove the second load.

Store-to-load forwarding: when a load can be proven to access exactly
the memory written by a preceding store, we can replace the load's
result with the store's data operand, and remove the load.

Both of these optimizations rely on a "last store" analysis that is a
sort of coloring mechanism, split across disjoint categories of abstract
state. The basic idea is that every memory-accessing operation is put
into one of N disjoint categories; it is disallowed for memory to ever
be accessed by an op in one category and later accessed by an op in
another category. (The frontend must ensure this.)

Then, given this, we scan the code and determine, for each
memory-accessing op, when a single prior instruction is a store to the
same category. This "colors" the instruction: it is, in a sense, a
static name for that version of memory.

This analysis provides an important invariant: if two operations access
memory with the same last-store, then no other store can alias in the
time between that last store and these operations. This must-not-alias
property, together with a check that the accessed address is *exactly
the same* (same SSA value and offset), and other attributes of the
access (type, extension mode) are the same, let us prove that the
results are the same.

Given last-store info, we scan the instructions and build a table from
"memory location" key (last store, address, offset, type, extension) to
known SSA value stored in that location. A store inserts a new mapping.
A load may also insert a new mapping, if we didn't already have one.
Then when a load occurs and an entry already exists for its "location",
we can reuse the value. This will be either RLE or St-to-Ld depending on
where the value came from.

Note that this does work across basic blocks: the last-store analysis
is a full iterative dataflow pass, and we are careful to check dominance
of a previously-defined value before aliasing to it at a potentially
redundant load. So we will do the right thing if we only have a
"partially redundant" load (loaded already but only in one predecessor
block), but we will also correctly reuse a value if there is a store or
load above a loop and a redundant load of that value within the loop, as
long as no potentially-aliasing stores happen within the loop.

Fixes #4131.

Passes tests and runs SpiderMonkey correctly locally; benchmarks TBD.

Creating this as a draft for early feedback; will likely refine the comments
and explanations a bit more, and benchmark this.

Wasmtime GitHub notifications bot (May 18 2022 at 23:58):

cfallin updated PR #4163 from alias-analysis to main.

Wasmtime GitHub notifications bot (May 19 2022 at 05:56):

bjorn3 submitted PR review.

Wasmtime GitHub notifications bot (May 19 2022 at 05:56):

bjorn3 created PR review comment:

It would be nice if all stack slots would be their own category until their address is leaked using stack_addr.

Wasmtime GitHub notifications bot (May 19 2022 at 06:16):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (May 19 2022 at 06:16):

cfallin created PR review comment:

Sure, that would be nice, but entails an escape analysis, which is outside the scope of this PR. Happy to discuss further (and review PRs!) once this is in, of course.

Wasmtime GitHub notifications bot (May 19 2022 at 09:02):

bjorn3 submitted PR review.

Wasmtime GitHub notifications bot (May 19 2022 at 09:02):

bjorn3 created PR review comment:

Even stopping optimization entirely if stack_addr is used on a stack slot would likely help cg_clif a lot. Having each stack slot whose address is never leaked be optimized independently is the most important for cg_clif. I'm fine with leaving that for a later PR.

Wasmtime GitHub notifications bot (May 19 2022 at 16:46):

cfallin updated PR #4163 from alias-analysis to main.

Wasmtime GitHub notifications bot (May 19 2022 at 18:44):

cfallin updated PR #4163 from alias-analysis to main.

Wasmtime GitHub notifications bot (May 19 2022 at 18:44):

cfallin has marked PR #4163 as ready for review.

Wasmtime GitHub notifications bot (May 19 2022 at 19:15):

lqd submitted PR review.

Wasmtime GitHub notifications bot (May 19 2022 at 19:15):

lqd created PR review comment:

Tiny typo

//! to the abstract state category occurred, *and* no other trapping

Wasmtime GitHub notifications bot (May 19 2022 at 19:27):

fitzgen submitted PR review.

Wasmtime GitHub notifications bot (May 19 2022 at 19:27):

fitzgen submitted PR review.

Wasmtime GitHub notifications bot (May 19 2022 at 19:27):

fitzgen created PR review comment:

For clarity:
    /// Accesses only the "heap" part of abstract state. Used for alias
Same for the other abstract state parts.

Wasmtime GitHub notifications bot (May 19 2022 at 19:27):

fitzgen created PR review comment:

Slightly unfortunate that we can't easily differentiate between two different memories or two different tables. I don't think it will matter too much in practice, but unfortunate none the less.

File a follow up issue for having an alias analysis bit for each memory/table instead of lumping all memories and all tables together?

Wasmtime GitHub notifications bot (May 19 2022 at 19:27):

fitzgen created PR review comment:

I think this already exists as https://docs.rs/cranelift-codegen/latest/cranelift_codegen/ir/instructions/enum.InstructionData.html#method.memflags no?

Wasmtime GitHub notifications bot (May 19 2022 at 19:27):

fitzgen created PR review comment:

Could you add some comments about what cases this is testing and what is expected? That will be really helpful for folks updating tests for unrelated reasons and trying to determine whether the new precise output is valid or not. Also useful for reviewers trying to figure out which cases do or don't have tests ;)

Wasmtime GitHub notifications bot (May 19 2022 at 19:27):

fitzgen created PR review comment:

These are becoming very verbose. Can we make set_table et al take and return Self so that we can chain the methods like
ir::MemFlags::trusted().set_table()

Wasmtime GitHub notifications bot (May 19 2022 at 19:27):

fitzgen created PR review comment:

I think these tests would be a lot easier to read as CLIF to CLIF tests. That way we don't have to figure out what changes are from unrelated passes or just related to aarch64-specific lowering, or even have to know aarch64 to understand these tests that are logically portable and architecture-independent.

Adding test alias_analysis that just does alias analysis followed by GVN by calling Context::remove_redundant_loads should be really straightforward. Should be able to pretty much copy+paste the test simple_gvn implementation:

https://github.com/bytecodealliance/wasmtime/blob/0a0c232a14e651f1457ced0ad0c950771de2f3c3/cranelift/filetests/src/test_simple_gvn.rs

Wasmtime GitHub notifications bot (May 19 2022 at 19:27):

fitzgen created PR review comment:

File a follow up issue to track/discuss dead-store elimination?

Wasmtime GitHub notifications bot (May 19 2022 at 19:27):

cfallin updated PR #4163 from alias-analysis to main.

Wasmtime GitHub notifications bot (May 19 2022 at 19:27):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (May 19 2022 at 19:27):

cfallin created PR review comment:

Thanks!

Wasmtime GitHub notifications bot (May 19 2022 at 23:03):

cfallin updated PR #4163 from alias-analysis to main.

Wasmtime GitHub notifications bot (May 19 2022 at 23:03):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (May 19 2022 at 23:03):

cfallin created PR review comment:

That's a great idea, thanks -- done!

Wasmtime GitHub notifications bot (May 19 2022 at 23:03):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (May 19 2022 at 23:03):

cfallin created PR review comment:

Done (here and in other new tests)!

Wasmtime GitHub notifications bot (May 19 2022 at 23:13):

cfallin updated PR #4163 from alias-analysis to main.

Wasmtime GitHub notifications bot (May 19 2022 at 23:13):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (May 19 2022 at 23:13):

cfallin created PR review comment:

Ah yes, I had missed that; using that one now, thanks.

Wasmtime GitHub notifications bot (May 19 2022 at 23:13):

cfallin created PR review comment:

Done!

Wasmtime GitHub notifications bot (May 19 2022 at 23:13):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (May 19 2022 at 23:14):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (May 19 2022 at 23:14):

cfallin created PR review comment:

Sure, good idea; I added with_<flag> methods to MemFlags.

Wasmtime GitHub notifications bot (May 19 2022 at 23:19):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (May 19 2022 at 23:19):

cfallin created PR review comment:

Sure! There are efficiency reasons why this is not trivial -- bits in a bitset (MemFlags is a u8, but even with a u32 or u64 that's still a small fixed number of heaps/tables), and the vector of last-stores (does it scale arbitrarily?). But maybe there's a hybrid approach (small number of privileged heaps/tables) or maybe we can find a way to scale it well enough.

Filed #4166.

Wasmtime GitHub notifications bot (May 19 2022 at 23:23):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (May 19 2022 at 23:23):

cfallin created PR review comment:

Sure, #4167.

Wasmtime GitHub notifications bot (May 20 2022 at 18:22):

fitzgen submitted PR review.

Wasmtime GitHub notifications bot (May 20 2022 at 18:22):

fitzgen submitted PR review.

Wasmtime GitHub notifications bot (May 20 2022 at 18:22):

fitzgen created PR review comment:

Can we add a test for loads of the same address but of different types? I.e. something like
v1 = load.i32 v0
v2 = load.f32 v0
;; check: ... the second load was not eliminated ...
Probably would be good to test signed vs unsigned extensions and 8-to-32 vs 16-to-32 extensions as well.

Wasmtime GitHub notifications bot (May 20 2022 at 18:22):

fitzgen created PR review comment:

Can we also have tests for the other fence-semantics instructions too? Just tiny functions for each of them like
v1 = load.i32 v0
<fence instruction>
v2 = load.i32 v0
;; check: ...

Wasmtime GitHub notifications bot (May 20 2022 at 18:22):

fitzgen created PR review comment:

It might make sense to reduce indentation here, since this is basically the rest of the function:
let (address, offset, ty) = match inst_addr_offset_type(pos.func, inst) {
    None => continue,
    Some(x) => x,
};

// ...
Totally up to you.

Wasmtime GitHub notifications bot (May 20 2022 at 18:22):

fitzgen created PR review comment:

Just to double check ourselves

fn get_ext_opcode(op: Opcode) -> Option<Opcode> {
    debug_assert!(op.can_load() || op.can_store());
    match op {

Wasmtime GitHub notifications bot (May 20 2022 at 19:38):

cfallin updated PR #4163 from alias-analysis to main.

Wasmtime GitHub notifications bot (May 20 2022 at 19:38):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (May 20 2022 at 19:38):

cfallin created PR review comment:

Done! Added filetests/alias/fence.clif.

Wasmtime GitHub notifications bot (May 20 2022 at 19:39):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (May 20 2022 at 19:39):

cfallin created PR review comment:

Done!

Wasmtime GitHub notifications bot (May 20 2022 at 19:39):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (May 20 2022 at 19:39):

cfallin created PR review comment:

Done! Added filetests/alias/extends.clif.

Wasmtime GitHub notifications bot (May 20 2022 at 19:40):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (May 20 2022 at 19:40):

cfallin created PR review comment:

Ah, there's actually a final call to update_state below the long if-let, which this would skip, and factoring the body out is a bit complex because of mutable borrows of self. Left it as-is for now, but thanks in any case!

Wasmtime GitHub notifications bot (May 20 2022 at 20:19):

cfallin merged PR #4163.

Last updated: Apr 16 2025 at 19:03 UTC