cfallin opened PR #27 from cranelift-egraphs
to main
:
This RFC proposes to add a middle-end optimization framework to Cranelift based on e-graphs. A middle-end
optimization framework works to improve the target-independent IR (intermediate representation), CLIF, before Cranelift translates it to machine instructions. Cranelift already has some basic ptimizations in this category: Global Value Numbering (GVN), Loop-Independent Code Motion (LICM), Constant folding, and most recently, a very basic form of alias analysis and redundant-load elimination. However, the current situation falls short in several ways.The RFC will first describe how the current situation suffers from three problems: how to order and interleave these optimizations as we build more of them (the phase-ordering problem), how to ensure they are correct when they are all delicate hand-written code (the verification problem), and how to make it easier to build up a large body of known simplifications without tedious handwritten code. It will then describe how an e-graph-based framework could address all
three of these problems. It will describe some novel contributions in how to encode control flow in e-graphs (necessary in order to use e-graphs to optimize whole function bodies) developed during initial experimentation. Then finally it will discuss how we can use our existing rewrite-rule DSL, ISLE, to describe rewrites on the e-graph.
bjorn3 created PR review comment:
*hash-consing
bjorn3 submitted PR review.
bjorn3 submitted PR review.
bjorn3 created PR review comment:
*function
bjorn3 submitted PR review.
bjorn3 created PR review comment:
*using
eqrion created PR review comment:
I've not had a chance to read this paper yet [1], but do you know if PEG's are another graph-based IR in this category?
eqrion submitted PR review.
cfallin submitted PR review.
cfallin created PR review comment:
Thanks for the link -- I wasn't aware of PEGs previously, but they look to be in the same family as the other approaches! The theta-nodes for loops (see e.g. Fig. 2 in the POPL 2009 paper) indicate to me that these are region-based, like VSDGs, so the same discussion points apply (vs. the retained-CFG-skeleton approach here). I see also in the description that they compute each node at most once, and as a result do partial-redundancy elimination, whereas we take an approach trying to preserve our existing GVN semantics and only compute a value when we know it will be used, which is generally safer from an optimization-always-improves-perf point of view.
jameysharp submitted PR review.
jameysharp created PR review comment:
I haven't checked whether this applies to Program Expression Graphs specifically, but similar dataflow-based IRs like the Value Dependence Graph don't preserve non-termination. A loop that doesn't terminate might not define any values used later and would then be removed, "because CFG values that are not demanded do not appear." Also, the "Equality Saturation" paper you cited has a whole side thing (section 6.1) about handling side-effects like heap access, because that's another challenge for purely dataflow IRs.
This is what distinguishes the Value State Dependence Graph and the newer Regionalized VSDG. Treating state and termination dependencies as additional pseudo-dataflow edges means any topological sort of the graph is a valid serialization.
I assume Chris' approach preserves termination and side effects (at the cost of missing some optimization opportunities that might be safe) but I haven't looked at it carefully enough yet to understand how it does that. From what I've heard so far I suspect it's different than anything I've seen in the literature.
cfallin updated PR #27 from cranelift-egraphs
to main
.
cfallin submitted PR review.
cfallin created PR review comment:
Yes, the approach proposed in this RFC preserves both termination and side-effects in a somewhat simple way: by just keeping that part of the CFG around. Or more precisely, it maintains a "skeleton", which is a list of eclass IDs in each basic block of side-effecting ops in original order. This is then used during the scoped elaboration to schedule the pure ops back into the spaces between them. One nice property of this, also, is that we stay close enough to the original CFG that we could actually do the roundtrip in a way that never has to move instruction data (memory traffic is expensive!) except when a node is truly duplicated: the skeleton's instructions are stored in the original
ir::Function
, and the original copy of each e-node remains in their::Function
as well.
cfallin updated PR #27 from cranelift-egraphs
to main
.
avanhatt submitted PR review.
avanhatt submitted PR review.
avanhatt created PR review comment:
Total nit: loop-invariant not loop-independent
avanhatt created PR review comment:
the re-inventing the wheel problem :D
avanhatt created PR review comment:
Nit: if we're being precise, in the egraph itself, the references are to other eclasses, not other enodes. Unless aegraphs differ from egg-style egraphs here?
avanhatt created PR review comment:
Ruler would be another good external cite here: https://www.mwillsey.com/papers/ruler
avanhatt created PR review comment:
Nit: ID not defined at this point
avanhatt created PR review comment:
Might be worth mentioning here that verification results still need to trust the rule application engine, but that in the rule-based world, you get to centralize the core that you need to trust.
avanhatt created PR review comment:
Basic question (sorry): is there an external source that detailed the operations/behavior on a scoped hashmap? When I've implemented GVN in the past it's been with a standard hashmap. I want to think of the scoped hashmap as conceptually equivalent to a stack of hashmaps, but I don't think that's quite right when a value spans multiple scopes.
cfallin updated PR #27 from cranelift-egraphs
to main
.
cfallin submitted PR review.
cfallin created PR review comment:
I just added a paragraph to describe this -- thanks! The basic idea is that it's like a lexical-scoping lookup: inserts add to deepest scope, lookups scan scopes from inner to outer.
cfallin submitted PR review.
cfallin created PR review comment:
Updated, thanks!
cfallin submitted PR review.
cfallin created PR review comment:
Updated, to "same canonical representation"
cfallin created PR review comment:
Added!
cfallin submitted PR review.
cfallin submitted PR review.
cfallin created PR review comment:
Yes, excellent point.
cfallin submitted PR review.
cfallin created PR review comment:
Fixed!
bjorn3 submitted PR review.
avanhatt submitted PR review.
uweigand submitted PR review.
elliottt submitted PR review.
cfallin merged PR #27.
cfallin edited PR #27 from cranelift-egraphs
to main
:
This RFC proposes to add a middle-end optimization framework to Cranelift based on e-graphs. A middle-end
optimization framework works to improve the target-independent IR (intermediate representation), CLIF, before Cranelift translates it to machine instructions. Cranelift already has some basic ptimizations in this category: Global Value Numbering (GVN), Loop-Independent Code Motion (LICM), Constant folding, and most recently, a very basic form of alias analysis and redundant-load elimination. However, the current situation falls short in several ways.The RFC will first describe how the current situation suffers from three problems: how to order and interleave these optimizations as we build more of them (the phase-ordering problem), how to ensure they are correct when they are all delicate hand-written code (the verification problem), and how to make it easier to build up a large body of known simplifications without tedious handwritten code. It will then describe how an e-graph-based framework could address all
three of these problems. It will describe some novel contributions in how to encode control flow in e-graphs (necessary in order to use e-graphs to optimize whole function bodies) developed during initial experimentation. Then finally it will discuss how we can use our existing rewrite-rule DSL, ISLE, to describe rewrites on the e-graph.
Last updated: Nov 22 2024 at 17:03 UTC