fitzgen opened issue #5181:
It should be enabled by default. We should also define what the requirements we need to meet to enable it by default are.
cc @cfallin @jameysharp
fitzgen labeled issue #5181:
It should be enabled by default. We should also define what the requirements we need to meet to enable it by default are.
cc @cfallin @jameysharp
fitzgen labeled issue #5181:
It should be enabled by default. We should also define what the requirements we need to meet to enable it by default are.
cc @cfallin @jameysharp
jameysharp commented on issue #5181:
As far as I know there are two specific reasons we haven't done this yet:
- Compile speed regresses a little bit with egraphs turned on.
- We were encountering some fuzzing failures.
We have some ideas that might get the egraphs optimizer to parity with the current implementation and we want to play with those.
cfallin commented on issue #5181:
Indeed; to add a little more, the overall approach I want to take is to integrate the egraph data structure more completely into the
DataFlowGraph
. Basically, thePrimaryMap
ofValueDef
s can take the place of the "eclass nodes" incranelift-egraph
currently: alongside the other kinds of defs (instruction output, blockparam, alias), we can have a "union" kind. Then the multi-definition aspect of the aegraph can be represented directly in the CLIF.Then the only other difference is that we can take "pure" insts and not place them in blocks before elaboration. So the egraph-mode becomes:
- Allow multiple definitions of a value in the CLIF, by introducing a "union" kind of value (write this in the CLIF text format as e.g.
v3 <- v1, v2
analogously to aliases currently);- Apply rewrite rules as soon as values are introduced, as we do in egraph-mode today (optimizations-during-build);
- Do not place "pure" instructions in the
Layout
until elaboration. Elaboration consists of (i) choosing a canonical value for each value (a value->value map), and (ii) puttingInst
s in theLayout
, possibly duplicating where needed;- Modifying the hashing in GVN a bit so that we hash insts based on canonical values (via a union-find forest) rather than latest-value-in-eclass values.
There are a bunch of nice outcomes from that (and once I saw it, I realized we basically must go this way, and a separate egraph doesn't make sense anymore):
- Any CLIF is also almost an aegraph, trivially / by construction. (The lack of placement for pure nodes is the only difference).
- There is no copying/translation of information between egraph nodes and CLIF instructions.
- We get all the debugging benefits of the text format and other utilities surrounding CLIF.
I want to build all of this first, and suspect that when I do, the perf regressions we're seeing should go away. The reason for holding back before then is both because of the regression but also because if the design is going to change, we should do that before we rely on it as the default path.
I'm currently 100%-coding-time on another project but will hopefully be able to come back to this sometime soon-ish; @jameysharp and I have talked about collaborating on its implementation as well.
bjorn3 commented on issue #5181:
Would it be possible to replace
Layout
with a different type while the function is in aegraph form, but keepDataFlowGraph
in both cases? This would extend to more alternative representations like VSDG/sea of nodes or something with structured control flow (for eg wasm or spirv generation) with minimal (or no) changes to the core ir while at the same time minimizing code duplication.
cfallin commented on issue #5181:
Yep, we could go further in that direction as well; that's an interesting direction to explore.
cfallin closed issue #5181:
It should be enabled by default. We should also define what the requirements we need to meet to enable it by default are.
cc @cfallin @jameysharp
Last updated: Jan 24 2025 at 00:11 UTC