wasmtime / PR #7922 Remove the use of the union-find stru... · git-wasmtime

Stream: git-wasmtime

Topic: wasmtime / PR #7922 Remove the use of the union-find stru...

Wasmtime GitHub notifications bot (Feb 12 2024 at 23:12):

elliottt opened PR #7922 from elliottt:trevor/remove-union-find-from-elaborator to bytecodealliance:main:

Remove the UnionFind argument to Elaborator::new, and from the Elaborator structure, relying instead on the value_to_best_value table when computing canonical values.

Co-authored-by: Jamey Sharp <jsharp@fastly.com>
Co-authored-by: L. Pereira <lpereira@fastly.com>

Wasmtime GitHub notifications bot (Feb 12 2024 at 23:46):

elliottt edited PR #7922:

Remove the UnionFind argument to Elaborator::new, and from the Elaborator structure, relying instead on the value_to_best_value table when computing canonical values. Running sightglass on the spideromonkey benchmark showed no difference in performance between main and this branch (compile time or run time).

Wasmtime GitHub notifications bot (Feb 12 2024 at 23:46):

elliottt edited PR #7922:

Remove the UnionFind argument to Elaborator::new, and from the Elaborator structure, relying instead on the value_to_best_value table when computing canonical values. Running sightglass on the spideromonkey benchmark showed no difference in performance between main and this branch (compile time or run time).

Wasmtime GitHub notifications bot (Feb 12 2024 at 23:46):

elliottt has marked PR #7922 as ready for review.

Wasmtime GitHub notifications bot (Feb 12 2024 at 23:46):

elliottt requested cfallin for a review on PR #7922.

Wasmtime GitHub notifications bot (Feb 12 2024 at 23:46):

elliottt requested wasmtime-compiler-reviewers for a review on PR #7922.

Wasmtime GitHub notifications bot (Feb 12 2024 at 23:52):

elliottt edited PR #7922:

Remove the UnionFind argument to Elaborator::new, and from the Elaborator structure, relying instead on the value_to_best_value table when computing canonical values. Running sightglass on the spideromonkey benchmark showed no difference in performance between main and this branch (compile time or run time).

Additionally, we compared the assembly produced with and without this change, and found the difference in generated code to be mostly negligible (lea instead of add, or a move between registers added or removed). The difference in the length of the disassembled output was only +15 lines, which out of 2212651 is pretty good.

Wasmtime GitHub notifications bot (Feb 13 2024 at 00:13):

cfallin submitted PR review:

Thanks for taking this on!

It's reassuring to see no changes to the compile-tests' outputs. I remembered one bit of the original thinking after our discussion earlier: IIRC, we had wanted to track elaborated values by canonical-value (union-find result) because otherwise, elaborating a use of an "early" node in the class will not necessarily see a later node that was union'd into the class -- it will result in redundant elaboration.

I suspect this mostly doesn't show up in practice because eager rewriting means that all subsequent uses are of the latest (highest-numbered) rewrite, if any rewrites occurred, so the best-value propagation can give the best for the whole class at that node. But it'd be good to verify that is what you all had been thinking as well, and add a comment describing this somewhere. What do you think?

Wasmtime GitHub notifications bot (Feb 13 2024 at 00:13):

cfallin submitted PR review:

Thanks for taking this on!

It's reassuring to see no changes to the compile-tests' outputs (EDIT: and so little in real code). I remembered one bit of the original thinking after our discussion earlier: IIRC, we had wanted to track elaborated values by canonical-value (union-find result) because otherwise, elaborating a use of an "early" node in the class will not necessarily see a later node that was union'd into the class -- it will result in redundant elaboration.

I suspect this mostly doesn't show up in practice because eager rewriting means that all subsequent uses are of the latest (highest-numbered) rewrite, if any rewrites occurred, so the best-value propagation can give the best for the whole class at that node. But it'd be good to verify that is what you all had been thinking as well, and add a comment describing this somewhere. What do you think?

Wasmtime GitHub notifications bot (Feb 13 2024 at 00:32):

elliottt commented on PR #7922:

The goal that we had in removing the use of the union find structure was to enforce the assumption that it was valid to refer to an inner node of the union tree as a subset of an eclass. Using the union find results from the rewriting pass over the whole function during elaboration may end up merging results from different branches in the dominator tree into the same eclass, which would mean that we might canonicalize to a node higher in the union tree than we originally intended. If that situation occurred, we would be at the mercy of the cost function for correctness. I think that with the changes you proposed to how we manage access to the gvn map as well, we'll be able to relax the new guideline we introduced for using subsume when the RHS forgets values from the LHS in egraph rules.

As to where we would add a comment, I'm not sure the best place to document this. Do we already document our use of sub-trees of the union tree to name subsets of the eclass? I think that's a super elegant implementation detail that's worth documenting, and we could call out careful management of eclasses in the same place.

Wasmtime GitHub notifications bot (Feb 13 2024 at 00:52):

cfallin commented on PR #7922:

OK, yeah, this is confirming to me why we didn't see the scoping issue originally: I think I had been imagining the canonical value to be the "first" and hence dominate all others in the eclass, but, indeed, you're correct that out-of-order canonical values can cause references that require the cost function (or the domtree-range scoping described in #7891) to handle. Thanks for "elaborating" on that!

Perhaps we could add to the comment on value_to_elaborated_value?

Incidentally, this gives me another thought: if we could somehow canonicalize toward dominating values (up the domtree), this problem would also go away. I'm not sure how to do that efficiently though, so I think this plus domtree-range scoping (to avoid delicate subsume rules) feels like the right way to make all this fully robust.

Wasmtime GitHub notifications bot (Feb 13 2024 at 19:05):

elliottt updated PR #7922.

Wasmtime GitHub notifications bot (Feb 13 2024 at 19:59):

elliottt merged PR #7922.

Last updated: Apr 18 2025 at 08:04 UTC