elliottt opened PR #7922 from elliottt:trevor/remove-union-find-from-elaborator
to bytecodealliance:main
:
Remove the UnionFind argument to
Elaborator::new
, and from theElaborator
structure, relying instead on thevalue_to_best_value
table when computing canonical values.Co-authored-by: Jamey Sharp <jsharp@fastly.com>
Co-authored-by: L. Pereira <lpereira@fastly.com><!--
Please make sure you include the following information:
If this work has been discussed elsewhere, please include a link to that
conversation. If it was discussed in an issue, just mention "issue #...".Explain why this change is needed. If the details are in an issue already,
this can be brief.Our development process is documented in the Wasmtime book:
https://docs.wasmtime.dev/contributing-development-process.htmlPlease ensure all communication follows the code of conduct:
https://github.com/bytecodealliance/wasmtime/blob/main/CODE_OF_CONDUCT.md
-->
elliottt edited PR #7922:
Remove the UnionFind argument to
Elaborator::new
, and from theElaborator
structure, relying instead on thevalue_to_best_value
table when computing canonical values. Running sightglass on the spideromonkey benchmark showed no difference in performance between main and this branch (compile time or run time).<!--
Please make sure you include the following information:
If this work has been discussed elsewhere, please include a link to that
conversation. If it was discussed in an issue, just mention "issue #...".Explain why this change is needed. If the details are in an issue already,
this can be brief.Our development process is documented in the Wasmtime book:
https://docs.wasmtime.dev/contributing-development-process.htmlPlease ensure all communication follows the code of conduct:
https://github.com/bytecodealliance/wasmtime/blob/main/CODE_OF_CONDUCT.md
-->
elliottt edited PR #7922:
Remove the
UnionFind
argument toElaborator::new
, and from theElaborator
structure, relying instead on thevalue_to_best_value
table when computing canonical values. Running sightglass on the spideromonkey benchmark showed no difference in performance between main and this branch (compile time or run time).<!--
Please make sure you include the following information:
If this work has been discussed elsewhere, please include a link to that
conversation. If it was discussed in an issue, just mention "issue #...".Explain why this change is needed. If the details are in an issue already,
this can be brief.Our development process is documented in the Wasmtime book:
https://docs.wasmtime.dev/contributing-development-process.htmlPlease ensure all communication follows the code of conduct:
https://github.com/bytecodealliance/wasmtime/blob/main/CODE_OF_CONDUCT.md
-->
elliottt has marked PR #7922 as ready for review.
elliottt requested cfallin for a review on PR #7922.
elliottt requested wasmtime-compiler-reviewers for a review on PR #7922.
elliottt edited PR #7922:
Remove the
UnionFind
argument toElaborator::new
, and from theElaborator
structure, relying instead on thevalue_to_best_value
table when computing canonical values. Running sightglass on the spideromonkey benchmark showed no difference in performance between main and this branch (compile time or run time).Additionally, we compared the assembly produced with and without this change, and found the difference in generated code to be mostly negligible (
lea
instead ofadd
, or a move between registers added or removed). The difference in the length of the disassembled output was only +15 lines, which out of2212651
is pretty good.<!--
Please make sure you include the following information:
If this work has been discussed elsewhere, please include a link to that
conversation. If it was discussed in an issue, just mention "issue #...".Explain why this change is needed. If the details are in an issue already,
this can be brief.Our development process is documented in the Wasmtime book:
https://docs.wasmtime.dev/contributing-development-process.htmlPlease ensure all communication follows the code of conduct:
https://github.com/bytecodealliance/wasmtime/blob/main/CODE_OF_CONDUCT.md
-->
cfallin submitted PR review:
Thanks for taking this on!
It's reassuring to see no changes to the compile-tests' outputs. I remembered one bit of the original thinking after our discussion earlier: IIRC, we had wanted to track elaborated values by canonical-value (union-find result) because otherwise, elaborating a use of an "early" node in the class will not necessarily see a later node that was union'd into the class -- it will result in redundant elaboration.
I suspect this mostly doesn't show up in practice because eager rewriting means that all subsequent uses are of the latest (highest-numbered) rewrite, if any rewrites occurred, so the best-value propagation can give the best for the whole class at that node. But it'd be good to verify that is what you all had been thinking as well, and add a comment describing this somewhere. What do you think?
cfallin submitted PR review:
Thanks for taking this on!
It's reassuring to see no changes to the compile-tests' outputs (EDIT: and so little in real code). I remembered one bit of the original thinking after our discussion earlier: IIRC, we had wanted to track elaborated values by canonical-value (union-find result) because otherwise, elaborating a use of an "early" node in the class will not necessarily see a later node that was union'd into the class -- it will result in redundant elaboration.
I suspect this mostly doesn't show up in practice because eager rewriting means that all subsequent uses are of the latest (highest-numbered) rewrite, if any rewrites occurred, so the best-value propagation can give the best for the whole class at that node. But it'd be good to verify that is what you all had been thinking as well, and add a comment describing this somewhere. What do you think?
elliottt commented on PR #7922:
The goal that we had in removing the use of the union find structure was to enforce the assumption that it was valid to refer to an inner node of the union tree as a subset of an eclass. Using the union find results from the rewriting pass over the whole function during elaboration may end up merging results from different branches in the dominator tree into the same eclass, which would mean that we might canonicalize to a node higher in the union tree than we originally intended. If that situation occurred, we would be at the mercy of the cost function for correctness. I think that with the changes you proposed to how we manage access to the gvn map as well, we'll be able to relax the new guideline we introduced for using subsume when the RHS forgets values from the LHS in egraph rules.
As to where we would add a comment, I'm not sure the best place to document this. Do we already document our use of sub-trees of the union tree to name subsets of the eclass? I think that's a super elegant implementation detail that's worth documenting, and we could call out careful management of eclasses in the same place.
cfallin commented on PR #7922:
OK, yeah, this is confirming to me why we didn't see the scoping issue originally: I think I had been imagining the canonical value to be the "first" and hence dominate all others in the eclass, but, indeed, you're correct that out-of-order canonical values can cause references that require the cost function (or the domtree-range scoping described in #7891) to handle. Thanks for "elaborating" on that!
Perhaps we could add to the comment on
value_to_elaborated_value
?Incidentally, this gives me another thought: if we could somehow canonicalize toward dominating values (up the domtree), this problem would also go away. I'm not sure how to do that efficiently though, so I think this plus domtree-range scoping (to avoid delicate subsume rules) feels like the right way to make all this fully robust.
elliottt updated PR #7922.
elliottt merged PR #7922.
Last updated: Dec 23 2024 at 12:05 UTC