Stream: git-wasmtime

Topic: wasmtime / PR #5800 egraphs: fix handling of effectful-bu...


view this post on Zulip Wasmtime GitHub notifications bot (Feb 16 2023 at 06:12):

cfallin opened PR #5800 from fix-egraph-effectful-gvn to main:

This PR addresses #5796: currently, ops that are effectful, i.e., remain in the side-effecting skeleton (which we keep in the Layout while the egraph exists), but are idempotent and thus mergeable by a GVN pass, are not handled properly.

GVN is still possible on effectful but idempotent ops precisely because our GVN does not create partial redundancies: it removes an instruction only when it is dominated by an identical instruction. An isntruction will not be "hoisted" to a point where it could execute in the optimized code but not in the original.

However, there are really two parts to the egraph implementation that produce this effect: the deduplication on insertion into the egraph, and the elaboration with a scoped hashmap. The deduplication lets us give a single name (value ID) to all copies of an identical instruction, and then elaboration will re-create duplicates if GVN should not hoist or merge some of them.

Because deduplication need not worry about dominance or scopes, we use a simple (non-scoped) hashmap to dedup/intern ops as "egraph nodes".

When we added support for GVN'ing effectful but idempotent ops (#5594), we kept the use of this simple dedup'ing hashmap, but these ops do not get elaborated; instead they stay in the side-effecting skeleton. Thus, we inadvertently created potential for weird code-motion effects.

The proposal in #5796 would solve this in a clean way by treating these ops as pure again, and keeping them out of the skeleton, instead putting "force" pseudo-ops in the skeleton. However, this is a little more complex than I would like, and I've realized that @jameysharp's earlier suggestion is much simpler: we can keep an actual scoped hashmap separately just for the effectful-but-idempotent ops, and use it to GVN while we build the egraph. In effect, we're fusing a separate GVN pass with the egraph pass (but letting it interact corecursively with egraph rewrites. This is in principle similar to how we keep a separate map for loads and fuse this pass with the egraph rewrite pass as well.

Note that we can use a ScopedHashMap here without the "context" (as needed by CtxHashMap) because, as noted by @jameysharp, in practice the ops we want to GVN have all their args inline. Equality on the InstructinoData itself is conservative: two insts whose struct contents compare shallowly equal are definitely identical, but identical insts in a deep-equality sense may not compare shallowly equal, due to list indirection. This is fine for GVN, because it is still sound to skip any given GVN opportunity (and keep the original instructions).

Fixes #5796.

<!--

Please ensure that the following steps are all taken care of before submitting
the PR.

Please ensure all communication adheres to the code of conduct.
-->

view this post on Zulip Wasmtime GitHub notifications bot (Feb 16 2023 at 06:12):

cfallin requested jameysharp for a review on PR #5800.

view this post on Zulip Wasmtime GitHub notifications bot (Feb 16 2023 at 23:04):

cfallin updated PR #5800 from fix-egraph-effectful-gvn to main.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 02 2023 at 01:39):

jameysharp submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 02 2023 at 01:39):

jameysharp submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 02 2023 at 01:39):

jameysharp created PR review comment:

The reason for needing the controlling type to be part of the hash key seems subtle to me. I suspect all the instructions currently classed as is_mergeable_for_egraph have their type fully determined by their operands, so I think this isn't strictly necessary today. But it does seem like a good idea to do it just in case, and also it's consistent with the key type for the pure GVN map.

I'm not sure there's any action necessary here, just noting something that confused me for a minute.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 02 2023 at 01:39):

jameysharp created PR review comment:

I think it's worth copying some portion of your commit message into a comment here:

Note that we can use a ScopedHashMap here without the "context" (as needed by CtxHashMap) because in practice the ops we want to GVN have all their args inline. Equality on the InstructionData itself is conservative: two insts whose struct contents compare shallowly equal are definitely identical, but identical insts in a deep-equality sense may not compare shallowly equal, due to list indirection. This is fine for GVN, because it is still sound to skip any given GVN opportunity (and keep the original instructions).

view this post on Zulip Wasmtime GitHub notifications bot (Mar 02 2023 at 01:49):

cfallin updated PR #5800 from fix-egraph-effectful-gvn to main.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 02 2023 at 01:50):

cfallin submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 02 2023 at 01:50):

cfallin created PR review comment:

Ah, yes, good point -- I've added a comment at the def (below the comment you suggested below as well), and actually on the def of the ordinary gvn_map as well.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 02 2023 at 01:50):

cfallin submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 02 2023 at 01:50):

cfallin created PR review comment:

Good idea, added; thanks!

view this post on Zulip Wasmtime GitHub notifications bot (Mar 02 2023 at 01:52):

cfallin edited PR #5800 from fix-egraph-effectful-gvn to main:

This PR addresses #5796: currently, ops that are effectful, i.e., remain in the side-effecting skeleton (which we keep in the Layout while the egraph exists), but are idempotent and thus mergeable by a GVN pass, are not handled properly.

GVN is still possible on effectful but idempotent ops precisely because our GVN does not create partial redundancies: it removes an instruction only when it is dominated by an identical instruction. An isntruction will not be "hoisted" to a point where it could execute in the optimized code but not in the original.

However, there are really two parts to the egraph implementation that produce this effect: the deduplication on insertion into the egraph, and the elaboration with a scoped hashmap. The deduplication lets us give a single name (value ID) to all copies of an identical instruction, and then elaboration will re-create duplicates if GVN should not hoist or merge some of them.

Because deduplication need not worry about dominance or scopes, we use a simple (non-scoped) hashmap to dedup/intern ops as "egraph nodes".

When we added support for GVN'ing effectful but idempotent ops (#5594), we kept the use of this simple dedup'ing hashmap, but these ops do not get elaborated; instead they stay in the side-effecting skeleton. Thus, we inadvertently created potential for weird code-motion effects.

The proposal in #5796 would solve this in a clean way by treating these ops as pure again, and keeping them out of the skeleton, instead putting "force" pseudo-ops in the skeleton. However, this is a little more complex than I would like, and I've realized that @jameysharp's earlier suggestion is much simpler: we can keep an actual scoped hashmap separately just for the effectful-but-idempotent ops, and use it to GVN while we build the egraph. In effect, we're fusing a separate GVN pass with the egraph pass (but letting it interact corecursively with egraph rewrites). This is in principle similar to how we keep a separate map for loads and fuse this pass with the egraph rewrite pass as well.

Note that we can use a ScopedHashMap here without the "context" (as needed by CtxHashMap) because, as noted by @jameysharp, in practice the ops we want to GVN have all their args inline. Equality on the InstructionData itself is conservative: two insts whose struct contents compare shallowly equal are definitely identical, but identical insts in a deep-equality sense may not compare shallowly equal, due to list indirection. This is fine for GVN, because it is still sound to skip any given GVN opportunity (and keep the original instructions).

Fixes #5796.

<!--

Please ensure that the following steps are all taken care of before submitting
the PR.

Please ensure all communication adheres to the code of conduct.
-->

view this post on Zulip Wasmtime GitHub notifications bot (Mar 02 2023 at 01:52):

cfallin has enabled auto merge for PR #5800.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 02 2023 at 02:46):

cfallin merged PR #5800.


Last updated: Jan 24 2025 at 00:11 UTC