jameysharp opened issue #6109:
Feature
Integrate
cranelift/codegen/src/remove_constant_phis.rs
into the egraph pass instead of running it as a separate pass beforehand.Benefit
Equality saturation may reveal that multiple values are equivalent even though they were defined by different instructions, so it's useful to defer checking whether a block param is defined by equivalent values at every block-call until then. But aliasing block parameters to a common definition can expose more patterns for equality saturation. Therefore this optimization is most effective if it's interleaved with equality saturation.
Integrating this makes #6106 and #5908 more useful: both of those optimizations can prune control-flow edges, making it more likely that a block parameter has only one definition.
Also, during the egraph pass we're iterating over all blocks and updating all instructions anyway, so folding another pass in avoids extra traversals.
Challenges
The "remove constant phis" pass is based on a forward dataflow analysis, which computes a fixpoint over the whole function. The egraph pass visits each basic block only once without reaching a fixpoint. Without revisiting loops, this pass can only be an approximation. However, that's probably good enough, at least for now. Someday we may offer a mode where the egraph does run multiple iterations, when more thorough optimization is desired.
Even in the absence of loops, the egraph pass doesn't currently guarantee that all predecessors of a block have been visited before the block itself, only that a block is visited before any block it dominates. Without that guarantee, we'll often miss optimization opportunities. I think we could establish this guarantee but it's currently tricky. Fixing #5908 may make this more feasible because I think then we'll have more freedom in block visit order; #5800 limited the traversal order to dom-tree pre-order.
Implementation
On starting equality saturation for a block, loop over the block parameters. For each one, check the corresponding argument in all block-calls which reference this block. If they're all the same value, replace the block parameter with an alias to that value, removing that position from the block header and from all the block-calls.
One improvement is that if some predecessor passes the block parameter itself as the argument, then ignore that predecessor for this purpose. In other words, assume that the block parameter always comes from the same definition, then try to disprove that assumption. (It isn't immediately clear to me whether the existing "remove constant phis" pass implements this improvement; I think it does, but if so it's expressed in a way I don't entirely understand.)
A further improvement is to identify groups of block parameters that can be merged into one. (Again, I _think_ the current implementation does this.) For example, in the following program,
v1
andv2
can both be replaced withv0
:block1(v0: i32): jump block2(v0, v0) block2(v1: i32, v2: i32): ; ... jump block2(v2, v1)
Alternatives
We can leave this pass separate, or even run it both before and after the egraph pass. We could even keep the standalone pass while also adding similar functionality to the egraph pass; the standalone pass can resolve identities which require a full fixpoint, and the egraph pass can resolve loop-free cases using the additional information from equality saturation.
jameysharp labeled issue #6109:
Feature
Integrate
cranelift/codegen/src/remove_constant_phis.rs
into the egraph pass instead of running it as a separate pass beforehand.Benefit
Equality saturation may reveal that multiple values are equivalent even though they were defined by different instructions, so it's useful to defer checking whether a block param is defined by equivalent values at every block-call until then. But aliasing block parameters to a common definition can expose more patterns for equality saturation. Therefore this optimization is most effective if it's interleaved with equality saturation.
Integrating this makes #6106 and #5908 more useful: both of those optimizations can prune control-flow edges, making it more likely that a block parameter has only one definition.
Also, during the egraph pass we're iterating over all blocks and updating all instructions anyway, so folding another pass in avoids extra traversals.
Challenges
The "remove constant phis" pass is based on a forward dataflow analysis, which computes a fixpoint over the whole function. The egraph pass visits each basic block only once without reaching a fixpoint. Without revisiting loops, this pass can only be an approximation. However, that's probably good enough, at least for now. Someday we may offer a mode where the egraph does run multiple iterations, when more thorough optimization is desired.
Even in the absence of loops, the egraph pass doesn't currently guarantee that all predecessors of a block have been visited before the block itself, only that a block is visited before any block it dominates. Without that guarantee, we'll often miss optimization opportunities. I think we could establish this guarantee but it's currently tricky. Fixing #5908 may make this more feasible because I think then we'll have more freedom in block visit order; #5800 limited the traversal order to dom-tree pre-order.
Implementation
On starting equality saturation for a block, loop over the block parameters. For each one, check the corresponding argument in all block-calls which reference this block. If they're all the same value, replace the block parameter with an alias to that value, removing that position from the block header and from all the block-calls.
One improvement is that if some predecessor passes the block parameter itself as the argument, then ignore that predecessor for this purpose. In other words, assume that the block parameter always comes from the same definition, then try to disprove that assumption. (It isn't immediately clear to me whether the existing "remove constant phis" pass implements this improvement; I think it does, but if so it's expressed in a way I don't entirely understand.)
A further improvement is to identify groups of block parameters that can be merged into one. (Again, I _think_ the current implementation does this.) For example, in the following program,
v1
andv2
can both be replaced withv0
:block1(v0: i32): jump block2(v0, v0) block2(v1: i32, v2: i32): ; ... jump block2(v2, v1)
Alternatives
We can leave this pass separate, or even run it both before and after the egraph pass. We could even keep the standalone pass while also adding similar functionality to the egraph pass; the standalone pass can resolve identities which require a full fixpoint, and the egraph pass can resolve loop-free cases using the additional information from equality saturation.
jameysharp labeled issue #6109:
Feature
Integrate
cranelift/codegen/src/remove_constant_phis.rs
into the egraph pass instead of running it as a separate pass beforehand.Benefit
Equality saturation may reveal that multiple values are equivalent even though they were defined by different instructions, so it's useful to defer checking whether a block param is defined by equivalent values at every block-call until then. But aliasing block parameters to a common definition can expose more patterns for equality saturation. Therefore this optimization is most effective if it's interleaved with equality saturation.
Integrating this makes #6106 and #5908 more useful: both of those optimizations can prune control-flow edges, making it more likely that a block parameter has only one definition.
Also, during the egraph pass we're iterating over all blocks and updating all instructions anyway, so folding another pass in avoids extra traversals.
Challenges
The "remove constant phis" pass is based on a forward dataflow analysis, which computes a fixpoint over the whole function. The egraph pass visits each basic block only once without reaching a fixpoint. Without revisiting loops, this pass can only be an approximation. However, that's probably good enough, at least for now. Someday we may offer a mode where the egraph does run multiple iterations, when more thorough optimization is desired.
Even in the absence of loops, the egraph pass doesn't currently guarantee that all predecessors of a block have been visited before the block itself, only that a block is visited before any block it dominates. Without that guarantee, we'll often miss optimization opportunities. I think we could establish this guarantee but it's currently tricky. Fixing #5908 may make this more feasible because I think then we'll have more freedom in block visit order; #5800 limited the traversal order to dom-tree pre-order.
Implementation
On starting equality saturation for a block, loop over the block parameters. For each one, check the corresponding argument in all block-calls which reference this block. If they're all the same value, replace the block parameter with an alias to that value, removing that position from the block header and from all the block-calls.
One improvement is that if some predecessor passes the block parameter itself as the argument, then ignore that predecessor for this purpose. In other words, assume that the block parameter always comes from the same definition, then try to disprove that assumption. (It isn't immediately clear to me whether the existing "remove constant phis" pass implements this improvement; I think it does, but if so it's expressed in a way I don't entirely understand.)
A further improvement is to identify groups of block parameters that can be merged into one. (Again, I _think_ the current implementation does this.) For example, in the following program,
v1
andv2
can both be replaced withv0
:block1(v0: i32): jump block2(v0, v0) block2(v1: i32, v2: i32): ; ... jump block2(v2, v1)
Alternatives
We can leave this pass separate, or even run it both before and after the egraph pass. We could even keep the standalone pass while also adding similar functionality to the egraph pass; the standalone pass can resolve identities which require a full fixpoint, and the egraph pass can resolve loop-free cases using the additional information from equality saturation.
jameysharp labeled issue #6109:
Feature
Integrate
cranelift/codegen/src/remove_constant_phis.rs
into the egraph pass instead of running it as a separate pass beforehand.Benefit
Equality saturation may reveal that multiple values are equivalent even though they were defined by different instructions, so it's useful to defer checking whether a block param is defined by equivalent values at every block-call until then. But aliasing block parameters to a common definition can expose more patterns for equality saturation. Therefore this optimization is most effective if it's interleaved with equality saturation.
Integrating this makes #6106 and #5908 more useful: both of those optimizations can prune control-flow edges, making it more likely that a block parameter has only one definition.
Also, during the egraph pass we're iterating over all blocks and updating all instructions anyway, so folding another pass in avoids extra traversals.
Challenges
The "remove constant phis" pass is based on a forward dataflow analysis, which computes a fixpoint over the whole function. The egraph pass visits each basic block only once without reaching a fixpoint. Without revisiting loops, this pass can only be an approximation. However, that's probably good enough, at least for now. Someday we may offer a mode where the egraph does run multiple iterations, when more thorough optimization is desired.
Even in the absence of loops, the egraph pass doesn't currently guarantee that all predecessors of a block have been visited before the block itself, only that a block is visited before any block it dominates. Without that guarantee, we'll often miss optimization opportunities. I think we could establish this guarantee but it's currently tricky. Fixing #5908 may make this more feasible because I think then we'll have more freedom in block visit order; #5800 limited the traversal order to dom-tree pre-order.
Implementation
On starting equality saturation for a block, loop over the block parameters. For each one, check the corresponding argument in all block-calls which reference this block. If they're all the same value, replace the block parameter with an alias to that value, removing that position from the block header and from all the block-calls.
One improvement is that if some predecessor passes the block parameter itself as the argument, then ignore that predecessor for this purpose. In other words, assume that the block parameter always comes from the same definition, then try to disprove that assumption. (It isn't immediately clear to me whether the existing "remove constant phis" pass implements this improvement; I think it does, but if so it's expressed in a way I don't entirely understand.)
A further improvement is to identify groups of block parameters that can be merged into one. (Again, I _think_ the current implementation does this.) For example, in the following program,
v1
andv2
can both be replaced withv0
:block1(v0: i32): jump block2(v0, v0) block2(v1: i32, v2: i32): ; ... jump block2(v2, v1)
Alternatives
We can leave this pass separate, or even run it both before and after the egraph pass. We could even keep the standalone pass while also adding similar functionality to the egraph pass; the standalone pass can resolve identities which require a full fixpoint, and the egraph pass can resolve loop-free cases using the additional information from equality saturation.
Last updated: Nov 22 2024 at 17:03 UTC