wasmtime / PR #3709 Cranelift: Fix cold-blocks-related lo... · git-wasmtime

Stream: git-wasmtime

Topic: wasmtime / PR #3709 Cranelift: Fix cold-blocks-related lo...

Wasmtime GitHub notifications bot (Jan 21 2022 at 00:37):

cfallin opened PR #3709 from cold-blocks-dead-code-bug to main:

If a block is marked cold but has side-effect-free code that is only
used by side-effectful code in non-cold blocks, we will erroneously fail
to emit it, causing a regalloc failure.

This is due to the interaction of block ordering and lowering: we rely
on block ordering to visit uses before defs (except for backedges) so
that we can effectively do an inline liveness analysis and skip lowering
operations that are not used anywhere. This "inline DCE" is needed
because instruction lowering can pattern-match and merge one instruction
into another, removing the need to generate the source instruction.

Unfortunately, the way that I added cold-block support in #3698 was
oblivious to this -- it just changed the block sort order. For
efficiency reasons, we generate code in its final order directly, so it
would not be tenable to generate it in e.g. RPO first and then reorder
cold blocks to the bottom; we really do want to visit in the same order
as the final code.

This PR fixes the bug by moving the point at which cold blocks are sunk
to emission-time instead. This is cheaper than either trying to visit
blocks during lowering in RPO but add to VCode out-of-order, or trying
to do some expensive analysis to recover proper liveness. It's not clear
that the latter would be possible anyway -- the need to lower some
instructions depends on other instructions' isel results/merging
success, so we really do need to visit in RPO, and we can't simply lower
all instructions as side-effecting roots (some can't be toplevel nodes).

The one downside of this approach is that the VCode itself still has
cold blocks inline; so in the text format (and hence compile-tests) it's
not possible to see the sinking. This PR adds a test for cold-block
sinking that actually verifies the machine code. (The test also includes
an add-instruction in the cold path that would have been incorrectly
skipped prior to this fix.)

Fortunately this bug would not have been triggered by the one current
use of cold blocks in #3699, because there the only operation in the
cold block was an (always effectful) call instruction. The worst-case
effect of the bug in other code would be a regalloc panic; no silent
miscompilations could result.

Depends on #3708.

Wasmtime GitHub notifications bot (Jan 21 2022 at 00:37):

cfallin updated PR #3709 from cold-blocks-dead-code-bug to main.

Wasmtime GitHub notifications bot (Jan 21 2022 at 00:37):

cfallin edited PR #3709 from cold-blocks-dead-code-bug to main:

If a block is marked cold but has side-effect-free code that is only
used by side-effectful code in non-cold blocks, we will erroneously fail
to emit it, causing a regalloc failure.

This is due to the interaction of block ordering and lowering: we rely
on block ordering to visit uses before defs (except for backedges) so
that we can effectively do an inline liveness analysis and skip lowering
operations that are not used anywhere. This "inline DCE" is needed
because instruction lowering can pattern-match and merge one instruction
into another, removing the need to generate the source instruction.

Unfortunately, the way that I added cold-block support in #3698 was
oblivious to this -- it just changed the block sort order. For
efficiency reasons, we generate code in its final order directly, so it
would not be tenable to generate it in e.g. RPO first and then reorder
cold blocks to the bottom; we really do want to visit in the same order
as the final code.

This PR fixes the bug by moving the point at which cold blocks are sunk
to emission-time instead. This is cheaper than either trying to visit
blocks during lowering in RPO but add to VCode out-of-order, or trying
to do some expensive analysis to recover proper liveness. It's not clear
that the latter would be possible anyway -- the need to lower some
instructions depends on other instructions' isel results/merging
success, so we really do need to visit in RPO, and we can't simply lower
all instructions as side-effecting roots (some can't be toplevel nodes).

The one downside of this approach is that the VCode itself still has
cold blocks inline; so in the text format (and hence compile-tests) it's
not possible to see the sinking. This PR adds a test for cold-block
sinking that actually verifies the machine code. (The test also includes
an add-instruction in the cold path that would have been incorrectly
skipped prior to this fix.)

Fortunately this bug would not have been triggered by the one current
use of cold blocks in #3699, because there the only operation in the
cold block was an (always effectful) call instruction. The worst-case
effect of the bug in other code would be a regalloc panic; no silent
miscompilations could result.

Wasmtime GitHub notifications bot (Jan 21 2022 at 00:38):

cfallin requested fitzgen for a review on PR #3709.

Wasmtime GitHub notifications bot (Jan 21 2022 at 18:14):

fitzgen submitted PR review.

Wasmtime GitHub notifications bot (Jan 21 2022 at 18:14):

fitzgen submitted PR review.

Wasmtime GitHub notifications bot (Jan 21 2022 at 18:14):

fitzgen created PR review comment:

A very small percentage of blocks will be cold, does it make sense to use a sparse HashSet rather than dense a Vec with an entry for every block here?

Wasmtime GitHub notifications bot (Jan 21 2022 at 18:14):

fitzgen created PR review comment:

Nitpick: either way I would personally call this cold_blocks rather than cold_block

Wasmtime GitHub notifications bot (Jan 21 2022 at 18:47):

cfallin updated PR #3709 from cold-blocks-dead-code-bug to main.

Wasmtime GitHub notifications bot (Jan 21 2022 at 18:48):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Jan 21 2022 at 18:48):

cfallin created PR review comment:

Good points, thanks! Using a hashset now.

Wasmtime GitHub notifications bot (Jan 21 2022 at 19:18):

cfallin merged PR #3709.

Last updated: Apr 18 2025 at 04:04 UTC