cfallin opened issue #4133:
In some cases, it is possible to perform better optimizations and instruction selection if one knows that only part of a value is needed (demanded bits). At other times, it is possible to do better if one knows that the operation or instruction actually generating a value sets more bits than are necessary in the destination register (defined-bits).
We should develop these analyses and use them during optimization and during lowering. Some examples:
- Bitmasks can be simplified or elided when demanded-bits allows: for example,
and x, 1
with only the LSB demanded is justx
.- An extend operator can be elided if the upper (extended-into) bits are not demanded.
- An extend operator can be elided if the upper bits are already actually defined by the chosen producer instruction(s).
The demanded-bits analysis should be at the CLIF level as it is a machine-independent concept (what bits do the CLIF semantics of the use(s) actually depend on). The defined-bits analysis is fundamentally a lowered-MachInst-level concept, as it has to do (in our case, since no bits within the type are undefined) with bits above a value in a register (e.g., upper 32 for an
I32
a 64-bit register).
cfallin labeled issue #4133:
In some cases, it is possible to perform better optimizations and instruction selection if one knows that only part of a value is needed (demanded bits). At other times, it is possible to do better if one knows that the operation or instruction actually generating a value sets more bits than are necessary in the destination register (defined-bits).
We should develop these analyses and use them during optimization and during lowering. Some examples:
- Bitmasks can be simplified or elided when demanded-bits allows: for example,
and x, 1
with only the LSB demanded is justx
.- An extend operator can be elided if the upper (extended-into) bits are not demanded.
- An extend operator can be elided if the upper bits are already actually defined by the chosen producer instruction(s).
The demanded-bits analysis should be at the CLIF level as it is a machine-independent concept (what bits do the CLIF semantics of the use(s) actually depend on). The defined-bits analysis is fundamentally a lowered-MachInst-level concept, as it has to do (in our case, since no bits within the type are undefined) with bits above a value in a register (e.g., upper 32 for an
I32
a 64-bit register).
cfallin labeled issue #4133:
In some cases, it is possible to perform better optimizations and instruction selection if one knows that only part of a value is needed (demanded bits). At other times, it is possible to do better if one knows that the operation or instruction actually generating a value sets more bits than are necessary in the destination register (defined-bits).
We should develop these analyses and use them during optimization and during lowering. Some examples:
- Bitmasks can be simplified or elided when demanded-bits allows: for example,
and x, 1
with only the LSB demanded is justx
.- An extend operator can be elided if the upper (extended-into) bits are not demanded.
- An extend operator can be elided if the upper bits are already actually defined by the chosen producer instruction(s).
The demanded-bits analysis should be at the CLIF level as it is a machine-independent concept (what bits do the CLIF semantics of the use(s) actually depend on). The defined-bits analysis is fundamentally a lowered-MachInst-level concept, as it has to do (in our case, since no bits within the type are undefined) with bits above a value in a register (e.g., upper 32 for an
I32
a 64-bit register).
fitzgen commented on issue #4133:
The defined-bits analysis is fundamentally a lowered-MachInst-level concept, as it has to do (in our case, since no bits within the type are undefined) with bits above a value in a register
I just realized I've been saying "defined bits" in conversation with you when I meant "known bits". I was about to comment that there is nothing fundamentally register-oriented about it, but of course I was talking about the wrong thing.
For posterity/clarity, _known_ bits is a forwards analysis where we logically keep track of this join semi-lattice per bit in a value:
? / \ 0 1
and we would have transfer functions that do things like
????0011 bor ???????? ------------ ??????11
It's like per-bit (rather than per-value) partial evaluation.
This is totally something we can do at the CLIF level. Nothing here is fundamentally dealing with registers here.
In addition to being useful for peepholes, we can do neat things like remove CFG edges that we know will never be taken (look at the known bits for a
br_table
's block index argument and remove edges to the table's blocks that conflict with those known bits). The CFG edge trimming is something that is definitely more of a CLIF thing than a mach inst thing.
cfallin commented on issue #4133:
That totally makes sense too! Per-bit partial evaluation (or per-bit constant folding I guess) is a good description. A value with all known bits graduates to a full
iconst
then and can enable further opts.
Last updated: Dec 23 2024 at 12:05 UTC