cfallin opened PR #2576 from cmp-load-bug
to main
:
On x64, the new backend generates
cmp
instructions at their use-sites
when possible (when the icmp that generates a boolean is known) so that
the condition flows directly through flags rather than a materialized
boolean. E.g., bothbint
(boolean to int) andselect
(conditional
select) instruction lowerings invokeemit_cmp()
to do so.Load-op fusion in
emit_cmp()
nominally allowedcmp
to use itscmp reg, mem
form.However, the mergeable-load condition (load has only single use) was not
adequately checked. Consider the sequence:v2 = load.i64 v1 v3 = icmp eq v0, v2 v4 = bint.i64 v3 v5 = select.i64 v3, v0, v1
The load
v2
is only used in theicmp
atv3
. However, the cmp will
be separately codegen'd twice, once for thebint
and once for the
select
.Prior to this fix, the above example would result in the load at
v2
sinking to thecmp
just above theselect
; we then emit anothercmp
for thebint
, but the load has already been used once so we do not
allow merging. We thus (i) expect the register forv2
to contain the
loaded value, but (ii) skip the codegen for the load because it has been
sunk. This results in a regalloc error (unexpected livein) as the
unfilled register is upward-exposed to the entry point.Because of this, we need to accept only the reg, reg form in
emit_cmp()
(and the FP equivalent). We could get marginally better
code by tracking whether thecmp
we are emitting comes from an
icmp
/fcmp
with only one use; but IMHO simplicity is a better rule
here when subtle interactions occur.<!--
Please ensure that the following steps are all taken care of before submitting
the PR.
[ ] This has been discussed in issue #..., or if not, please tell us why
here.[ ] A short description of what this does, why it is needed; if the
description becomes long, the matter should probably be discussed in an issue
first.[ ] This PR contains test cases, if meaningful.
- [ ] A reviewer from the core maintainer team has been assigned for this PR.
If you don't know who could review this, please indicate so. The list of
suggested reviewers on the right can help you.Please ensure all communication adheres to the code of conduct.
-->
cfallin requested fitzgen and abrown for a review on PR #2576.
cfallin requested fitzgen and abrown for a review on PR #2576.
fitzgen submitted PR Review.
fitzgen created PR Review Comment:
Maybe a comment saying something like "see comment above
let rhs = put_input_in_reg(...)
inemit_cmp
for why we force RHS into a register" because this is a bit subtle?
fitzgen submitted PR Review.
cfallin updated PR #2576 from cmp-load-bug
to main
:
On x64, the new backend generates
cmp
instructions at their use-sites
when possible (when the icmp that generates a boolean is known) so that
the condition flows directly through flags rather than a materialized
boolean. E.g., bothbint
(boolean to int) andselect
(conditional
select) instruction lowerings invokeemit_cmp()
to do so.Load-op fusion in
emit_cmp()
nominally allowedcmp
to use itscmp reg, mem
form.However, the mergeable-load condition (load has only single use) was not
adequately checked. Consider the sequence:v2 = load.i64 v1 v3 = icmp eq v0, v2 v4 = bint.i64 v3 v5 = select.i64 v3, v0, v1
The load
v2
is only used in theicmp
atv3
. However, the cmp will
be separately codegen'd twice, once for thebint
and once for the
select
.Prior to this fix, the above example would result in the load at
v2
sinking to thecmp
just above theselect
; we then emit anothercmp
for thebint
, but the load has already been used once so we do not
allow merging. We thus (i) expect the register forv2
to contain the
loaded value, but (ii) skip the codegen for the load because it has been
sunk. This results in a regalloc error (unexpected livein) as the
unfilled register is upward-exposed to the entry point.Because of this, we need to accept only the reg, reg form in
emit_cmp()
(and the FP equivalent). We could get marginally better
code by tracking whether thecmp
we are emitting comes from an
icmp
/fcmp
with only one use; but IMHO simplicity is a better rule
here when subtle interactions occur.<!--
Please ensure that the following steps are all taken care of before submitting
the PR.
[ ] This has been discussed in issue #..., or if not, please tell us why
here.[ ] A short description of what this does, why it is needed; if the
description becomes long, the matter should probably be discussed in an issue
first.[ ] This PR contains test cases, if meaningful.
- [ ] A reviewer from the core maintainer team has been assigned for this PR.
If you don't know who could review this, please indicate so. The list of
suggested reviewers on the right can help you.Please ensure all communication adheres to the code of conduct.
-->
cfallin submitted PR Review.
cfallin created PR Review Comment:
Done, thanks!
cfallin merged PR #2576.
Last updated: Nov 22 2024 at 16:03 UTC