elliottt commented on issue #5031:
@uweigand, could you take a closer look at my changes to the s390x backend? In particular the changes in c01d860, as they increased the instruction count for the
bmask
translation by quite a bit.
elliottt edited a comment on issue #5031:
@uweigand, could you take a closer look at my changes to the s390x backend? In particular the changes to the lowering of
bmask
, as they increased the instruction count for the translation by quite a bit.
uweigand commented on issue #5031:
@uweigand, could you take a closer look at my changes to the s390x backend? In particular the changes to the lowering of
bmask
, as they increased the instruction count for the translation by quite a bit.Well, in part this is due to changed semantics: the old
bmask
would take a boolean type as input, which (except for$B1
) already used only the values 0 and -1. So the semantics ofbmask
was simply a no-op or a sign-extend. The newbmask
I understand now needs to take any arbitrary integer input, so it actually has to perform a comparison against zero.That said, I think the new implementation could still be improved, in particular the
$I128
cases. However, instead of re-implementing the wheel here, why don't you simply use the existingicmp_val
helper to emit the actual comparison?Maybe something alone the lines of
(rule (lower (has_type oty (bmask x @ (value_type ity)))) (lower_bool_to_mask oty (icmp_val $true (IntCC.NotEqual) x (imm ity 0)))
with a new helper
lower_bool_to_mask
analogous to the oldlower_bool
...
uweigand commented on issue #5031:
Following up on myself, it would be even simpler and better to use the existing
value_nonzero
helper, which will fold in previous comparisons if possible:(rule (lower (has_type ty (bmask x))) (lower_bool_to_mask ty (value_nonzero x)))
elliottt commented on issue #5031:
Following up on myself, it would be even simpler _and_ better to use the existing
value_nonzero
helper, which will fold in previous comparisons if possible:
(rule (lower (has_type ty (bmask x))) (lower_bool_to_mask ty (value_nonzero x)))
This is much nicer! I'll rework the changes to add
lower_bool_to_mask
instead.
uweigand commented on issue #5031:
This is much nicer! I'll rework the changes to add
lower_bool_to_mask
instead.Thanks! The s390x parts LGTM now.
Last updated: Jan 24 2025 at 00:11 UTC