afonso360 opened PR #5546 from select-optimizations
to main
:
:wave: Hey,
This PR adds some egraph rules to optimize a
select + icmp
into the appropriate version of themin
/max
instructions.We transform the following code:
function %select_sgt_to_smax(i32, i32) -> i32 { block0(v0: i32, v1: i32): v2 = icmp sgt v0, v1 v3 = select v2, v0, v1 return v3 }
Into:
function %select_sgt_to_smax(i32, i32) -> i32 { block0(v0: i32, v1: i32): v2 = smax v0, v1 return v2 }
Which is usually a single instruction in the backends.
I've been running this in fuzzgen for a while and it hasn't complained so far!
afonso360 edited PR #5546 from select-optimizations
to main
:
:wave: Hey,
This PR adds some egraph rules to optimize a
select + icmp
into the appropriate version of themin
/max
instructions.We transform the following code:
function %select_sgt_to_smax(i32, i32) -> i32 { block0(v0: i32, v1: i32): v2 = icmp sgt v0, v1 v3 = select v2, v0, v1 return v3 }
Into:
function %select_sgt_to_smax(i32, i32) -> i32 { block0(v0: i32, v1: i32): v2 = smax v0, v1 return v2 }
I've been running this in fuzzgen for a while and it hasn't complained so far!
afonso360 edited PR #5546 from select-optimizations
to main
:
:wave: Hey,
This PR adds some egraph rules to optimize a
select + icmp
into the appropriate version of themin
/max
instructions.We transform the following code:
function %select_sgt_to_smax(i32, i32) -> i32 { block0(v0: i32, v1: i32): v2 = icmp sgt v0, v1 v3 = select v2, v0, v1 return v3 }
Into:
function %select_sgt_to_smax(i32, i32) -> i32 { block0(v0: i32, v1: i32): v2 = smax v0, v1 return v2 }
This is beneficial at least for the riscv backend, where we can emit the
max
instruction in these cases instead of branching.I've been running this in fuzzgen for a while and it hasn't complained so far!
afonso360 edited PR #5546 from select-optimizations
to main
:
:wave: Hey,
This PR adds some egraph rules to optimize a
select + icmp
into the appropriate version of themin
/max
instructions.We transform the following code:
function %select_sgt_to_smax(i32, i32) -> i32 { block0(v0: i32, v1: i32): v2 = icmp sgt v0, v1 v3 = select v2, v0, v1 return v3 }
Into:
function %select_sgt_to_smax(i32, i32) -> i32 { block0(v0: i32, v1: i32): v2 = smax v0, v1 return v2 }
This is beneficial at least for the RISC-V backend, where we can emit the
max
instruction in these cases instead of branching.I've been running this in fuzzgen for a while and it hasn't complained so far!
jameysharp submitted PR review.
afonso360 updated PR #5546 from select-optimizations
to main
.
afonso360 updated PR #5546 from select-optimizations
to main
.
cfallin submitted PR review.
cfallin created PR review comment:
@jameysharp and I talked a little further about this today and we agreed that we probably don't want to double the number of
icmp
nodes unconditionally; it's cheaper to have more rules, some with inverted matching forms on the left-hand sides, than it is to grow the egraph and then have to match on the result of that. Would you be willing to remove this rule and compose it into the below manually instead?
cfallin submitted PR review.
afonso360 updated PR #5546 from select-optimizations
to main
.
afonso360 updated PR #5546 from select-optimizations
to main
.
jameysharp merged PR #5546.
Last updated: Dec 23 2024 at 12:05 UTC