cfallin opened PR #2043 from csel-opt
to main
:
Previously, we simply compared the input bool to 0, which forced the
value into a register (usually via a cmp and cset), zero-extended it,
etc. This patch performs the same pattern-matching that branches do to
directly perform the cmp and use its flag results with the csel.On the
bz2
benchmark, the runtime is affected as follows (measuring
withperf stat
, using wasmtime with its cache enabled, and taking the
second run after the first compiles and populates the cache):pre:
1117.232000 task-clock (msec) # 1.000 CPUs utilized 133 context-switches # 0.119 K/sec 1 cpu-migrations # 0.001 K/sec 5,041 page-faults # 0.005 M/sec 3,511,615,100 cycles # 3.143 GHz 4,272,427,772 instructions # 1.22 insn per cycle <not supported> branches 27,980,906 branch-misses 1.117299838 seconds time elapsed
post:
1003.738075 task-clock (msec) # 1.000 CPUs utilized 121 context-switches # 0.121 K/sec 0 cpu-migrations # 0.000 K/sec 5,052 page-faults # 0.005 M/sec 3,224,875,393 cycles # 3.213 GHz 4,000,838,686 instructions # 1.24 insn per cycle <not supported> branches 27,928,232 branch-misses 1.003440004 seconds time elapsed
In other words, with this change, on
bz2
, we see a 6.3% reduction in
executed instructions.<!--
Please ensure that the following steps are all taken care of before submitting
the PR.
[ ] This has been discussed in issue #..., or if not, please tell us why
here.[ ] A short description of what this does, why it is needed; if the
description becomes long, the matter should probably be discussed in an issue
first.[ ] This PR contains test cases, if meaningful.
- [ ] A reviewer from the core maintainer team has been assigned for this PR.
If you don't know who could review this, please indicate so. The list of
suggested reviewers on the right can help you.Please ensure all communication adheres to the code of conduct.
-->
cfallin requested bnjbvr for a review on PR #2043.
cfallin requested bnjbvr and julian-seward1 for a review on PR #2043.
julian-seward1 submitted PR Review.
cfallin merged PR #2043.
Last updated: Dec 23 2024 at 13:07 UTC