fitzgen opened PR #4536 from reuse-hash-set-in-can-optimize-var-lookup
to main
:
First, we switch from a
BTreeSet
to aHashSet
because clearing aBTreeSet
will deallocate the btree's nodes but clearing aHashSet
will not deallocate
the backing hash table, saving the space to reuse for future insertions.Then, we reuse the same set (and therefore the same allocation) across every
call tocan_optimize_var_lookup
.This results in a 1.22x to 1.32x speed up on various Sightglass benchmarks:
compilation :: nanoseconds :: benchmarks/pulldown-cmark/benchmark.wasm Δ = 39478181.76 ± 3441880.32 (confidence = 99%) main.so is 0.75x to 0.79x faster than reuse-set.so! reuse-set.so is 1.27x to 1.32x faster than main.so! [160128343 172174751.09 213325968] main.so [115055695 132696569.33 200782128] reuse-set.so compilation :: nanoseconds :: benchmarks/bz2/benchmark.wasm Δ = 22576954.88 ± 1830771.68 (confidence = 99%) main.so is 0.77x to 0.81x faster than reuse-set.so! reuse-set.so is 1.25x to 1.29x faster than main.so! [100449245 106820149.65 118628066] main.so [77039172 84243194.77 128168647] reuse-set.so compilation :: nanoseconds :: benchmarks/spidermonkey/benchmark.wasm Δ = 664533554.97 ± 22109170.05 (confidence = 99%) main.so is 0.81x to 0.82x faster than reuse-set.so! reuse-set.so is 1.22x to 1.23x faster than main.so! [3549762523 3640587103.35 3798662501] main.so [2793335181 2976053548.38 3192950484] reuse-set.so
(Benchmark results for
cycles
, rather than wall time, are ~identical.)<!--
Please ensure that the following steps are all taken care of before submitting
the PR.
[ ] This has been discussed in issue #..., or if not, please tell us why
here.[ ] A short description of what this does, why it is needed; if the
description becomes long, the matter should probably be discussed in an issue
first.[ ] This PR contains test cases, if meaningful.
- [ ] A reviewer from the core maintainer team has been assigned for this PR.
If you don't know who could review this, please indicate so. The list of
suggested reviewers on the right can help you.Please ensure all communication adheres to the code of conduct.
-->
fitzgen requested cfallin for a review on PR #4536.
fitzgen updated PR #4536 from reuse-hash-set-in-can-optimize-var-lookup
to main
.
fitzgen requested jameysharp for a review on PR #4536.
jameysharp submitted PR review.
jameysharp created PR review comment:
Huh, moving the
insert
after changingcurrent
doesn't seem right. Did you mean to do that?A pre-existing nit:
insert
returns true ifcontains
would have returned false. So assuming you don't want to change the order here those two calls can be collapsed intoif !self.visited.insert(current) { ...
fitzgen created PR review comment:
Huh, moving the
insert
after changingcurrent
doesn't seem right. Did you mean to do that?Have to to appease the borrow checker, since
predecessors
has a shared borrow ofself
.
fitzgen submitted PR review.
fitzgen created PR review comment:
Ah but I did introduce a bug, I see now what you mean, will fix in a sec.
fitzgen submitted PR review.
fitzgen updated PR #4536 from reuse-hash-set-in-can-optimize-var-lookup
to main
.
fitzgen requested jameysharp for a review on PR #4536.
jameysharp submitted PR review.
jameysharp created PR review comment:
The simplest fix is to move the
visited
check before thepredecessors.len()
checks: if we've visited this block before, it already passed thepredecessors.len()
checks.
jameysharp submitted PR review.
cfallin submitted PR review.
fitzgen merged PR #4536.
Last updated: Dec 23 2024 at 12:05 UTC