alexcrichton opened PR #4051 from less-arc-clone
to main
:
This commit implements an optimization to help improve concurrently
creating instances of a module on many threads simultaneously. One
bottleneck to this measured has been the reference count modification on
Arc<HostFunc>
. Each host function stored within aLinker<T>
is
wrapped in anArc<HostFunc>
structure, and when any of those host
functions are inserted into a store the reference count is incremented.
When the store is dropped the reference count is then decremented.This ends up meaning that when a module imports N functions it ends up
doing 2N atomic modifications over the lifetime of the instance. For
embeddings where theLinker<T>
is rarely modified but instances are
frequently created this can be a surprising bottleneck to creating many
instances.A change implemented here is to optimize the instantiation process when
using anInstancePre<T>
. AnInstancePre
serves as an opportunity to
take the list of items used to instantiate a module and wrap them all up
in anArc<[T]>
. Everything is going to get cloned into aStore<T>
anyway so to optimize this theArc<[T]>
is cloned at the top-level and
then nothing else is cloned internally. This continues to, however,
preserve a strong reference count for all contained items to prevent
them from being deallocated.A new variant of
FuncKind
was added for host functions which is
effectively stored via*mut HostFunc
. This variant is unsafe to create
and manage and has been documented internally.Performance-wise the overall impact of this change is somewhat minor.
It's already a bit esoteric if this atomic increment and decrement are a
bottleneck due to the number of concurrent instances being created. In
my measurements I've seen that this can reduce instantiation time by up
to 10% for a module that imports two dozen functions. For larger modules
with more imports this is expected to have a larger win.<!--
Please ensure that the following steps are all taken care of before submitting
the PR.
[ ] This has been discussed in issue #..., or if not, please tell us why
here.[ ] A short description of what this does, why it is needed; if the
description becomes long, the matter should probably be discussed in an issue
first.[ ] This PR contains test cases, if meaningful.
- [ ] A reviewer from the core maintainer team has been assigned for this PR.
If you don't know who could review this, please indicate so. The list of
suggested reviewers on the right can help you.Please ensure all communication adheres to the code of conduct.
-->
cfallin submitted PR review.
alexcrichton merged PR #4051.
Last updated: Jan 24 2025 at 00:11 UTC