cfallin opened PR #3733 from lazy-anyfuncs
to main
:
Currently, in the instance initialization path, we build an "anyfunc"
for every imported function and every function in the module. These
anyfuncs are used as function references in tables, both within a module
and across modules (via function imports).Building all of these is quite wasteful if we never use most of them.
Instead, it would be better to build only the ones that we use.This commit implements a technique to lazily initialize them: at the
point that a pointer to an anyfunc is taken, we check whether it has
actually been initialized. If it hasn't, we do so. We track whether an
anyfunc has been initialized with a separate bitmap, and we only need to
zero this bitmap at instantiation time. (This is important for
performance too: even just zeroing a large array of anyfunc structs can
be expensive, at the microsecond scale, for a module with thousands of
functions.)Keeping the information around that is needed for this lazy init
required a bit of refactoring in the InstanceAllocationRequest; a few
fields now live in anArc
held by the instance while it exists.This was split out from #3699 (I'll rebase it out of that PR next).
The benign-race trickery to make both instantiation and the access-time check
fast is a little hairy, but I left a long comment that tries to explain why
it works. More eyeballs on this (and/or thoughts on if there is a better
way to do this in semi-idiomatic Rust --UnsafeCell
? something else?)
would be welcome!<!--
Please ensure that the following steps are all taken care of before submitting
the PR.
[ ] This has been discussed in issue #..., or if not, please tell us why
here.[ ] A short description of what this does, why it is needed; if the
description becomes long, the matter should probably be discussed in an issue
first.[ ] This PR contains test cases, if meaningful.
- [ ] A reviewer from the core maintainer team has been assigned for this PR.
If you don't know who could review this, please indicate so. The list of
suggested reviewers on the right can help you.Please ensure all communication adheres to the code of conduct.
-->
cfallin requested alexcrichton for a review on PR #3733.
cfallin updated PR #3733 from lazy-anyfuncs
to main
.
cfallin edited PR #3733 from lazy-anyfuncs
to main
:
Currently, in the instance initialization path, we build an "anyfunc"
for every imported function and every function in the module. These
anyfuncs are used as function references in tables, both within a module
and across modules (via function imports).Building all of these is quite wasteful if we never use most of them.
Instead, it would be better to build only the ones that we use.This commit implements a technique to lazily initialize them: at the
point that a pointer to an anyfunc is taken, we check whether it has
actually been initialized. If it hasn't, we do so. We track whether an
anyfunc has been initialized with a separate bitmap, and we only need to
zero this bitmap at instantiation time. (This is important for
performance too: even just zeroing a large array of anyfunc structs can
be expensive, at the microsecond scale, for a module with thousands of
functions.)Keeping the information around that is needed for this lazy init
required a bit of refactoring in the InstanceAllocationRequest; a few
fields now live in anArc
held by the instance while it exists.This was split out from #3697 (I'll rebase it out of that PR next).
The benign-race trickery to make both instantiation and the access-time check
fast is a little hairy, but I left a long comment that tries to explain why
it works. More eyeballs on this (and/or thoughts on if there is a better
way to do this in semi-idiomatic Rust --UnsafeCell
? something else?)
would be welcome!<!--
Please ensure that the following steps are all taken care of before submitting
the PR.
[ ] This has been discussed in issue #..., or if not, please tell us why
here.[ ] A short description of what this does, why it is needed; if the
description becomes long, the matter should probably be discussed in an issue
first.[ ] This PR contains test cases, if meaningful.
- [ ] A reviewer from the core maintainer team has been assigned for this PR.
If you don't know who could review this, please indicate so. The list of
suggested reviewers on the right can help you.Please ensure all communication adheres to the code of conduct.
-->
cfallin updated PR #3733 from lazy-anyfuncs
to main
.
cfallin updated PR #3733 from lazy-anyfuncs
to main
.
cfallin updated PR #3733 from lazy-anyfuncs
to main
.
cfallin updated PR #3733 from lazy-anyfuncs
to main
.
cfallin updated PR #3733 from lazy-anyfuncs
to main
.
cfallin edited PR #3733 from lazy-anyfuncs
to main
:
During instance initialization, we build two sorts of arrays eagerly:
We create an "anyfunc" (a
VMCallerCheckedAnyfunc
) for every function
in an instance.We initialize every element of a funcref table with an initializer to
a pointer to one of these anyfuncs.Most instances will not touch (via call_indirect or table.get) all
funcref table elements. And most anyfuncs will never be referenced,
because most functions are never placed in tables or used with
ref.func
. Thus, both of these initialization tasks are quite wasteful.
Profiling shows that a significant fraction of the remaining
instance-initialization time after our other recent optimizations is
going into these two tasks.This PR implements two basic ideas:
The anyfunc array can be lazily initialized as long as we retain the
information needed to do so. A zero in the func-ptr part of the tuple
means "uninitalized"; a null-check and slowpath does the
initialization whenever we take a pointer to an anyfunc.A funcref table can be lazily initialized as long as we retain a link
to its corresponding instance and function index for each element. A
zero in a table element means "uninitialized", and a slowpath does the
initialization.The use of all-zeroes to mean "uninitialized" means that we can use fast
memory clearing techniques, like madvise(DONTNEED) on Linux or just
freshly-mmap'd anonymous memory, to get to the initial state without
a lot of memory writes.Funcref tables are a little tricky because funcrefs can be null. We need
to distinguish "element was initially non-null, but user stored explicit
null later" from "element never touched" (ie the lazy init should not
blow away an explicitly stored null). We solve this by stealing the LSB
from every funcref (anyfunc pointer): when the LSB is set, the funcref
is initialized and we don't hit the lazy-init slowpath. We insert the
bit on storing to the table and mask it off after loading.Performance effect on instantiation in the on-demand allocator (pooling
allocator effect should be similar as the table-init path is the same):sequential/default/spidermonkey.wasm time: [71.886 us 72.012 us 72.133 us] sequential/default/spidermonkey.wasm time: [22.243 us 22.256 us 22.270 us] change: [-69.117% -69.060% -69.000%] (p = 0.00 < 0.05) Performance has improved.
So, 72µs to 22µs, or a 69% reduction.
Last updated: Nov 22 2024 at 16:03 UTC