alexcrichton opened PR #3791 from minor-instantiate-wins
to main
:
I've been poring over profiles of instantiation recently and this is the final batch (for now at least) of optimizations of low-ish hanging fruit for speeding up instantiation. At this level the fruit is starting to appear significantly higher in the tree and at least for me this is basically at the limit of what I think makes sense to optimize for now. In the
instantiation.rs
benchmark theWasiCtx::new
call is starting to become a pretty nontrivial portion of the entire benchmark which while we could make faster shows just how fast instantiation is at this point.Each optimization is split out to a separate commit below, but at a high level the optimizations implemented here are:
- The
FuncData
type is shrunk to avoid extraneous movement of hundreds of bytes per-instantiation.- Type information is no longer stored inline in
VMContext
but rather externally inModule
to provide constant-time instantiation.- Space for functions being inserted into a store is now pre-reserved to minimize reallocations and memory movement during the instantiation process.
Each improvement here is relatively modest but overall this improves the instantiation of spidermonkey/rustpython modules by 20-25%, dropping them to 3.5 and 3.9us on my machine.
The remaining profile I'm having a tough time finding what else to move out of the way and optimize. One of the highest parts of the profile now is all the
Arc<T>
reference count management, but doing something different there would be a pretty major overhaul I think and honestly this feels like a great problem to have. There's probably some minor tweaks we can make to data structures inStore<T>
to make them more optimal, but as things go it feels like we're in a pretty good place as-is.
alexcrichton updated PR #3791 from minor-instantiate-wins
to main
.
cfallin submitted PR review.
cfallin submitted PR review.
cfallin created PR review comment:
We probably limit
sig_index
here sufficiently, but just in case, out of principle,checked_mul()
here?
bjorn3 submitted PR review.
bjorn3 created PR review comment:
Would it make sense to remove this variant and always use an
Arc
?
alexcrichton updated PR #3791 from minor-instantiate-wins
to main
.
alexcrichton submitted PR review.
alexcrichton created PR review comment:
Perhaps one day if this gets too complicated, but otherwise
Arc
management is already pretty high in the performance profile so I'd prefer to not add more.
alexcrichton merged PR #3791.
Last updated: Dec 23 2024 at 13:07 UTC