alexcrichton opened PR #2820 from faster-stack-maps
to main
:
This commit fixes a perf issue I was seeing in some local benchmarking
where registration of stack maps took a nontrivial amount of time for a
module that didn't even use stack maps! The fix here is to largely just
do the same thing as #2811 which is to use the in-memory data structures
of aCompiledModule
rather than rebuilding different data structures
when a module is registered with aStore
.The
StackMapRegistry
type now takes a lookup object for a range of
pcs, simplifying registration. Registration additionally doesn't even
happen if a module is pre-determined to not have any stack maps inside
of it. This trait object encapsulates all the logic that was previously
used in the rebuilt data structures and also leverages conveniences like
CompiledModule::func_by_pc
added recently as well.With this all combined it means that modules which don't have stack maps
now skip this registration step entirely and modules with stack maps
should do much less work during registration as well.<!--
Please ensure that the following steps are all taken care of before submitting
the PR.
[ ] This has been discussed in issue #..., or if not, please tell us why
here.[ ] A short description of what this does, why it is needed; if the
description becomes long, the matter should probably be discussed in an issue
first.[ ] This PR contains test cases, if meaningful.
- [ ] A reviewer from the core maintainer team has been assigned for this PR.
If you don't know who could review this, please indicate so. The list of
suggested reviewers on the right can help you.Please ensure all communication adheres to the code of conduct.
-->
alexcrichton requested fitzgen for a review on PR #2820.
fitzgen submitted PR Review.
peterhuene created PR Review Comment:
Question: why the trait over using a method on
CompiledModule
(similar tofunc_by_pc
andfunc_info
)?
peterhuene submitted PR Review.
peterhuene edited PR Review Comment.
alexcrichton submitted PR Review.
alexcrichton created PR Review Comment:
Only because the
CompiledModule
lives in thewasmtime_jit
crate which thiswasmtime_runtime
crate doesn't have access to. Otherwise though there's definitely no need for dynamic trait dispatch here.
alexcrichton updated PR #2820 from faster-stack-maps
to main
.
peterhuene submitted PR Review.
peterhuene created PR Review Comment:
Makes sense. I guess the alternative would be to use a "lookup" trait that the runtime interacts with (it would just need the
lookup_stack_map
method), freeing the registry implementation to live in thewasmtime
crate, but that's just trading dynamic-dispatch during module instantiation to dynamic-dispatch at externref gc.
peterhuene edited PR Review Comment.
peterhuene submitted PR Review.
peterhuene created PR Review Comment:
If we did move the registry implementation to
wasmtime
, I think this data structure could be a generic one replacingStoreFrameInfo
too, basically just a "whichCompiledModule
should I be talking to given this pc?" which responds with an answer that we then use to lookup the appropriate information (gimmie frame info, gimmie trap info, gimmie stack maps, etc.)
peterhuene edited PR Review Comment.
peterhuene edited PR Review Comment.
peterhuene edited PR Review Comment.
peterhuene edited PR Review Comment.
peterhuene submitted PR Review.
peterhuene created PR Review Comment:
In fact, given externref gc being a relatively expensive (and hopefully very infrequent) operation and the cost of doing trait object dispatch per frame would be negligible, I would argue
VMContext
should have*const dyn StackMapLookup
rather than*mut StackMapRegistry
.This frees us to remove
StackMapRegistry
andStoreFrameInfo
in favor of a singleCompiledModuleRegistry
per store which impls the runtime'sStackMapLookup
and can lookup frame and trap information as well.Thoughts on this approach? Perhaps for another PR?
peterhuene edited PR Review Comment.
alexcrichton submitted PR Review.
alexcrichton created PR Review Comment:
Hm good points! I'll look to add this to this PR, seems reasonable to do here!
alexcrichton closed without merge PR #2820.
Last updated: Jan 24 2025 at 00:11 UTC