fitzgen added the cranelift:goal:optimize-speed label to Issue #9352.
fitzgen added the wasmtime:ref-types label to Issue #9352.
fitzgen added the performance label to Issue #9352.
fitzgen opened issue #9352:
Currently the
TypeRegistry::is_subtype
method requires taking a read lock to access the supertypes arrays that allow for O(1) subtype checking. At the time of writing, this doesn't power theref.{test,cast}
instructions because I haven't gotten around to implementing them yet, but it is expected that it will for the very first MVP implementation of them. It is also used in the id-to-funcref conversion for the funcrefs side table (see https://github.com/bytecodealliance/wasmtime/pull/9341).Acquiring a lock during Wasm execution is, of course, not ideal for horizontal scaling within a process.
We can avoid locks for most cases by creating a shared lock-free arena of supertype indices (potentially built on top of a linear memory) where the arena's free list and mapping of type id to pointer-to-supertype-array are behind a lock, but accessing already-allocated supertypes arrays doesn't require any locking. (The supertypes arrays are immutable for as long as they exist, which makes accessing them without locking safe.)
(Actually, as I write this out, this is basically what we are doing in
TypeRegistry
today already, except we're allocating supertypes arrays via the global allocator and they aren't all in the same shared arena. But there is no reason why we couldn't start exposing the pointers to the supertypes arrays to Wasm today, modulo some#[repr(C)]
sprinkling and a few little things like that).Our various operations would then proceed as follows:
- Creating a new module: take the lock, allocate any supertype arrays in the arena that aren't already in there, and then for each of the module's types, grab a pointer to its supertypes array that can be used without locking.
ref.{test,cast}
and other instructions that do subtyping checks: Look up the operand's type in the module's types-to-supertypes-array mapping, which doesn't require locking. If the supertypes array exists, use it to test subtyping and we're done.
- If the supertypes array doesn't exist for this type, which is possible when given some external
struct
instance whose type isn't defined in this module but could be a subtype of a type defined in this module for example, then we still have to fall back toTypeRegistry::is_subtype
and locking. We could always throw an LRU cache in front of this to help delay avoid repeated locking. Open to more ideas for this degenerate case. This gets a bit hard for the super degenerate cases where a Wasm module defines astruct
type, calls a host function, the host function then dynamically defines a new type (after the module is created and instantiated!) that subtypes the module's type and returns an instance of it, and then the wasm does a subtype check on that instance.- id-to-funcref conversion: works basically the same as the
ref.{test,cast}
instructions, we have access to the wasm module from the libcall and can use the module's pointers to the supertypes arrays without locking the wholeTypeRegistry
.- Dropping a module: opposite of taking a module. Takes a lock, decrefs/removes the module's types and deallocates any now-unused supertypes arrays.
xpbowler commented on issue #9352:
Hi @fitzgen, I want to give a shot at this! are there any pointers you would give to get started (I'm a newbie, haha)
fitzgen commented on issue #9352:
Hi @xpbowler,
I don't really have anything on top of what I wrote in the OP, but I must warn you that this is probably not the best task to take on as a first contribution as it is pretty involved. It also isn't clear that this is the lowest hanging fruit for wasm gc perf right now.
Something that might be a little bit more straight forward, but could then blossom out into lots of different and interesting paths, would be to
- take a look at some potential GC benchmarks (listed here
- get them running under Wasmtime (ideally under the sightglass benchmark harness and committed in sightglass's benchmark suite)
- profile the new benchmark(s) and see where time is spent and what our codegen for those sections looks like
- report the results in the wasm GC perf umbrella issue, come up with new optimization ideas, prioritize existing optimization ideas based on those results, etc...
fitzgen edited a comment on issue #9352:
Hi @xpbowler,
I don't really have anything on top of what I wrote in the OP, but I must warn you that this is probably not the best task to take on as a first contribution as it is pretty involved. It also isn't clear that this is the lowest hanging fruit for wasm gc perf right now.
Something that might be a little bit more straight forward, but could then blossom out into lots of different and interesting paths, would be to
- take a look at some potential GC benchmarks (listed here)
- get them running under Wasmtime (ideally under the sightglass benchmark harness and committed in sightglass's benchmark suite)
- profile the new benchmark(s) and see where time is spent and what our codegen for those sections looks like
- report the results in the wasm GC perf umbrella issue, come up with new optimization ideas, prioritize existing optimization ideas based on those results, etc...
xpbowler commented on issue #9352:
Thanks, I'll take a look at those instead to get started!
xpbowler deleted a comment on issue #9352:
Thanks, I'll take a look at those instead to get started!
Last updated: Dec 23 2024 at 12:05 UTC