jameysharp added the wasmtime label to Issue #8212.
jameysharp added the performance label to Issue #8212.
jameysharp opened issue #8212:
Feature
I would like to remove the callee vmctx field from
VMFuncRef
. Anywhere that we currently have a pointer to aVMFuncRef
, we would instead have a pair of that pointer plus a callee vmctx pointer.Benefit/Implementation
Currently any use of
ref.func
in Wasmtime is compiled to a libcall, including table initialization from an element segment. If the function in question is declared within the current module, then the libcall has to initialize aVMFuncRef
structure within the vmctx and then return a pointer to that structure. (If it's an import, then we have aVMFuncRef
pointer in the corresponding import, and can just return that.)This is currently necessary because
VMFuncRef
includes the callee's vmctx pointer, which is not known until instantiation time.But all of the other fields are constant for a particular function once the module is loaded: The type ID is determined by the engine based on what other modules were loaded previously, and the function pointers are relocated according to the load address of the module, but none of that changes afterward.
So removing the callee vmctx field means we can initialize all the
VMFuncRef
structures when the module is loaded. Then we can keep a single array of them attached to the module, and remove the space reserved for them in each vmctx. So although tables will need an additional word per element in each instance due to the fat-pointer representation, I think that's more than offset by removing five words per funcref from the vmctx for every instance.At that point,
ref.func
on a locally declared function just needs to get the address of a constant index into that module-global array ofVMFuncRef
s, and pair it with the current vmctx. Soref.func
should compile to a base-pointer load and an add for locally declared functions (compared to two loads for imported functions). And that base-pointer load will benotrap
andreadonly
, so it can be subject to GVN and LICM ifref.func
is used multiple times or in a loop.Similarly, initializing tables from element segments can be fast: after loading the
VMFuncRef
array base pointer once, each locally declared functions can be computed by adding a compile-time constant offset. Maybe it's fast enough to remove the lazy-init optimization entirely, as in #8002.Alternatives
In #8195 I suggested an alternative representation for read-only funcref tables. Now I've learned that the type IDs aren't known until the module is loaded so that plan doesn't work as written, but the tables could still be quickly unpacked when the module is loaded.
The above proposal is more general than #8195: I believe this should work equally well in all WebAssembly modules and components, even when the table is writable, or an active element segment is applied to an imported table, or an element segment uses
global.get
. This proposal also doesn't require trampolines for imported functions like that one did. On the other hand, the read-only tables proposal might speed up module loading slightly relative to this plan.cc: @fitzgen @alexcrichton @cfallin
Last updated: Dec 23 2024 at 12:05 UTC