wasmtime / PR #3733 Implement lazy funcref table and anyf... · git-wasmtime

Stream: git-wasmtime

Topic: wasmtime / PR #3733 Implement lazy funcref table and anyf...

Wasmtime GitHub notifications bot (Feb 03 2022 at 07:56):

cfallin edited PR #3733 from lazy-anyfuncs to main:

During instance initialization, we build two sorts of arrays eagerly:

We create an "anyfunc" (a VMCallerCheckedAnyfunc) for every function
in an instance.

We initialize every element of a funcref table with an initializer to
a pointer to one of these anyfuncs.

Most instances will not touch (via call_indirect or table.get) all
funcref table elements. And most anyfuncs will never be referenced,
because most functions are never placed in tables or used with
ref.func. Thus, both of these initialization tasks are quite wasteful.
Profiling shows that a significant fraction of the remaining
instance-initialization time after our other recent optimizations is
going into these two tasks.

This PR implements two basic ideas:

The anyfunc array can be lazily initialized as long as we retain the
information needed to do so. A zero in the func-ptr part of the tuple
means "uninitalized"; a null-check and slowpath does the
initialization whenever we take a pointer to an anyfunc.

A funcref table can be lazily initialized as long as we retain a link
to its corresponding instance and function index for each element. A
zero in a table element means "uninitialized", and a slowpath does the
initialization.

The use of all-zeroes to mean "uninitialized" means that we can use fast
memory clearing techniques, like madvise(DONTNEED) on Linux or just
freshly-mmap'd anonymous memory, to get to the initial state without
a lot of memory writes.

Funcref tables are a little tricky because funcrefs can be null. We need
to distinguish "element was initially non-null, but user stored explicit
null later" from "element never touched" (ie the lazy init should not
blow away an explicitly stored null). We solve this by stealing the LSB
from every funcref (anyfunc pointer): when the LSB is set, the funcref
is initialized and we don't hit the lazy-init slowpath. We insert the
bit on storing to the table and mask it off after loading.

Performance effect on instantiation in the on-demand allocator (pooling
allocator effect should be similar as the table-init path is the same):
sequential/default/spidermonkey.wasm
                        time:   [71.886 us 72.012 us 72.133 us]

sequential/default/spidermonkey.wasm
                        time:   [22.243 us 22.256 us 22.270 us]
                        change: [-69.117% -69.060% -69.000%] (p = 0.00 < 0.05)
                        Performance has improved.
So, 72µs to 22µs, or a 69% reduction.

Wasmtime GitHub notifications bot (Feb 03 2022 at 16:16):

alexcrichton submitted PR review.

Wasmtime GitHub notifications bot (Feb 03 2022 at 16:16):

alexcrichton submitted PR review.

Wasmtime GitHub notifications bot (Feb 03 2022 at 16:16):

alexcrichton created PR review comment:

Running across this I'm a bit confused because I thought that this would be handled elsewhere. Poking around I'm also surprised to not see any modifications to initialize_tables. Running this PR as well shows that this is still quite hot on instantiation.

Overall it looks like Table::get, the API, is lazy, but element initialization isn't lazy yet?

Wasmtime GitHub notifications bot (Feb 03 2022 at 16:16):

alexcrichton created PR review comment:

I was personally hoping we could get away from this entirely and avoid needing this, but I don't think that's going to be the case. If instead we lean into this, I've got two concerns about this approach:

Performance wise SharedSignatures::Table(val.clone()) happens to create this SharedSignatures which is a vec clone/allocation that otherwise doesn't really need to happen. That's something I think we should avoid.

Otherwise conceptually InstanceAllocationRequest I feel is getting pretty confusing where it's basically just &wasmtime_jit::CompiledModule but sort-of-not because we don't want to talk about all the types in one crate. This has performance costs because everything is stored in its own separate Arc which means allocation/deallocation is lots of little Arc adjustments instead of simply one. It also has maintainability issues because so much of this has to change every time we add a new field here (lots to plumb through here as you see).

I feel that the changes made in this PR are somewhat past the breaking point that this is in need of a refactoring before landing. If you'd prefer I think it's ok to land the refactoring ahead of time but doing it as part of this PR is also fine by me.

What I'm thinking though is that we'd create a trait like this in wasmtime_runtime:
trait Module {
    fn env_module(&self) -> &wasmtime_environ::Module;
    fn unique_id(&self) -> Option<CompiledModuleId>;
    fn memfds(&self) -> Option<&ModuleMemFds>;
    fn functions(&self) -> &PrimaryMap<DefinedFuncIndex, FunctionInfo>;
    fn shared_signatures(&self) -> SharedSignatures<'_>;
    fn wasm_data(&self) -> &[u8];
    fn image_base(&self) -> usize;
}
and then InstanceAllocationRequest would contain:
pub struct InstanceAllocationRequest<'a> {
    pub module: &'a Arc<dyn wasmtime_runtime::Module>,
    // ... anything else not related to modules
}
I believe this will remove the need for management of Arc for each individual field all over the place and allow us to move towards having just one Arc inside of wasmtime::Module. I don't know whether wasmtime_runtime::Module would be implemented for wasmtime::Module or wasmtime_jit::CompiledModule. I think ideally CompiledModule would be used but I don't know if it has all the pieces internally necessary, it may be either wasmtime::Module or wasmtime::ModuleInner that implements this (or something like that).

Wasmtime GitHub notifications bot (Feb 03 2022 at 16:16):

alexcrichton created PR review comment:

Since this is the slow path this function might be a good candidate for #[cold] (not that I think it will have any effect other than preventing inlining and making asm a bit more readable should we happen to come across it)

Wasmtime GitHub notifications bot (Feb 03 2022 at 16:16):

alexcrichton created PR review comment:

Instead of using memset here could the alloc above instead be alloc_zeroed?

Wasmtime GitHub notifications bot (Feb 03 2022 at 16:16):

alexcrichton created PR review comment:

In general InstanceHandle is intended to roughly represent an owned instance, so I don't think that it's the best choice for this type parameter.

I think that at the Table layer here it's best for get to return enum { None, NotInitialized, Some(element) } and for an upper layer to handle the lazy initialization. Notably I think that Instance would be a good place to handle the "do the get, if that fails, do a set" since that's where the data for the lazily initialized slot will be stored anyway.

Wasmtime GitHub notifications bot (Feb 03 2022 at 16:16):

alexcrichton created PR review comment:

AFAIK this statement has no effect, it's still possible to use vmctx_data after this since rustc will automatically reborrow for this call.

Wasmtime GitHub notifications bot (Feb 03 2022 at 16:16):

alexcrichton created PR review comment:

It's worth pointing out as well that this has ramifications on other parts of the system because currently every Store has a ModuleRegistry which keeps all instantiated modules alive for their backing data (e.g. keeping the wasm_data: *mut [u8] alive. With this dyn Module concept that would no longer be necessary really and we might be able to simplify ModuleRegistry in the future as well, but that's probably best left to a future PR.

Wasmtime GitHub notifications bot (Feb 03 2022 at 16:16):

alexcrichton created PR review comment:

Right now I think that this may be detrimental to the on-demand allocator because it either does a too-large memset right now for the entire VMContext or it will need to call calloc, which also does a too-large memset.

An alternative design, though, would be to pass a flag into this function whether the memory is known to be zero'd. That way we could conditionally zero out the array of VMCallerCheckedAnyfunc contents based on whether it was pre-zeroed already, making the pooling allocator efficiently use madvise while the on-demand allocator still initializes as little memory as possible.

After saying this, though, I think it may actually be best to take this incrementally? What do you think about using memset here to zero memory and benchmarking later how much faster using madvise is for the pooling allocator? I'm a little worried about the madvise traffic being increased for the pooling allocator since we already know it's a source of slow tlb-shootdowns, and I'm afraid that any cost of that will be hidden by the other performance gains in this PR so we can't accurately measure the strategy of memset-on-initialize vs madvise-to-zero.

Wasmtime GitHub notifications bot (Feb 03 2022 at 16:16):

alexcrichton created PR review comment:

This seems like it could be a footgun waiting to happen in the sense that it promotes a shared reference, &Instance, to an InstanceHandle which can be trivially used to get &mut Instance. Ideally, though, this function wouldn't need to be added at all. I'm hoping that my suggestion below about moving the Instance arguments out of Table itself will help avoid the need for this function.

Wasmtime GitHub notifications bot (Feb 03 2022 at 16:16):

alexcrichton created PR review comment:

Could this use a debug_assert! that everything is zero'd?

Wasmtime GitHub notifications bot (Feb 03 2022 at 17:41):

cfallin updated PR #3733 from lazy-anyfuncs to main.

Wasmtime GitHub notifications bot (Feb 03 2022 at 17:53):

cfallin updated PR #3733 from lazy-anyfuncs to main.

Wasmtime GitHub notifications bot (Feb 04 2022 at 02:46):

cfallin updated PR #3733 from lazy-anyfuncs to main.

Wasmtime GitHub notifications bot (Feb 04 2022 at 02:51):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Feb 04 2022 at 02:51):

cfallin created PR review comment:

(resolving, as I didn't actually push the latest changes that did the lazy init when you did this review; sorry!)

Wasmtime GitHub notifications bot (Feb 04 2022 at 02:55):

cfallin updated PR #3733 from lazy-anyfuncs to main.

Wasmtime GitHub notifications bot (Feb 05 2022 at 01:25):

cfallin updated PR #3733 from lazy-anyfuncs to main.

Wasmtime GitHub notifications bot (Feb 05 2022 at 01:30):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Feb 05 2022 at 01:30):

cfallin created PR review comment:

Adjusted to from_instance_mut and take the &mut Instance. Still needed at the moment as the lazy-table-init API needs to be accessible both from the runtime/libcall and from the upper wasmtime crate layer -- so I create an InstanceHandle temporarily inside the libcall. This seems reasonably safe but if there is a better way to encapsulate/reconfigure I'm all ears!

Wasmtime GitHub notifications bot (Feb 05 2022 at 01:30):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Feb 05 2022 at 01:31):

cfallin created PR review comment:

I've refactored basically along these lines -- it cleaned a lot of things up. Thanks for the idea!

Wasmtime GitHub notifications bot (Feb 05 2022 at 01:33):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Feb 05 2022 at 01:33):

cfallin created PR review comment:

Reworked this bit along the lines you suggest:

the TableElement has an UninitFunc arm;

There is a separate TableAndOwningInstance handle-like struct that encapsulates a Table and an Instance and knows how to do lazy init (and can hand out projected borrows to the underlying table);

We get this TableAndOwningInstance in the various spots where a table is looked up on an instance.

The design is still ever so slightly awkward (see comment on InstanceHandle::from_instance_mut) because it needs to be available both at the libcall layer and in the wasmtime::Table implementation, but this is a lot cleaner for sure.

Wasmtime GitHub notifications bot (Feb 05 2022 at 01:42):

cfallin updated PR #3733 from lazy-anyfuncs to main.

Wasmtime GitHub notifications bot (Feb 05 2022 at 02:48):

cfallin updated PR #3733 from lazy-anyfuncs to main.

Wasmtime GitHub notifications bot (Feb 05 2022 at 03:02):

cfallin updated PR #3733 from lazy-anyfuncs to main.

Wasmtime GitHub notifications bot (Feb 05 2022 at 03:13):

cfallin updated PR #3733 from lazy-anyfuncs to main.

Wasmtime GitHub notifications bot (Feb 05 2022 at 03:13):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Feb 05 2022 at 03:13):

cfallin created PR review comment:

Done!

Wasmtime GitHub notifications bot (Feb 05 2022 at 03:13):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Feb 05 2022 at 03:13):

cfallin created PR review comment:

Ah, yes, added.

Wasmtime GitHub notifications bot (Feb 05 2022 at 03:14):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Feb 05 2022 at 03:14):

cfallin created PR review comment:

Switched instead to pass in a "not prezeroed" flag as suggested below, so we can zero just the anyfunc area.

Wasmtime GitHub notifications bot (Feb 05 2022 at 03:15):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Feb 05 2022 at 03:15):

cfallin created PR review comment:

Ah, ok; I was trying to be as lifetime-correct with the aliasing as possible but I suppose it was just paranoia. Removed this zeroing in any case.

Wasmtime GitHub notifications bot (Feb 05 2022 at 03:19):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Feb 05 2022 at 03:19):

cfallin created PR review comment:

Modified to take a prezeroed flag as suggested -- good idea!

I'm a little hesitant to settle on a memset-only design even for the pooling allocator, as my earlier measurements with a large module (spidermonkey.wasm) showed that that became really quite hot; that's why I had gone to the initialized-bitmap instead (1 bit vs 24 bytes to zero --> 192x less). But I will benchmark this to be sure on Monday.

Alternatively, I think that if we move to a single-slot-per-instance design for the pooling allocator, such that we have one madvise for everything (memories, tables), stretching that range to include a few extra pages for the Instance and VMContext should be no big deal (it will share one mmap-lock acquire and one set of IPIs), so I suspect that madvise-cost may not have to be a huge consideration in the future. Of course getting there implies removing uffd first, since it depends on the separate-pools design, so we would have to make that decision first.

Wasmtime GitHub notifications bot (Feb 05 2022 at 03:25):

cfallin updated PR #3733 from lazy-anyfuncs to main.

Wasmtime GitHub notifications bot (Feb 05 2022 at 03:54):

cfallin updated PR #3733 from lazy-anyfuncs to main.

Wasmtime GitHub notifications bot (Feb 05 2022 at 06:33):

cfallin updated PR #3733 from lazy-anyfuncs to main.

Wasmtime GitHub notifications bot (Feb 05 2022 at 07:16):

cfallin updated PR #3733 from lazy-anyfuncs to main.

Wasmtime GitHub notifications bot (Feb 07 2022 at 18:40):

alexcrichton submitted PR review.

Wasmtime GitHub notifications bot (Feb 07 2022 at 18:40):

alexcrichton submitted PR review.

Wasmtime GitHub notifications bot (Feb 07 2022 at 18:40):

alexcrichton created PR review comment:

Can this Arc<Module> get folded into the runtime_info as well? (returned as &Module since there shouldn't be a need to wrap this in an Arc any more)

Wasmtime GitHub notifications bot (Feb 07 2022 at 18:40):

alexcrichton created PR review comment:

With recent refactorings does this still need to be in a separate Arc?

Wasmtime GitHub notifications bot (Feb 07 2022 at 18:40):

alexcrichton created PR review comment:

I think that this map iteration can probably be entirely replaced with self.get_caller_checked_anyfunc().unwrap_or(ptr::null_mut()) now perhaps? (no need to comment that we're specifically forcing a lazy init ideally)

Wasmtime GitHub notifications bot (Feb 07 2022 at 18:40):

alexcrichton created PR review comment:

The ownership here feels pretty weird to me and I'm not sure works out well. For example TableAndOwningInstance isn't connected to &mut self so it's pretty easy to violate borrowing rules here.

Overall though I think it'd ideally be best to remove the need for this. Could the lazy_init methods on TableAndOwningInstance be moved to methods on Instance here?

Wasmtime GitHub notifications bot (Feb 07 2022 at 18:40):

alexcrichton created PR review comment:

If you're feeling extra-ambitious this would also be a good item to move into ModuleRuntimeInfo since it's the same for all modules. (this of course is fine to leave for later though)

Wasmtime GitHub notifications bot (Feb 07 2022 at 18:40):

alexcrichton created PR review comment:

Were you able to test/benchmark to see the affect of madvise-vs-memset and see which is faster?

Given our recent measurements about ipis and madvise I would actually imagine that always-memset may be the approach we want here

Wasmtime GitHub notifications bot (Feb 07 2022 at 18:40):

alexcrichton created PR review comment:

I think this can probably go back to the way it was, right? e.g. the trait would be:
trait ModuleRuntimeInfo {
    fn shared_signatures(&self) -> SharedSignatures<'_>;
}

Wasmtime GitHub notifications bot (Feb 07 2022 at 18:40):

alexcrichton created PR review comment:

This might be something worth benchmarking as well perhaps. Now that I see this I see that this is an indirect function call in a loop (and with this current PR is actually creating-and-destroying an Arc each iteration of the loop here).

I think this can probably be fixed by moving the shared_signatures call above the loop (but it's still good to remove the Arc traffic). Otherwise though in the future the table built here is static-per-module so we could probably also store a long-lived table in the module itself and have a level of indirection here like we do for the builtin functions (although that's of course all better left for a future PR)

Wasmtime GitHub notifications bot (Feb 07 2022 at 18:40):

alexcrichton created PR review comment:

If possible I think it'd be best to avoid adding this impl, and if the signature of the trait is changed like I mentioned above I think this can be avoided.

Wasmtime GitHub notifications bot (Feb 07 2022 at 18:40):

alexcrichton created PR review comment:

Could this use a method on VMOffsets or some other similar helper. I still really want to implement the change where we shrink this table, and hardcoding that the size of the table is the size of instance.module.functions.len() is somewhat non-obvious here and perhaps disconnected from other calculations about the size of this table.

Wasmtime GitHub notifications bot (Feb 07 2022 at 18:40):

alexcrichton created PR review comment:

To avoid the clone() here can the module be provided as an argument? I think that's done for tables/memories since we have a copy of the module already in the InstanceAllocationRequest which can be separately borrowed

Wasmtime GitHub notifications bot (Feb 07 2022 at 18:40):

alexcrichton created PR review comment:

If this sticks around (maybe-pre-zeroed, which I still think should be measured first before we commit to it) then I think it should be a named enum parameter instead of a comment-on-each-callsite-with-bool parameter.

Wasmtime GitHub notifications bot (Feb 07 2022 at 18:40):

alexcrichton created PR review comment:

I'm only this far in reviewing and haven't gotten to the table module yet, but I don't think that the REF_MASK or w/e encoding is done internally should be open-coded here, instead this seems like something that the method on elem.into_raw() would handle

Wasmtime GitHub notifications bot (Feb 07 2022 at 18:40):

alexcrichton created PR review comment:

Could this have .expect or a comment indicating why the .unwrap() is ok?

Wasmtime GitHub notifications bot (Feb 07 2022 at 18:40):

alexcrichton created PR review comment:

If it works out I think it'd be best to return this as &Arc<MemoryMemFd> instead of returning the Arc-by-value, since that's generally more flexible and delays the reference counting until it really needs to happen.

Wasmtime GitHub notifications bot (Feb 07 2022 at 18:40):

alexcrichton created PR review comment:

Does this need to be a separate allocation? I was expecting this could do impl ModuleRuntimeInfo for ModuleInner or similar.

Wasmtime GitHub notifications bot (Feb 07 2022 at 18:40):

alexcrichton created PR review comment:

Having ModuleRuntimeInfo for ModuleInner would remove the need for the unsafety here entirely I think since we'd never need &'static [u8] for the data.

Wasmtime GitHub notifications bot (Feb 07 2022 at 18:40):

alexcrichton created PR review comment:

I commented above, but ideally I think this would not be pub(crate) and would instead be local to this module.

Additionally I think the definition here could be !REF_INIT_BIT?

Wasmtime GitHub notifications bot (Feb 07 2022 at 18:40):

alexcrichton created PR review comment:

The Module-entry-points I already always find a bit confusing since there's so many, but if setting the name is really so onerous that we need a differently-delegated entry point for each one that feels like overkill for such a minor feature. I'm not sure if a named module is really used all that much, if ever, to the point that if it's necessary to have brand new entry points all over the place here to juggle the Arc and such it seems like we should just remove the ability to customize the name.

Wasmtime GitHub notifications bot (Feb 07 2022 at 18:40):

alexcrichton created PR review comment:

This is one reason I think that this should do ModuleRuntimeInfo for ModuleInner, I don't think there's any reaosn to duplicate the information within a module, instead it should all reside in one location and it's always referenced from there.

Wasmtime GitHub notifications bot (Feb 07 2022 at 18:40):

alexcrichton created PR review comment:

In the same manner that we have enum MemoryInitialization with one variant being paged, I think it would be a better design to have enum TableInitialization with one variant being Vec<Option<FuncIndex>> for tables where we statically know how the initialization happens. That would also shift more of this work to compile time where I think it probably better fits as well.

Wasmtime GitHub notifications bot (Feb 07 2022 at 19:24):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Feb 07 2022 at 19:24):

cfallin created PR review comment:

Sure, the intent was to try not to force too much turning-inside-out in calling code, but I can invert this if needed; the Instance is available everywhere this is used I think.

The intent here was to try to bundle all of the relevant bits together for extra safety: otherwise, you specify the index once when getting the Table but then again when accessing bits of the Instance as it locates the correct info in its lazy-init structures.

The ownership concern I think should be addressed if we hold a &'a mut Instance with the usual 'a-tied-to-self pattern yeah? This was originally written this way because I had to use InstanceHandles to expose to the wasmtime crate but now it's all wrapped up again so that's no longer an issue.

Anyway I'll play with this a bit more...

Wasmtime GitHub notifications bot (Feb 07 2022 at 19:24):

cfallin edited PR review comment.

Wasmtime GitHub notifications bot (Feb 07 2022 at 19:30):

alexcrichton submitted PR review.

Wasmtime GitHub notifications bot (Feb 07 2022 at 19:30):

alexcrichton created PR review comment:

In some sense Instance is "the package" in this case so I'd prefer to avoid adding more package types than necessary. The ownership in all of wasmtime-runtime is pretty murky to be honest and probably isn't 100% safe from something like a miri perspective, but if we can I'd prefer to stick to similar idioms used elsewhere, so something that needs table and module information would stick to Instance whereas only-needs-table things would stick to Table

Wasmtime GitHub notifications bot (Feb 07 2022 at 23:56):

cfallin updated PR #3733 from lazy-anyfuncs to main.

Wasmtime GitHub notifications bot (Feb 07 2022 at 23:57):

cfallin updated PR #3733 from lazy-anyfuncs to main.

Wasmtime GitHub notifications bot (Feb 08 2022 at 01:01):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Feb 08 2022 at 01:01):

cfallin created PR review comment:

The reason for this is that it's mirroring the logic that occurs in the generated code. TableElement::into_raw() produces the form that is stored in the table, which includes the tag-bits; while the return value from the libcall produces the form that we load out of the table, which is normally masked by a discrete and instruction in the generated code.

I'll add a comment describing this so it's clear why it's here though!

Wasmtime GitHub notifications bot (Feb 08 2022 at 01:10):

cfallin updated PR #3733 from lazy-anyfuncs to main.

Wasmtime GitHub notifications bot (Feb 08 2022 at 01:11):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Feb 08 2022 at 01:11):

cfallin created PR review comment:

Done!

Wasmtime GitHub notifications bot (Feb 08 2022 at 01:14):

cfallin updated PR #3733 from lazy-anyfuncs to main.

Wasmtime GitHub notifications bot (Feb 08 2022 at 01:14):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Feb 08 2022 at 01:14):

cfallin created PR review comment:

Done!

Wasmtime GitHub notifications bot (Feb 08 2022 at 01:14):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Feb 08 2022 at 01:15):

cfallin created PR review comment:

Done!

Wasmtime GitHub notifications bot (Feb 08 2022 at 07:49):

cfallin updated PR #3733 from lazy-anyfuncs to main.

Wasmtime GitHub notifications bot (Feb 08 2022 at 07:53):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Feb 08 2022 at 07:53):

cfallin created PR review comment:

Done, refactored away from this design.

Wasmtime GitHub notifications bot (Feb 08 2022 at 07:57):

cfallin updated PR #3733 from lazy-anyfuncs to main.

Wasmtime GitHub notifications bot (Feb 08 2022 at 07:57):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Feb 08 2022 at 07:57):

cfallin created PR review comment:

Removed!

Wasmtime GitHub notifications bot (Feb 08 2022 at 15:36):

alexcrichton submitted PR review.

Wasmtime GitHub notifications bot (Feb 08 2022 at 15:36):

alexcrichton created PR review comment:

FWIW I don't think we should remove this just because we can, I was pointing this out to show that I think it's fine to remove this if it's onerous for us to maintain. I suspect that if the ModuleRuntimeInfo doesn't live as a separate structure then the internal refactoring which led to this being not possible or more complicated is probably no longer there, in which case I don't think there's as much of a reason to remove this and I think we should probably keep it.

Wasmtime GitHub notifications bot (Feb 08 2022 at 17:33):

cfallin updated PR #3733 from lazy-anyfuncs to main.

Wasmtime GitHub notifications bot (Feb 08 2022 at 17:37):

cfallin updated PR #3733 from lazy-anyfuncs to main.

Wasmtime GitHub notifications bot (Feb 08 2022 at 23:16):

cfallin updated PR #3733 from lazy-anyfuncs to main.

Wasmtime GitHub notifications bot (Feb 08 2022 at 23:17):

cfallin updated PR #3733 from lazy-anyfuncs to main.

Wasmtime GitHub notifications bot (Feb 08 2022 at 23:18):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Feb 08 2022 at 23:18):

cfallin created PR review comment:

Done, this refactor makes things much cleaner -- thanks!

Wasmtime GitHub notifications bot (Feb 08 2022 at 23:19):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Feb 08 2022 at 23:19):

cfallin created PR review comment:

Done.

Wasmtime GitHub notifications bot (Feb 08 2022 at 23:19):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Feb 08 2022 at 23:19):

cfallin created PR review comment:

Done.

Wasmtime GitHub notifications bot (Feb 08 2022 at 23:19):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Feb 08 2022 at 23:19):

cfallin created PR review comment:

Removed!

Wasmtime GitHub notifications bot (Feb 08 2022 at 23:20):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Feb 08 2022 at 23:20):

cfallin created PR review comment:

I actually managed to remove SharedSignatures altogether; the runtime-info trait just provides a lookup function and the implementation used by special-purpose modules (default callee, etc) can do its own "return one signature" thing.

Wasmtime GitHub notifications bot (Feb 08 2022 at 23:20):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Feb 08 2022 at 23:20):

cfallin created PR review comment:

Removed this Arc clone altogether!

Wasmtime GitHub notifications bot (Feb 08 2022 at 23:22):

cfallin updated PR #3733 from lazy-anyfuncs to main.

Wasmtime GitHub notifications bot (Feb 08 2022 at 23:22):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Feb 08 2022 at 23:22):

cfallin created PR review comment:

Done!

Wasmtime GitHub notifications bot (Feb 08 2022 at 23:23):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Feb 08 2022 at 23:23):

cfallin created PR review comment:

Done!

Wasmtime GitHub notifications bot (Feb 08 2022 at 23:25):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Feb 08 2022 at 23:25):

cfallin created PR review comment:

I think I might save this one for later as it involves a bit more cross-crate work (the wasmtime crate needs to see the VMOffsets and own it in the wasmtime::Module); but it's a good idea.

Wasmtime GitHub notifications bot (Feb 09 2022 at 01:10):

cfallin updated PR #3733 from lazy-anyfuncs to main.

Wasmtime GitHub notifications bot (Feb 09 2022 at 01:11):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Feb 09 2022 at 01:11):

cfallin created PR review comment:

Done! This is also a nice cleanup, thanks.

Wasmtime GitHub notifications bot (Feb 09 2022 at 04:17):

cfallin updated PR #3733 from lazy-anyfuncs to main.

Wasmtime GitHub notifications bot (Feb 09 2022 at 05:20):

cfallin updated PR #3733 from lazy-anyfuncs to main.

Wasmtime GitHub notifications bot (Feb 09 2022 at 05:55):

cfallin updated PR #3733 from lazy-anyfuncs to main.

Wasmtime GitHub notifications bot (Feb 09 2022 at 16:34):

alexcrichton submitted PR review.

Wasmtime GitHub notifications bot (Feb 09 2022 at 16:34):

alexcrichton submitted PR review.

Wasmtime GitHub notifications bot (Feb 09 2022 at 16:34):

alexcrichton created PR review comment:

Something I've sort of forgotten until now, it might be good to confirm that this doesn't have too much of a performance hit. My main worry here would be around call_indirect. I suspect that call_indirect itself is indeed getting slower here, but I'm less interested in that than an overall performance profile of a module. Could you run some tests with like a JS markdown renderer to see if there's a meaningful decrease in performance or whether it's in the noise? My hunch is that any slowdown here is in the noise for any module, but it's probably good to confirm that before landing.

Wasmtime GitHub notifications bot (Feb 09 2022 at 16:34):

alexcrichton created PR review comment:

Reading this again, could the constants here be extracted to something in the wasmtime_environ crate to be shared between compile-time and runtime?

Wasmtime GitHub notifications bot (Feb 09 2022 at 16:34):

alexcrichton created PR review comment:

Could this use a style more similar to MemoryInitialization to make sure that we're consistent between the two? Where one variant of TableInitialization is "list of initializers" and the second variant is "precomputed per-table lists"

Wasmtime GitHub notifications bot (Feb 09 2022 at 16:34):

alexcrichton created PR review comment:

Personally I try to avoid methods like this for not-so-commonly-used types since specifically asking for one variant almost defeats the purpose of exhaustive matching in Rust. In other words we'd lose any benefit where if a variant is added to this enum we can't rely on the compiler to tell us to go update all the various places, instead we have to audit all callers of these methods to determine if they need to handle the new case.

If possible can these helpers get removed in favor of a match on call-sites with comments for why one or the other case is ignored?

Wasmtime GitHub notifications bot (Feb 09 2022 at 16:34):

alexcrichton created PR review comment:

It might be worth double-checking but I think that Option<FuncIndex> is actually a 64-bit value whereas FuncIndex is a 32-bit value. To make this more compact I think we could represent None with FuncIndex::reserved_value().

Personally I'm not a fan of one value of FuncIndex meaning "none" but it's the design we already have and changing that is beyond the scope of this PR. Despite my personal thoughts, though, it does lend itself well to this use case.

Wasmtime GitHub notifications bot (Feb 09 2022 at 16:34):

alexcrichton created PR review comment:

I think most other places we check this by using module.defined_table_index(index), so coul that be used here instead?

Wasmtime GitHub notifications bot (Feb 09 2022 at 16:34):

alexcrichton created PR review comment:

Actually now that I think about this in relation to my above comment I don't think that this is quite right in that element segments can have ref.null as a particular element which is represented with FuncIndex::reserved_value() which gets set to Some here as well. That's probably handled gracefully elsewhere but in any case I think it'd be good to avoid the Option<FuncIndex> here

Wasmtime GitHub notifications bot (Feb 09 2022 at 16:34):

alexcrichton created PR review comment:

This seems to use a fallible get in combination with take_while which ends up doing two rounds of bounds-checks? Could this instead always do get and break on out-of-bounds?

Wasmtime GitHub notifications bot (Feb 09 2022 at 16:34):

alexcrichton created PR review comment:

I think this may be able to drop the pub(crate) now?

At least for me I find it useful to have as few implementation details as possible as pub or pub(crate) because it means you don't have to worry about anyone outside this module using this and you only have to audit this one module.

Wasmtime GitHub notifications bot (Feb 09 2022 at 16:34):

alexcrichton created PR review comment:

FWIW here I think instead of inserting something into self.result.module.table_initialization it might be more ergonomic to build a local list of initializers (sort of like how this built it previously into self.result.module) and then after the loop is finished it simply overwrites what was previously located at self.result.module.table_initialization, which because we only deal with valid modules at this point is guaranteed to be an empty list.

Wasmtime GitHub notifications bot (Feb 09 2022 at 16:34):

alexcrichton created PR review comment:

This is probably less cold now to the point that now it's better to remove this.

Wasmtime GitHub notifications bot (Feb 09 2022 at 16:34):

alexcrichton created PR review comment:

Idiomatically I think this'd be better named into_traitobj

Wasmtime GitHub notifications bot (Feb 09 2022 at 16:34):

alexcrichton created PR review comment:

FWIW the + 'static can be removed here (and in other locations probably) since if you don't mention anything that's the default.

Wasmtime GitHub notifications bot (Feb 09 2022 at 16:34):

alexcrichton created PR review comment:

I think this comment may be missing leading & on the types because otherwise this is saying that the cast happening here is the one that's not possible.

Wasmtime GitHub notifications bot (Feb 09 2022 at 16:34):

alexcrichton created PR review comment:

Could this use *into = VMCallerCheckedAnyfunc { ... } to statically assert that we initialize all fields?

Wasmtime GitHub notifications bot (Feb 09 2022 at 19:31):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Feb 09 2022 at 19:31):

cfallin created PR review comment:

I tried that at first the but issue is that the mode decision is per-table; so the "FuncTable" mode would then need an inner enum for each table to give the list of indices or the original list of initializers, which just reinvents this enum.

It looks like the MemoryInitialization mechanism gets around this by only converting to paged mode if all memories are compatible? I'm not sure if we want to do that for tables: it would create a perf cliff where adding one initializer for one imported table slows the whole instantiation down. Open to other ideas though -- thoughts?

Wasmtime GitHub notifications bot (Feb 09 2022 at 19:38):

alexcrichton submitted PR review.

Wasmtime GitHub notifications bot (Feb 09 2022 at 19:38):

alexcrichton created PR review comment:

I think the same argument applies to multi-memory where one out-of-bounds-initializer shouldn't necessarily create a cliff for the other memories. In that sense I'd prefer that we're consistent and fix this later if necessary (either by going the route you've done here or something similar). None of this is really that much of an issue today since modules generally have at most only one funcref table.

Wasmtime GitHub notifications bot (Feb 09 2022 at 19:42):

cfallin updated PR #3733 from lazy-anyfuncs to main.

Wasmtime GitHub notifications bot (Feb 09 2022 at 19:42):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Feb 09 2022 at 19:42):

cfallin created PR review comment:

Ah, it's a borrowing thing (we take a &mut self.module.table_initialization via the iterator); but I've moved this to a .skip on the iter, which is I think a pattern used elsewhere.

Wasmtime GitHub notifications bot (Feb 09 2022 at 19:42):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Feb 09 2022 at 19:42):

cfallin created PR review comment:

Done!

Wasmtime GitHub notifications bot (Feb 09 2022 at 19:42):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Feb 09 2022 at 19:42):

cfallin created PR review comment:

Done!

Wasmtime GitHub notifications bot (Feb 09 2022 at 19:43):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Feb 09 2022 at 19:43):

cfallin created PR review comment:

Done!

Wasmtime GitHub notifications bot (Feb 09 2022 at 20:41):

cfallin updated PR #3733 from lazy-anyfuncs to main.

Wasmtime GitHub notifications bot (Feb 09 2022 at 20:42):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Feb 09 2022 at 20:42):

cfallin created PR review comment:

Actually now with refactor this went away and we can use defined_table_index as you suggest.

Wasmtime GitHub notifications bot (Feb 09 2022 at 20:42):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Feb 09 2022 at 20:42):

cfallin created PR review comment:

Done!

Wasmtime GitHub notifications bot (Feb 09 2022 at 20:42):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Feb 09 2022 at 20:42):

cfallin created PR review comment:

Done!

Wasmtime GitHub notifications bot (Feb 09 2022 at 20:42):

cfallin created PR review comment:

Ah, yes, better!

Wasmtime GitHub notifications bot (Feb 09 2022 at 20:42):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Feb 09 2022 at 20:48):

cfallin updated PR #3733 from lazy-anyfuncs to main.

Wasmtime GitHub notifications bot (Feb 09 2022 at 20:48):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Feb 09 2022 at 20:48):

cfallin created PR review comment:

Done!

Wasmtime GitHub notifications bot (Feb 09 2022 at 20:48):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Feb 09 2022 at 20:48):

cfallin created PR review comment:

Done!

Wasmtime GitHub notifications bot (Feb 09 2022 at 20:48):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Feb 09 2022 at 20:48):

cfallin created PR review comment:

Right, good catch, fixed!

Wasmtime GitHub notifications bot (Feb 09 2022 at 20:48):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Feb 09 2022 at 20:48):

cfallin created PR review comment:

Done!

Wasmtime GitHub notifications bot (Feb 09 2022 at 20:53):

cfallin updated PR #3733 from lazy-anyfuncs to main.

Wasmtime GitHub notifications bot (Feb 09 2022 at 20:53):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Feb 09 2022 at 20:53):

cfallin created PR review comment:

Done!

Wasmtime GitHub notifications bot (Feb 09 2022 at 20:53):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Feb 09 2022 at 20:53):

cfallin created PR review comment:

Done!

Wasmtime GitHub notifications bot (Feb 09 2022 at 21:03):

cfallin updated PR #3733 from lazy-anyfuncs to main.

Wasmtime GitHub notifications bot (Feb 09 2022 at 21:12):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Feb 09 2022 at 21:12):

cfallin created PR review comment:

Just tested with a local Viceroy build running a markdown render service that takes 1.2s to run; delta is in the noise (+/- 1% either way, varies by run).

Wasmtime GitHub notifications bot (Feb 09 2022 at 21:56):

cfallin merged PR #3733.

Last updated: Apr 16 2025 at 22:03 UTC