wasmtime / PR #8011 Define garbage collection rooting APIs · git-wasmtime

Stream: git-wasmtime

Topic: wasmtime / PR #8011 Define garbage collection rooting APIs

Wasmtime GitHub notifications bot (Feb 28 2024 at 13:59):

fitzgen opened PR #8011 from fitzgen:rooting-api to bytecodealliance:main:

Rooting prevents GC objects from being collected while they are actively being used.

We have a few sometimes-conflicting goals with our GC rooting APIs:

Safety: It should never be possible to get a use-after-free bug because the user misused the rooting APIs, the collector "mistakenly" determined an object was unreachable and collected it, and then the user tried to access the object. This is our highest priority.

Moving GC: Our rooting APIs should moving collectors (such as generational and compacting collectors) where an object might get relocated after a collection and we need to update the GC root's pointer to the moved object. This means we either need cooperation and internal mutability from individual GC roots as well as the ability to enumerate all GC roots on the native Rust stack, or we need a level of indirection.

Performance: Our rooting APIs should generally be as low-overhead as possible. They definitely shouldn't require synchronization and locking to create, access, and drop GC roots.

Ergonomics: Our rooting APIs should be, if not a pleasure, then at least not a burden for users. Additionally, the API's types should be Sync and Send so that they work well with async Rust.

For example, goals (3) and (4) are in conflict when we think about how to support (2). Ideally, for ergonomics, a root would automatically unroot itself when dropped. But in the general case that requires holding a reference to the store's root set, and that root set needs to be held simultaneously by all GC roots, and they each need to mutate the set to unroot themselves. That implies Rc<RefCell<...>> or Arc<Mutex<...>>! The former makes the store and GC root types not Send and not Sync. The latter imposes synchronization and locking overhead. So we instead make GC roots indirect and require passing in a store context explicitly to unroot in the general case. This trades worse ergonomics for better performance and support for moving GC and async Rust.

Okay, with that out of the way, this module provides two flavors of rooting API. One for the common, scoped lifetime case, and another for the rare case where we really need a GC root with an arbitrary, non-LIFO/non-scoped lifetime:

RootScope and Rooted<T>: These are used for temporarily rooting GC objects for the duration of a scope. Upon exiting the scope, they are automatically unrooted. The internal implementation takes advantage of the LIFO property inherent in scopes, making creating and dropping Rooted<T>s and RootScopes super fast and roughly equivalent to bump allocation.

This type is vaguely similar to V8's [HandleScope].

[HandleScope]: https://v8.github.io/api/head/classv8_1_1HandleScope.html

Note that Rooted<T> can't be statically tied to its context scope via a lifetime parameter, unfortunately, as that would allow the creation and use of only one Rooted<T> at a time, since the Rooted<T> would take a borrow of the whole context.

This supports the common use case for rooting and provides good ergonomics.

ManuallyRooted<T>: This is the fully general rooting API used for holding onto non-LIFO GC roots with arbitrary lifetimes. However, users must manually unroot them. Failure to manually unroot a ManuallyRooted<T> before it is dropped will result in the GC object (and everything it transitively references) leaking for the duration of the Store's lifetime.

This type is roughly similar to SpiderMonkey's [PersistentRooted<T>], although they avoid the manual-unrooting with internal mutation and shared references. (Our constraints mean we can't do those things, as mentioned explained above.)

[PersistentRooted<T>]: http://devdoc.net/web/developer.mozilla.org/en-US/docs/Mozilla/Projects/SpiderMonkey/JSAPI_reference/JS::PersistentRooted.html

At the end of the day, both Rooted<T> and ManuallyRooted<T> are just tagged indices into the store's RootSet. This indirection allows working with Rust's borrowing discipline (we use &mut Store to represent mutable access to the GC heap) while still allowing rooted references to be moved around without tying up the whole store in borrows. Additionally, and crucially, this indirection allows us to update the actual GC pointers in the RootSet and support moving GCs (again, as mentioned above).

Wasmtime GitHub notifications bot (Feb 28 2024 at 13:59):

fitzgen requested elliottt for a review on PR #8011.

Wasmtime GitHub notifications bot (Feb 28 2024 at 13:59):

fitzgen requested wasmtime-compiler-reviewers for a review on PR #8011.

Wasmtime GitHub notifications bot (Feb 28 2024 at 13:59):

fitzgen requested wasmtime-core-reviewers for a review on PR #8011.

Wasmtime GitHub notifications bot (Feb 28 2024 at 13:59):

fitzgen requested alexcrichton for a review on PR #8011.

Wasmtime GitHub notifications bot (Feb 28 2024 at 13:59):

fitzgen commented on PR #8011:

Part of https://github.com/bytecodealliance/wasmtime/issues/5032

Wasmtime GitHub notifications bot (Feb 28 2024 at 14:01):

fitzgen updated PR #8011.

Wasmtime GitHub notifications bot (Feb 28 2024 at 14:15):

fitzgen updated PR #8011.

Wasmtime GitHub notifications bot (Feb 28 2024 at 14:30):

fitzgen updated PR #8011.

Wasmtime GitHub notifications bot (Feb 28 2024 at 15:03):

fitzgen updated PR #8011.

Wasmtime GitHub notifications bot (Feb 28 2024 at 15:04):

fitzgen updated PR #8011.

Wasmtime GitHub notifications bot (Feb 28 2024 at 17:44):

github-actions[bot] commented on PR #8011:

Subscribe to Label Action

cc @fitzgen, @peterhuene

<details>
This issue or pull request has been labeled: "cranelift", "fuzzing", "wasmtime:api", "wasmtime:c-api", "wasmtime:ref-types"

Thus the following users have been cc'd because of the following labels:

fitzgen: fuzzing, wasmtime:ref-types

peterhuene: wasmtime:api, wasmtime:c-api

To subscribe or unsubscribe from this label, edit the <code>.github/subscribe-to-label.json</code> configuration file.

Learn more.
</details>

Wasmtime GitHub notifications bot (Feb 28 2024 at 21:30):

alexcrichton submitted PR review:

At a high level this all looks good to me. I think there's work we can do to build confidence in the internals, though. While I realize we can't remove all unsafe here I suspect we don't need quite so many just-a-typed-transmute functions. Those are really difficult to reason about the safety.

Additionally I think this is definitely an area where we're going to be leaning on miri pretty heavily. Can you ensure that there's tests that run in miri doing all the various bits and bobs with the API other than actually calling in to wasm?

Wasmtime GitHub notifications bot (Feb 28 2024 at 21:30):

alexcrichton submitted PR review:

At a high level this all looks good to me. I think there's work we can do to build confidence in the internals, though. While I realize we can't remove all unsafe here I suspect we don't need quite so many just-a-typed-transmute functions. Those are really difficult to reason about the safety.

Additionally I think this is definitely an area where we're going to be leaning on miri pretty heavily. Can you ensure that there's tests that run in miri doing all the various bits and bobs with the API other than actually calling in to wasm?

Wasmtime GitHub notifications bot (Feb 28 2024 at 21:30):

alexcrichton created PR review comment:

Given the clones of VMExternRef this seems like this method should either not exist or should at least be marked unsafe?

Wasmtime GitHub notifications bot (Feb 28 2024 at 21:30):

alexcrichton created PR review comment:

For this style of lifetime management you'll want to use impl Into<StoreContext<'a, T>> rather than AsContext and then 'a is the lifetime that you want

Wasmtime GitHub notifications bot (Feb 28 2024 at 21:30):

alexcrichton created PR review comment:

Should this debug_assert! the return value is Some?

Wasmtime GitHub notifications bot (Feb 28 2024 at 21:30):

alexcrichton created PR review comment:

If possible, these tests should definitely try to run under miri since they're not dealing with wasm compilation

Wasmtime GitHub notifications bot (Feb 28 2024 at 21:30):

alexcrichton created PR review comment:

I realize that this is probably copied from somewhere else, but reading over this again, this doesn't actually buy us anything right? The return value of this function is an unrooted pointer so for this assertion to be correct it'd actually have to be at a higher scope, right?

Put another way, this function should probably assert it's in a "assert no gc scope", right? (as opposed to creating a temporary assert-no-gc-scope)

Wasmtime GitHub notifications bot (Feb 28 2024 at 21:30):

alexcrichton created PR review comment:

Mind updating the docs just above this to reflect the current state? (they were already outdated for the prior AtomicUsize too)

Wasmtime GitHub notifications bot (Feb 28 2024 at 21:30):

alexcrichton created PR review comment:

To confirm, while this is fine to be explicit I don't think that this is a different signature than before, just elaborated.

Wasmtime GitHub notifications bot (Feb 28 2024 at 21:30):

alexcrichton created PR review comment:

I don't think this can be a safe function because there's no guarantee that inner belongs to store, right?

Wasmtime GitHub notifications bot (Feb 28 2024 at 21:30):

alexcrichton created PR review comment:

For a future PR, and with more GC types, I think it'd be reasonable to move these impls to the gc_ref.rs module to avoid the #[cfg] here

Wasmtime GitHub notifications bot (Feb 28 2024 at 21:30):

alexcrichton created PR review comment:

Can this be a method on VMGcRef?

Also, this implementation only seems correct when we only have one gc reference type, so could this perhaps assert somehow that there's only one? Something like assert!(ONLY_EXTERNREF_IMPLEMENTED_YET) and then we delete that const when more than one type is implemented and all callsites are updated?

Wasmtime GitHub notifications bot (Feb 28 2024 at 21:30):

alexcrichton created PR review comment:

Technically doesn't this API example leak the externref for the entire lifetime of store? If so can this perhaps show using a RootScope to not do that?

Wasmtime GitHub notifications bot (Feb 28 2024 at 21:30):

alexcrichton created PR review comment:

Could this be a method on VMExternRef?

Wasmtime GitHub notifications bot (Feb 28 2024 at 21:30):

alexcrichton created PR review comment:

While I'm not certain that this would work, one thing worth trying to solve some of the lifetime issues I mentioned above is to change this method signature to:
        fn get_gc_ref<'a>(&self, store: &'a StoreOpaque) -> Option<&'a VMGcRef>;

Wasmtime GitHub notifications bot (Feb 28 2024 at 21:30):

alexcrichton created PR review comment:

Personally I'm very wary of these functions in addition to the ref_from_raw functions in the runtime crate. Can these be removed entirely? I would like to ideally rely on natural lifetime stuff in rustc to avoid any need to do this sort of transmute.

For example there's basically no way to locally reason that this transmute is correct, so I need to build a total understanding of the whole GC as a result to know when it's safe and when it's not safe to use this.

Wasmtime GitHub notifications bot (Feb 28 2024 at 21:30):

alexcrichton created PR review comment:

I think this needs to be unsafe since there's no guarantee gc_ref is valid or belongs to this store?

Wasmtime GitHub notifications bot (Feb 28 2024 at 21:30):

alexcrichton created PR review comment:

Since this is on the hot path of calls out of wasm, perhaps #[inline]?

Wasmtime GitHub notifications bot (Feb 28 2024 at 21:30):

alexcrichton created PR review comment:

Can this not be a defaulted method? transmute in general is "really scary" and we should avoid it at all times if we can, and if we can't I think that it should ideally fall into the category of "should be trivial to verify this is correct". In that sense this is not trivial as I would have to audit all implementations of GcRefImpl. If, however, this was located on every impl of GcRefImpl I could take a look at the type, see it's #[repr(transparent)], and conclude that it's correct.

also in the local impls I think it would be neat to do something like:

assert!(matches!(self, Self(GcRootIndex { ... }))) to trigger an error if the structure ever changes.

Wasmtime GitHub notifications bot (Feb 28 2024 at 21:30):

alexcrichton created PR review comment:

Could these be omitted for now? Deferred instead to something like ref_eq?

Wasmtime GitHub notifications bot (Feb 28 2024 at 21:30):

alexcrichton created PR review comment:

Since this is on the wasm call hot path, could this be #[inline] with a "fast path" of "ok nothing changed" and then have a slow path for "ok gotta drain"

Wasmtime GitHub notifications bot (Feb 28 2024 at 21:30):

alexcrichton created PR review comment:

Similar to above, could these become freestanding methods on Rooted?

Wasmtime GitHub notifications bot (Feb 28 2024 at 21:30):

alexcrichton created PR review comment:

I commented on this elsewhere a bit, but this would be, IMO, a good location for something like:
store.debug_assert_gc_not_allowed();

Wasmtime GitHub notifications bot (Feb 28 2024 at 21:30):

alexcrichton created PR review comment:

Sort of along the lines of my comment above, it's sort of a "bug", not practically but morally, that there's no AutoAssertNoGc marker here?

Wasmtime GitHub notifications bot (Feb 28 2024 at 21:30):

alexcrichton created PR review comment:

I commented below on this but I want to be sure to leave a comment on this as well before I forget. More-or-less I think we should try to remove this method, I'm not a huge fan of methods whose bodies are just transmute

Wasmtime GitHub notifications bot (Feb 28 2024 at 21:30):

alexcrichton created PR review comment:

Also for an example of this there's Memory::data and data_mut for mutable versions.

Wasmtime GitHub notifications bot (Feb 28 2024 at 21:30):

alexcrichton created PR review comment:

Also, I was expecting to see changes to the codegen for manipulating this reference count, but I don't think that's in this PR. Is that something you'd like to defer to a future PR?

Wasmtime GitHub notifications bot (Feb 29 2024 at 13:07):

fitzgen submitted PR review.

Wasmtime GitHub notifications bot (Feb 29 2024 at 13:07):

fitzgen created PR review comment:

I think so, yeah. I just wrote this all out to help convince myself of the safety of what was going on here, felt like it would be useful for future readers.

Wasmtime GitHub notifications bot (Feb 29 2024 at 13:07):

fitzgen submitted PR review.

Wasmtime GitHub notifications bot (Feb 29 2024 at 13:07):

fitzgen created PR review comment:

That's a good idea.

Wasmtime GitHub notifications bot (Feb 29 2024 at 13:14):

fitzgen submitted PR review.

Wasmtime GitHub notifications bot (Feb 29 2024 at 13:14):

fitzgen created PR review comment:

This scope is sufficient to catch the particular instance of this larger bug that we actually hit, and which caused the writing of this comment. It was specifically that inserting the reference into the activations table (happens just below this comment) that would trigger a GC and the above bug. So this scope would catch that particular instance of the bug from happening again, but you are correct that we need to prevent GC in a larger scope than what this actually guards against. I don't think this is really so terrible, it really is the nature of debug assertions: we often can't assert the precise invariant we want to hold, but can assert something a little looser that is implied by that invariant. That said, in this case I think we can thread an AutoAssertNoGC through the into_abi method.

Wasmtime GitHub notifications bot (Feb 29 2024 at 13:43):

fitzgen submitted PR review.

Wasmtime GitHub notifications bot (Feb 29 2024 at 13:43):

fitzgen created PR review comment:

Heh, and it turns out we had an AutoAssertNoGc for the larger scope already: didn't have to adjust callers when changing into_abi to take an AutoAssertNoGc instead of a store.

Wasmtime GitHub notifications bot (Feb 29 2024 at 14:02):

fitzgen created PR review comment:

Also, I was expecting to see changes to the codegen for manipulating this reference count, but I don't think that's in this PR. Is that something you'd like to defer to a future PR?

Nah, that was just an oversight.

Wasmtime GitHub notifications bot (Feb 29 2024 at 14:02):

fitzgen submitted PR review.

Wasmtime GitHub notifications bot (Feb 29 2024 at 16:52):

fitzgen updated PR #8011.

Wasmtime GitHub notifications bot (Feb 29 2024 at 16:53):

fitzgen commented on PR #8011:

@alexcrichton I think I've addressed everything in your review.

Going to start rebasing on main, but I think you can take another look now.

Wasmtime GitHub notifications bot (Feb 29 2024 at 18:08):

fitzgen updated PR #8011.

Wasmtime GitHub notifications bot (Feb 29 2024 at 18:15):

fitzgen updated PR #8011.

Wasmtime GitHub notifications bot (Feb 29 2024 at 18:16):

fitzgen updated PR #8011.

Wasmtime GitHub notifications bot (Feb 29 2024 at 18:19):

fitzgen updated PR #8011.

Wasmtime GitHub notifications bot (Feb 29 2024 at 18:35):

fitzgen updated PR #8011.

Wasmtime GitHub notifications bot (Feb 29 2024 at 21:17):

fitzgen requested alexcrichton for a review on PR #8011.

Wasmtime GitHub notifications bot (Mar 01 2024 at 16:44):

alexcrichton submitted PR review.

Wasmtime GitHub notifications bot (Mar 01 2024 at 16:44):

alexcrichton submitted PR review.

Wasmtime GitHub notifications bot (Mar 01 2024 at 16:44):

alexcrichton created PR review comment:

I like this idea of using this as the type, nice! :+1:

Wasmtime GitHub notifications bot (Mar 01 2024 at 16:44):

alexcrichton created PR review comment:

The more I think about this the more I'm realizing that this is I think a pretty big footgun. The safety of VMExternRef relies on doing things on the "store thread" to ensure that the non-atomic updates are all ok. Other mutations like clone_from_gc_ref are unsafe but there's no way we can mark drop as unsafe here. That means that it's sort of pretty unsafe just to have a VMExternRef in your hand and there's not much we can do about that (e.g. easy to forget during review too).

This is going to go away in the future though, right? In the future the reference count here will go away and we'll rely on tracing instead to find this, so no Drop will be required?

Wasmtime GitHub notifications bot (Mar 01 2024 at 16:44):

alexcrichton created PR review comment:

Sorry I know I'm harping on this a lot, but I think we should avoid std::mem::transmute as much as possible if we can given that it's got so many footguns and I always forget the whole set of footguns. Some suggestions here:

Could this (and the one below) move to the VMExternRef type? That way it'll be easy to see that VMExternRef is a #[repr(transparent)]

Could this perhaps have a debug assert that the size/align are the same?

Instead of std::mem::transmute, could this do &*(self as *const _).cast()

I realize that it can be sort of six-to-one-half-dozen or another but I'm at least personally wary of using transmute unless we absolutely have to (e.g. the only way to go from usize to fn() is transmute, but pointers can be casted/etc)

Wasmtime GitHub notifications bot (Mar 01 2024 at 16:44):

alexcrichton created PR review comment:

Should this perhaps call from_gc_ref below?

Wasmtime GitHub notifications bot (Mar 01 2024 at 16:44):

alexcrichton created PR review comment:

s/from_raw/from_gc_ref/ I think?

Wasmtime GitHub notifications bot (Mar 01 2024 at 16:44):

alexcrichton created PR review comment:

Whiel you're here, mind removing as casts where possible and replacing them with .cast()?

Wasmtime GitHub notifications bot (Mar 01 2024 at 16:44):

alexcrichton created PR review comment:

With the lack of type parameter here now mind throwing #[inline] on this and the above methods?

Wasmtime GitHub notifications bot (Mar 01 2024 at 16:44):

alexcrichton created PR review comment:

Could this an clone_from_gc_ref assert that vmexternref is the only implemented type?

Wasmtime GitHub notifications bot (Mar 01 2024 at 16:44):

alexcrichton created PR review comment:

Could this debug print be self.0.as_ptr() since &self changes enough that its pointer value probably isn't too interesting?

Wasmtime GitHub notifications bot (Mar 01 2024 at 16:44):

alexcrichton created PR review comment:

Oh and now I also realize that these same concerns apply to Clone above, that seems like a footgun that it's safe since there's no easy way to audit callers are all happening in the same thread.

Wasmtime GitHub notifications bot (Mar 01 2024 at 16:44):

alexcrichton created PR review comment:

Is it actually even possible to safely call this method?

This only exist for the C API, and there's no rooting in the C API, so thinking about this it seems like we should remove this because it basically can't be used correctly

Wasmtime GitHub notifications bot (Mar 04 2024 at 13:08):

fitzgen submitted PR review.

Wasmtime GitHub notifications bot (Mar 04 2024 at 13:08):

fitzgen created PR review comment:

I don't think that would be helpful here since the difference between a pointer to a VMExternData and a VMExternRef is that the latter does refcounting operations, but it is an invariant that if we called this function, then the refcount is zero. So if we did from_gc_ref here, we would have to std::mem::forget the result so that the drop impl didn't try to decrement the refcount when we know it is already zero and we are deallocating the VMExternData it points to.

Wasmtime GitHub notifications bot (Mar 04 2024 at 13:09):

fitzgen submitted PR review.

Wasmtime GitHub notifications bot (Mar 04 2024 at 13:09):

fitzgen created PR review comment:

This is a method on VMExternData, not VMExternRef, so &self is the interesting pointer value.

Wasmtime GitHub notifications bot (Mar 04 2024 at 13:43):

fitzgen submitted PR review.

Wasmtime GitHub notifications bot (Mar 04 2024 at 13:43):

fitzgen created PR review comment:

This only exist for the C API

We use this to convert ValRaws back to ExternRefs, and the untyped function call path internally uses ValRaws, so we can't just remove this without affecting non-C-API things.

there's no rooting in the C API

There isn't explicit control of rooting, no, I haven't added C-API stuff for RootScope and Rooted vs ManuallyRooted. But we do root things in the C-API implementation nonetheless, for example a wasm_ref_t is a ManuallyRooted<Ref> under the hood and you could use this method safely in combination with that type.

Wasmtime GitHub notifications bot (Mar 04 2024 at 13:55):

fitzgen updated PR #8011.

Wasmtime GitHub notifications bot (Mar 04 2024 at 13:59):

fitzgen submitted PR review.

Wasmtime GitHub notifications bot (Mar 04 2024 at 13:59):

fitzgen created PR review comment:

Yeah, it is definitely a footgun, and this isn't how I would design this from scratch. It is just the intermediate stepping stone on the way to the next PR, which introduces a proper GcHeap that is passed in to GC hooks/barriers which are proof of safe access to the heap and its objects (similar to what store does at the embedder API level).

I don't really know how to avoid this intermediate state without fusing this PR together with the next.

Wasmtime GitHub notifications bot (Mar 04 2024 at 16:23):

alexcrichton submitted PR review.

Wasmtime GitHub notifications bot (Mar 04 2024 at 16:23):

alexcrichton created PR review comment:

Ok if this is just a stepping stone I think that's ok. If you're ok with it though I think it'd be best at this point to land this PR tomorrow at the earliest since the 19.0.0 branch point will happen tonight and I think it would be best to not have this pair of PRs get straddled across a release

Wasmtime GitHub notifications bot (Mar 04 2024 at 16:26):

alexcrichton submitted PR review.

Wasmtime GitHub notifications bot (Mar 04 2024 at 16:26):

alexcrichton created PR review comment:

Oh right yeah, but the untyped/unchecked path basically only exists for the C API. I'm definitely not saying that we should add a bunch of rooting-related things to the C API, but I'm wondering if we should basically forbid externref in the C API as a result of this (or at least in the ValRaw parts bound in the C API).

This seems like it might be too big of a footgun to expose in the C API where you can very easily get an unrooted value and there's no way to go from that unrooted value to a rooted value, so even if you want to "do the right thing" you can't

Wasmtime GitHub notifications bot (Mar 04 2024 at 16:31):

fitzgen submitted PR review.

Wasmtime GitHub notifications bot (Mar 04 2024 at 16:31):

fitzgen created PR review comment:

That makes sense to me.

Wasmtime GitHub notifications bot (Mar 04 2024 at 16:34):

fitzgen submitted PR review.

Wasmtime GitHub notifications bot (Mar 04 2024 at 16:34):

fitzgen created PR review comment:

I think it does make sense to expose the rooting stuff in the C API eventually, it's just not a priority at the moment. Given that, I think we can forbid it in the C API in the meantime, yeah. How would you like to do this? Propagating errors or asserting? I think the former could be difficult/annoying because of the amount of places that are currently infallible but which would need to become fallible...

Wasmtime GitHub notifications bot (Mar 04 2024 at 16:37):

alexcrichton submitted PR review.

Wasmtime GitHub notifications bot (Mar 04 2024 at 16:37):

alexcrichton created PR review comment:

Perhaps removing the externref field from the header (and the tag for that enum)? And then using unreachable!() everywhere inside of the C bindings since they shouldn't be triggerable?

Wasmtime GitHub notifications bot (Mar 05 2024 at 16:02):

fitzgen updated PR #8011.

Wasmtime GitHub notifications bot (Mar 05 2024 at 16:03):

fitzgen requested alexcrichton for a review on PR #8011.

Wasmtime GitHub notifications bot (Mar 05 2024 at 16:03):

fitzgen submitted PR review.

Wasmtime GitHub notifications bot (Mar 05 2024 at 16:03):

fitzgen created PR review comment:

Done.

Wasmtime GitHub notifications bot (Mar 05 2024 at 17:31):

alexcrichton submitted PR review.

Wasmtime GitHub notifications bot (Mar 05 2024 at 17:42):

fitzgen updated PR #8011.

Wasmtime GitHub notifications bot (Mar 05 2024 at 17:43):

fitzgen has enabled auto merge for PR #8011.

Wasmtime GitHub notifications bot (Mar 05 2024 at 23:00):

fitzgen updated PR #8011.

Wasmtime GitHub notifications bot (Mar 05 2024 at 23:04):

fitzgen updated PR #8011.

Wasmtime GitHub notifications bot (Mar 05 2024 at 23:05):

fitzgen has enabled auto merge for PR #8011.

Wasmtime GitHub notifications bot (Mar 06 2024 at 00:19):

fitzgen requested wasmtime-default-reviewers for a review on PR #8011.

Wasmtime GitHub notifications bot (Mar 06 2024 at 00:19):

fitzgen updated PR #8011.

Wasmtime GitHub notifications bot (Mar 06 2024 at 00:19):

fitzgen has enabled auto merge for PR #8011.

Wasmtime GitHub notifications bot (Mar 06 2024 at 01:08):

fitzgen merged PR #8011.

Wasmtime GitHub notifications bot (Mar 08 2024 at 00:52):

alexcrichton commented on PR #8011:

Handling the fallout for this in the wasmtime-cpp repository I think that the requirement to take a wasmtime_context_t as an argument for value operations invalidates the preexisting copy constructor and copy assignment operator. Not an issue per se, but it appears that removing those also invalidates this pattern where arguments can't be constructed as {6, 27} any more. I'll admit I'm not C++ expert and the wall of error messages I'm looking at don't exactly explain why it's related to the copy constructor/assignment.

Anyway that's a long way of asking, do you think we're going to indefinitely want to have a context argument on these functios into the future, even with the idea of an indexed heap?

Wasmtime GitHub notifications bot (Mar 08 2024 at 14:15):

fitzgen commented on PR #8011:

Anyway that's a long way of asking, do you think we're going to indefinitely want to have a context argument on these functios into the future, even with the idea of an indexed heap?

Yes, unfortunately, I do. At least in Rust. I could see a C/C++ specific approach to alleviating this, however...

Indexed GC heaps are orthogonal to the rooting approach here.

In order to support moving GCs, we need to either:

Be able to precisely identify raw (whether pointer or index) GC roots and mutate them after an object moves.

Or support pinning GC references temporarily such that they do not move while rooted. This allows us to avoid the need to update GC roots after an object moves.

Option (2) is a pretty big constraint on GC implementations. We would have to really bend over backwards to support that in our planned copying collector.

So focusing on option (1), there are basically two approaches we can take:

Indirection. Store the raw GC references in some sort of table (aka our GcRoots) and have the user's root type index into that table (aka our Rooted<T>/ManuallyRooted<T>). This is nice, and works well with Rust and its move semantics, because we don't need to track down where all the Rooted<T>s and ManuallyRooted<T>s are in memory and we don't need to pin them or do any sort of updates on move so that we can update their internal GC reference after an object moves. Instead, the GC refs are all in the table and we can just do a pass over the table to update the GC refs, and all the Rooted<T>s index into the table and get the moved GC ref the next time they do some operation on their referent. The downside is that all constructing and dropping GC roots in this scheme requires access to the table, in the general case, as you've found.

Intrusive lists/some other intrusive data structure. This is the approach that SpiderMonkey takes, for example. All the user's roots hold raw GC references and are additionally members of an intrusive list that is associated with their GC heap. This allows the GC to walk the intrusive list and update all rooted GC references after moving objects. But this means that you need to pin the GC root type or have move constructors that keep the intrusive list valid as the GC root is moved in memory. This works well with C++ but very much not so well with Rust, which is why we went with the first option.

So, if we wanted to make the C/C++ API a little more ergonomic, we could support the intrusive list option. Ideally not in the Rust embedder API, but maybe with some doc(hidden) stuff that is feature gated for when we are building the C API crate. FWIW, this is also fairly unsafe not just in terms of the invariants around maintianing the intrusive list, but in that right now with the table/indirect approach we only ever use indices and we bounds check their accesses into the table so "use after free"-style bugs are still memory safe, which limits how bad things can go wrong. That is not the case with the intrusive list approach.

Wasmtime GitHub notifications bot (Mar 08 2024 at 14:16):

fitzgen edited a comment on PR #8011:

Anyway that's a long way of asking, do you think we're going to indefinitely want to have a context argument on these functios into the future, even with the idea of an indexed heap?

Yes, unfortunately, I do. At least in Rust. I could see a C/C++ specific approach to alleviating this, however...

Indexed GC heaps are orthogonal to the rooting approach here.

In order to support moving GCs, we need to either:

Be able to precisely identify raw (whether pointer or index) GC roots and mutate them after an object moves.

Or support pinning GC references temporarily such that they do not move while rooted. This allows us to avoid the need to update GC roots after an object moves.

Option (2) is a pretty big constraint on GC implementations. We would have to really bend over backwards to support that in our planned copying collector.

So focusing on option (1), there are basically two approaches available to us for implementing rooting APIs:

Indirection. Store the raw GC references in some sort of table (aka our GcRoots) and have the user's root type index into that table (aka our Rooted<T>/ManuallyRooted<T>). This is nice, and works well with Rust and its move semantics, because we don't need to track down where all the Rooted<T>s and ManuallyRooted<T>s are in memory and we don't need to pin them or do any sort of updates on move so that we can update their internal GC reference after an object moves. Instead, the GC refs are all in the table and we can just do a pass over the table to update the GC refs, and all the Rooted<T>s index into the table and get the moved GC ref the next time they do some operation on their referent. The downside is that all constructing and dropping GC roots in this scheme requires access to the table, in the general case, as you've found.

Intrusive lists/some other intrusive data structure. This is the approach that SpiderMonkey takes, for example. All the user's roots hold raw GC references and are additionally members of an intrusive list that is associated with their GC heap. This allows the GC to walk the intrusive list and update all rooted GC references after moving objects. But this means that you need to pin the GC root type or have move constructors that keep the intrusive list valid as the GC root is moved in memory. This works well with C++ but very much not so well with Rust, which is why we went with the first option.

So, if we wanted to make the C/C++ API a little more ergonomic, we could support the intrusive list option. Ideally not in the Rust embedder API, but maybe with some doc(hidden) stuff that is feature gated for when we are building the C API crate. FWIW, this is also fairly unsafe not just in terms of the invariants around maintianing the intrusive list, but in that right now with the table/indirect approach we only ever use indices and we bounds check their accesses into the table so "use after free"-style bugs are still memory safe, which limits how bad things can go wrong. That is not the case with the intrusive list approach.

Wasmtime GitHub notifications bot (Mar 08 2024 at 20:46):

alexcrichton commented on PR #8011:

Ok nah that all sounds good, no need to go the intrusive list route just yet, I think it's ok if things are slightly less ergonomic in the C++ bindings API unless a C++ wizard more adept than I can figure out a better solution

Wasmtime GitHub notifications bot (Mar 14 2024 at 18:52):

rockwotj commented on PR #8011:

Not an issue per se, but it appears that removing those also invalidates this pattern where arguments can't be constructed as {6, 27} any more

https://github.com/bytecodealliance/wasmtime-cpp/blob/da579e78f799aca0a472875b7e348f74b3a04145/include/wasmtime.hh#L2457C32-L2458

The function you're calling takes a std::vector and you're passing in an initializer_list, which copies things out of the vector: https://godbolt.org/z/h844YKxYq

You'll need to use a move iterator to force moving out of the container. One of the annoying bits about C++ iterators. You could use ranges/views to make this more seemless.

As for dropping the copy constructor/assignment I think that's fine, but it can be useful to have an explicit copy function that takes a context.

Wasmtime GitHub notifications bot (Mar 14 2024 at 20:52):

alexcrichton commented on PR #8011:

Makes sense! In https://github.com/bytecodealliance/wasmtime-cpp/pull/48 I ended up adding overloaded versions for std::initializer_list and std::vector with a "base" version that takes an iterator. That being said I don't know how to best generically take an iterator in C++

Wasmtime GitHub notifications bot (Mar 14 2024 at 21:09):

rockwotj commented on PR #8011:

Added a suggestion here: https://github.com/bytecodealliance/wasmtime-cpp/pull/48/files#r1525490495

Last updated: Nov 22 2024 at 16:03 UTC