wasmtime / PR #8595 Wasmtime(gc): Add support for superty... · git-wasmtime

Stream: git-wasmtime

Topic: wasmtime / PR #8595 Wasmtime(gc): Add support for superty...

Wasmtime GitHub notifications bot (May 10 2024 at 23:13):

fitzgen opened PR #8595 from fitzgen:supertypes-and-finality to bytecodealliance:main:

This is the final type system change for Wasm GC: the ability to explicitly declare supertypes and finality. A final type may not be a supertype of another type. A concrete heap type matches another concrete heap type if its concrete type is a subtype (potentially transitively) of the other heap type's concrete type.

Next, I'll begin support for allocating GC structs and arrays at runtime.

I've also implemented O(1) subtype checking in the types registry:

In a type system with single inheritance, the subtyping relationships between all types form a set of trees. The root of each tree is a type that has no supertype; each node's immediate children are the types that directly subtype that node.

For example, consider these types:
class Base {}
class A subtypes Base {}
class B subtypes Base {}
class C subtypes A {}
class D subtypes A {}
class E subtypes C {}
These types produce the following tree:
           Base
          /    \
         A      B
        / \
       C   D
      /
     E
Note the following properties:

If sub is a subtype of sup (either directly or transitively) then sup must be on the path from sub up to the root of sub's tree.

Additionally, sup must be the ith node down from the root in that path, where i is the length of the path from sup to its tree's root.

Therefore, if we maintain a vector containing the path to the root for each type, then we can simply check if sup is at index supertypes(sup).len() within supertypes(sub).

Wasmtime GitHub notifications bot (May 10 2024 at 23:13):

fitzgen requested wasmtime-compiler-reviewers for a review on PR #8595.

Wasmtime GitHub notifications bot (May 10 2024 at 23:13):

fitzgen requested cfallin for a review on PR #8595.

Wasmtime GitHub notifications bot (May 10 2024 at 23:13):

fitzgen requested alexcrichton for a review on PR #8595.

Wasmtime GitHub notifications bot (May 10 2024 at 23:13):

fitzgen requested wasmtime-core-reviewers for a review on PR #8595.

Wasmtime GitHub notifications bot (May 10 2024 at 23:17):

fitzgen updated PR #8595.

Wasmtime GitHub notifications bot (May 11 2024 at 00:55):

alexcrichton submitted PR review:

Nice :+1:

Wasmtime GitHub notifications bot (May 11 2024 at 00:55):

alexcrichton submitted PR review:

Nice :+1:

Wasmtime GitHub notifications bot (May 11 2024 at 00:55):

alexcrichton created PR review comment:

Aha this clears up some historical confusion on my part! This is the missing link in the "break the recursion" thinking for me.

Wasmtime GitHub notifications bot (May 11 2024 at 00:55):

alexcrichton created PR review comment:

Could the above method delegate to this new method to start off? (looks like they both start with all the same code)

Wasmtime GitHub notifications bot (May 11 2024 at 00:55):

alexcrichton created PR review comment:

IIRC you had a scheme where a lock wasn't required from within wasm, right? (outside of wasm seems totally fine, just wanted to confirm for in-wasm)

Wasmtime GitHub notifications bot (May 11 2024 at 00:55):

alexcrichton created PR review comment:

Also, for the arms modified here, should the matches method be removed? Or is it used elsewhere?

Wasmtime GitHub notifications bot (May 11 2024 at 00:55):

alexcrichton created PR review comment:

Could this always do = None regardless of whether there's an array there or not?

Wasmtime GitHub notifications bot (May 11 2024 at 00:55):

alexcrichton created PR review comment:

Interpreted literally I think these docs are backwards since true means it can't be a supertype of anything else. Mind rewording the other way around?

Wasmtime GitHub notifications bot (May 11 2024 at 02:44):

github-actions[bot] commented on PR #8595:

Subscribe to Label Action

cc @saulecabrera

<details>
This issue or pull request has been labeled: "wasmtime:api", "winch"

Thus the following users have been cc'd because of the following labels:

saulecabrera: winch

To subscribe or unsubscribe from this label, edit the <code>.github/subscribe-to-label.json</code> configuration file.

Learn more.
</details>

Wasmtime GitHub notifications bot (May 12 2024 at 01:30):

jeffparsons commented on PR #8595:

Therefore, if we maintain a vector containing the path to the root for each type, then we can simply check if sup is at index supertypes(sup).len() within supertypes(sub).

This means that the storage for the vecs in each tree is, in total, in O(N log N), right?

I think there's a cheaper O(N) storage per tree way to do it, that still offers O(1) subtype checking:

Store nodes of the tree in a vec where all descendents of a node occur after that node and before any non-descendent node.

Each node stores the index of its last descendent. If it has none, then the index will be its own index.

To test whether sub is a subtype of sup:
- Look up trees and indexes of both types.
- If they live in different trees, then there is no subtype relationship.
- Otherwise, look up sup's node. If sup_index < sub_index && sub_index <= sup_last_descendent then we have a subtype.

There's a good chance I've misunderstood something (e.g. how the trees are updated?), but if there's nothing else special about these trees then maybe this optimisation is applicable! :pray:

Wasmtime GitHub notifications bot (May 12 2024 at 10:36):

jeffparsons edited a comment on PR #8595:

Therefore, if we maintain a vector containing the path to the root for each type, then we can simply check if sup is at index supertypes(sup).len() within supertypes(sub).

This means that the storage for the vecs in each tree is, in total, in O(N log N), right?

I think there's a cheaper O(N) storage per tree way to do it, that still offers O(1) subtype checking:

Store nodes of the tree in a vec where all descendents of a node occur after that node and before any non-descendent node.

Each node stores the index of its last descendent. If it has none, then the index will be its own index.

To test whether sub is a subtype of sup:
- Look up trees and indexes of both types.
- If they live in different trees, then there is no subtype relationship.
- Otherwise, look up sup's node. If sup_index < sub_index && sub_index <= sup_last_descendent then we have a subtype.

EDIT: And I guess if the types are already stored in some other structure(s), there's actually no need for a separate vector for this information; the primary representation of the type can just have those extra fields for: tree ID, index in tree, max descendent index.

There's a good chance I've misunderstood something (e.g. how the trees are updated?), but if there's nothing else special about these trees then maybe this optimisation is applicable! :pray:

Wasmtime GitHub notifications bot (May 13 2024 at 17:22):

fitzgen commented on PR #8595:

@jeffparsons, thanks for digging in here! Always appreciate a second pair of eyes.

This means that the storage for the vecs in each tree is, in total, in O(N log N), right?

It is indeed a time-space tradeoff, but as-implemented it can actually be even worse than O(n log n): consider a chain where each type is a subtype of its predecessor: a; b <: a; c <: b; d <: c; .... The ith type's supertype array will be of length i - 1, and this forms the classic "triangle" shape that indicates O(n * n) behavior. This particular case could be resolved (although the implementation would be a bit awkward in the presence of type canonicalization across modules) by defining the supertype arrays in sub-to-supertype order and then using a subslice of a subtype's supertype's vector, when one is present, instead of allocating a new vector. Then I think we would be at O(n log n).

However, implementations are allowed to define their own limits on the maximum depth a subtyping chain may have. And for JS embeddings, it is additionally tightened from "some implementation limit" to 63; for maximum cross-engine compatibility, we've always matched the JS embedding limits.

https://webassembly.github.io/gc/js-api/index.html#limits

Anyways, the point is that this limit provides bounds on the total memory overhead associated with these supertype vectors, preventing worst-case scenarios.

Finally, my understanding is that it is pretty much expected that engines implement subtyping checks with this technique, FWIW. For example, here is SpiderMonkey's docs about doing this same thing: https://searchfox.org/mozilla-central/rev/ee2ad260c25310a9fbf96031de05bbc0e94394cc/js/src/wasm/WasmTypeDef.h#480-541

This is a good reminder to double check that we are actually enforcing the subtyping depth limits in validation, though :-p

I think there's a cheaper O(N) storage per tree way to do it, that still offers O(1) subtype checking:

If I understand correctly, this is storing the entry and exit indices of a depth-first traversal of the tree and then subtype checking checks that (a) we are looking at the same tree and (b) that the subtype's index (doesn't matter if we check entry vs exit index) is within the supertype's range, correct?

This would indeed work very nicely in the context of a single Wasm module and a static set of types.

But, as you alluded to, things get problematic when we consider updates. GC objects can be passed between modules, and a non-final type from module A can be subtyped by another type in module B and then an instance of that subtype can be passed from B to A, and A might want to do subtyping checks on this instance even though it never defined the exact type of that instance. This necessitates canonicalizing types across modules. So, even though from a single Wasm module's perspective the type set is static, from an engine perspective we are constantly dynamically adding and removing types from the same deduplicated, canonicalized types registry as modules are loaded or unloaded (or when the host defines/drops types via the public Wasmtime API). And therefore our type hierarchies and their trees aren't static, and must support having subtrees grafted on and pruned away at any moment. That means we would need to recompute potentially all those depth-first traversal indices on every mutation, an O(n) operation for each new type definition, which is already too expensive on its own, but also leads to O(n * n) behavior when defining n new types.

So I don't think this approach is something we can use in this case, unfortunately.

Wasmtime GitHub notifications bot (May 13 2024 at 17:23):

fitzgen edited a comment on PR #8595:

@jeffparsons, thanks for digging in here! Always appreciate a second pair of eyes.

This means that the storage for the vecs in each tree is, in total, in O(N log N), right?

It is indeed a time-space tradeoff, but as-implemented it can actually be even worse than O(n log n): consider a chain where each type is a subtype of its predecessor: a; b <: a; c <: b; d <: c; .... The ith type's supertype array will be of length i - 1, and this forms the classic "triangle" shape that indicates O(n * n) behavior. This particular case could be resolved (although the implementation would be a bit awkward in the presence of type canonicalization across modules) by defining the supertype arrays in sub-to-supertype order and then using a subslice of a subtype's supertype's vector, when one is present, instead of allocating a new vector. Then I think we would be at O(n log n) in general again.

However, implementations are allowed to define their own limits on the maximum depth a subtyping chain may have. And for JS embeddings, it is additionally tightened from "some implementation limit" to 63; for maximum cross-engine compatibility, we've always matched the JS embedding limits.

https://webassembly.github.io/gc/js-api/index.html#limits

Anyways, the point is that this limit provides bounds on the total memory overhead associated with these supertype vectors, preventing worst-case scenarios.

Finally, my understanding is that it is pretty much expected that engines implement subtyping checks with this technique, FWIW. For example, here is SpiderMonkey's docs about doing this same thing: https://searchfox.org/mozilla-central/rev/ee2ad260c25310a9fbf96031de05bbc0e94394cc/js/src/wasm/WasmTypeDef.h#480-541

This is a good reminder to double check that we are actually enforcing the subtyping depth limits in validation, though :-p

I think there's a cheaper O(N) storage per tree way to do it, that still offers O(1) subtype checking:

If I understand correctly, this is storing the entry and exit indices of a depth-first traversal of the tree and then subtype checking checks that (a) we are looking at the same tree and (b) that the subtype's index (doesn't matter if we check entry vs exit index) is within the supertype's range, correct?

This would indeed work very nicely in the context of a single Wasm module and a static set of types.

But, as you alluded to, things get problematic when we consider updates. GC objects can be passed between modules, and a non-final type from module A can be subtyped by another type in module B and then an instance of that subtype can be passed from B to A, and A might want to do subtyping checks on this instance even though it never defined the exact type of that instance. This necessitates canonicalizing types across modules. So, even though from a single Wasm module's perspective the type set is static, from an engine perspective we are constantly dynamically adding and removing types from the same deduplicated, canonicalized types registry as modules are loaded or unloaded (or when the host defines/drops types via the public Wasmtime API). And therefore our type hierarchies and their trees aren't static, and must support having subtrees grafted on and pruned away at any moment. That means we would need to recompute potentially all those depth-first traversal indices on every mutation, an O(n) operation for each new type definition, which is already too expensive on its own, but also leads to O(n * n) behavior when defining n new types.

So I don't think this approach is something we can use in this case, unfortunately.

Wasmtime GitHub notifications bot (May 13 2024 at 20:19):

fitzgen updated PR #8595.

Wasmtime GitHub notifications bot (May 13 2024 at 20:27):

fitzgen submitted PR review.

Wasmtime GitHub notifications bot (May 13 2024 at 20:27):

fitzgen created PR review comment:

This doesn't quite work out because while both arms of the above method start with producing an EngineOrModuleIndex, and that happens in the same way as this method, the arms continue by reusing intermediate steps from how the EngineOrModuleIndex was produced in the code that follows after the index has been constructed. This makes it subtly difficult to pull apart or reuse.

Wasmtime GitHub notifications bot (May 13 2024 at 20:30):

fitzgen submitted PR review.

Wasmtime GitHub notifications bot (May 13 2024 at 20:30):

fitzgen created PR review comment:

Unfortunately always doing self.types_to_supertypes = None would force the secondary map to allocate even when we aren't ever using Wasm GC. I could remove the is_some_and(..) suffix tho; I'll do that.

Wasmtime GitHub notifications bot (May 13 2024 at 20:37):

alexcrichton submitted PR review.

Wasmtime GitHub notifications bot (May 13 2024 at 20:37):

alexcrichton created PR review comment:

Ah ok makes sense, mind leaving a comment that the purpose here is to avoid touching the map entirely?

Wasmtime GitHub notifications bot (May 13 2024 at 20:41):

fitzgen submitted PR review.

Wasmtime GitHub notifications bot (May 13 2024 at 20:41):

fitzgen created PR review comment:

For the common case of checking subtyping of instances of types defined in this module, yes.

I don't have a plan for how to avoid locks in the terrible case of new external subtypes created dynamically:

create module A with non-final type T

instantiate module A

create module B

instantiate module B with type U <: T

instantiate module B

instantiate object of type U

pass object to instance A for subtyping check

(Note that module B could also be the host using the public API)

My thinking is that we can just fall back to locking the type registry in the cross-module-subtyping case. I think there are some more tricks we can play with InstancePres to make cross-module-subtyping fast for all types that exist at InstancePre creation time, but for new subtypes created during runtime we will always need to fall back to locking (or equivalent) AFAICT.

Wasmtime GitHub notifications bot (May 13 2024 at 20:43):

fitzgen submitted PR review.

Wasmtime GitHub notifications bot (May 13 2024 at 20:43):

fitzgen created PR review comment:

They are used elsewhere (e.g. inside the with_finality_and_supertype constructors) and are also generally useful as part of the public API.

Wasmtime GitHub notifications bot (May 13 2024 at 20:44):

fitzgen edited PR review comment.

Wasmtime GitHub notifications bot (May 13 2024 at 20:46):

fitzgen updated PR #8595.

Wasmtime GitHub notifications bot (May 13 2024 at 20:48):

alexcrichton submitted PR review.

Wasmtime GitHub notifications bot (May 13 2024 at 20:48):

alexcrichton created PR review comment:

Ok makes sense, I mostly wanted to confirm there was a means of intra-module checks that didn't require locking.

Wasmtime GitHub notifications bot (May 13 2024 at 21:30):

fitzgen merged PR #8595.

Last updated: Apr 18 2025 at 14:03 UTC