wasmtime / issue #5288 Function references · git-wasmtime

Stream: git-wasmtime

Topic: wasmtime / issue #5288 Function references

Wasmtime GitHub notifications bot (Nov 17 2022 at 04:49):

github-actions[bot] commented on issue #5288:

Subscribe to Label Action

cc @peterhuene

<details>
This issue or pull request has been labeled: "cranelift", "cranelift:wasm", "wasmtime:api", "wasmtime:config"

Thus the following users have been cc'd because of the following labels:

peterhuene: wasmtime:api

To subscribe or unsubscribe from this label, edit the <code>.github/subscribe-to-label.json</code> configuration file.

Learn more.
</details>

Wasmtime GitHub notifications bot (Nov 17 2022 at 04:50):

github-actions[bot] commented on issue #5288:

Label Messager: wasmtime:config

It looks like you are changing Wasmtime's configuration options. Make sure to
complete this check list:

[ ] If you added a new Config method, you wrote extensive documentation for
it.

<details>

Our documentation should be of the following form:

```text
Short, simple summary sentence.

More details. These details can be multiple paragraphs. There should be
information about not just the method, but its parameters and results as
well.

Is this method fallible? If so, when can it return an error?

Can this method panic? If so, when does it panic?

Example

Optional example here.
```

</details>

[ ] If you added a new Config method, or modified an existing one, you
ensured that this configuration is exercised by the fuzz targets.

<details>

For example, if you expose a new strategy for allocating the next instance
slot inside the pooling allocator, you should ensure that at least one of our
fuzz targets exercises that new strategy.

Often, all that is required of you is to ensure that there is a knob for this
configuration option in [wasmtime_fuzzing::Config][fuzzing-config] (or one
of its nested structs).

Rarely, this may require authoring a new fuzz target to specifically test this
configuration. See [our docs on fuzzing][fuzzing-docs] for more details.

</details>

[ ] If you are enabling a configuration option by default, make sure that it
has been fuzzed for at least two weeks before turning it on by default.

[fuzzing-config]: https://github.com/bytecodealliance/wasmtime/blob/ca0e8d0a1d8cefc0496dba2f77a670571d8fdcab/crates/fuzzing/src/generators.rs#L182-L194
[fuzzing-docs]: https://docs.wasmtime.dev/contributing-fuzzing.html

<details>

To modify this label's message, edit the <code>.github/label-messager/wasmtime-config.md</code> file.

To add new label messages or remove existing label messages, edit the
<code>.github/label-messager.json</code> configuration file.

Learn more.

</details>

Wasmtime GitHub notifications bot (Nov 17 2022 at 09:35):

bjorn3 commented on issue #5288:

call_ref is simply translated as a cranelift indirect_call. A faster cranelift indirect call that does not check the callee signature can (and should) eventually be added and used. Further, no null check is elided.

call_indirect doesn't check the function signature, nor does it check for null pointers.

Wasmtime GitHub notifications bot (Feb 23 2023 at 14:24):

dhil commented on issue #5288:

I'm interested in resuming work on this patch and dedicating the necessary time to bring it into an acceptable state. I have a patch which merges this PR with the recent changes to main (c.f. effect-handlers/wasmtime#13). I will merge that patch into this PR later today.

I think the current status is as follows:

Compatible with latest wasm-tools

All new function reference instructions have been implemented (save for return_call_ref) (tested on x86_64).

10 test case failures:
failures:
    wast::Cranelift::spec::function_references::br_table
    wast::Cranelift::spec::function_references::br_table_pooling
    wast::Cranelift::spec::function_references::linking
    wast::Cranelift::spec::function_references::linking_pooling
    wast::Cranelift::spec::function_references::return_call_ref
    wast::Cranelift::spec::function_references::return_call_ref_pooling
    wast::Cranelift::spec::function_references::table
    wast::Cranelift::spec::function_references::table_pooling
    wast::Cranelift::spec::function_references::type_equivalence
    wast::Cranelift::spec::function_references::type_equivalence_pooling

test result: FAILED. 872 passed; 10 failed; 10 ignored; 0 measured; 0 filtered out; finished in 77.91s
Here return_call_ref is expected to fail as we have not implemented the tail calls proposal. I haven't worked out why linking and type_equivalence fail yet. The test br_table fails because of how the wast crate of wasm-tools elaborates single element tables
(module
  (type $t (func))
  (func $tf)
  (table $t (ref null $t) (elem $tf)))
The table test fails due to the incompleteness of the type reconstruction procedure wasmtime::values::Val::ty(). The procedure has to infer the nullability of an ExternRef or FuncRef by looking at their respective shape, which is impossible.

I will probably need some help with this latter issue as it seems to require a slight redesign/refactor to keep source value type information around, at least in the wast crate of wasmtime. I will probably also need some help with diagnosing the remaining failure cases. I am also positive that there are several code quality improvements that need to be implemented in order to make this PR satisfactory.

Wasmtime GitHub notifications bot (Apr 13 2023 at 18:22):

abrown commented on issue #5288:

This review got auto-assigned to me but I think @fitzgen and @alexcrichton have the right context for the review here.

Wasmtime GitHub notifications bot (Apr 19 2023 at 13:34):

dhil commented on issue #5288:

@alexcrichton @fitzgen I have reverted the changes to the wasmtime public API. The end result is not too bad; there are a couple of more test failures now. I haven't looked into the failures in detail yet.

Wasmtime GitHub notifications bot (Apr 19 2023 at 13:38):

dhil edited a comment on issue #5288:

@alexcrichton @fitzgen I have reverted the changes to the wasmtime public API. The end result is not too bad; there are a couple of more test failures now. I haven't looked into the failures in detail yet.

If you want I can reopen this PR from my personal fork such that you can get write permissions. I think GitHub do not grant write permissions on PRs originating from an organisation account.

Wasmtime GitHub notifications bot (Apr 19 2023 at 17:22):

alexcrichton commented on issue #5288:

Some things I'm seeing:

I sent https://github.com/effect-handlers/wasmtime/pull/19 to surface the failures here on this PR

One test failure, wast::Cranelift::spec::function_references::linking, relates to this function which needs to be souped up for typed function references. Note though that this function will need to grow &ModuleTypes arguments to perform the subtyping check between two different modules (this is basically the same code that's in wasmparser, just at link-time instead of validation-time).

wast::Cranelift::spec::function_references::table looks like it may be a bug in table initialization?

wast::Cranelift::spec::function_references::br_table is failing due to https://github.com/bytecodealliance/wasm-tools/pull/952 I think

wast::Cranelift::spec::function_references::type_equivalence looks like it may be some sort of code generator bug? Unsure.

There are additionally a number of tests as you've found which all panic due to the one you inserted to avoid changes to the public API. Those tests will need to be skipped as-is from the upstream testsuite, but they can be copied to to tests/misc_testsuite with updates to run in Wasmtime.

Wasmtime GitHub notifications bot (Apr 20 2023 at 15:27):

dhil commented on issue #5288:

Some things I'm seeing:

* I sent [Get more CI passing effect-handlers/wasmtime#19](https://github.com/effect-handlers/wasmtime/pull/19) to surface the failures here on this PR

* One test failure, `wast::Cranelift::spec::function_references::linking`, relates to [this function](https://github.com/bytecodealliance/wasmtime/blob/ef7af28ef095277f5979dcbc6ff03964c1e0ba12/crates/wasmtime/src/types/matching.rs#L183-L193) which needs to be souped up for typed function references. Note though that this function will need to grow `&ModuleTypes` arguments to perform the subtyping check between two different modules (this is basically the same code that's in `wasmparser`, just at link-time instead of validation-time).

* `wast::Cranelift::spec::function_references::table` looks like it may be a bug in table initialization?

* `wast::Cranelift::spec::function_references::br_table` is failing due to [Support for Wast table initialisation syntactic sugar with typeful references wasm-tools#952](https://github.com/bytecodealliance/wasm-tools/pull/952) I think

* `wast::Cranelift::spec::function_references::type_equivalence` looks like it may be some sort of code generator bug? Unsure.

There are additionally a number of tests as you've found which all panic due to the one you inserted to avoid changes to the public API. Those tests will need to be skipped as-is from the upstream testsuite, but they can be copied to to tests/misc_testsuite with updates to run in Wasmtime.

Excellent, thank you! I am down to two failures now type_equivalence and table.

Wasmtime GitHub notifications bot (Apr 20 2023 at 15:28):

dhil edited a comment on issue #5288:

Excellent, thank you! I am down to two failures now type_equivalence and table.

There are additionally a number of tests as you've found which all panic due to the one you inserted to avoid changes to the public API. Those tests will need to be skipped as-is from the upstream testsuite, but they can be copied to to tests/misc_testsuite with updates to run in Wasmtime.

I will move some of the content of the ignored tests to the misc testsuite too.

Wasmtime GitHub notifications bot (May 03 2023 at 10:01):

dhil commented on issue #5288:

I have been trying to track down the remaining two failures. In the table.wast file, the problem is that tables aren't properly initialised. Looking at the code it indeed seems like table initialisers get ignored altogether (https://github.com/bytecodealliance/wasmtime/blob/main/crates/environ/src/module_environ.rs#L303-L307), which would explain why table entries always seem to be initialised to null (c.f. https://github.com/bytecodealliance/wasmtime/blob/main/crates/runtime/src/instance.rs#L872-L874). From reading the code, I understand that great care is being taken to ensure lazy initialisation of tables, and thus, I wonder what semantics you want for tables with initialising expressions? @alexcrichton @fitzgen

Wasmtime GitHub notifications bot (May 03 2023 at 18:27):

alexcrichton commented on issue #5288:

Looking at the code it indeed seems like table initialisers get ignored altogether

Indeed! Table intializers were added with the function-references proposal so that's why they're "ignored". When we updated wasmparser we probably should have made it an error if a non-null initializer was indicated.

I think what you'll want to do is to update TableInitialization to store the default initializer and then update try_func_table_init to not use FuncTable if there's a non-null initializer. We can probably lazy-initialize with initializers but that's fine to to in a future PR. It's probably best to get the base set of functionality working first.

Wasmtime GitHub notifications bot (May 16 2023 at 09:41):

dhil commented on issue #5288:

Commit 5de9a246fc0c8a0b9ae7a4a60b2ce74cceba86ce implements a fix for the table initialisation problem, though, it is ended up a bit more roundabout than I initially anticipated. If I understand correctly, TableInitialization is global per module, so to avoid penalising pre-function-references code I introduced a new table initialisation strategy EagerFuncTable, which eagerly initialises all table fields. I am not sure whether this aligns with what you had in mind.

As for the final failing test type-equivalence, it fails, seemingly, because the type test for call_indirect is nominal rather than structural (c.f. https://github.com/bytecodealliance/wasmtime/blob/6dc3eb717d7cf9b3539557ea7b393dd51225336d/crates/cranelift/src/func_environ.rs#L1662C1-L1665). Here is a minimal example to reproduce:
(module
  (type $s0 (func (param i32)))
  (type $s1 (func (param i32 (ref $s0))))
  (type $t1 (func (param (ref $s1))))

  (func $s1 (type $s1))
  (table funcref (elem $s1))

  (func (export "run")
    (call_indirect (type $t1) (ref.func $s1) (i32.const 0))
  )
)
(assert_return (invoke "run"))
I am not quite sure what the best course of action is here. I would have expected type/type signature canonicalisation to solve this problem.

Wasmtime GitHub notifications bot (May 16 2023 at 15:39):

alexcrichton commented on issue #5288:

which eagerly initialises all table fields. I am not sure whether this aligns with what you had in mind.

Seems reasonable to me! One adjustment I might recommend though is to reuse the existing TableInitialization::Segments initializer. That's the "eager" mode which is optimized, if possible, to TableInitialization::FuncTable which is lazy initialization. During that optimization you can add a check that if anything is not null-initialized then the optimization can be skipped. I think this'll all have the same effect as what you've implemented but avoids EagerTableInitializer and Segments being sort of duplicates of each other.

I'll look more into the call_indirect issue now.

Wasmtime GitHub notifications bot (May 16 2023 at 15:49):

alexcrichton commented on issue #5288:

Ok so I can definitely imagine that canonicalization has a bug in it and something isn't working as expected. My guess is that something is hashing a module-local u32 when it actually needs to hash something module-independent, or something like that.

That being said I'm not sure that your reduction showcases the issue? You're invoking, indirectly, a function that takes two parameters with only one parameter. In that sense I think a runtime error is expected there?

Wasmtime GitHub notifications bot (May 17 2023 at 13:00):

dhil commented on issue #5288:

Ok so I can definitely imagine that canonicalization has a bug in it and something isn't working as expected. My guess is that something is hashing a module-local u32 when it actually needs to hash something module-independent, or something like that.

That being said I'm not sure that your reduction showcases the issue? You're invoking, indirectly, a function that takes two parameters with only one parameter. In that sense I think a runtime error is expected there?

Ah yes, sorry. I deleted a little bit too much. Here is a (valid) repro:
(module
  (type $s0 (func (param i32)))
  (type $s1 (func (param i32 (ref $s0))))
  (type $s2 (func (param i32 (ref $s0))))
  (type $t1 (func (param (ref $s1))))
  (type $t2 (func (param (ref $s2))))

  (func $s1 (type $s1))
  (func $f1 (type $t1))
  (func $f2 (type $t2))
  (table funcref (elem $f1 $f2 $s1))

  (func (export "run")
    (call_indirect (type $t1) (ref.func $s1) (i32.const 1))
  )
)
It is extracted from type-equivalence.wast, the ;; Indirect types section.

Wasmtime GitHub notifications bot (May 17 2023 at 13:00):

dhil edited a comment on issue #5288:

Ok so I can definitely imagine that canonicalization has a bug in it and something isn't working as expected. My guess is that something is hashing a module-local u32 when it actually needs to hash something module-independent, or something like that.

That being said I'm not sure that your reduction showcases the issue? You're invoking, indirectly, a function that takes two parameters with only one parameter. In that sense I think a runtime error is expected there?

Ah yes, sorry. I deleted a little bit too much. Here is a (valid) repro:
(module
  (type $s0 (func (param i32)))
  (type $s1 (func (param i32 (ref $s0))))
  (type $s2 (func (param i32 (ref $s0))))
  (type $t1 (func (param (ref $s1))))
  (type $t2 (func (param (ref $s2))))

  (func $s1 (type $s1))
  (func $f1 (type $t1))
  (func $f2 (type $t2))
  (table funcref (elem $f1 $f2 $s1))

  (func (export "run")
    (call_indirect (type $t1) (ref.func $s1) (i32.const 1))
  )
)
(assert_return (invoke "run"))
It is extracted from type-equivalence.wast, the ;; Indirect types section.

Wasmtime GitHub notifications bot (May 17 2023 at 13:01):

dhil edited a comment on issue #5288:

Ok so I can definitely imagine that canonicalization has a bug in it and something isn't working as expected. My guess is that something is hashing a module-local u32 when it actually needs to hash something module-independent, or something like that.

That being said I'm not sure that your reduction showcases the issue? You're invoking, indirectly, a function that takes two parameters with only one parameter. In that sense I think a runtime error is expected there?

Ah yes, sorry. I deleted a little bit too much. Here is a (valid) repro:
(module
  (type $s0 (func (param i32)))
  (type $s1 (func (param i32 (ref $s0))))
  (type $s2 (func (param i32 (ref $s0))))
  (type $t1 (func (param (ref $s1))))
  (type $t2 (func (param (ref $s2))))

  (func $s1 (type $s1))
  (func $f2 (type $t2))
  (table funcref (elem $f2 $s1))

  (func (export "run")
    (call_indirect (type $t1) (ref.func $s1) (i32.const 0))
  )
)
(assert_return (invoke "run"))
It is extracted from type-equivalence.wast, the ;; Indirect types section.

Wasmtime GitHub notifications bot (May 17 2023 at 13:02):

dhil commented on issue #5288:

which eagerly initialises all table fields. I am not sure whether this aligns with what you had in mind.

Seems reasonable to me! One adjustment I might recommend though is to reuse the existing TableInitialization::Segments initializer. That's the "eager" mode which is optimized, if possible, to TableInitialization::FuncTable which is lazy initialization. During that optimization you can add a check that if anything is not null-initialized then the optimization can be skipped. I think this'll all have the same effect as what you've implemented but avoids EagerTableInitializer and Segments being sort of duplicates of each other.

I'll look more into the call_indirect issue now.

I have attempted to clean this up in the latest commit. Is this closer to what you had in mind?

Wasmtime GitHub notifications bot (May 17 2023 at 14:18):

alexcrichton commented on issue #5288:

Ok so I think the issue is in two locations:

Here in the engine-level signature registry, which is a hash map of types across modules registered in an engine.

Here where there's a module-level map of types-to-indexes which should be deduplicating the above types but isn't.

This bug is due to derive(Hash) on WasmFuncType which delegates all the way through to WasmHeapType which hashes the raw index. I think the best solution would be to remove the auto-derived trait impls from WasmType about equality and hashing, and then work backwards from there. The raw WasmHeapType::Index should probably take a SignatureIndex to assist in deduplicating within a module. That should solve the second bullet above.

The first bullet above, however, is much more difficult. That'll need to perform a "deep hash" of the entire structural type since it must be module-independent (and SignatureIndex is module-local). If the second bullet is enough to get tests passing for this PR though then that seems ok to defer to an issue and get this landed instead.

(to clarify, at this point I think it's fine to land this PR so long as it passes CI, it's been long enough and I'm sure y'all are tired of rebasing. It's ok to defer work so long as there's issues on file and the work is gated, which it all is)

Wasmtime GitHub notifications bot (May 19 2023 at 10:49):

dhil commented on issue #5288:

Thanks! I see the culprit now. Just to clarify, when you say

The raw WasmHeapType::Index should probably take a SignatureIndex to assist in deduplicating within a module.

Do you mean that WasmHeapType should be parameterised by a SignatureIndex rather than u32? I can see how that would make the interning easier, though, how does this affect conversion to/from wasmparser types?

Wasmtime GitHub notifications bot (May 19 2023 at 15:06):

alexcrichton commented on issue #5288:

No need to worry about conversion into a wasmparser type, that in theory shouldn't happen (or at least not that I know of). For conversion from a wasmparser type it may mean that a From impl needs to be removed in favor of a method on ModuleTypes or something like that where the indexing/hashing context is available.

So thinking more on this, sort of what this wants is WasmType<T> where a Module store WasmType<SignatureIndex> but an Engine stores a WasmType<VMSharedSignatureIndex>. I think then the derive(Hash) should continue to work with PartialEq as well?

Wasmtime GitHub notifications bot (May 22 2023 at 11:14):

dhil commented on issue #5288:

I am OK with trying this approach, but it is a sizeable refactoring I think as WasmType<T> implies a type bound on Table, Global, and EntityType (at least). I suppose for each of those the same logic applies: at the module level we want T = SignatureIndex and at the engine level T = VMSharedSignatureIndex (if relevant).

The From trait will need to be removed, or at least supported by an alternative to support conversion from typed references.

Should I give this refactoring a go?

Wasmtime GitHub notifications bot (May 22 2023 at 12:25):

dhil edited a comment on issue #5288:

I am OK with trying this approach, but it is a sizeable refactoring I think as WasmType<T> implies a type bound on Table, Global, and EntityType (I suppose if they are only module-local then they can instantiate T = SignatureIndex internally).

The From trait will need to be removed, or at least supported by an alternative to support conversion from typed references.

Should I give this refactoring a go?

Wasmtime GitHub notifications bot (May 22 2023 at 15:49):

CosineP commented on issue #5288:

This needs to happen for typed continuations support as well. Currently we assume everything is a function, but if we do this refactor I think we can look up in the types table whether a ref is a function or a continuation, which we need to know in a number of places in Wasmtime (at least in the embedding API; we managed to fudge the table layout stuff). But as a result I can say firsthand it is absolutely a hefty undertaking which is why we resorted to our current hack. I may still have a local branch with part of this refactoring underway.

Wasmtime GitHub notifications bot (May 22 2023 at 18:44):

alexcrichton commented on issue #5288:

If you're ok with it, I think it's ok to set the test to ignore and land this. It's true that this refactoring will require a bit of finesse, but it's one I'm happy to take on myself rather than having y'all do yet-more work on top of what you've already got here.

Otherwise I'd be happy to merge this with a green CI run (which will require ignoring tests) and a list of items to tackle tracked in an issue (such as fixing the failing tests, this possible refactoring, the embedder API, etc)

Wasmtime GitHub notifications bot (May 23 2023 at 12:10):

dhil commented on issue #5288:

I have added the test type-equivalence.wast to the ignore set. How should we go about documenting the issues? Should I open each issue as a separate issue here on GitHub?

Wasmtime GitHub notifications bot (May 23 2023 at 13:33):

alexcrichton commented on issue #5288:

I think one tracking issue should be good to start off with and it can branch from there as necessary. I'll work on giving this a final pass today. Thanks again for y'all's work here!

Wasmtime GitHub notifications bot (May 24 2023 at 14:41):

alexcrichton commented on issue #5288:

@dhil to answer this question (over here for a bit more visibility) my thinking is that at the wasm AST layer there's only "this index" as a type but at the Wasmtime layer we'll be able to be more principled than that by having, for example:
enum WasmHeapType {
    Func,
    Extern,
    TypedFunc(SignatureIndex),
    Array,
    TypedArray(StorageType),
    Struct,
    TypedStruct(StructIndex),
    // ..
}
or something along the lines of that where WasmHeapType stores effectively the discriminant of what it refers to rather than requiring the indexing operation to then verify the type of the result.

One of the things I was worried about was that there was the pervasive assumption that WasmHeapType::Index(_) was a function which won't be true once the GC proposal was implemented. I wanted to head off issues there by having the Wasmtime-level name be called TypedFunc so it's clear that all the cases handling Func and TypedFunc are only there for handling function-related things. In the future when more variants are added it'll require adding more match arms to handle in all these places.

Wasmtime GitHub notifications bot (May 25 2023 at 11:41):

dhil commented on issue #5288:

I've opened a issue to track the unresolved bits of this PR. See #6455. I hopefully I managed to cover them all.

Wasmtime GitHub notifications bot (May 25 2023 at 19:47):

fitzgen commented on issue #5288:

Seems like there are some conflicts and this needs a rebase/merge.

Wasmtime GitHub notifications bot (May 26 2023 at 08:23):

dhil commented on issue #5288:

Seems like there are some conflicts and this needs a rebase/merge.

I have synchronised with upstream again. Hopefully everything ticks green again :-)

Wasmtime GitHub notifications bot (May 26 2023 at 16:07):

alexcrichton commented on issue #5288:

:confetti:

Last updated: Apr 16 2025 at 16:03 UTC