alexcrichton commented on issue #4039:
Note that procedurally this is based on https://github.com/bytecodealliance/wasmtime/pull/4005 so the first few commits aren't relevant to this PR itself. Additionally as with #4005 there are no tests, and this one definitely can't land without tests.
github-actions[bot] commented on issue #4039:
Subscribe to Label Action
cc @peterhuene
<details>
This issue or pull request has been labeled: "wasmtime:api", "wasmtime:config"Thus the following users have been cc'd because of the following labels:
- peterhuene: wasmtime:api
To subscribe or unsubscribe from this label, edit the <code>.github/subscribe-to-label.json</code> configuration file.
Learn more.
</details>
github-actions[bot] commented on issue #4039:
Label Messager: wasmtime:config
It looks like you are changing Wasmtime's configuration options. Make sure to
complete this check list:
[ ] If you added a new
Config
method, you wrote extensive documentation for
it.<details>
Our documentation should be of the following form:
```text
Short, simple summary sentence.More details. These details can be multiple paragraphs. There should be
information about not just the method, but its parameters and results as
well.Is this method fallible? If so, when can it return an error?
Can this method panic? If so, when does it panic?
Example
Optional example here.
```</details>
[ ] If you added a new
Config
method, or modified an existing one, you
ensured that this configuration is exercised by the fuzz targets.<details>
For example, if you expose a new strategy for allocating the next instance
slot inside the pooling allocator, you should ensure that at least one of our
fuzz targets exercises that new strategy.Often, all that is required of you is to ensure that there is a knob for this
configuration option in [wasmtime_fuzzing::Config
][fuzzing-config] (or one
of its nestedstruct
s).Rarely, this may require authoring a new fuzz target to specifically test this
configuration. See [our docs on fuzzing][fuzzing-docs] for more details.</details>
[ ] If you are enabling a configuration option by default, make sure that it
has been fuzzed for at least two weeks before turning it on by default.[fuzzing-config]: https://github.com/bytecodealliance/wasmtime/blob/ca0e8d0a1d8cefc0496dba2f77a670571d8fdcab/crates/fuzzing/src/generators.rs#L182-L194
[fuzzing-docs]: https://docs.wasmtime.dev/contributing-fuzzing.html
<details>
To modify this label's message, edit the <code>.github/label-messager/wasmtime-config.md</code> file.
To add new label messages or remove existing label messages, edit the
<code>.github/label-messager.json</code> configuration file.</details>
alexcrichton commented on issue #4039:
What if there were two ways to go from Value to Cursor?
The reason I haven't gone with something like this is that the idiomatic thing to do is then the slow thing, which I've been trying to avoid. Existence of a
Value<T>
doesn't lock memory to a particular state or prevent mutations, so validation of aValue<T>
must always be re-done.It's also less-so validation and more just reading all the memory. Even if we were to somehow do ahead-of-time validation it means that when you read something like
list<my-enum>
that does all the decoding a second time which is also something we should be shooting to avoid for performant cases. Basically I think it's required that we interleave validation and decoding for performance, but to be strictly standards compliant we probably need to figure out how to force those two to happen given anyValue<T>
alexcrichton commented on issue #4039:
Still needs tests, so nothing new here, but I would like to confirm with other the subtyping store because that may or may not invalidate everything in this PR.
alexcrichton commented on issue #4039:
Ok I have written a mess of tests for this which indeed found some issues in the previous code. I've also opened https://github.com/bytecodealliance/wasmtime/issues/4185 to track various items to make sure I don't forget them. Otherwise I think this is probably good to land now.
Currently I have not added support to
*.wast
for actually invoking component functions. I could do that (inventing some syntax for this along the way) along with statically listing "these are the signatures which can be called" or something like that. I opted to instead use custom embedding tests, but eventually I think it will be more useful to get*.wast
support as well (but it's always limited in the sense that the callable signatures must be statically listed internally). If @fitzgen you or others have thoughts on how to improve the tests here I'm happy to change things up as well.
alexcrichton commented on issue #4039:
Ah let me write my thoughts on subtyping here as well.
I realized relatively late in the design process for this feature that we did not factor in subtyping when crossing between the host and wasm. The major problem here is that the API designed here is where the host statically asserts the signature of a wasm componet function and then attempts to call it with that signature. Given subtyping, though, one might suspect that subtyping relationships would be respected in this typecheck. Implementation-wise, this is not supported.
For example if a wasm function export takes zero parameters, then the host could declare that it in fact takes 2 parameters. According to the subtyping rules these type signatures are compatible and this function call should be executed. If the parameters were strings though the host would copy in results to the module when calling it when it shouldn't do that.
A naive fix for this issue would be to always lower values in the context of a type. For example the
ComponentValue
trait would have a type added to thelower
method where a value is lowered into a particular type (or stored as one). This means though that all layers of lowering are constantly performing typechecks to see if subtyping conversions are necessary. It's roughly predicted that the cost of this will be prohibitive and as such we wouldn't want to do that.A much more advanced fix would be to generate trampolines. Compilation of a component would now also involve specifying the interface that an embedder would be using (e.g. the signature it expects for exports and the signatures for the imports). Appropriate trampolines would be generated and the host would call the trampolines which would "do the right thing". The problem with this strategy is that the host loses all flexibility of the layout of host data. Instead now everything has to be in a format that the trampoline understands. Additionally throwing in complexity of things like host destructors makes this much more complex as well.
Overall after talking with Luke the current thinking is to do this:
- Do not implement subtyping in host imports/exports. Everything has to have an exact signature match.
- If subtyping is necessary, the component is wrapped in another component which uses the right signatures.
The (hopeful) idea is that hosts can detect mismatches in type relationships. Hosts ideally have
*.wit
files or similar describing imports and exports, and that can be used to preprocess a component from a consumer to either verify that the types all line up or whether a second component is needed to perform the subtyping relationship. This component-to-component communication would then handle subtyping via the Cranelift-generated trampolines because type representation is all fixed with the canonical ABI and it's much easier to generate a trampoline.
fitzgen commented on issue #4039:
Overall after talking with Luke the current thinking is to do this:
* Do not implement subtyping in host imports/exports. Everything has to have an exact signature match. * If subtyping is necessary, the component is wrapped in another component which uses the right signatures.
The (hopeful) idea is that hosts can detect mismatches in type relationships. Hosts ideally have
*.wit
files or similar describing imports and exports, and that can be used to preprocess a component from a consumer to either verify that the types all line up or whether a second component is needed to perform the subtyping relationship. This component-to-component communication would then handle subtyping via the Cranelift-generated trampolines because type representation is all fixed with the canonical ABI and it's much easier to generate a trampoline.This seems like a very reasonable path for us to go down.
Last updated: Jan 24 2025 at 00:11 UTC