alexcrichton opened PR #7929 from alexcrichton:remove-in-tree-wit
to bytecodealliance:main
:
This commit replaces the textual
*.wit
files in the Wasmtime tree with the binary form of a WIT package. Currently there's still duplication in thecrates/wasi/wit
andcrates/wasi-http/wit
folder which I think we can remove with some extra work but for now I wanted to at least take the step forward of using a binary distribution instead of textual WIT files.The new WIT structure now looks like:
wit/test.wit
- miscellaneous things Wasmtime uses for tests such as a proxy world and a command that has http. Eventually I'd like to remove this.
wit/deps/wasi-cli@0.2.0.wasm
- serialized version of thewasi-cli
proposal at the v0.2.0 tag. Created withwasm-tools component wit --wasm
wit/deps/wasi-http@0.2.0.wasm
- Same as thewasi-cli
proposal, but for HTTP instead.Supporting this new format for dependencies required updating the
wit-bindgen
dependency as well which ensures that everyone's agreeing onwit-parser
being up-to-date with this new feature support.<!--
Please make sure you include the following information:
If this work has been discussed elsewhere, please include a link to that
conversation. If it was discussed in an issue, just mention "issue #...".Explain why this change is needed. If the details are in an issue already,
this can be brief.Our development process is documented in the Wasmtime book:
https://docs.wasmtime.dev/contributing-development-process.htmlPlease ensure all communication follows the code of conduct:
https://github.com/bytecodealliance/wasmtime/blob/main/CODE_OF_CONDUCT.md
-->
alexcrichton requested pchickey for a review on PR #7929.
alexcrichton requested wasmtime-core-reviewers for a review on PR #7929.
alexcrichton requested wasmtime-default-reviewers for a review on PR #7929.
pchickey submitted PR review:
Awesome!
alexcrichton has enabled auto merge for PR #7929.
What is the advantage of this? It would make it harder to review changes (as git will refuse to show any diff) Also I believe distros generally require the original source for binary blobs to be included anyway and recompiled as part of the build process. https://www.debian.org/doc/debian-policy/ch-source.html#missing-sources-debian-missing-sources and https://docs.fedoraproject.org/en-US/packaging-guidelines/what-can-be-packaged/#prebuilt-binaries-or-libraries
pchickey commented on PR #7929:
Just thought of this - this will eliminate the rust docs generated from the wit comments on the generated code, right? Those are pretty useful and so losing them is a significant regression...
alexcrichton commented on PR #7929:
Good questions! The main benefit of this to me is that it more heavily discourages local modifications to WIT files and reaffirms that the source of truth of the WIT files is upstream in the WASI proposals themeslves, not here in Wasmtime (as has sort of historically ad-hoc-wise been the case but we're trying to shift now). In the limit with https://github.com/bytecodealliance/wit-bindgen/pull/837 I hope to have
crates/wasi/wasi-cli@0.2.0.wasm
andcrates/wasi-http/wasi-http@0.2.0.wasm
as the only artifacts in-tree and nothing else, too, to further reduce the need to have CI checks that the two directories are the same.You're right, however, that reviewing diffs won't be easy and if sources require non-binary artifacts won't work. One possible solution to that would be to check in the wasm text format instead of the wasm binary format as that would still pretty heavily discourage modifications but it would be more easily diff-able and wouldn't be a binary. Still a bit of a "generated wad of text", however, so I'm not sure how distros would feel about that. On the diff side it's also a property that any modifications will almost surely require changes on the Wasmtime side of things, but those are definitely more difficult to review than a change in the WIT itself. I think in general I'm not sure how best we can manage depending on the WASI spec here at this time, but this felt better to me than copying all the WITs in all the places. We're also somewhat "setting the trend" as well where if other projects or embeddings want to reference WASI this is an example of how they don't have to copy WITs but can instead copy artifacts at this time (ideally a package-manager-style-solution would work but that isn't fully mature yet AFAIK).
As for documentation, that should actually be solved where the WASI proposals have a custom section with documentation inside of it so when
wit-parser
parses everything it has enough info to attach the documentation back to the original item.
alexcrichton updated PR #7929.
alexcrichton edited PR #7929:
This commit replaces the textual
*.wit
files in the Wasmtime tree with
the binary form of a WIT package. The new WIT structure now looks like:
crates/wasi/wasi-cli@0.2.0.wasm
- serialized version of thewasi-cli
proposal at the v0.2.0 tag. Created withwasm-tools component wit --wasm
crates/wasi-http/wasi-http@0.2.0.wasm
- same as the above, but for
wasi-http
.
crates/test-programs/wit
- WIT files used for tests such as a custom
reactor.
crates/test-programs/wit/deps/*.wasm
- WIT bindings to generate
tests with. This is a duplicate of thewasi-{cli,http}@0.2.0.wasm
files from above but additionally enables testing multiple versions in
the future.The end state of this PR is that whenever WASI APIs are updated we'll
need a tagged release to pull into Wasmtime to officially support.
alexcrichton commented on PR #7929:
I've pushed up an update which is now what I think the longer-term vision for WITs-in-Wasmtime should look like. CI will fail until I publish https://github.com/bytecodealliance/wit-bindgen/pull/838 but the idea is that theres a single wasm for
wasmtime-wasi
and a single wasm forwasmtime-wasi-http
, and then we have "stuff" for tests.(this doesn't move the needle on the binary-vs-not problem, just wanted to point out I made an orthogonal update to that)
alexcrichton updated PR #7929.
pchickey commented on PR #7929:
OK. I'm ok with the docs regression as long as we work on it soon :)
alexcrichton commented on PR #7929:
Oh to clarify, the docs problem I believe is solved. Docs are encoded in the wasm binaries in a custom section, so the only issue I think with this is the binary-vs-not problem
alexcrichton updated PR #7929.
alexcrichton updated PR #7929.
alexcrichton commented on PR #7929:
I've added this to the next Wasmtime meeting agenda in case a resolution is not met before then.
pchickey submitted PR review:
Sorry, this fell off my queue. This change looks good to me, are we waiting to resolve the concerns @bjorn3 brought up?
alexcrichton commented on PR #7929:
Yeah, @bjorn3 do you have thoughts on
*.wasm
vs*.wat
perhaps? Or thoughts on the problem this is solving of discouraging edits and avoiding duplication?
If you want to discourage edits, one option would be to store hashes of them separately and check that they match in a build script. That way you have to modify the source and then separately recompute the hash (or patch out the hash check, but you can leave a big comment near the hash check about why it exists). Or maybe you could gzip the wat files? That would also still allow extracting the original sources, but would still look like a binary blob if you open it in a text editor without extracting them first.
cfallin commented on PR #7929:
had some thoughts I was saving for the Wasmtime agenda item you had scheduled on Feb 29 (are we still discussing it then?) but to add some here:
My main thought here is discoverability and beginner-friendliness: wits seem sort of like header files in the sense that it can be useful to refer to them as specifications. If they exist locally only in binary form, that benefit is mostly lost; it can still be dumped via a tool, but it's not as immediately apparent how to do that to a newcomer. That seems more important to me than the "control" aspects you mentioned (the latter which can still be enforced other ways), though admittedly that's a value judgment.
Solving the "single source of truth" problem by including only a derived artifact feels a little artificial to me: it's still a copy, data that came from elsewhere. One way I've seen this managed elsewhere (e.g. in large company monorepos) is to have a
third_party/
directory with exact copies of external source, and optionally scripts to do updates, and maybe CI rules or at least social conventions to prevent checkin of anything but updates from upstream.
cfallin edited a comment on PR #7929:
I had some thoughts I was saving for the Wasmtime agenda item you had scheduled on Feb 29 (are we still discussing it then?) but to add some here:
My main thought here is discoverability and beginner-friendliness: wits seem sort of like header files in the sense that it can be useful to refer to them as specifications. If they exist locally only in binary form, that benefit is mostly lost; it can still be dumped via a tool, but it's not as immediately apparent how to do that to a newcomer. That seems more important to me than the "control" aspects you mentioned (the latter which can still be enforced other ways), though admittedly that's a value judgment.
Solving the "single source of truth" problem by including only a derived artifact feels a little artificial to me: it's still a copy, data that came from elsewhere. One way I've seen this managed elsewhere (e.g. in large company monorepos) is to have a
third_party/
directory with exact copies of external source, and optionally scripts to do updates, and maybe CI rules or at least social conventions to prevent checkin of anything but updates from upstream.
alexcrichton commented on PR #7929:
Ah I put this on the agenda for Feb 29 to make sure it's not forgotten about, but I figure that if it's merged sooner I can take it off the agenda.
I'm not sure how best to thread this needle myself. I don't know how to balance the conerns y'all have with how these WIT files are developed. For example I understand distros don't like binary blobs but a packager can always check out a versioned tag of a WASI proposal/repo and generate the exact same wasm binary. We could check in
*.wat
or text files but at least personally as a reviewer I'd always skip the files anyway since any changes are going to have larger changes on the bindings/implementation which is what I'd be interested in. I agree that the contents aren't discoverable here in this repo when there's a binary form but to me that's not necessarily a bad thing because this isn't where these WIT files are developed, that's intended for the upstream repos themselves.To me I feel that the binary distribution of WIT packages is a perfect use case here for integrating with downstream repos using WIT packages. Ideally we wouldn't even check in binary files and we'd simply have
Wit.toml
which would say[dependencies] 'wasi:cli' = "0.2.0"
and nothing else, meaning literally nothing would be checked in to this repository. In that sense the binary files are to me another step towards this future.As I think more about this one other thing I like about the binary solution is that it makes it much easier to write down and execute instructions about updating these files. Currently it's copying a bunch of WIT files but you have to get that from multiple upstream WASI repositories and make sure you copy them all to the right places with the right names. You also need to make sure all the repos are checked out at the right version. As a reviewer I'd have to theoretically redo all that work to confirm it's all matching. With a binary file there's a single file in a single location under GitHub Releases for a single repo to go pick out. As a reviewer I could verify it's the expected file and move on from there.
Well, in any case, I'm happy to discuss further on this PR, but I'm also ok deferring to next week's meeting too.
alexcrichton closed without merge PR #7929.
alexcrichton commented on PR #7929:
Discussion from last week's Wasmtime meeting concluded that this isn't the way we'd like to go, and instead pursing https://github.com/WebAssembly/component-model/issues/313 to have a single
*.wit
file would be better. Additionally we'll want a script to run to update the proposals from a particular tag to ensure no local changes.
Last updated: Nov 22 2024 at 16:03 UTC