Stream: git-wasmtime

Topic: wasmtime / issue #6486 update and reorganize WIT, fix `ca...


view this post on Zulip Wasmtime GitHub notifications bot (May 31 2023 at 19:59):

pchickey commented on issue #6486:

Thanks, I'll look at this in detail this afternoon.

To answer the first design question, yes lets please use the wit-deps binary in CI just like we did in p2-p. Please make a shell script inside ci/ (lets say ./ci/wit-deps-check.sh) which invokes the wit-deps CLI & runs git diff in the correct directories, and call that script from a CI step that is just like the one we added to build the component adapter.

It would also be nice if the shell script had (e.g. ./ci/wit-deps-check.sh update) would modify the working tree to what should be committed for the check to pass in CI, and print that out with the error in CI so folks who don't know this tool don't stumble over it.

view this post on Zulip Wasmtime GitHub notifications bot (May 31 2023 at 21:41):

alexcrichton commented on issue #6486:

fix cargo vendor use cases

Theoretically the "verify-publish" step in CI should be checking this by running cargo package on all component crates and then building them from the result. That should verify that all crates build as they'll appear on crates.io, but given how things are probably broken currently to motivate the change in this PR my guess is that this isn't working as expected.

Do you have an example crate/coment with cargo vendor that's not working? If so I'd like to poke around at it and see if the "verify-publish" step can be improved to verify this continues to work into the future.

view this post on Zulip Wasmtime GitHub notifications bot (May 31 2023 at 21:59):

pchickey commented on issue #6486:

The diff here looks good. Roman and I had a discussion on zulip for how to add the CI check for this.

The component-adapter and test-programs crates would not have worked with cargo-vendor before this PR. I don't know which of those was a problem for vendoring into wasmcloud, but both of those are excluded from our verify-publish step.

view this post on Zulip Wasmtime GitHub notifications bot (Jun 01 2023 at 08:06):

rvolosatovs commented on issue #6486:

The diff here looks good. Roman and I had a discussion on zulip for how to add the CI check for this.

The component-adapter and test-programs crates would not have worked with cargo-vendor before this PR. I don't know which of those was a problem for vendoring into wasmcloud, but both of those are excluded from our verify-publish step.

For the record, we're vendoring the component adapter as a bindep https://github.com/wasmCloud/wasmCloud/tree/8544ef854e1695381def129c68c341a5fd9940e3/tests/wasi-adapter, if you're interested, that's how it's built: https://github.com/wasmCloud/wasmCloud/blob/8544ef854e1695381def129c68c341a5fd9940e3/tests/actors/build.rs#L116-L134 along with other Wasm as part of tests (so just cargo test from root is sufficient to run the whole test suite when using nightly, while the crate itself can be depended upon by stable Rust)

The trick is that our reproducible build process happens in a sandbox with no access to non-deterministic I/O, e.g. no networking without the matching content hash specified. It looks something like this:

  1. All dependencies are fetched from either a package registry (e.g. crates.io) or git repositories using hashes from Cargo.lock (which lets us fetch the actual contents, since we know the content hashes upfront)
  2. With all dependencies fetched a "vendor tree" is build, like cargo vendor would do (e.g. for published packages: https://github.com/ipetkov/crane/blob/35110cccf28823320f4fd697fcafcb5038683982/lib/downloadCargoPackage.nix#L19-L23 and https://github.com/ipetkov/crane/blob/35110cccf28823320f4fd697fcafcb5038683982/lib/vendorCargoRegistries.nix#L45-L103, there's similar logic for git dependencies, which we rely upon to vendor this repository)
  3. Later the actual cargo build step happens "offline", using the vendor directory constructed in (2.), just like it would work using plain cargo vendor

view this post on Zulip Wasmtime GitHub notifications bot (Jun 01 2023 at 18:53):

alexcrichton commented on issue #6486:

I think it would simplify things for Wasmtime quite a bit if there could only be one set of WIT files managed here, rather than three. A large risk with multiple sets of WIT files that are all duplicates is that they can get out of sync, and there's no protection against that right now. In theory they should all be the exact same and ideally use something like symlinks but that doesn't all work out for windows/vendoring reasons.

Would it be possible to update your vendoring process of the component adapter to manually copy around the WIT files or similarly? Or alternatively could the vendored artifact be the Wasmtime repository since that's what the component adapter expects to be built within the context of?


Also, as an orthogonal point, right now deps.toml points to main.tar.gz archives but I think that main will need to be replaced with a commit or otherwise re-vendoring the WIT files will change over time as the branches get updated.

view this post on Zulip Wasmtime GitHub notifications bot (Jun 01 2023 at 19:13):

rvolosatovs commented on issue #6486:

I think it would simplify things for Wasmtime quite a bit if there could only be one set of WIT files managed here, rather than three. A large risk with multiple sets of WIT files that are all duplicates is that they can get out of sync, and there's no protection against that right now. In theory they should all be the exact same and ideally use something like symlinks but that doesn't all work out for windows/vendoring reasons.

There's only one set of WIT files managed "from external sources" - in crates/wasi/wit, both component adapter and reactor-tests inherit from whatever is used within crates/wasi/wit

Would it be possible to update your vendoring process of the component adapter to manually copy around the WIT files or similarly? Or alternatively could the vendored artifact be the Wasmtime repository since that's what the component adapter expects to be built within the context of?

That's exactly how it works in this PR, e.g. https://github.com/rvolosatovs/wasmtime/blob/b22c571b911c1bbdd2d9356d72de18606d3e08ce/crates/test-programs/reactor-tests/wit/deps.toml#L1 this just means copy all WIT definitions from this path along with all transitive dependencies

Also, as an orthogonal point, right now deps.toml points to main.tar.gz archives but I think that main will need to be replaced with a commit or otherwise re-vendoring the WIT files will change over time as the branches get updated.

That's not the case, wit-deps is lazy by default, so for any dependency X, for which Y = hash(X), if deps.lock contains Y as the hash of X, wit-deps lock (and corresponding wit_deps::lock_path) do not do anything.

view this post on Zulip Wasmtime GitHub notifications bot (Jun 01 2023 at 19:17):

rvolosatovs edited a comment on issue #6486:

I think it would simplify things for Wasmtime quite a bit if there could only be one set of WIT files managed here, rather than three. A large risk with multiple sets of WIT files that are all duplicates is that they can get out of sync, and there's no protection against that right now. In theory they should all be the exact same and ideally use something like symlinks but that doesn't all work out for windows/vendoring reasons.

There's only one set of WIT files managed "from external sources" - in crates/wasi/wit, both component adapter and reactor-tests inherit from whatever is used within crates/wasi/wit

Would it be possible to update your vendoring process of the component adapter to manually copy around the WIT files or similarly? Or alternatively could the vendored artifact be the Wasmtime repository since that's what the component adapter expects to be built within the context of?

That's exactly how it works in this PR, e.g. https://github.com/rvolosatovs/wasmtime/blob/b22c571b911c1bbdd2d9356d72de18606d3e08ce/crates/test-programs/reactor-tests/wit/deps.toml#L1 this just means copy all WIT definitions from this path along with all transitive dependencies

Also, as an orthogonal point, right now deps.toml points to main.tar.gz archives but I think that main will need to be replaced with a commit or otherwise re-vendoring the WIT files will change over time as the branches get updated.

That's not the case, wit-deps is lazy by default, so for any dependency X, for which Y = hash(X), if deps.lock contains Y as the hash of X, wit-deps lock (and corresponding wit_deps::lock_path) do not do anything.

Update:

view this post on Zulip Wasmtime GitHub notifications bot (Jun 01 2023 at 19:17):

rvolosatovs edited a comment on issue #6486:

I think it would simplify things for Wasmtime quite a bit if there could only be one set of WIT files managed here, rather than three. A large risk with multiple sets of WIT files that are all duplicates is that they can get out of sync, and there's no protection against that right now. In theory they should all be the exact same and ideally use something like symlinks but that doesn't all work out for windows/vendoring reasons.

There's only one set of WIT files managed "from external sources" - in crates/wasi/wit, both component adapter and reactor-tests inherit from whatever is used within crates/wasi/wit

Would it be possible to update your vendoring process of the component adapter to manually copy around the WIT files or similarly? Or alternatively could the vendored artifact be the Wasmtime repository since that's what the component adapter expects to be built within the context of?

That's exactly how it works in this PR, e.g. https://github.com/rvolosatovs/wasmtime/blob/b22c571b911c1bbdd2d9356d72de18606d3e08ce/crates/test-programs/reactor-tests/wit/deps.toml#L1 this just means copy all WIT definitions from this path along with all transitive dependencies

Also, as an orthogonal point, right now deps.toml points to main.tar.gz archives but I think that main will need to be replaced with a commit or otherwise re-vendoring the WIT files will change over time as the branches get updated.

That's not the case, wit-deps is lazy by default, so for any dependency X, for which Y = hash(X), if deps.lock contains Y as the hash of X, wit-deps lock (and corresponding wit_deps::lock_path) do not do anything.

Update:

view this post on Zulip Wasmtime GitHub notifications bot (Jun 01 2023 at 19:19):

rvolosatovs edited a comment on issue #6486:

I think it would simplify things for Wasmtime quite a bit if there could only be one set of WIT files managed here, rather than three. A large risk with multiple sets of WIT files that are all duplicates is that they can get out of sync, and there's no protection against that right now. In theory they should all be the exact same and ideally use something like symlinks but that doesn't all work out for windows/vendoring reasons.

There's only one set of WIT files managed "from external sources" - in crates/wasi/wit, both component adapter and reactor-tests inherit from whatever is used within crates/wasi/wit

Would it be possible to update your vendoring process of the component adapter to manually copy around the WIT files or similarly? Or alternatively could the vendored artifact be the Wasmtime repository since that's what the component adapter expects to be built within the context of?

That's exactly how it works in this PR, e.g. https://github.com/rvolosatovs/wasmtime/blob/b22c571b911c1bbdd2d9356d72de18606d3e08ce/crates/test-programs/reactor-tests/wit/deps.toml#L1 this just means copy all WIT definitions from this path along with all transitive dependencies

Also, as an orthogonal point, right now deps.toml points to main.tar.gz archives but I think that main will need to be replaced with a commit or otherwise re-vendoring the WIT files will change over time as the branches get updated.

That's not the case, wit-deps is lazy by default, so for any dependency X, for which Y = hash(X), if deps.lock contains Y as the hash of X, wit-deps lock (and corresponding wit_deps::lock_path) do not do anything.

Update:

Update 2:
Here's the "local path" case implementation https://github.com/bytecodealliance/wit-deps/blob/b4314067818a45413fc6c366983f19a80b2626fa/crates/wit-deps/src/manifest.rs#L193-L212

view this post on Zulip Wasmtime GitHub notifications bot (Jun 01 2023 at 19:22):

rvolosatovs edited a comment on issue #6486:

I think it would simplify things for Wasmtime quite a bit if there could only be one set of WIT files managed here, rather than three. A large risk with multiple sets of WIT files that are all duplicates is that they can get out of sync, and there's no protection against that right now. In theory they should all be the exact same and ideally use something like symlinks but that doesn't all work out for windows/vendoring reasons.

There's only one set of WIT files managed "from external sources" - in crates/wasi/wit, both component adapter and reactor-tests inherit from whatever is used within crates/wasi/wit

Would it be possible to update your vendoring process of the component adapter to manually copy around the WIT files or similarly? Or alternatively could the vendored artifact be the Wasmtime repository since that's what the component adapter expects to be built within the context of?

That's exactly how it works in this PR, e.g. https://github.com/rvolosatovs/wasmtime/blob/b22c571b911c1bbdd2d9356d72de18606d3e08ce/crates/test-programs/reactor-tests/wit/deps.toml#L1 this just means copy all WIT definitions from this path along with all transitive dependencies

Also, as an orthogonal point, right now deps.toml points to main.tar.gz archives but I think that main will need to be replaced with a commit or otherwise re-vendoring the WIT files will change over time as the branches get updated.

That's not the case, wit-deps is lazy by default, so for any dependency X, for which Y = hash(X), if deps.lock contains Y as the hash of X, wit-deps lock (and corresponding wit_deps::lock_path) do not do anything.

Update:

Update 2:
Here's the "local path" case implementation https://github.com/bytecodealliance/wit-deps/blob/b4314067818a45413fc6c366983f19a80b2626fa/crates/wit-deps/src/manifest.rs#L193-L212

Update 3:
You could try running wasmtime-wit-deps with RUST_LOG=trace to see what happens on disk

view this post on Zulip Wasmtime GitHub notifications bot (Jun 01 2023 at 19:25):

rvolosatovs edited a comment on issue #6486:

I think it would simplify things for Wasmtime quite a bit if there could only be one set of WIT files managed here, rather than three. A large risk with multiple sets of WIT files that are all duplicates is that they can get out of sync, and there's no protection against that right now. In theory they should all be the exact same and ideally use something like symlinks but that doesn't all work out for windows/vendoring reasons.

There's only one set of WIT files managed "from external sources" - in crates/wasi/wit, both component adapter and reactor-tests inherit from whatever is used within crates/wasi/wit

Would it be possible to update your vendoring process of the component adapter to manually copy around the WIT files or similarly? Or alternatively could the vendored artifact be the Wasmtime repository since that's what the component adapter expects to be built within the context of?

That's exactly how it works in this PR, e.g. https://github.com/rvolosatovs/wasmtime/blob/b22c571b911c1bbdd2d9356d72de18606d3e08ce/crates/test-programs/reactor-tests/wit/deps.toml#L1 this just means copy all WIT definitions from this path along with all transitive dependencies

Also, as an orthogonal point, right now deps.toml points to main.tar.gz archives but I think that main will need to be replaced with a commit or otherwise re-vendoring the WIT files will change over time as the branches get updated.

That's not the case, wit-deps is lazy by default, so for any dependency X, for which Y = hash(X), if deps.lock contains Y as the hash of X, wit-deps lock (and corresponding wit_deps::lock_path) does not do anything.

Update:

Update 2:
Here's the "local path" case implementation https://github.com/bytecodealliance/wit-deps/blob/b4314067818a45413fc6c366983f19a80b2626fa/crates/wit-deps/src/manifest.rs#L193-L212

Update 3:
You could try running wasmtime-wit-deps with RUST_LOG=trace to see what happens on disk

view this post on Zulip Wasmtime GitHub notifications bot (Jun 01 2023 at 19:27):

rvolosatovs edited a comment on issue #6486:

I think it would simplify things for Wasmtime quite a bit if there could only be one set of WIT files managed here, rather than three. A large risk with multiple sets of WIT files that are all duplicates is that they can get out of sync, and there's no protection against that right now. In theory they should all be the exact same and ideally use something like symlinks but that doesn't all work out for windows/vendoring reasons.

There's only one set of WIT files managed "from external sources" - in crates/wasi/wit, both component adapter and reactor-tests inherit from whatever is used within crates/wasi/wit, that's also why order matters in PATHS (crates/wasi/wit is a dependency of the other two crates)

Would it be possible to update your vendoring process of the component adapter to manually copy around the WIT files or similarly? Or alternatively could the vendored artifact be the Wasmtime repository since that's what the component adapter expects to be built within the context of?

That's exactly how it works in this PR, e.g. https://github.com/rvolosatovs/wasmtime/blob/b22c571b911c1bbdd2d9356d72de18606d3e08ce/crates/test-programs/reactor-tests/wit/deps.toml#L1 this just means copy all WIT definitions from this path along with all transitive dependencies

Also, as an orthogonal point, right now deps.toml points to main.tar.gz archives but I think that main will need to be replaced with a commit or otherwise re-vendoring the WIT files will change over time as the branches get updated.

That's not the case, wit-deps is lazy by default, so for any dependency X, for which Y = hash(X), if deps.lock contains Y as the hash of X, wit-deps lock (and corresponding wit_deps::lock_path) does not do anything.

Update:

Update 2:
Here's the "local path" case implementation https://github.com/bytecodealliance/wit-deps/blob/b4314067818a45413fc6c366983f19a80b2626fa/crates/wit-deps/src/manifest.rs#L193-L212

Update 3:
You could try running wasmtime-wit-deps with RUST_LOG=trace to see what happens on disk

view this post on Zulip Wasmtime GitHub notifications bot (Jun 01 2023 at 19:31):

rvolosatovs edited a comment on issue #6486:

I think it would simplify things for Wasmtime quite a bit if there could only be one set of WIT files managed here, rather than three. A large risk with multiple sets of WIT files that are all duplicates is that they can get out of sync, and there's no protection against that right now. In theory they should all be the exact same and ideally use something like symlinks but that doesn't all work out for windows/vendoring reasons.

There's only one set of WIT files managed "from external sources" - in crates/wasi/wit, both component adapter and reactor-tests inherit from whatever is used within crates/wasi/wit, that's also why order matters in PATHS (crates/wasi/wit is a dependency of the other two crates)

Would it be possible to update your vendoring process of the component adapter to manually copy around the WIT files or similarly? Or alternatively could the vendored artifact be the Wasmtime repository since that's what the component adapter expects to be built within the context of?

That's exactly how it works in this PR, e.g. https://github.com/rvolosatovs/wasmtime/blob/b22c571b911c1bbdd2d9356d72de18606d3e08ce/crates/test-programs/reactor-tests/wit/deps.toml#L1 this just means copy all WIT definitions from this path along with all transitive dependencies

Also, as an orthogonal point, right now deps.toml points to main.tar.gz archives but I think that main will need to be replaced with a commit or otherwise re-vendoring the WIT files will change over time as the branches get updated.

That's not the case, wit-deps is lazy by default, so for any dependency X, for which Y = hash(X), if deps.lock contains Y as the hash of X, wit-deps lock (and corresponding wit_deps::lock_path) does not do anything.

Update:

Update 2:
Here's the "local path" case implementation https://github.com/bytecodealliance/wit-deps/blob/b4314067818a45413fc6c366983f19a80b2626fa/crates/wit-deps/src/manifest.rs#L193-L212

Update 3:
You could try running wasmtime-wit-deps with RUST_LOG=trace to see what happens on disk

view this post on Zulip Wasmtime GitHub notifications bot (Jun 01 2023 at 19:57):

alexcrichton commented on issue #6486:

Ah ok this PR has changed radically since I last looked at it, I apologize for not bringing myself back up to speed before leaving a comment. I'll also admit that I have yet to investigate, learn, and understand wit-deps in depth. I was previously naively assuming that it worked the same way as Cargo, or that it was intended to work similarly to Cargo. There's a few things I want to comment about though:

Orthogonally from my above comments, why are crates/wasi/wit/{cli,clocks,filesystem} added in this PR? Are they an aritfact of an intermediate state which weren't removed? I don't believe they're read by wit-parser and otherwise they aren't sharing bits with the copies under deps/{cli,clocks,filesystem}.

Additionally the world command, command-extended, proxy, and reactor I see are added as top-level "local" files in crates/wasi/wit/*.wit. Are these not present in any other upstream WASI repository? If not could they be added to the wasi:cli and wasi:http repositories respectively and referred-to from there?

view this post on Zulip Wasmtime GitHub notifications bot (Jun 02 2023 at 06:10):

rvolosatovs commented on issue #6486:

How is this any different from cargo path dependencies as documented here: https://doc.rust-lang.org/cargo/reference/specifying-dependencies.html#specifying-path-dependencies ? Sure I wish copying WIT files would not be necessary, but the reason it behaves this way is the fact that wit-bindgen requires all dependencies to be located in deps directory, which I would rather call a questionable design decision of wit-bindgen.
Usage of path dependencies (in wit-deps, but similarly in cargo) is generally discouraged and the use case is mostly just development, quickly replacing a dependency by a local, modified, copy.
Ideally, all WIT packages should specify their "real" dependencies with an external source. (e.g. the standard repositories) I'd be happy to do that, but 2 things are blocking that:

  1. Wasmtime does not use upstream WIT interfaces (filesystem, most importantly - clocks and cli are "transitively broken" due to different filesystem package used)
  2. There's no standard repository containing the command, reactor from what I can see

That's not true, for any given gzipped tarball URL X: hash(untar(fetch(X))) will give you the exact same hash as I described in https://github.com/bytecodealliance/wasmtime/pull/6486#issuecomment-1572635483, see also that comment for hash definition. The reasoning here is simple, X could be a URL to a zip file or whatever else, still the hash in the lock would be the same and the integrity/equality of the contents could be verified. The lock file provides enough data precisely to be able to reproduce previous operation.

Note again, that soon it should not to use tar, but rather it will be able to consume dependencies as Wasm https://github.com/bytecodealliance/wit-deps/issues/25, at which point the Wasm is also what'd be hashed - but that again depends on wit-bindgen being able to actually consume dependencies as Wasm and not require raw WIT.

Note, that this tool was built originally as a temporary solution to address unmaintainable mess of copy-pasting WIT files around from external sources. Soon, downstream users of WIT interfaces will not need to actually use WIT directly, so it'll only be WIT interface developers, who will need some kind of wit management tool.

I understand that wit-deps could probably be smart enough to not update things if not requested, but that to me seems like a subtle anti-pattern in the design of wit-deps where it's not as robust as it otherwise could be in terms of vendoring dependencies. This is why I dont' think main.tar.gz is used because in my mind the vendored state should be reproducible at any time by anyone.

It's true that in the absence of an entry in local cache, main branch references and similar cannot be reproduced if they change. The only way around that would be implementing git support directly in wit-deps to record the rev that main pointed to at the time. I went with the quickest and simplest solution for MVP, just so that, again, downstream users of WIT packages, standard or not, could reasonably consume them without resorting to adding multiple submodules/subtrees/copy-pasting. The intended use case is indeed to pin to exact commits like so https://github.com/bytecodealliance/wit-deps/blob/b4314067818a45413fc6c366983f19a80b2626fa/tests/build/wit/deps.toml#L1-L2, in which case all of this is fully reproducible. I've only added the update functionality and examples with "dynamic" branch references after the first usage here https://github.com/WebAssembly/wasi-http/pull/23, when I realized there's a need downstream to be able to "simply" pull in changes.

Note that wit-bindgen, again, cannot consume WIT standard repositories as-is, because of the wit directory within those repositories, which means that simply adding a submodule to a repository in wit/deps/pkg is not possible. Again, I'd say that's a wit-bindgen design drawback, which wit-deps, again, addresses to eliminate this headache from downstream developers consuming real WIT packages.

The new workspace is only there due to cargo vet auditing, please see OP, I'm happy to drop https://github.com/bytecodealliance/wasmtime/commit/b22c571b911c1bbdd2d9356d72de18606d3e08ce if someone could certify all the dependencies used. The reason for a new tool is, in fact, simplification of the update and consistency checking process, originally @pchickey's request https://github.com/bytecodealliance/wasmtime/pull/6486#issuecomment-1570860614 - rather than having to manually copy-paste WIT around as is done today (and prone to mistakes, note how a bunch of WIT files were missed by the update in https://github.com/bytecodealliance/wasmtime/pull/6390) there are just 1 tool with 2 subcommands with no arguments, which can automate all of this and do "the right thing", as well as check consistency in CI.

It's nice to provide a working dev environment out-of-the-box. Requiring developers of Wasmtime to manually install development dependencies, to me, is an anti-pattern, that's why a Rust crate instead is very useful - it works cross-platform and developers do not need to install yet another binary executable.

Orthogonally from my above comments, why are crates/wasi/wit/{cli,clocks,filesystem} added in this PR? Are they an aritfact of an intermediate state which weren't removed? I don't believe they're read by wit-parser and otherwise they aren't sharing bits with the copies under deps/{cli,clocks,filesystem}.

Assuming that every WIT dependency must have an external source (e.g. a standard repo), deps/{cli,clocks,filesystem} are these sources, which should eventually be removed in favor of using upstream (see the TODO comment). Regarding different contents I cannot reproduce:

rvolosatovs@cobalt ..ecodealliance/wasmtime/crates/wasi/wit (git)-[fix/cargo-vendor] % for dep in cli clocks filesystem; do sha256sum $dep/*; done
99f45bfa45638e87bcb8e1c7b8fad65c2ac60fd70500fa447268371a0d87ed10  cli/environment.wit
a917b0e0381f0cb3f86e9c88c0fc84b46388595cf4858f02de2d597caee82e31  cli/exit.wit
2c60a8c95149393ac3698eb766da44c3e570dfca38646256c888ecb34f9717ab  cli/preopens.wit
57f8549af0058bf1bf58eccb9820715a907b2c17825e8eb7213ab647381935be  cli/stdio.wit
f0418770f24f7130d2eaf5f43316edc145ec4a524e77f02ace7db6171dcd8c76  clocks/monotonic-clock.wit
caf183ddda4bb6693f0cc9d3249e42a58a21091ff1764e92f652dfc8a1bbff6c  clocks/timezone.wit
44e062d552753f939b4dec8ce87b499ba46fdc0e88e8d630d7fd1bc0d29f7e13  clocks/wall-clock.wit
ce4e225baf21a915435f807a0bb47bc34f43227c0424156cab7c6c536a5d1755  filesystem/filesystem.wit

rvolosatovs@cobalt ..ecodealliance/wasmtime/crates/wasi/wit (git)-[fix/cargo-vendor] % for dep in cli clocks filesystem; do sha256sum deps/$dep/*; done
99f45bfa45638e87bcb8e1c7b8fad65c2ac60fd70500fa447268371a0d87ed10  deps/cli/environment.wit
a917b0e0381f0cb3f86e9c88c0fc84b46388595cf4858f02de2d597caee82e31  deps/cli/exit.wit
2c60a8c95149393ac3698eb766da44c3e570dfca38646256c888ecb34f9717ab  deps/cli/preopens.wit
57f8549af0058bf1bf58eccb9820715a907b2c17825e8eb7213ab647381935be  deps/cli/stdio.wit
f0418770f24f7130d2eaf5f43316edc145ec4a524e77f02ace7db6171dcd8c76  deps/clocks/monotonic-clock.wit
caf183ddda4bb6693f0cc9d3249e42a58a21091ff1764e92f652dfc8a1bbff6c  deps/clocks/timezone.wit
44e062d552753f939b4dec8ce87b499ba46fdc0e88e8d630d7fd1bc0d29f
[message truncated]

view this post on Zulip Wasmtime GitHub notifications bot (Jun 02 2023 at 06:12):

rvolosatovs edited a comment on issue #6486:

How is this any different from cargo path dependencies as documented here: https://doc.rust-lang.org/cargo/reference/specifying-dependencies.html#specifying-path-dependencies ? Sure I wish copying WIT files would not be necessary, but the reason it behaves this way is the fact that wit-bindgen requires all dependencies to be located in deps directory, which I would rather call a questionable design decision of wit-bindgen.
Usage of path dependencies (in wit-deps, but similarly in cargo) is generally discouraged and the use case is mostly just development, quickly replacing a dependency by a local, modified, copy.
Ideally, all WIT packages should specify their "real" dependencies with an external source. (e.g. the standard repositories) I'd be happy to do that, but 2 things are blocking that:

  1. Wasmtime does not use upstream WIT interfaces (filesystem, most importantly - clocks and cli are "transitively broken" due to different filesystem package used)
  2. There's no standard repository containing the command, reactor from what I can see

That's not true, for any given gzipped tarball URL X: hash(untar(fetch(X))) will give you the exact same hash as I described in https://github.com/bytecodealliance/wasmtime/pull/6486#issuecomment-1572635483, see also that comment for hash definition. The reasoning here is simple, X could be a URL to a zip file or whatever else, still the hash in the lock would be the same and the integrity/equality of the contents could be verified. The lock file provides enough data precisely to be able to reproduce previous operation.

Soon it should not have to use tar, but rather it will be able to consume dependencies as Wasm https://github.com/bytecodealliance/wit-deps/issues/25, at which point the Wasm is also what'd be hashed - but that again depends on wit-bindgen being able to actually consume dependencies as Wasm and not require raw WIT.

Note, that this tool was built originally as a temporary solution to address unmaintainable mess of copy-pasting WIT files around from external sources. Soon, downstream users of WIT interfaces will not need to actually use WIT directly, so it'll only be WIT interface developers, who will need some kind of wit management tool.

I understand that wit-deps could probably be smart enough to not update things if not requested, but that to me seems like a subtle anti-pattern in the design of wit-deps where it's not as robust as it otherwise could be in terms of vendoring dependencies. This is why I dont' think main.tar.gz is used because in my mind the vendored state should be reproducible at any time by anyone.

It's true that in the absence of an entry in local cache, main branch references and similar cannot be reproduced if they change. The only way around that would be implementing git support directly in wit-deps to record the rev that main pointed to at the time. I went with the quickest and simplest solution for MVP, just so that, again, downstream users of WIT packages, standard or not, could reasonably consume them without resorting to adding multiple submodules/subtrees/copy-pasting. The intended use case is indeed to pin to exact commits like so https://github.com/bytecodealliance/wit-deps/blob/b4314067818a45413fc6c366983f19a80b2626fa/tests/build/wit/deps.toml#L1-L2, in which case all of this is fully reproducible. I've only added the update functionality and examples with "dynamic" branch references after the first usage here https://github.com/WebAssembly/wasi-http/pull/23, when I realized there's a need downstream to be able to "simply" pull in changes.

Note that wit-bindgen, again, cannot consume WIT standard repositories as-is, because of the wit directory within those repositories, which means that simply adding a submodule to a repository in wit/deps/pkg is not possible. Again, I'd say that's a wit-bindgen design drawback, which wit-deps, again, addresses to eliminate this headache from downstream developers consuming real WIT packages.

The new workspace is only there due to cargo vet auditing, please see OP, I'm happy to drop https://github.com/bytecodealliance/wasmtime/commit/b22c571b911c1bbdd2d9356d72de18606d3e08ce if someone could certify all the dependencies used. The reason for a new tool is, in fact, simplification of the update and consistency checking process, originally @pchickey's request https://github.com/bytecodealliance/wasmtime/pull/6486#issuecomment-1570860614 - rather than having to manually copy-paste WIT around as is done today (and prone to mistakes, note how a bunch of WIT files were missed by the update in https://github.com/bytecodealliance/wasmtime/pull/6390) there are just 1 tool with 2 subcommands with no arguments, which can automate all of this and do "the right thing", as well as check consistency in CI.

It's nice to provide a working dev environment out-of-the-box. Requiring developers of Wasmtime to manually install development dependencies, to me, is an anti-pattern, that's why a Rust crate instead is very useful - it works cross-platform and developers do not need to install yet another binary executable.

Orthogonally from my above comments, why are crates/wasi/wit/{cli,clocks,filesystem} added in this PR? Are they an aritfact of an intermediate state which weren't removed? I don't believe they're read by wit-parser and otherwise they aren't sharing bits with the copies under deps/{cli,clocks,filesystem}.

Assuming that every WIT dependency must have an external source (e.g. a standard repo), deps/{cli,clocks,filesystem} are these sources, which should eventually be removed in favor of using upstream (see the TODO comment). Regarding different contents I cannot reproduce:

rvolosatovs@cobalt ..ecodealliance/wasmtime/crates/wasi/wit (git)-[fix/cargo-vendor] % for dep in cli clocks filesystem; do sha256sum $dep/*; done
99f45bfa45638e87bcb8e1c7b8fad65c2ac60fd70500fa447268371a0d87ed10  cli/environment.wit
a917b0e0381f0cb3f86e9c88c0fc84b46388595cf4858f02de2d597caee82e31  cli/exit.wit
2c60a8c95149393ac3698eb766da44c3e570dfca38646256c888ecb34f9717ab  cli/preopens.wit
57f8549af0058bf1bf58eccb9820715a907b2c17825e8eb7213ab647381935be  cli/stdio.wit
f0418770f24f7130d2eaf5f43316edc145ec4a524e77f02ace7db6171dcd8c76  clocks/monotonic-clock.wit
caf183ddda4bb6693f0cc9d3249e42a58a21091ff1764e92f652dfc8a1bbff6c  clocks/timezone.wit
44e062d552753f939b4dec8ce87b499ba46fdc0e88e8d630d7fd1bc0d29f7e13  clocks/wall-clock.wit
ce4e225baf21a915435f807a0bb47bc34f43227c0424156cab7c6c536a5d1755  filesystem/filesystem.wit

rvolosatovs@cobalt ..ecodealliance/wasmtime/crates/wasi/wit (git)-[fix/cargo-vendor] % for dep in cli clocks filesystem; do sha256sum deps/$dep/*; done
99f45bfa45638e87bcb8e1c7b8fad65c2ac60fd70500fa447268371a0d87ed10  deps/cli/environment.wit
a917b0e0381f0cb3f86e9c88c0fc84b46388595cf4858f02de2d597caee82e31  deps/cli/exit.wit
2c60a8c95149393ac3698eb766da44c3e570dfca38646256c888ecb34f9717ab  deps/cli/preopens.wit
57f8549af0058bf1bf58eccb9820715a907b2c17825e8eb7213ab647381935be  deps/cli/stdio.wit
f0418770f24f7130d2eaf5f43316edc145ec4a524e77f02ace7db6171dcd8c76  deps/clocks/monotonic-clock.wit
caf183ddda4bb6693f0cc9d3249e42a58a21091ff1764e92f652dfc8a1bbff6c  deps/clocks/timezone.wit
44e062d552753f939b4dec8ce87b499ba46fdc0e88e8d630d7fd1bc0d29f7e13
[message truncated]

view this post on Zulip Wasmtime GitHub notifications bot (Jun 02 2023 at 06:14):

rvolosatovs edited a comment on issue #6486:

How is this any different from cargo path dependencies as documented here: https://doc.rust-lang.org/cargo/reference/specifying-dependencies.html#specifying-path-dependencies ? Sure I wish copying WIT files would not be necessary, but the reason it behaves this way is the fact that wit-bindgen requires all dependencies to be located in deps directory, which I would rather call a questionable design decision of wit-bindgen.
Usage of path dependencies (in wit-deps, but similarly in cargo) is generally discouraged and the use case is mostly just development, quickly replacing a dependency by a local, modified, copy.
Ideally, all WIT packages should specify their "real" dependencies with an external source. (e.g. the standard repositories) I'd be happy to do that, but 2 things are blocking that:

  1. Wasmtime does not use upstream WIT interfaces (filesystem, most importantly - clocks and cli are "transitively broken" due to different filesystem package used)
  2. There's no standard repository containing the command, reactor from what I can see

That's not true, for any given gzipped tarball URL X: hash(untar(fetch(X))) will give you the exact same hash as I described in https://github.com/bytecodealliance/wasmtime/pull/6486#issuecomment-1572635483, see also that comment for hash definition. The reasoning here is simple, X could be a URL to a zip file or whatever else, still the hash in the lock would be the same and the integrity/equality of the contents could be verified. The lock file provides enough data precisely to be able to reproduce previous operation.

Soon it should not have to use tar, but rather it will be able to consume dependencies as Wasm https://github.com/bytecodealliance/wit-deps/issues/25, at which point the Wasm is also what'd be hashed - but that again depends on wit-bindgen being able to actually consume dependencies as Wasm and not require raw WIT.

Note, that this tool was built originally as a temporary solution to address unmaintainable mess of copy-pasting WIT files around from external sources. Soon, downstream users of WIT interfaces will not need to actually use WIT directly, so it'll only be WIT interface developers, who will need some kind of wit management tool.

I understand that wit-deps could probably be smart enough to not update things if not requested, but that to me seems like a subtle anti-pattern in the design of wit-deps where it's not as robust as it otherwise could be in terms of vendoring dependencies. This is why I dont' think main.tar.gz is used because in my mind the vendored state should be reproducible at any time by anyone.

It's true that in the absence of an entry in local cache, main branch references and similar cannot be reproduced if they change. The only way around that would be implementing git support directly in wit-deps to record the rev that main pointed to at the time. I went with the quickest and simplest solution for MVP, just so that, again, downstream users of WIT packages, standard or not, could reasonably consume them without resorting to adding multiple submodules/subtrees/copy-pasting. The intended use case is indeed to pin to exact commits like so https://github.com/bytecodealliance/wit-deps/blob/b4314067818a45413fc6c366983f19a80b2626fa/tests/build/wit/deps.toml#L1-L2, in which case all of this is fully reproducible. I've only added the update functionality and examples with "dynamic" branch references after the first usage here https://github.com/WebAssembly/wasi-http/pull/23, when I realized there's a need downstream to be able to "simply" pull in changes.

Note that wit-bindgen, again, cannot consume WIT standard repositories as-is, because of the wit directory within those repositories, which means that simply adding a submodule to a repository in wit/deps/pkg is not possible. Again, I'd say that's a wit-bindgen design drawback, which wit-deps, again, addresses to eliminate this headache from downstream developers consuming real WIT packages.

The new workspace is only there due to cargo vet auditing, please see OP, I'm happy to drop https://github.com/bytecodealliance/wasmtime/commit/b22c571b911c1bbdd2d9356d72de18606d3e08ce if someone could certify all the dependencies used. The reason for a new tool is, in fact, simplification of the update and consistency checking process, originally @pchickey's request https://github.com/bytecodealliance/wasmtime/pull/6486#issuecomment-1570860614 - rather than having to manually copy-paste WIT around as is done today (and prone to mistakes, note how a bunch of WIT files were missed by the update in https://github.com/bytecodealliance/wasmtime/pull/6390) there is now just one tool with 2 subcommands, which take no arguments and can automate all of this tedious procedure and do "the right thing", as well as check consistency in CI.

It's nice to provide a working dev environment out-of-the-box. Requiring developers of Wasmtime to manually install development dependencies, to me, is an anti-pattern, that's why a Rust crate instead is very useful - it works cross-platform and developers do not need to install yet another binary executable.

Orthogonally from my above comments, why are crates/wasi/wit/{cli,clocks,filesystem} added in this PR? Are they an aritfact of an intermediate state which weren't removed? I don't believe they're read by wit-parser and otherwise they aren't sharing bits with the copies under deps/{cli,clocks,filesystem}.

Assuming that every WIT dependency must have an external source (e.g. a standard repo), deps/{cli,clocks,filesystem} are these sources, which should eventually be removed in favor of using upstream (see the TODO comment). Regarding different contents I cannot reproduce:

rvolosatovs@cobalt ..ecodealliance/wasmtime/crates/wasi/wit (git)-[fix/cargo-vendor] % for dep in cli clocks filesystem; do sha256sum $dep/*; done
99f45bfa45638e87bcb8e1c7b8fad65c2ac60fd70500fa447268371a0d87ed10  cli/environment.wit
a917b0e0381f0cb3f86e9c88c0fc84b46388595cf4858f02de2d597caee82e31  cli/exit.wit
2c60a8c95149393ac3698eb766da44c3e570dfca38646256c888ecb34f9717ab  cli/preopens.wit
57f8549af0058bf1bf58eccb9820715a907b2c17825e8eb7213ab647381935be  cli/stdio.wit
f0418770f24f7130d2eaf5f43316edc145ec4a524e77f02ace7db6171dcd8c76  clocks/monotonic-clock.wit
caf183ddda4bb6693f0cc9d3249e42a58a21091ff1764e92f652dfc8a1bbff6c  clocks/timezone.wit
44e062d552753f939b4dec8ce87b499ba46fdc0e88e8d630d7fd1bc0d29f7e13  clocks/wall-clock.wit
ce4e225baf21a915435f807a0bb47bc34f43227c0424156cab7c6c536a5d1755  filesystem/filesystem.wit

rvolosatovs@cobalt ..ecodealliance/wasmtime/crates/wasi/wit (git)-[fix/cargo-vendor] % for dep in cli clocks filesystem; do sha256sum deps/$dep/*; done
99f45bfa45638e87bcb8e1c7b8fad65c2ac60fd70500fa447268371a0d87ed10  deps/cli/environment.wit
a917b0e0381f0cb3f86e9c88c0fc84b46388595cf4858f02de2d597caee82e31  deps/cli/exit.wit
2c60a8c95149393ac3698eb766da44c3e570dfca38646256c888ecb34f9717ab  deps/cli/preopens.wit
57f8549af0058bf1bf58eccb9820715a907b2c17825e8eb7213ab647381935be  deps/cli/stdio.wit
f0418770f24f7130d2eaf5f43316edc145ec4a524e77f02ace7db6171dcd8c76  deps/clocks/monotonic-clock.wit
caf183ddda4bb6693f0cc9d3249e42a58a21091ff1764e92f652dfc8a1bbff6c  deps/clocks/timezone.wit
44e062d552753f939b4dec8ce87b499ba46fdc
[message truncated]

view this post on Zulip Wasmtime GitHub notifications bot (Jun 15 2023 at 11:19):

rvolosatovs commented on issue #6486:

I've removed all dependency updates and any reorganization from this PR, it's just fixing the cargo vendor use case for downstream crates. I really hope we can align on getting this fix over the line


Last updated: Nov 22 2024 at 16:03 UTC