Stream: wasi

Topic: multiple worlds and the tool sandwich problem


view this post on Zulip Randy Reddig (Sep 25 2024 at 16:12):

This is an extraction from a longer thread with @Mossaka (Joe) while working on the Go bindings generator for WIT (wit-bindgen-go). The problem described here applies to Go, but I believe this may apply more broadly, so this is an attempt to surface the issue in a forum where we can collaborate on a solution.

The "tool sandwich" problem:

WIT is needed at two distinct stages of development: (1) to generate bindings called directly or indirectly by a user program, and (2) used later to convert a compiled wasm module into a component via wasm-tools component embed and wasm-tools component new.

The "sandwich" refers to WIT as "bread" and the contents of the sandwich being the normal development activities by an end-user developer writing a program that depends on the underlying WIT interfaces, with or without their knowledge.

Phases 1 and 2 may be performed by two distinct, unrelated parties, and at different times. For example, (1) could be performed by an open-source maintainer who develops a Go package that wraps WIT interfaces with standard Go semantics (e.g. wrapping wasi:http). That Go package is eventually used by an end-user developer (2), with no knowledge or necessarily any understanding of the underlying system call layer provided by the WIT / WASI interfaces.

Right now, the "tool sandwich" problem leaks the abstraction of WIT down to the end-user developer, who must synthesize a WASI world that accommodates each of the packages their program uses.

Hypothetical Proposal

  1. Allow wasm-tools component embed|new to accept multiple WIT worlds, and fuse them at build time (implying a synthetic WIT world with a number of include statements).
  2. Standardize a mechanism for language libraries/packages to embed WIT, suitable for downstream tool consumption.
  3. Allow (1) and (2) to work together to mitigate the tool sandwich problem, so end users do not have to know or care about the underlying system interfaces, WIT, or WASI at all.

view this post on Zulip Alex Crichton (Sep 25 2024 at 16:44):

This is a very good problem to discuss, thanks for bringing this up and typing this out!

I can perhaps start by describing the shape of how this was solved in Rust/C and then answer your questions after. The expected workflow for part (1) is:

For part (2) it then looks like:

The imporatant part here is basically these object files smuggling type information from (1) to (2) without anyone in the middle being any the wiser. Bundling wasm-ld and wasm-tools component new is what wasm-component-ld is tasked with doing.


So with all that in mind, what you're proposing I believe basically aligns with how this is done today in Rust and C. For (1) that's already done by wasm-tools component new in the sense that if you run embed multiple times those worlds are all concatenated together for the final component (possibly producing a world no one has ever written down berfore). For (2) the mechanism is custom sections in the final wasm module where each custom section is the wasm-encoded WIT world that was used to generate bindings. Then (3) works via language-specific mechanisms for automatic inclusion of the object files to go to the linker.

view this post on Zulip Joel Dice (Sep 25 2024 at 16:59):

FWIW, wasm-component-ld accepts an arbitrary number of --component-type options, each accepting the path to a WIT file; those WIT files are then merged (along with any binary component types passed via custom sections, as Alex described) together, and the result is used as the component type for componentization via wit-component. The .NET component tooling uses this mechanism via -Wl flags to the linker.

view this post on Zulip Randy Reddig (Sep 25 2024 at 17:14):

We’re currently prototyping WASI 0.2 on TinyGo, so we have some degree of control over the entire toolchain. In theory, what we do with TinyGo can inform a plan for mainline (e.g. "big") Go.

  1. We can adapt wasm-tools-go and wit-bindgen-go to emit WIT or other metadata in the generated code.
  2. We can adapt TinyGo to embed that metadata in custom sections in the compiled wasm.

If wasm-tools component new can consume that metadata without needing a sidecar of WIT, I think we’ve solved the tool sandwich problem.

view this post on Zulip Alex Crichton (Sep 25 2024 at 17:35):

Yeah that should work! If you'd like I can detail a bit more about the exact format of the custom section too. It's not something super well documented at this moment

view this post on Zulip Randy Reddig (Sep 26 2024 at 15:57):

Great!

Next question: can we smuggle WIT text (instead of binary) in a custom section that wasm-tools could interpret?

view this post on Zulip Alex Crichton (Sep 26 2024 at 16:24):

That's not implemented at this time, and we've tried to avoid it due to the more-stable-nature of the binary format as opposed to the text format, but there's also no reason we couldn't support it as an option!


Last updated: Nov 22 2024 at 16:03 UTC