preview2 · wasi · Zulip Chat Archive

Stream: wasi

Topic: preview2

Nathaniel McCallum (Dec 01 2022 at 21:17):

@Dan Gohman @Kyle Brown @Luke Wagner @Brian @Romans Volosatovs @Bailey Hayes @Alex Crichton As several of us have discussed, I have been preparing a theoretical wasi-snapshot-preview2 interface based on the state of wit-bindgen today. The only outstanding feature needed for this is use; which @Brian is working on. In the spirit of release early, release often, you can find my work here: https://github.com/npmccallum/wasi-snapshot-preview2

This effort attempts to do two things:

get a version of wasi based on wit as it exists today
add some of the most painful missing features (particularly, outgoing TCP and DNS)

This work is based heavily on existing proposals and all the credit goes to the original authors. I have mostly tried to adapt to the current situation in wit and consolidate the effort.

I would appreciate your reviews.

GitHub - npmccallum/wasi-snapshot-preview2

Contribute to npmccallum/wasi-snapshot-preview2 development by creating an account on GitHub.

Joel Dice (Dec 01 2022 at 21:23):

For comparison, there's also https://github.com/bytecodealliance/preview2-prototyping/blob/main/wit/wasi.wit

preview2-prototyping/wasi.wit at main · bytecodealliance/preview2-prototyping

Polyfill adapter for preview1-using wasm modules to call preview2 functions. - preview2-prototyping/wasi.wit at main · bytecodealliance/preview2-prototyping

Nathaniel McCallum (Dec 01 2022 at 21:59):

@Joel Dice Thanks! I hadn't seen that particular iteration (but I saw others that are very similar).

Nathaniel McCallum (Dec 01 2022 at 21:59):

I'm happy to merge with anyone.

Dan Gohman (Dec 01 2022 at 22:02):

Is your dns, ip, and tcp stuff based on wasi-sockets, or is it different?

Nathaniel McCallum (Dec 01 2022 at 22:02):

@Dan Gohman Based on it, but modified for today's wit.

Dan Gohman (Dec 01 2022 at 22:03):

The preview2-prototyping repo is the same as the old sunfishcode/preview1.wasm repo; it just recently renamed and move.

Nathaniel McCallum (Dec 01 2022 at 22:03):

@Dan Gohman I excluded UDP and rewrote the wasi-dns proposal to keep asynchronous DNS with descriptors rather than futures, etc.

Nathaniel McCallum (Dec 01 2022 at 22:05):

@Dan Gohman Most of my proposal borrowed from your old repo which I didn't see any action on. So now that the work has shifted elsewhere, I'm happy to merge.

Joel Dice (Dec 01 2022 at 22:05):

There seems to be a lot of interest from various people in helping move the design and implementation of Preview 2 forward. I've been working on the host implementation in the preview2-prototyping repo referenced above, and Alex, Dan, and Pat (and others?) have been contributing to the polyfill on a daily basis. Perhaps we should all get together and sync up so we're all pushing in the same direction.

Nathaniel McCallum (Dec 01 2022 at 22:05):

:thumbs_up:

Dan Gohman (Dec 01 2022 at 22:05):

Sounds good to me.

Dan Gohman (Dec 01 2022 at 22:05):

I'm around right now if anyone else wants a quick chat

Joel Dice (Dec 01 2022 at 22:05):

There's a Component Model meeting tomorrow; would that be a good venue? I can chat now, too.

Nathaniel McCallum (Dec 01 2022 at 22:06):

I could give you 15 minutes now.

Nathaniel McCallum (Dec 01 2022 at 22:06):

Maybe more if I check with my wife.

Dan Gohman (Dec 01 2022 at 22:06):

https://meet.jit.si/OrangePurpleLittleMountains

Jitsi Meet

Join a WebRTC video conference powered by the Jitsi Videobridge

Joel Dice (Dec 01 2022 at 22:06):

Alex, would you have time to join us? You seem to have been working on this lately?

Alex Crichton (Dec 01 2022 at 22:06):

sure yeah, I can join

Dan Gohman (Dec 01 2022 at 22:06):

Also, the meeting tomorrow is also good.

Nathaniel McCallum (Dec 01 2022 at 22:12):

@Alex Crichton nathaniel@profian.com

Marcin Kolny (Dec 02 2022 at 09:29):

Hi, what's the conclusion? I'm quite interested in next steps.

Ralph (Dec 02 2022 at 13:22):

yup, here as well. How do things fall out? :-)

Joel Dice (Dec 02 2022 at 15:05):

I was only there for the first part of the meeting, but it was mainly Dan and Nathaniel syncing up their respective Preview 2 WIT files and getting on the same page about how and why Preview 2 points the way to Component Model Values, Resources, Streams, and Futures without actually using any of those things (since they aren't baked yet). Other themes:

Moving away from implicit preopens and untyped ambient authorities towards a typed main entrypoint for commands
Replace "everything is a file descriptor" and "errno for all errors" with more narrowly-scoped handles and error enums, respectively
Nathaniel is especially interested in better socket support, so he's been focusing on that, with particular attention to edge cases, UDP/DTLS transparency, etc.

More concretely, Nathaniel is going to merge his changes into the preview2-prototyping project linked to above, and that's where we'll collaborate going forward.
Further discussion to follow in today's component model meeting.
@Nathaniel McCallum @Dan Gohman feel free to chime in if any of this is inaccurate or incomplete.

Ralph (Dec 02 2022 at 15:09):

thanks, Joel!

Marcin Kolny (Dec 02 2022 at 17:13):

Thanks!

Nathaniel McCallum (Dec 02 2022 at 17:24):

@Joel Dice provided a very good summary. Let me add a few things that I take as even more important conclusions than the particular technical details:

We need to break compatibility with preview1 ASAP. There is significant risk of it becoming a de-facto standard. The longer we wait, the harder this becomes.
The highest priority is getting preview2 to a place where it, and all future versions, can be polyfilled easily. This means we can break compatibility more regularly with a clear upgrade story.
We also need to focus on the APIs that enable standard libraries to work. This is because they take the longest to integrate and have the most painful development cycle and the widest ranging impact. This particularly means networking.
The more we discussed networking the more we came to agree that the corner cases are both real and substantial and that we should hem towards closeness to the Berkely API. Attempts to consolidate these APIs into very high level APIs may cause race conditions in wasi-libc. So, for example, we cannot combine bind() and listen().

Alex Crichton (Dec 02 2022 at 18:03):

While I don't disagree about focusing on standard library APIs first I think there's still a long road to actually updating standard libraries to use these APIs. The component model itself is not really stable enough for that use case. This is why the wasi_snapshot_preview1.wasm is being developed, though, since that provides an easy means to define preview1 in terms of whatever preview2 currently is, which means there's no immediate pressure to update standard libraries

Ralph (Dec 02 2022 at 23:03):

I'm grateful for all the hard work going on here.

Joel Dice (Dec 02 2022 at 23:32):

Dan and I met today and discussed Preview 2 host implementation strategy, which helped clarify a few things for me. Here are some notes both to aid my memory and in case anyone else is curious or has comments.

The overall goal is to ship something quickly, so some of the original ambitions for Preview 2 are no longer in scope, and some things are being carried over from Preview 1 (e.g. the WasiFile trait) where there is no mature alternative yet available.
As with the Preview 1 implementation, we're going to keep using the cap-std and system-interface where appropriate (mainly for wasi-filesystem) since the former provides valuable directory sandboxing and latter provides useful, well-tested OS abstractions. If those abstractions cause pain as we implement Preview 2, we'll either fix them so they don't or come up with something better.
We're going to pull a slimmed-down version of wasi-common into the preview2-prototyping repo, removing wiggle, the rights system, the poll_oneoff scheduler, and anything else that's no longer relevant, but keeping the WasiFile and WasiDir traits and impls to ease migration for embedders. We will probably add/remove/modify methods in those traits as appropriate, but the basic shapes will be the same.
Each interface that has functions which can return errors will have its own error enum covering only relevant cases.
We will have separate handle types for each pseudo-resource (e.g. clocks, files, sockets), and each of these will have its own set of monomorphic functions (akin to resource methods). In other words, the function for reading from a file will be different than the function for reading from a socket, etc. In general, there will not be any polymorphic functions (e.g. a single function which can take multiple kinds of handles) unless those handles are wrapped in an enum type with a case for each possible handle type.
As with the Preview 1 implementation, we'll use async everywhere even though the underlying cap-std functions are blocking, using a pseudo-executor for the blocking "futures" and the runtime's real executor (e.g. Tokio) for non-blocking wasi-poll futures. The idea is to use something like Tokio for executing poll-oneoff futures instead of the existing, buggy wasi-common schedulers. I'm handwaving a bit here -- happy to discuss it further.
We're going to aim to make the core of the implementation runtime-agnostic, but the main priority is to ship something quickly, so we might need to compromise on that a bit if it slows things down.

Nathaniel McCallum (Dec 03 2022 at 14:01):

@Joel Dice Two requests:

Let's pick an error return type for the default function implementations and use it everywhere. This is currently inconsistent.
Let's use interior mutability everywhere. This will help with existing threading work. I know of only one blocker for this.

Joel Dice (Dec 05 2022 at 14:55):

@Nathaniel McCallum Can you give an example of the error return type inconsistency? I'm only seeing wasi_common::Error everywhere at the moment, but I may be missing something. Also, note that the plan is for each Preview 2 interface to have its own error enum type, so I expect that will be reflected in the implementations as well. I'd actually like to get away from wrapping everything in an anyhow::Error and then downcasting later, since that's proven to be a source of bugs in the past (e.g. errors turned into traps when they shouldn't have been).

Regarding interior mutability, I imagine we could remove wasi_common::Table::get_mut and let the compiler tell us what else needs to be changed.

YAMAMOTO Takashi (Dec 28 2022 at 07:03):

Joel Dice said:

Replace "everything is a file descriptor" and "errno for all errors" with more narrowly-scoped handles and error enums, respectively

why?

Dan Gohman (Dec 28 2022 at 13:22):

To be sure, we'll continue to support POSIX APIs at the libc level. But at the WASI level, there are advantages to leveraging Wasm's static type system. In POSIX, and in fact in preview1, there are so many error codes, and in a lot of situations it's ambiguous which errno code is the right one to use. There are open issues about this in wasi-testsuite. And as WASI grows more features and more APIs, trying to maintain a single unified error code space will only get harder. And for file descriptors, at the WASI level, using distinct static types will let us use interface-type handles.

Dan Gohman (Dec 28 2022 at 13:32):

Instead of saying "file descriptors are dynamically typed, you can pass anything to read and it'll work if the underlying resource can model itself as a stream of bytes", the idea is to move to a model of "there's a stream type, and any resource which can model itself as a stream can give you a stream view of itself", at the WASI level. This will let us do things like connect distinct trust domains with streams without having to worry about whether one side might take advantage of dynamic typing to do non-stream things too. And it'll give us more flexibility when async and type-parameterized streams arrive.

Dan Gohman (Dec 28 2022 at 13:33):

Fun fact: a Unix program with stdout and stderr redirected can still print to the console:

$ cat t.c
int main() {
    write(0, "Hello, World!\n", 14);
}
$ gcc -w t.c
$ ./a.out > /dev/null 2> /dev/null
Hello, World!
$

This is possible on Unix because of the dynamic-typing nature of file descriptors.

YAMAMOTO Takashi (Dec 29 2022 at 13:28):

does it mean a rather big switch statement in eg. libc read(), which can ends up with importing things which the app doesn't actually use?

YAMAMOTO Takashi (Dec 29 2022 at 13:31):

Dan Gohman said:

Fun fact: a Unix program with stdout and stderr redirected can still print to the console:
$ cat t.c
int main() {
    write(0, "Hello, World!\n", 14);
}
$ gcc -w t.c
$ ./a.out > /dev/null 2> /dev/null
Hello, World!
$
This is possible on Unix because of the dynamic-typing nature of file descriptors.

i even have a hack to pass a listening socket as fd 0 to a wasm instance. https://github.com/yamt/garbage/blob/3618fbf5f5c9a6bdd28e950f4d240eadd52ef7c9/wasm/httpd/test.sh#L3

garbage/test.sh at 3618fbf5f5c9a6bdd28e950f4d240eadd52ef7c9 · yamt/garbage

My random garbage. Contribute to yamt/garbage development by creating an account on GitHub.

Dan Gohman (Dec 29 2022 at 16:29):

Switch statement yes, but if we structure it right, with strategic use of weak symbols, we can avoid pulling in imports the app doesn't need.

YAMAMOTO Takashi (Dec 30 2022 at 01:39):

i can't think of "strategic use of weak symbols" which can solve it. can you explain a bit?
my concern is that, with a naive implementation like the following, all wasi_type_?_op_a will be imported if an app uses libc_op_a.

libc_op_a(int fd)
{
    switch (get_type(fd)) {
    case x: wasi_type_x_op_a(...); break;
    case y: wasi_type_y_op_a(...); break;
    case z: wasi_type_z_op_a(...); break;
    }
}

YAMAMOTO Takashi (Dec 30 2022 at 01:47):

do you mean to apply what we do for chdir and __wasilibc_find_relpath_alloc to libc_open_type_x and wasi_type_x_op_????

Dan Gohman (Dec 31 2022 at 11:32):

One part of it is that ops like read and write operate on streams, so there won't be a separate read/write for each resource type. Resources that want to support read and write will just produce a stream rather than having their own read and write functions.

Dan Gohman (Dec 31 2022 at 11:38):

Another part of it is the actual weak symbol trick. Eg. for sockets, any program that needs sockets support (and not just operating on a stream) will have a call to accept, connect, listen, or similar, and then we can functions defined in the same .o files which override weak symbols in libc to add sockets support.

Last updated: Apr 08 2025 at 19:03 UTC