Stream: wasi

Topic: Process model?


view this post on Zulip Tarek Sander (Mar 14 2024 at 07:51):

Would there be interest for a light "process" model in WASI? Things like build tools and compilers often fork or execute other programs, and these essentially "shared nothing threads" are also possible on the web without SharedArrayBuffer. It would include a "process" tree with a way to get the exit code of a child, killing a child, "fork" (create a copy of the WASM instance without shared memory) and "exec" (replace the instance in the process tree with an instance of a wasm file, probably from bytes for flexibility, but can be loaded from the filesystem with wasi:fs). Supported for running would only be components implementing the cli world. Signals could be supported with the async component model in the future. That would e.g. allow standard build tools to run easier in a browser. Also may be enough for the Python multiprocessing module to work, though that probably needs signals.

view this post on Zulip Tarek Sander (Mar 14 2024 at 08:00):

Or should this be a larger proposal with its own posix-cliworld? And support for more Posix APIs? Or should something like that be split into multiple small proposals?

view this post on Zulip bjorn3 (Mar 14 2024 at 10:17):

Supporting only something like posix_spawn would be a better idea than supporting fork+exec IMO. Supporting fork basically mandates COW memory to avoid terrible performance due to having to copy the entire linear memory on every fork and it requires cloning the wasm stack too, which most wasm runtimes don't support and I don't think will ever be supported on the web. Posix_spawn on the other hand directly creates a new process with the target executable loaded in already, which is indeed easy to do on the web already using web workers.

view this post on Zulip bjorn3 (Mar 14 2024 at 10:23):

I don't think posix signals should be allowed. They can interrupt the process in between any two instructions. The component model async support will I would expect not allow preemption, but only allow task switching at yield points where the current task is awaiting completion of a future. Furthermore posix signals leak part of their state across exec, which can cause processes that don't expect this to misbehave.

On the other hand something closer to linux's signalfd may work. That requires you to actively poll for incoming signals and thus doesn't cause arbitrary preemption. At that point however you may as well use a pipe or some other IPC mechanism that doesn't involve signals as you don't have compatibility with programs that expect posix signals anyway.

view this post on Zulip bjorn3 (Mar 14 2024 at 10:27):

Also your processes don't need to form a process tree. The process spawn method should probably return a pidfd equivalent and allow any process with the pidfd to wait on it. And then not expose any pid's at all. This is more secure and doesn't require zombie processes when you haven't waited on the process exit yet to avoid pid reuse as the pidfd is intrinsically tied to a single process.

view this post on Zulip Tarek Sander (Mar 14 2024 at 10:29):

Supporting fork basically mandates COW memory to avoid terrible performance due to having to copy the entire linear memory on every fork and it requires cloning the wasm stack too, which most wasm runtimes don't support and I don't think will ever be supported on the web.

Oh right, the call stack. I suppose asyncify would support replaying the stack until the fork call. And I don't expect forks to happen often so performance wouldn't be that big of an issue.

I don't think posix signals should be allowed. They can interrupt the process in between any two instructions. The component model async support will I would expect not allow preemption, but only allow task switching at yield points where the current task is awaiting completion of a future. Furthermore posix signals leak part of their state across exec, which can cause processes that don't expect this to misbehave.

On the other hand something closer to linux's signalfd may work. That requires you to actively poll for incoming signals and thus doesn't cause arbitrary preemption. At that point however you may as well use a pipe or some other IPC mechanism that doesn't involve signals as you don't have compatibility with programs that expect posix signals anyway.

Traditional signals can be supported with threads then: A thread listens for signal on a signalfd-equivalent and runs the registered signal handlers.

Also your processes don't need to form a process tree. The process spawn method should probably return a pidfd equivalent and allow any process with the pidfd to wait on it. And then not expose any pid's at all. This is more secure and doesn't require zombie processes when you haven't waited on the process exit yet to avoid pid reuse as the pidfd is intrinsically tied to a single process.

This is about compatibility with existing software though: I don't know how much software uses posix_spawn or pidfds.

view this post on Zulip bjorn3 (Mar 14 2024 at 10:29):

tl;dr: I personally see value in supporting processes, but I don't think we should simply copy what POSIX does, but follow the capability oriented model of WASI and forego signals entirely. I don't have any voting rights for WASI proposals though.

view this post on Zulip Tarek Sander (Mar 14 2024 at 10:30):

bjorn3 said:

tl;dr: I personally see value in supporting processes, but I don't think we should simply copy what POSIX does, but follow the capability oriented model of WASI and forego signals entirely. I don't have any voting rights for WASI proposals though.

I'd still like to have to possibility of emulating POSIX APIs though, at the WASI libc level.

view this post on Zulip bjorn3 (Mar 14 2024 at 10:32):

For compatibility maybe it would be possible for wasi-libc to keep an internal mapping between pid and pidfd and when you wait on a pid, lookup the corresponding pidfd and wait on it instead. This would give every process their own pid namespace though.

view this post on Zulip bjorn3 (Mar 14 2024 at 10:34):

For signals, single threaded programs may expect the signal handler to run on the main thread and to cause all syscalls to return EINTR. For multi threaded programs your compatibility mechanism would work for as long as a process doesn't use pthread_kill (or was it another function) to send a signal to a specific thread.

view this post on Zulip Tarek Sander (Mar 14 2024 at 10:38):

bjorn3 said:

For signals, single threaded programs may expect the signal handler to run on the main thread and to cause all syscalls to return EINTR. For multi threaded programs your compatibility mechanism would work for as long as a process doesn't use pthread_kill (or was it another function) to send a signal to a specific thread.

That would also be possible: Host calls would need to check for pending signals on return and execute the signal handler.

view this post on Zulip bjorn3 (Mar 14 2024 at 10:41):

That doesn't actually interrupt the syscall itself, which some processes rely on.

view this post on Zulip Tarek Sander (Mar 14 2024 at 10:45):

bjorn3 said:

That doesn't actually interrupt the syscall itself, which some processes rely on.

Some syscalls like sleep, read and write would need special handling to support interruption. I don't think many syscalls would need that though.

view this post on Zulip Tarek Sander (Mar 14 2024 at 10:49):

Especially since I/O will probably go async in p3, so poll/await would probably be the only thing that needs to be interruptible.

view this post on Zulip Tarek Sander (Mar 14 2024 at 11:12):

And I think the cost of asyncify for fork is also OK: A stated, the goal is compatibility, not performance. You're free to use the spawn API that would also logically be included and your program wouldn't need to go through asyncify, and there would be no need to copy memory.

view this post on Zulip Tarek Sander (Mar 14 2024 at 11:17):

But it may make sense to split it into modern and legacy POSIX proposals. Modern would include the process model, spawn, signalfds, pidfds and other easily implementable things, and the legacy proposal would add things like a proper process tree with PIDs, fork on top of asyncify and exec. The modern one would have priority IMO, but supporting older POSIX APIs is a nice long-term goal.

view this post on Zulip Dan Gohman (Mar 14 2024 at 13:34):

The problem with fork is that it's more of a whole-system design philosophy than a function. We can't ignore it when we don't need it, because just by existing, it creates the possibility that a fork could happen at any time. Everything in the system, for all time, has to be designed with fork in mind.

view this post on Zulip Tarek Sander (Mar 14 2024 at 13:45):

Many WASI proposals would need no interaction with fork: E.g. the key-value store. Those proposals should have their resources not copied with fork, they only stay in the parent (though the program is intended to be a POSIX app and shouldn't even be aware of WASI). Only resources like open files would need special fork handling. Because other than copying the memory, call stack and fd table, there's nothing else to fork. And the call stack would be saved via asyncify, and both instances would then rewind to the fork call and get the appropriate return value. The posix-legacy world would need to expose stack unwinding and rewinding functions based on asyncify, and that should trap if there isn't enough memory.

view this post on Zulip Dan Gohman (Mar 14 2024 at 15:05):

The whole-system design philosophy we are aiming for is: every API can be virtualized. And the filesystem API is a key part of that story. If we start saying that the filesystem API has some special relationship with fork, that would mean that when we virtualize the filesystem by implementing it in Wasm, now arbitrary Wasm code has to have that same special relationship with fork.

view this post on Zulip Tarek Sander (Mar 14 2024 at 15:20):

Technically the only thing it needs to support is cloning resources into the resource table of the new instance. So the fs resources need a clone method (which shouldn't be that hard), and the legacy-posix world needs a function that accepts a resource and puts it in the specified resource index. That would need a move function that moves a resource to the specified index in the resource table, AFAIK that just means manipulating the table of externrefs inside WASM. Seems virtualizable to me.

view this post on Zulip Tarek Sander (Mar 14 2024 at 15:25):

And any resources that support cloning would be cloned to the child.

view this post on Zulip Dan Gohman (Mar 14 2024 at 15:46):

What happens if you fork while something in your "process" is holding a resource which can't be cloned? When does a virtual filesystem instance allocate a new linear-memory stack and thread-local storage to use when the new "process" calls it? How does this interact with stack-switching? How does this interact with GC?

A repository for the stack switching proposal. Contribute to WebAssembly/stack-switching development by creating an account on GitHub.
Branch of the spec repo scoped to discussion of GC integration in WebAssembly - WebAssembly/gc

view this post on Zulip Dave Bakker (badeend) (Mar 14 2024 at 15:48):

the fs resources need a clone method (which shouldn't be that hard)

Why do you think clone would be contained to just filesystem resources?

view this post on Zulip Tarek Sander (Mar 14 2024 at 15:50):

Stack switching: interaction if it is needed for threads. GC: GC objects aren't copied. Remember, everything that's not posix-relevant is invisible to fork, because programs using fork aren't using these APIs anyways. fork is for compatibility, not for making a crazy WASM-POSIX hybrid application.

view this post on Zulip Tarek Sander (Mar 14 2024 at 15:52):

Dave Bakker (badeend) said:

the fs resources need a clone method (which shouldn't be that hard)

Why do you think clone would be contained to just filesystem resources?

It doesn't need to, but any resource that wants to support fork must support cloning. That would probably need another table for mapping resource ids to clone methods, AFAIK resources should be opaque types if you only have a handle, right?

view this post on Zulip Dan Gohman (Mar 14 2024 at 16:43):

Ultimately, I suspect you're right; if we really sat down and thought about it, we could probably design a whole "legacy POSIX mode", with fork and something approximating signals, and that only tries to support "any language you want as long as it's C", and so on. And there is a perspective from which that would be a cool thing to have.
But there's also a cost. Where do we direct our finite energies? What are the eventual outcomes we're working for? I don't think many people would want us to build two separate ecosystems.

view this post on Zulip Tarek Sander (Mar 14 2024 at 16:49):

That's the purpose of having multiple worlds though: The CLI world is for running new applications, the proxy world is for writing HTTP proxies, and a modern POSIX interface would be compatible with the CLI world, but legacy-posix would be its own world, without access to many of the other proposals, but with better POSIX compatibility.

view this post on Zulip Dan Gohman (Mar 14 2024 at 16:51):

The ecosystem we're building doesn't stop at world boundaries. When you build an app for the wasi-cli world, maybe you use wasi-virt to wrap it up in with a virtual filesystem and other stuff so that it runs in some other world. Maybe your CLI app pulls in some library offering a non-CLI API, and maybe it's implemented in another component written in another language which happens to use GC.

view this post on Zulip Tarek Sander (Mar 14 2024 at 17:05):

As long as no GC references are passed to the component that wants to fork, everything should be fine. There's component-level isolation for a reason. The legacy-posix world would include no interface that uses GC types or other fancy WASM-specific features. The intended use case it to spawn a component that needs more POSIX features from e.g. a CLI component. It then writes its result in e.g. the filesystem and exits. E.g. you may want to execute a build tool that forks itself in WASM to build something.

view this post on Zulip Dan Gohman (Mar 14 2024 at 17:07):

But we'd still need some way to differentiate "I'm using this instance as a library and I expect it to clone all its state along with me" vs "I'm using this instance as a service and I'm expecting to be ok being shared by a cloned instance".

view this post on Zulip Dan Gohman (Mar 14 2024 at 17:09):

And if it happens to be using GC, then it just can't be used in "library" mode if the application does a fork

view this post on Zulip Dan Gohman (Mar 14 2024 at 17:10):

And if we do instance graph cloning, that's really inefficient, especially if we're often going to throw away the graph with an exec.

view this post on Zulip Dan Gohman (Mar 14 2024 at 17:10):

Fork is just a really special snowflake. It's a really really bad API.

view this post on Zulip Tarek Sander (Mar 14 2024 at 17:10):

Applications using the legacy-posix world would not use GC libraries. The whole purpose is to take vanilla posix C programs with no knowledge of WASI or WASM and be able to run them correctly.

view this post on Zulip Dan Gohman (Mar 14 2024 at 17:11):

If we did that, we would be creating a parallel ecosystem.

view this post on Zulip Tarek Sander (Mar 14 2024 at 17:11):

Dan Gohman said:

Fork is just a really special snowflake. It's a really really bad API.

I hate COBOL as a language, it's still used and still needs compatibility.

view this post on Zulip Dan Gohman (Mar 14 2024 at 17:12):

This gets to why I describe fork as a "design philosophy" more than a function. It's possible to implement Cobol. People have done it, and it works fine. It's possible to do a lot of things on Wasm. But fork is something that you can't implement in Wasm. It needs to be magic.

view this post on Zulip Tarek Sander (Mar 14 2024 at 17:15):

I think we talked enough about fork specifically now.


Last updated: Oct 23 2024 at 20:03 UTC