Isolation and IPC · wasi-gfx · Zulip Chat Archive

Just spoke to the primary maintainer of wgpu (Connor Fitzgerald), he confirmed @Sean Isom's nightmare, wgpu doesn't isolate between gpu resources.
Now the question is if the component-model's isolation is enough.
(cc @Dan Gohman )

Sean Isom (Mar 26 2024 at 21:08):

Maybe this becomes the host runtime's problem (in this case, the wrapper around WebGPU). It would be difficult to implement runtime behavior in terms of the standard that is based on WIT, I would say?

Dan Gohman (Mar 26 2024 at 21:41):

Sean Isom (Mar 26 2024 at 21:44):

My (basic) understanding is that it is cross-origin resource isolation. It checks origin ownership before allowing the content process to communicate with the GPU process. Once IPC to the GPU process occurs, everything in the GPU process is untrusted (just runs native WGPU)

Sean Isom (Mar 26 2024 at 21:45):

Would be helpful for someone from Mozilla to validate that and maybe provide more detail

Dan Gohman (Mar 26 2024 at 22:27):

@Mendy Berger The component model itself only isolates Wasm code, so it doesn't help for isolating things happening on the GPU.

Dan Gohman (Mar 26 2024 at 23:09):

So it sounds like, in addition to the question of whether we need a GPU-process architecture, we'll also need to figure out validation.

Sean Isom (Mar 26 2024 at 23:11):

Is there anything in WASI previously that architecturally required IPC that could be used as a starting point? I feel like validation is a cumbersome but easier problem to solve.

Dan Gohman (Mar 26 2024 at 23:13):

Not "architecturally", per se. In theory if someone wanted extra defense-in-depth they might want to run wasi-filesystem isolated in its own process, but I'm not aware of anyone having done that yet.

Dan Gohman (Mar 26 2024 at 23:15):

There have been people investigating mapping WIt interefaces to IPC/RPC protocols, though at this time I don't know of anything out-of-the-box that we could use.

Sean Isom (Mar 27 2024 at 03:34):

I’ll drop Thomas Steiner a note and see if he has thoughts from the Chrome side. I don’t know the right point of contact at Mozilla

Sean Isom (Mar 27 2024 at 03:36):

I’m inclined to think validation is more important for an MVP than process sandboxing, as I think that is as much for host process stability as security, but I’m not an SME in browser design

Tarek Sander (Mar 27 2024 at 05:41):

I asked a similar thing on the wgpu repo, but nobody answered. I did find this though. So it seems process isolation is optional, as long as you'd stick to the safe wgpu APIs. I don't know about wgpu-core though.

Suitable for use with untrusted code? · gfx-rs wgpu · Discussion #2792

Hi there! 👋 I've been thinking about a use case in which a native host application creates a render target, and then exposes a rendering API to untrusted code running in a Wasm VM. However, I could...

Tarek Sander (Mar 27 2024 at 05:43):

Naga does shader validation already, and inserts bounds checks and things if needed, so that should be reasonably safe on the GPU side, too.

Bailey Hayes (Mar 27 2024 at 19:07):

GitHub - rvolosatovs/wrpc: Wasm component-native RPC framework

Wasm component-native RPC framework. Contribute to rvolosatovs/wrpc development by creating an account on GitHub.

Mendy Berger (Mar 27 2024 at 20:29):

I'm more than a little our of my depth here, and this is about security, so don't trust me more than any self important rando...

IIUC wgpu does do shader code bounds checking, so we don't have to do any validation.
They're just not doing resource checking, i.e. checking if a resource belongs to a specific caller before accessing it. But I think the runtimes are expected to do that, aren't they?

Tarek Sander (Mar 27 2024 at 20:46):

I think not checking the caller is natural for wgpu: There's only one caller, the program that uses wgpu as a library. I looked and yes, it seems IDs are shared among all instances for a given backend, though the ID is an index into a vector (or something like it) of resources, so at least the type would be right. So the only thing to secure that would be to make the IDs opaque and inaccessible to WASM: Keep them in the host resource for each WebGPU object, and when calling wgpu just extract the IDs. The WASI implementation then assigns its own IDs for the "WASI" backend resources.

Mendy Berger (Mar 27 2024 at 20:48):

@Tarek Sander exactly! This is why I don't think this is a real problem, the wasm module doesn't have access to the underlying wgpu IDs.

Tarek Sander (Mar 27 2024 at 20:49):

This would mean the Rust implementation of the WASI API would lie in wgpu-hal, that works with graphics API types instead of IDs.

Tarek Sander (Mar 27 2024 at 20:54):

Additional process isolation would be good, but I think for most uses cases being less paranoid about security that browsers is OK. It could be a vulnerability if the graphics API implementation wgpu in the host uses has a vulnerability for memory read/write, use that to run custom assembly or modify arguments to get access to other component's GPU resources, but seeing as these APIs are quite well tested and wgpu is used by Firefox, I don't think that's a big issue.

Mendy Berger (Mar 27 2024 at 20:55):

@Tarek Sander what's the problem with sharing IDs if they're opaque to the wasm guest?

Tarek Sander (Mar 27 2024 at 20:58):

The thing about wgpu-core IDs is that they can be explicitly set on resource creation by the client, I don't know if an ID being already used is handled, I don't think so. So the easiest option is to go to the wgpu-hal level where there are no IDs and let wgpu-core on top assign IDs like it wants to.

Mendy Berger (Mar 27 2024 at 21:00):

Not totally following, are you saying that we should build directly on top of wgpu-hal?

Tarek Sander (Mar 27 2024 at 21:09):

Mendy Berger (Mar 27 2024 at 21:10):

Tarek Sander (Mar 27 2024 at 21:16):

Yeah, that level also has to be the same as the WebGPU bindings in wgpu, since WebGPU has no mention of IDs either.

Mendy Berger (Mar 27 2024 at 21:17):

Right.
So I think that's what we're doing already, we don't expose any IDs to the wasm guest, only resources/types.

Tarek Sander (Mar 27 2024 at 21:20):

Then I think we should be fine isolation-wise. Configuring a sandbox where you can still access the GPU is a nightmare, at least when I tried doing it on Android. Each driver implementation can use different system calls with different ioctls and device file paths, it's not good to manage with something like seccomp filters. And the host-API to WASM was supposed to be the replacement to sandboxing anyways, if the API is secure, you don't need an additional sandbox.

Mendy Berger (Apr 05 2024 at 11:06):

Hey y'all, @Deian Stefan - a security researcher - will likely be able to provide feedback here.

Mendy Berger (Apr 18 2024 at 01:51):

level 1 security:
On a very basic level, the shader bounds check and the wasmtime resource isolation should be enough.

Level 2 security:
However, gpu driver code is often unsafe and buggy, so for any real world use case we'd wanna isolate the gpu into it's own process. Just like browsers do.

Level 3 security:
Once we have process isolation, we should consider even further restrictions with seccomp.

Level 4 security:
Browsers have runtime checks that they do on the gpu code. We should see if we can steel or replicate that.

He also brought up the point that GPUs often don't do a good job isolating code, especially the older ones and there's not much we can do about that. It's a risk browsers have to accept as well.

Tarek Sander (Apr 18 2024 at 06:39):

The APIs wgpu is built upon have extensions for robust buffer access extensions, which ensures the driver handles out-of-bounds shader accesses: Vulkan GL. The GL extension is also used to secure WebGL in browsers AFAIK. Depending on security paranoia, you could refuse running webgpu code without these extensions.

Process sandboxes are tricky if you really want to drop as much privileges as you can. E.g. Chromium on Linux has a setuid chromium-sandbox binary, because you need root to change the process's namespaces, user and filesystem root. I wanted to build a sandbox library for Rust before, but I just couldn't find any information on how to do process sandboxing on Windows. I guess replicating/using the chromium or Firefox sandbox is the best bet.

Seccomp is only available on Linux and Android, I don't know if other platforms have a way to restrict system calls.

Tarek Sander (Apr 18 2024 at 06:42):

An --unsafe-webgpu flags like browsers had for the early days of WebGPU implementations is also an option until all the sandboxing is implemented. If you're just running your own code in WASM and just use it for easy cross-platform deployment, sandboxing is not an issue.

Ralph (Apr 18 2024 at 09:50):

just ask! I'll find the windows person who knows and get you the answers if you want them.

Tarek Sander (Apr 18 2024 at 10:35):

@Ralph I guess my question is about the permission system in Windows in general: How are the permissions of a process determined? How granular can you change them? The only documentation I can find is about the high level stuf like group policies or the Windows sandbox feature that just runs Windows inside a VM. The chromium page gives a good overview, but has not much links the the Microsoft documentation (if any exists).

Ralph (Apr 18 2024 at 10:37):

Process Security and Access Rights - Win32 apps

The Microsoft Windows security model enables you to control access to process objects. For more information about security, see Access-Control Model.

Ralph (Apr 18 2024 at 10:38):

the windows lead I know lives in EDT, so we can chat with him about 3 pm or so CET

Tarek Sander (Apr 18 2024 at 10:43):

That link looks promising, at has links to the security architecture of Windows in general. Detailed information would be great! Last I checked there was no Rust process sandboxing library that supported Windows, so maybe the sandboxing solution we'll need could also benefit other projects as a library.

Ralph (Apr 18 2024 at 10:44):

Tarek Sander (Apr 18 2024 at 15:28):

I browsed the sandbox tag on crates.io again now and yep, none of the crates seem to support Windows.

crates.io: Rust Package Registry

Tarek Sander (Apr 18 2024 at 15:32):

Would y'all have a preference for the data format for IPC between the processes? If not I'd make a flexbuffers crate with strict bounds checks, the validation is sadly only in the C++ lib and even validation doesn't help in case we use a shared memory buffer, where the compromised side could change the offsets anytime.

Dan Gohman (Apr 18 2024 at 15:40):

Is there a reason for flexbuffers instead of flatbuffers? I imagine you'd be generating the IPC code from the Wit definitions, so you'll know all the types of everything up front.

Tarek Sander (Apr 18 2024 at 16:04):

I'd like the sandbox library to be independent of WASI (so there is finally one that support Windows in the ecosystem), and I don't think flatbuffers would work with serde, as there's no schema with that. I could make a flatbuffers compiler that uses strict bounds checks, but integrating with serde is a better goal than support for flatbuffer schemas IMO.

Tarek Sander (Apr 18 2024 at 16:07):

Which is why it surprises me that there is no sandbox library with support for Windows. Maybe I'm just bad at searching, but I haven't found one, y'all can look too if you want, maybe you're better at searching than me.

Tarek Sander (Apr 18 2024 at 16:09):

I'm using "y'all" as a non-notifying replacement for something like @all by the way, I don't know of a better term.

Dan Gohman (Apr 18 2024 at 16:10):

Dan Gohman (Apr 18 2024 at 16:25):

I think if you're using flatbuffers, then you just wouldn't use serde. Remoting a WebGPU API doesn't depend on serializing arbitrary user types, or serializing to arbitrary formats, so the main strengths of serde wouldn't apply.

Dan Gohman (Apr 18 2024 at 16:27):

It sounds like there are multiple possible goals here. If you're interested in building a Windows sandboxing library, you're certainly welcome to build that. That said, I don't think that's a critical path for wasi-gfx per se.

Ralph (Apr 18 2024 at 16:29):

I've asked a windows distinguished arch to point me to the logical solutions/information about this are, per this conversation. Let's see what he says relative to Windows.

Sean Isom (Apr 18 2024 at 16:30):

I agree with that Dan. I certainly see the argument and value of process sandboxing like in a web browser, and how the wasi-gfx standard should allow for that use case, but I don’t think that needs to be a requirement or part of the core

Tarek Sander (Apr 18 2024 at 16:32):

I agree that library doesn't belong in this subgroup specifically, but hopefully that could be adopted as an optional layer of security in wasmtime or the wasi implementation, because such an undertaking should not be done by just one person anyways. With more WASI proposals, process sandboxing may be necessary for more proposals than this for adequate security. Should I ask in #general about thoughts for a sandboxing library/layer?

Sean Isom (Apr 18 2024 at 16:32):

There is already a lot of implicit trust in host-side code in the wasi ecosystem (although to Deian’s point, GPU drivers may be buggier than average), so it feels like that decision should be left up to the implementer

Sean Isom (Apr 18 2024 at 16:33):

Yeah, that makes sense to me. I see this being more generic of a problem than just for gpu processes

Tarek Sander (Apr 18 2024 at 16:38):

So for now if this proposal lands in wasmtime, at least the webgpu implementation (the raw framebuffer and window access would still be useful, there are CPU drawing libraries aplenty) should be gated behind a disabled-by-default enable_unsafe_webgpu config field. The host can enable it, but it signifies that the code is trusted, at least to the extent of running it on the GPU.

Mendy Berger (Apr 18 2024 at 18:30):

Definitely should be up to the implementer and not part of the actual spec.
I think Deian’s point was that most implementations that run untrusted code will want it, so we should have it in mind when designing the spec.

Ralph (Apr 18 2024 at 20:24):

windows-rs/crates/libs/sys at master · microsoft/windows-rs

Rust for Windows. Contribute to microsoft/windows-rs development by creating an account on GitHub.

Ralph (Apr 18 2024 at 20:33):

@Tarek Sander the windows team is interested in finding you what you need here. They think they've got all win32 sdk apis in the windows-sys crate, but they wanna know, if there's something specific you're expecting to find, and they can make it easier, let me know.

Ralph (Apr 18 2024 at 20:42):

Windows doesn't exactly do seccomp. You can do things like it by restricting priveleges, creating a process with a limited token, etc...

Sets a mitigation policy for the calling process. Mitigation policies enable a process to harden itself against various types of attacks.

SetProcessMitigationPolicy function (processthreadsapi.h) - Win32 apps

Sets a mitigation policy for the calling process. Mitigation policies enable a process to harden itself against various types of attacks.

Ralph (Apr 18 2024 at 20:43):

again, if you have specific questions, lemme know and I'll find the correct resource

Tarek Sander (Apr 18 2024 at 22:46):

I was asking more about the process tokens themselves, and generally access control of resources, essentially what I'd need to implement what's described for the chromium sandbox.

Stream: wasi-gfx

Topic: Isolation and IPC

Mendy Berger (Mar 26 2024 at 20:34):

Sean Isom (Mar 26 2024 at 21:08):

Dan Gohman (Mar 26 2024 at 21:41):

Sean Isom (Mar 26 2024 at 21:44):

Sean Isom (Mar 26 2024 at 21:45):

Dan Gohman (Mar 26 2024 at 22:27):

Dan Gohman (Mar 26 2024 at 23:09):

Sean Isom (Mar 26 2024 at 23:11):

Dan Gohman (Mar 26 2024 at 23:13):

Dan Gohman (Mar 26 2024 at 23:15):

Sean Isom (Mar 27 2024 at 03:34):

Sean Isom (Mar 27 2024 at 03:36):

Tarek Sander (Mar 27 2024 at 05:41):

Tarek Sander (Mar 27 2024 at 05:43):

Bailey Hayes (Mar 27 2024 at 19:07):

Mendy Berger (Mar 27 2024 at 20:29):

Tarek Sander (Mar 27 2024 at 20:46):

Mendy Berger (Mar 27 2024 at 20:48):

Tarek Sander (Mar 27 2024 at 20:49):

Tarek Sander (Mar 27 2024 at 20:54):

Mendy Berger (Mar 27 2024 at 20:55):

Tarek Sander (Mar 27 2024 at 20:58):

Mendy Berger (Mar 27 2024 at 21:00):

Tarek Sander (Mar 27 2024 at 21:09):

Mendy Berger (Mar 27 2024 at 21:10):

Tarek Sander (Mar 27 2024 at 21:16):

Mendy Berger (Mar 27 2024 at 21:17):

Tarek Sander (Mar 27 2024 at 21:20):

Mendy Berger (Apr 05 2024 at 11:06):

Mendy Berger (Apr 18 2024 at 01:51):

Tarek Sander (Apr 18 2024 at 06:39):

Tarek Sander (Apr 18 2024 at 06:42):

Ralph (Apr 18 2024 at 09:50):

Tarek Sander (Apr 18 2024 at 10:35):

Ralph (Apr 18 2024 at 10:37):

Ralph (Apr 18 2024 at 10:38):

Ralph (Apr 18 2024 at 10:38):

Tarek Sander (Apr 18 2024 at 10:43):

Ralph (Apr 18 2024 at 10:44):

Ralph (Apr 18 2024 at 10:44):

Tarek Sander (Apr 18 2024 at 15:28):

Tarek Sander (Apr 18 2024 at 15:32):

Dan Gohman (Apr 18 2024 at 15:40):

Tarek Sander (Apr 18 2024 at 16:04):

Tarek Sander (Apr 18 2024 at 16:07):

Tarek Sander (Apr 18 2024 at 16:09):

Dan Gohman (Apr 18 2024 at 16:10):

Dan Gohman (Apr 18 2024 at 16:25):

Dan Gohman (Apr 18 2024 at 16:27):

Ralph (Apr 18 2024 at 16:29):

Sean Isom (Apr 18 2024 at 16:30):

Tarek Sander (Apr 18 2024 at 16:32):

Sean Isom (Apr 18 2024 at 16:32):

Sean Isom (Apr 18 2024 at 16:33):

Tarek Sander (Apr 18 2024 at 16:38):

Mendy Berger (Apr 18 2024 at 18:30):

Ralph (Apr 18 2024 at 20:24):

Ralph (Apr 18 2024 at 20:33):

Ralph (Apr 18 2024 at 20:42):

Ralph (Apr 18 2024 at 20:42):

Ralph (Apr 18 2024 at 20:43):

Tarek Sander (Apr 18 2024 at 22:46):