Stream: wasi-gfx

Topic: Isolation and IPC


view this post on Zulip Mendy Berger (Mar 26 2024 at 20:34):

Just spoke to the primary maintainer of wgpu (Connor Fitzgerald), he confirmed @Sean Isom's nightmare, wgpu doesn't isolate between gpu resources.
Now the question is if the component-model's isolation is enough.
(cc @Dan Gohman )

view this post on Zulip Sean Isom (Mar 26 2024 at 21:08):

Maybe this becomes the host runtime's problem (in this case, the wrapper around WebGPU). It would be difficult to implement runtime behavior in terms of the standard that is based on WIT, I would say?

view this post on Zulip Dan Gohman (Mar 26 2024 at 21:41):

When wgpu is used in Firefox, how do they keep GPU resources isolated?

view this post on Zulip Sean Isom (Mar 26 2024 at 21:44):

My (basic) understanding is that it is cross-origin resource isolation. It checks origin ownership before allowing the content process to communicate with the GPU process. Once IPC to the GPU process occurs, everything in the GPU process is untrusted (just runs native WGPU)

view this post on Zulip Sean Isom (Mar 26 2024 at 21:45):

Would be helpful for someone from Mozilla to validate that and maybe provide more detail

view this post on Zulip Dan Gohman (Mar 26 2024 at 22:27):

@Mendy Berger The component model itself only isolates Wasm code, so it doesn't help for isolating things happening on the GPU.

view this post on Zulip Dan Gohman (Mar 26 2024 at 23:09):

So it sounds like, in addition to the question of whether we need a GPU-process architecture, we'll also need to figure out validation.

view this post on Zulip Sean Isom (Mar 26 2024 at 23:11):

Is there anything in WASI previously that architecturally required IPC that could be used as a starting point? I feel like validation is a cumbersome but easier problem to solve.

view this post on Zulip Dan Gohman (Mar 26 2024 at 23:13):

Not "architecturally", per se. In theory if someone wanted extra defense-in-depth they might want to run wasi-filesystem isolated in its own process, but I'm not aware of anyone having done that yet.

view this post on Zulip Dan Gohman (Mar 26 2024 at 23:15):

There have been people investigating mapping WIt interefaces to IPC/RPC protocols, though at this time I don't know of anything out-of-the-box that we could use.

view this post on Zulip Sean Isom (Mar 27 2024 at 03:34):

I’ll drop Thomas Steiner a note and see if he has thoughts from the Chrome side. I don’t know the right point of contact at Mozilla

view this post on Zulip Sean Isom (Mar 27 2024 at 03:36):

I’m inclined to think validation is more important for an MVP than process sandboxing, as I think that is as much for host process stability as security, but I’m not an SME in browser design

view this post on Zulip Tarek Sander (Mar 27 2024 at 05:41):

I asked a similar thing on the wgpu repo, but nobody answered. I did find this though. So it seems process isolation is optional, as long as you'd stick to the safe wgpu APIs. I don't know about wgpu-core though.

Hi there! 👋 I've been thinking about a use case in which a native host application creates a render target, and then exposes a rendering API to untrusted code running in a Wasm VM. However, I could...

view this post on Zulip Tarek Sander (Mar 27 2024 at 05:43):

Naga does shader validation already, and inserts bounds checks and things if needed, so that should be reasonably safe on the GPU side, too.

view this post on Zulip Bailey Hayes (Mar 27 2024 at 19:07):

Dan Gohman said:

There have been people investigating mapping WIt interefaces to IPC/RPC protocols, though at this time I don't know of anything out-of-the-box that we could use.

https://github.com/rvolosatovs/wrpc/

Wasm component-native RPC framework. Contribute to rvolosatovs/wrpc development by creating an account on GitHub.

view this post on Zulip Mendy Berger (Mar 27 2024 at 20:29):

I'm more than a little our of my depth here, and this is about security, so don't trust me more than any self important rando...

IIUC wgpu does do shader code bounds checking, so we don't have to do any validation.
They're just not doing resource checking, i.e. checking if a resource belongs to a specific caller before accessing it. But I think the runtimes are expected to do that, aren't they?

Again, don't trust a word I just said!

view this post on Zulip Tarek Sander (Mar 27 2024 at 20:46):

I think not checking the caller is natural for wgpu: There's only one caller, the program that uses wgpu as a library. I looked and yes, it seems IDs are shared among all instances for a given backend, though the ID is an index into a vector (or something like it) of resources, so at least the type would be right. So the only thing to secure that would be to make the IDs opaque and inaccessible to WASM: Keep them in the host resource for each WebGPU object, and when calling wgpu just extract the IDs. The WASI implementation then assigns its own IDs for the "WASI" backend resources.

view this post on Zulip Mendy Berger (Mar 27 2024 at 20:48):

@Tarek Sander exactly! This is why I don't think this is a real problem, the wasm module doesn't have access to the underlying wgpu IDs.

view this post on Zulip Tarek Sander (Mar 27 2024 at 20:49):

This would mean the Rust implementation of the WASI API would lie in wgpu-hal, that works with graphics API types instead of IDs.

view this post on Zulip Tarek Sander (Mar 27 2024 at 20:54):

Additional process isolation would be good, but I think for most uses cases being less paranoid about security that browsers is OK. It could be a vulnerability if the graphics API implementation wgpu in the host uses has a vulnerability for memory read/write, use that to run custom assembly or modify arguments to get access to other component's GPU resources, but seeing as these APIs are quite well tested and wgpu is used by Firefox, I don't think that's a big issue.

view this post on Zulip Mendy Berger (Mar 27 2024 at 20:55):

@Tarek Sander what's the problem with sharing IDs if they're opaque to the wasm guest?

view this post on Zulip Tarek Sander (Mar 27 2024 at 20:58):

The thing about wgpu-core IDs is that they can be explicitly set on resource creation by the client, I don't know if an ID being already used is handled, I don't think so. So the easiest option is to go to the wgpu-hal level where there are no IDs and let wgpu-core on top assign IDs like it wants to.

view this post on Zulip Mendy Berger (Mar 27 2024 at 21:00):

Not totally following, are you saying that we should build directly on top of wgpu-hal?

view this post on Zulip Tarek Sander (Mar 27 2024 at 21:09):

No, I say we should provide a wgpu HAL.

view this post on Zulip Mendy Berger (Mar 27 2024 at 21:10):

You mean 'hal-like', right?

view this post on Zulip Tarek Sander (Mar 27 2024 at 21:16):

Yeah, that level also has to be the same as the WebGPU bindings in wgpu, since WebGPU has no mention of IDs either.

view this post on Zulip Mendy Berger (Mar 27 2024 at 21:17):

Right.
So I think that's what we're doing already, we don't expose any IDs to the wasm guest, only resources/types.

view this post on Zulip Tarek Sander (Mar 27 2024 at 21:20):

Then I think we should be fine isolation-wise. Configuring a sandbox where you can still access the GPU is a nightmare, at least when I tried doing it on Android. Each driver implementation can use different system calls with different ioctls and device file paths, it's not good to manage with something like seccomp filters. And the host-API to WASM was supposed to be the replacement to sandboxing anyways, if the API is secure, you don't need an additional sandbox.

view this post on Zulip Mendy Berger (Apr 05 2024 at 11:06):

Hey y'all, @Deian Stefan - a security researcher - will likely be able to provide feedback here.

view this post on Zulip Mendy Berger (Apr 18 2024 at 01:51):

Just had a conversation with @Deian Stefan, here's what I learned:

level 1 security:
On a very basic level, the shader bounds check and the wasmtime resource isolation should be enough.

Level 2 security:
However, gpu driver code is often unsafe and buggy, so for any real world use case we'd wanna isolate the gpu into it's own process. Just like browsers do.

Level 3 security:
Once we have process isolation, we should consider even further restrictions with seccomp.

Level 4 security:
Browsers have runtime checks that they do on the gpu code. We should see if we can steel or replicate that.

He also told me to ask Tal Garfinkel for his opinion on the matter.

Helpful link from Chromium team: https://chromium.googlesource.com/chromium/src/+/main/docs/security/research/graphics/webgpu_technical_report.md

He also brought up the point that GPUs often don't do a good job isolating code, especially the older ones and there's not much we can do about that. It's a risk browsers have to accept as well.

@Deian Stefan hope I didn't butcher any of your points.

view this post on Zulip Tarek Sander (Apr 18 2024 at 06:39):

The APIs wgpu is built upon have extensions for robust buffer access extensions, which ensures the driver handles out-of-bounds shader accesses: Vulkan GL. The GL extension is also used to secure WebGL in browsers AFAIK. Depending on security paranoia, you could refuse running webgpu code without these extensions.

Process sandboxes are tricky if you really want to drop as much privileges as you can. E.g. Chromium on Linux has a setuid chromium-sandbox binary, because you need root to change the process's namespaces, user and filesystem root. I wanted to build a sandbox library for Rust before, but I just couldn't find any information on how to do process sandboxing on Windows. I guess replicating/using the chromium or Firefox sandbox is the best bet.

Seccomp is only available on Linux and Android, I don't know if other platforms have a way to restrict system calls.

view this post on Zulip Tarek Sander (Apr 18 2024 at 06:42):

An --unsafe-webgpu flags like browsers had for the early days of WebGPU implementations is also an option until all the sandboxing is implemented. If you're just running your own code in WASM and just use it for easy cross-platform deployment, sandboxing is not an issue.

view this post on Zulip Ralph (Apr 18 2024 at 09:50):

Tarek Sander said:

The APIs wgpu is built upon have extensions for robust buffer access extensions, which ensures the driver handles out-of-bounds shader accesses: Vulkan GL. The GL extension is also used to secure WebGL in browsers AFAIK. Depending on security paranoia, you could refuse running webgpu code without these extensions.

Process sandboxes are tricky if you really want to drop as much privileges as you can. E.g. Chromium on Linux has a setuid chromium-sandbox binary, because you need root to change the process's namespaces, user and filesystem root. I wanted to build a sandbox library for Rust before, but I just couldn't find any information on how to do process sandboxing on Windows. I guess replicating/using the chromium or Firefox sandbox is the best bet.

Seccomp is only available on Linux and Android, I don't know if other platforms have a way to restrict system calls.

just ask! I'll find the windows person who knows and get you the answers if you want them.

view this post on Zulip Tarek Sander (Apr 18 2024 at 10:35):

@Ralph I guess my question is about the permission system in Windows in general: How are the permissions of a process determined? How granular can you change them? The only documentation I can find is about the high level stuf like group policies or the Windows sandbox feature that just runs Windows inside a VM. The chromium page gives a good overview, but has not much links the the Microsoft documentation (if any exists).

view this post on Zulip Ralph (Apr 18 2024 at 10:37):

https://learn.microsoft.com/en-us/windows/win32/procthread/process-security-and-access-rights is one place

The Microsoft Windows security model enables you to control access to process objects. For more information about security, see Access-Control Model.

view this post on Zulip Ralph (Apr 18 2024 at 10:38):

however, I'm not yet sure that's what you're looking for.

view this post on Zulip Ralph (Apr 18 2024 at 10:38):

the windows lead I know lives in EDT, so we can chat with him about 3 pm or so CET

view this post on Zulip Tarek Sander (Apr 18 2024 at 10:43):

That link looks promising, at has links to the security architecture of Windows in general. Detailed information would be great! Last I checked there was no Rust process sandboxing library that supported Windows, so maybe the sandboxing solution we'll need could also benefit other projects as a library.

view this post on Zulip Ralph (Apr 18 2024 at 10:44):

yeah, the windows peeps would know: they're all about rust these days

view this post on Zulip Ralph (Apr 18 2024 at 10:44):

which, good

view this post on Zulip Tarek Sander (Apr 18 2024 at 15:28):

I browsed the sandbox tag on crates.io again now and yep, none of the crates seem to support Windows.

view this post on Zulip Tarek Sander (Apr 18 2024 at 15:32):

Would y'all have a preference for the data format for IPC between the processes? If not I'd make a flexbuffers crate with strict bounds checks, the validation is sadly only in the C++ lib and even validation doesn't help in case we use a shared memory buffer, where the compromised side could change the offsets anytime.

view this post on Zulip Dan Gohman (Apr 18 2024 at 15:40):

Is there a reason for flexbuffers instead of flatbuffers? I imagine you'd be generating the IPC code from the Wit definitions, so you'll know all the types of everything up front.

view this post on Zulip Tarek Sander (Apr 18 2024 at 16:04):

I'd like the sandbox library to be independent of WASI (so there is finally one that support Windows in the ecosystem), and I don't think flatbuffers would work with serde, as there's no schema with that. I could make a flatbuffers compiler that uses strict bounds checks, but integrating with serde is a better goal than support for flatbuffer schemas IMO.

view this post on Zulip Tarek Sander (Apr 18 2024 at 16:07):

Ralph said:

yeah, the windows peeps would know: they're all about rust these days

Which is why it surprises me that there is no sandbox library with support for Windows. Maybe I'm just bad at searching, but I haven't found one, y'all can look too if you want, maybe you're better at searching than me.

view this post on Zulip Tarek Sander (Apr 18 2024 at 16:09):

I'm using "y'all" as a non-notifying replacement for something like @all by the way, I don't know of a better term.

view this post on Zulip Dan Gohman (Apr 18 2024 at 16:10):

I have also looked in the past, and not found any.

view this post on Zulip Dan Gohman (Apr 18 2024 at 16:25):

I think if you're using flatbuffers, then you just wouldn't use serde. Remoting a WebGPU API doesn't depend on serializing arbitrary user types, or serializing to arbitrary formats, so the main strengths of serde wouldn't apply.

view this post on Zulip Dan Gohman (Apr 18 2024 at 16:27):

It sounds like there are multiple possible goals here. If you're interested in building a Windows sandboxing library, you're certainly welcome to build that. That said, I don't think that's a critical path for wasi-gfx per se.

view this post on Zulip Ralph (Apr 18 2024 at 16:29):

I've asked a windows distinguished arch to point me to the logical solutions/information about this are, per this conversation. Let's see what he says relative to Windows.

view this post on Zulip Sean Isom (Apr 18 2024 at 16:30):

I agree with that Dan. I certainly see the argument and value of process sandboxing like in a web browser, and how the wasi-gfx standard should allow for that use case, but I don’t think that needs to be a requirement or part of the core

view this post on Zulip Tarek Sander (Apr 18 2024 at 16:32):

It'd be long before my library would be finished anyways.

Dan Gohman said:

It sounds like there are multiple possible goals here. If you're interested in building a Windows sandboxing library, you're certainly welcome to build that. That said, I don't think that's a critical path for wasi-gfx per se.

I agree that library doesn't belong in this subgroup specifically, but hopefully that could be adopted as an optional layer of security in wasmtime or the wasi implementation, because such an undertaking should not be done by just one person anyways. With more WASI proposals, process sandboxing may be necessary for more proposals than this for adequate security. Should I ask in #general about thoughts for a sandboxing library/layer?

view this post on Zulip Sean Isom (Apr 18 2024 at 16:32):

There is already a lot of implicit trust in host-side code in the wasi ecosystem (although to Deian’s point, GPU drivers may be buggier than average), so it feels like that decision should be left up to the implementer

view this post on Zulip Sean Isom (Apr 18 2024 at 16:33):

Yeah, that makes sense to me. I see this being more generic of a problem than just for gpu processes

view this post on Zulip Tarek Sander (Apr 18 2024 at 16:38):

So for now if this proposal lands in wasmtime, at least the webgpu implementation (the raw framebuffer and window access would still be useful, there are CPU drawing libraries aplenty) should be gated behind a disabled-by-default enable_unsafe_webgpu config field. The host can enable it, but it signifies that the code is trusted, at least to the extent of running it on the GPU.

view this post on Zulip Mendy Berger (Apr 18 2024 at 18:30):

Definitely should be up to the implementer and not part of the actual spec.
I think Deian’s point was that most implementations that run untrusted code will want it, so we should have it in mind when designing the spec.

view this post on Zulip Ralph (Apr 18 2024 at 20:24):

Ralph said:

I've asked a windows distinguished arch to point me to the logical solutions/information about this are, per this conversation. Let's see what he says relative to Windows.

https://github.com/microsoft/windows-rs/blob/master/crates/libs/sys is the answer, possibly, says the windows dude.

Rust for Windows. Contribute to microsoft/windows-rs development by creating an account on GitHub.

view this post on Zulip Ralph (Apr 18 2024 at 20:33):

@Tarek Sander the windows team is interested in finding you what you need here. They think they've got all win32 sdk apis in the windows-sys crate, but they wanna know, if there's something specific you're expecting to find, and they can make it easier, let me know.

view this post on Zulip Ralph (Apr 18 2024 at 20:42):

they mention:

view this post on Zulip Ralph (Apr 18 2024 at 20:42):

Windows doesn't exactly do seccomp. You can do things like it by restricting priveleges, creating a process with a limited token, etc...

And there is the SetProcessMitigationPolicy function (processthreadsapi.h) - Win32 apps | Microsoft Learn

SetProcessMitigationPolicy function (processthreadsapi.h) - Win32 apps

Sets a mitigation policy for the calling process. Mitigation policies enable a process to harden itself against various types of attacks.

Its just not as granular or clean as Linux.

Sets a mitigation policy for the calling process. Mitigation policies enable a process to harden itself against various types of attacks.

view this post on Zulip Ralph (Apr 18 2024 at 20:43):

again, if you have specific questions, lemme know and I'll find the correct resource

view this post on Zulip Tarek Sander (Apr 18 2024 at 22:46):

I was asking more about the process tokens themselves, and generally access control of resources, essentially what I'd need to implement what's described for the chromium sandbox.


Last updated: Dec 23 2024 at 12:05 UTC