Just spoke to the primary maintainer of wgpu (Connor Fitzgerald), he confirmed @Sean Isom's nightmare, wgpu doesn't isolate between gpu resources.
Now the question is if the component-model's isolation is enough.
(cc @Dan Gohman )
Maybe this becomes the host runtime's problem (in this case, the wrapper around WebGPU). It would be difficult to implement runtime behavior in terms of the standard that is based on WIT, I would say?
When wgpu is used in Firefox, how do they keep GPU resources isolated?
My (basic) understanding is that it is cross-origin resource isolation. It checks origin ownership before allowing the content process to communicate with the GPU process. Once IPC to the GPU process occurs, everything in the GPU process is untrusted (just runs native WGPU)
Would be helpful for someone from Mozilla to validate that and maybe provide more detail
@Mendy Berger The component model itself only isolates Wasm code, so it doesn't help for isolating things happening on the GPU.
So it sounds like, in addition to the question of whether we need a GPU-process architecture, we'll also need to figure out validation.
Is there anything in WASI previously that architecturally required IPC that could be used as a starting point? I feel like validation is a cumbersome but easier problem to solve.
Not "architecturally", per se. In theory if someone wanted extra defense-in-depth they might want to run wasi-filesystem isolated in its own process, but I'm not aware of anyone having done that yet.
There have been people investigating mapping WIt interefaces to IPC/RPC protocols, though at this time I don't know of anything out-of-the-box that we could use.
I’ll drop Thomas Steiner a note and see if he has thoughts from the Chrome side. I don’t know the right point of contact at Mozilla
I’m inclined to think validation is more important for an MVP than process sandboxing, as I think that is as much for host process stability as security, but I’m not an SME in browser design
I asked a similar thing on the wgpu
repo, but nobody answered. I did find this though. So it seems process isolation is optional, as long as you'd stick to the safe wgpu
APIs. I don't know about wgpu-core
though.
Naga does shader validation already, and inserts bounds checks and things if needed, so that should be reasonably safe on the GPU side, too.
Dan Gohman said:
There have been people investigating mapping WIt interefaces to IPC/RPC protocols, though at this time I don't know of anything out-of-the-box that we could use.
https://github.com/rvolosatovs/wrpc/
I'm more than a little our of my depth here, and this is about security, so don't trust me more than any self important rando...
IIUC wgpu does do shader code bounds checking, so we don't have to do any validation.
They're just not doing resource checking, i.e. checking if a resource belongs to a specific caller before accessing it. But I think the runtimes are expected to do that, aren't they?
Again, don't trust a word I just said!
I think not checking the caller is natural for wgpu
: There's only one caller, the program that uses wgpu
as a library. I looked and yes, it seems IDs are shared among all instances for a given backend, though the ID is an index into a vector (or something like it) of resources, so at least the type would be right. So the only thing to secure that would be to make the IDs opaque and inaccessible to WASM: Keep them in the host resource for each WebGPU object, and when calling wgpu
just extract the IDs. The WASI implementation then assigns its own IDs for the "WASI" backend resources.
@Tarek Sander exactly! This is why I don't think this is a real problem, the wasm module doesn't have access to the underlying wgpu IDs.
This would mean the Rust implementation of the WASI API would lie in wgpu-hal
, that works with graphics API types instead of IDs.
Additional process isolation would be good, but I think for most uses cases being less paranoid about security that browsers is OK. It could be a vulnerability if the graphics API implementation wgpu
in the host uses has a vulnerability for memory read/write, use that to run custom assembly or modify arguments to get access to other component's GPU resources, but seeing as these APIs are quite well tested and wgpu
is used by Firefox, I don't think that's a big issue.
@Tarek Sander what's the problem with sharing IDs if they're opaque to the wasm guest?
The thing about wgpu-core
IDs is that they can be explicitly set on resource creation by the client, I don't know if an ID being already used is handled, I don't think so. So the easiest option is to go to the wgpu-hal
level where there are no IDs and let wgpu-core
on top assign IDs like it wants to.
Not totally following, are you saying that we should build directly on top of wgpu-hal?
No, I say we should provide a wgpu HAL.
You mean 'hal-like', right?
Yeah, that level also has to be the same as the WebGPU bindings in wgpu
, since WebGPU has no mention of IDs either.
Right.
So I think that's what we're doing already, we don't expose any IDs to the wasm guest, only resources/types.
Then I think we should be fine isolation-wise. Configuring a sandbox where you can still access the GPU is a nightmare, at least when I tried doing it on Android. Each driver implementation can use different system calls with different ioctls and device file paths, it's not good to manage with something like seccomp filters. And the host-API to WASM was supposed to be the replacement to sandboxing anyways, if the API is secure, you don't need an additional sandbox.
Hey y'all, @Deian Stefan - a security researcher - will likely be able to provide feedback here.
Just had a conversation with @Deian Stefan, here's what I learned:
level 1 security:
On a very basic level, the shader bounds check and the wasmtime resource isolation should be enough.
Level 2 security:
However, gpu driver code is often unsafe and buggy, so for any real world use case we'd wanna isolate the gpu into it's own process. Just like browsers do.
Level 3 security:
Once we have process isolation, we should consider even further restrictions with seccomp.
Level 4 security:
Browsers have runtime checks that they do on the gpu code. We should see if we can steel or replicate that.
He also told me to ask Tal Garfinkel for his opinion on the matter.
Helpful link from Chromium team: https://chromium.googlesource.com/chromium/src/+/main/docs/security/research/graphics/webgpu_technical_report.md
He also brought up the point that GPUs often don't do a good job isolating code, especially the older ones and there's not much we can do about that. It's a risk browsers have to accept as well.
@Deian Stefan hope I didn't butcher any of your points.
The APIs wgpu
is built upon have extensions for robust buffer access extensions, which ensures the driver handles out-of-bounds shader accesses: Vulkan GL. The GL extension is also used to secure WebGL in browsers AFAIK. Depending on security paranoia, you could refuse running webgpu code without these extensions.
Process sandboxes are tricky if you really want to drop as much privileges as you can. E.g. Chromium on Linux has a setuid chromium-sandbox
binary, because you need root to change the process's namespaces, user and filesystem root. I wanted to build a sandbox library for Rust before, but I just couldn't find any information on how to do process sandboxing on Windows. I guess replicating/using the chromium or Firefox sandbox is the best bet.
Seccomp is only available on Linux and Android, I don't know if other platforms have a way to restrict system calls.
An --unsafe-webgpu
flags like browsers had for the early days of WebGPU implementations is also an option until all the sandboxing is implemented. If you're just running your own code in WASM and just use it for easy cross-platform deployment, sandboxing is not an issue.
Tarek Sander said:
The APIs
wgpu
is built upon have extensions for robust buffer access extensions, which ensures the driver handles out-of-bounds shader accesses: Vulkan GL. The GL extension is also used to secure WebGL in browsers AFAIK. Depending on security paranoia, you could refuse running webgpu code without these extensions.Process sandboxes are tricky if you really want to drop as much privileges as you can. E.g. Chromium on Linux has a setuid
chromium-sandbox
binary, because you need root to change the process's namespaces, user and filesystem root. I wanted to build a sandbox library for Rust before, but I just couldn't find any information on how to do process sandboxing on Windows. I guess replicating/using the chromium or Firefox sandbox is the best bet.Seccomp is only available on Linux and Android, I don't know if other platforms have a way to restrict system calls.
just ask! I'll find the windows person who knows and get you the answers if you want them.
@Ralph I guess my question is about the permission system in Windows in general: How are the permissions of a process determined? How granular can you change them? The only documentation I can find is about the high level stuf like group policies or the Windows sandbox feature that just runs Windows inside a VM. The chromium page gives a good overview, but has not much links the the Microsoft documentation (if any exists).
https://learn.microsoft.com/en-us/windows/win32/procthread/process-security-and-access-rights is one place
however, I'm not yet sure that's what you're looking for.
the windows lead I know lives in EDT, so we can chat with him about 3 pm or so CET
That link looks promising, at has links to the security architecture of Windows in general. Detailed information would be great! Last I checked there was no Rust process sandboxing library that supported Windows, so maybe the sandboxing solution we'll need could also benefit other projects as a library.
yeah, the windows peeps would know: they're all about rust these days
which, good
I browsed the sandbox tag on crates.io again now and yep, none of the crates seem to support Windows.
Would y'all have a preference for the data format for IPC between the processes? If not I'd make a flexbuffers crate with strict bounds checks, the validation is sadly only in the C++ lib and even validation doesn't help in case we use a shared memory buffer, where the compromised side could change the offsets anytime.
Is there a reason for flexbuffers instead of flatbuffers? I imagine you'd be generating the IPC code from the Wit definitions, so you'll know all the types of everything up front.
I'd like the sandbox library to be independent of WASI (so there is finally one that support Windows in the ecosystem), and I don't think flatbuffers would work with serde, as there's no schema with that. I could make a flatbuffers compiler that uses strict bounds checks, but integrating with serde is a better goal than support for flatbuffer schemas IMO.
Ralph said:
yeah, the windows peeps would know: they're all about rust these days
Which is why it surprises me that there is no sandbox library with support for Windows. Maybe I'm just bad at searching, but I haven't found one, y'all can look too if you want, maybe you're better at searching than me.
I'm using "y'all" as a non-notifying replacement for something like @all by the way, I don't know of a better term.
I have also looked in the past, and not found any.
I think if you're using flatbuffers, then you just wouldn't use serde. Remoting a WebGPU API doesn't depend on serializing arbitrary user types, or serializing to arbitrary formats, so the main strengths of serde wouldn't apply.
It sounds like there are multiple possible goals here. If you're interested in building a Windows sandboxing library, you're certainly welcome to build that. That said, I don't think that's a critical path for wasi-gfx per se.
I've asked a windows distinguished arch to point me to the logical solutions/information about this are, per this conversation. Let's see what he says relative to Windows.
I agree with that Dan. I certainly see the argument and value of process sandboxing like in a web browser, and how the wasi-gfx standard should allow for that use case, but I don’t think that needs to be a requirement or part of the core
It'd be long before my library would be finished anyways.
Dan Gohman said:
It sounds like there are multiple possible goals here. If you're interested in building a Windows sandboxing library, you're certainly welcome to build that. That said, I don't think that's a critical path for wasi-gfx per se.
I agree that library doesn't belong in this subgroup specifically, but hopefully that could be adopted as an optional layer of security in wasmtime or the wasi implementation, because such an undertaking should not be done by just one person anyways. With more WASI proposals, process sandboxing may be necessary for more proposals than this for adequate security. Should I ask in #general about thoughts for a sandboxing library/layer?
There is already a lot of implicit trust in host-side code in the wasi ecosystem (although to Deian’s point, GPU drivers may be buggier than average), so it feels like that decision should be left up to the implementer
Yeah, that makes sense to me. I see this being more generic of a problem than just for gpu processes
So for now if this proposal lands in wasmtime, at least the webgpu implementation (the raw framebuffer and window access would still be useful, there are CPU drawing libraries aplenty) should be gated behind a disabled-by-default enable_unsafe_webgpu config field. The host can enable it, but it signifies that the code is trusted, at least to the extent of running it on the GPU.
Definitely should be up to the implementer and not part of the actual spec.
I think Deian’s point was that most implementations that run untrusted code will want it, so we should have it in mind when designing the spec.
Ralph said:
I've asked a windows distinguished arch to point me to the logical solutions/information about this are, per this conversation. Let's see what he says relative to Windows.
https://github.com/microsoft/windows-rs/blob/master/crates/libs/sys is the answer, possibly, says the windows dude.
@Tarek Sander the windows team is interested in finding you what you need here. They think they've got all win32 sdk apis in the windows-sys crate, but they wanna know, if there's something specific you're expecting to find, and they can make it easier, let me know.
they mention:
Windows doesn't exactly do seccomp. You can do things like it by restricting priveleges, creating a process with a limited token, etc...
And there is the SetProcessMitigationPolicy function (processthreadsapi.h) - Win32 apps | Microsoft Learn
SetProcessMitigationPolicy function (processthreadsapi.h) - Win32 apps
Sets a mitigation policy for the calling process. Mitigation policies enable a process to harden itself against various types of attacks.
Its just not as granular or clean as Linux.
again, if you have specific questions, lemme know and I'll find the correct resource
I was asking more about the process tokens themselves, and generally access control of resources, essentially what I'd need to implement what's described for the chromium sandbox.
Last updated: Nov 22 2024 at 16:03 UTC