Stream: git-wasmtime

Topic: wasmtime / Issue #2232 Enable intercepting all filesystem...


view this post on Zulip Wasmtime GitHub notifications bot (Sep 25 2020 at 18:40):

joshuawarner32 opened Issue #2232:

I'm interested in using wasmtime as a VM to run wasi scripts in a sandbox - and by being at the level normally reserved for the OS, being able to get precise information about which files/dirs the sandboxed application reads and writes. I'd also like to be able to dynamically fill in the filesystem tree exposed to the wasi program rather than having to have the tree pre-populated, as in my case the filesystem tree could be prohibitively large.

There are a couple things missing from the existing VirtualDirEntry:

This design is of course pretty fuzzy at this point, and I've only done a cursory inspection of the interfaces involved. I'd be interested in working on this, if this seems to be in alignment with the project's goals. Feedback is most welcome!

view this post on Zulip Wasmtime GitHub notifications bot (Sep 25 2020 at 20:31):

pchickey commented on Issue #2232:

Welcome! The use case you describe is very much something we want to enable.

I'm presently working on a bunch of renovations to the wasi-common crate (see #2202, #2205). @sunfishcode and @alexcrichton have some ideas and works-in-progress that will also help us change the architecture of this crate. Your design ideas sound right in line with what we'd like to see, so I'd encourage you to either expand on them here or make a PR where we can all take a look together.

One thing I'm failing at right now is describing a cohesive vision for what we want wasi-common to become - there are a lot of moving parts right now, and I'm trying to balance the limited time I get to do code gardening against bigger concerns like shipping a new wasi snapshot (long overdue at this point, but none of the folks involved have had much spare bandwidth this summer) and some more urgent aspects of the design which need fixing for the sake of production systems using it. We're eager to get more help with any and all of these parts of the WASI puzzle, if you'd like to be more involved we can chat on the bytecode alliance zulip.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 27 2020 at 16:11):

kamyuentse commented on Issue #2232:

I am interested in the new architecture of this crate, @joshuawarner32 describe a use case to access the host file system. And I think we need to consider how to interoperate with the remote filesystem or object storage service, hdfs, s3, etc on the cloud platform.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 27 2020 at 19:27):

sunfishcode commented on Issue #2232:

One of the big pieces of this puzzle will be API virtualization. When an application imports eg. fd_read from wasi_snapshot_preview1, it should be possible at link time to resolve that to wasi-common's native implementation, to a different native implementation, or to a wasm implementation. And, these other implementations should be able to import fd_read from wasi_snapshot_preview themselves, allowing them to forward requests on to the next level down when they want to.

One we have a system which can do that, we won't need traits like VirtualDirEntry, and won't need to worry about ensuring that traits have all the needed hooks for everyone, because people will be able to wrap the WASI APIs themselves. And, this will generalize to all APIs, and not require a trait for each API that people want to customize. And, it'll allow for completely custom implementations, so people can experiment with other backends.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 27 2020 at 19:33):

joshuawarner32 commented on Issue #2232:

I've taken a bit of a deeper look, and I think one complicated factor here is that there are actually two levels of access here:

It's not immediately clear why this layering exists, but it appears there are multiple differences:

For my purposes, I'd propose that the first distinction (config time vs runtime) doesn't really make any sense; I'd actually like the same structure to live all the way to the end of the execution so that I can inspect it after the fact.

Both the second (data vs rights) and third (owned vs rc'd) do probably make sense to retain, as this is necessary to properly implement multiple handles with separate rights attached, as required in posix/wasi.

I do have a bit of experience in implementing filesystem-like datastructures, and one thing that's worked well in the past is to maintain a first-class concept of inodes (as identifiers for files/dirs on disk). All of the backing data for the file content / dir listing goes in a single "Filesystem" object, and all dir listings indirect through inode. A Handle is then just a combination of an inode and some set of rights.

Concretely, I'd propose the following:

struct Inode(usize); // abstract identifier, maybe the usize is public maybe not

enum Contents {
  Directory(Box<dyn DirContents>),
  File(Box<dyn FileContents>),
}

struct VNode {

  // explicit ref counting, to account for hard-linking files (and maybe dirs) in the tree
  // I'm actually not sure if this is part of the wasi spec, but it is certainly typical of filesytstems
  ref_count: usize,
  contents: Contents,
}

struct Handle {
  inode: Inode,
  rights: Rights,
}

// There should be one single Filesystem instance per WasiCtx
struct Filesystem {
  // Indexed by inode
  nodes: Vec<Option<VNodeRef>>,

  // maybe this should be separate?
  handles: Vec<Handle>,
}

view this post on Zulip Wasmtime GitHub notifications bot (Sep 27 2020 at 19:40):

joshuawarner32 commented on Issue #2232:

(sorry @sunfishcode, didn't see your reply until after I hit submit on mine)

When an application imports eg. fd_read from wasi_snapshot_preview1, it should be possible at link time to resolve that to wasi-common's native implementation, to a different native implementation, or to a wasm implementation.

Ooh interesting! I definitely agree this should be possible, and would certainly be pretty cool to be able to swap things out at this level too - however I'd argue that many/most users will want to customize small parts of the behavior of the runtime (such as myself), without inheriting the complexity of building a sane implementation of a posix-like FS API in their application code.

In other words, I'd propose that a simpler VFS-like interface (using perhaps something inspired by interfaces used in pick-your-favorite-os-kernel for navigating/mounting different filesystems together - should exist somewhere, whether part of wasi-common, or in some other "helper" crate.

There's substantial value in centralizing the implementation of things like rights-checking on handles, and cycle-detection (to prevent parent dirs from being moved into children).

view this post on Zulip Wasmtime GitHub notifications bot (Sep 27 2020 at 19:40):

bjorn3 commented on Issue #2232:

The difference between the two layers is because libpreopen (part of the wasi libc) gets a static list of path -> fd mappings when starting. The first layer is that list, while the second layer is when reading directories at runtime.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 27 2020 at 19:44):

joshuawarner32 commented on Issue #2232:

@bjorn3 Ah interesting. I guess in my proposal then the list passed to libpreopen would be a HashMap<PathBuf, Inode> or something similar.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 27 2020 at 19:44):

joshuawarner32 edited a comment on Issue #2232:

@bjorn3 Ah interesting. I guess in my proposal then the list passed to libpreopen would be a HashMap<PathBuf, Handle> or something similar.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 27 2020 at 19:48):

joshuawarner32 commented on Issue #2232:

Also, to flesh out the above, here's what DirContents might look like:

trait DirContents {
  fn list(&self) -> Result<Vec<&str>>;
  fn get(&self, child_name: &str) -> Result<Inode>;
  fn set(&mut self, child_name: &str, inode: Inode) -> Result<()>;
}

view this post on Zulip Wasmtime GitHub notifications bot (Sep 27 2020 at 19:55):

bjorn3 commented on Issue #2232:

Wasmtime currently uses the host inode as wasi inode: https://github.com/bytecodealliance/wasmtime/blob/b37adbbe317787fc1c627a93e36327c154e0fa68/crates/wasi-common/src/old/snapshot_0/sys/unix/linux/host_impl.rs#L11 This doesn't work well with a nodes: Vec<Option<NodeRef>>. Also keeping the ref_count of Vnode in sync with the host will be impossible. Lastly a file could turn into a directory without changing the inode if for example all inodes are used and then a single file is removed followed by a single directory created.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 27 2020 at 20:06):

joshuawarner32 commented on Issue #2232:

Wasmtime currently uses the host inode as wasi inode

This might be ideal for implementations that redirect all FS interaction to the host, but for anything that tries to virtualize part of the filesystem tree. I'd argue that in many cases it may actually be preferable to (by default) virtualize all the inodes that are passed to the wasi binary, both so that different runs can have better guarantees of determinism and for better sandboxing (since observing assigned inodes could give information about what else is running on the host).

Also keeping the ref_count of Vnode in sync with the host will be impossible.

I certainly wouldn't suggest keeping these in sync! The ref_count of Vnode should only represent references within the virtualized filesystem (i.e. the part accessible to the wasi binary).

Lastly a file could turn into a directory without changing the inode if for example all inodes are used and then a single file is removed followed by a single directory created.

This is a thing that can happen on a real filesystem too. In linux this is generally handled with the generation number (which is incremented whenever an inode number is reused).

view this post on Zulip Wasmtime GitHub notifications bot (Sep 27 2020 at 20:07):

joshuawarner32 edited a comment on Issue #2232:

Wasmtime currently uses the host inode as wasi inode

This might be ideal for implementations that redirect all FS interaction to the host, but not for anything that tries to virtualize part of the filesystem tree. I'd argue that in many cases it may actually be preferable to (by default) virtualize all the inodes that are passed to the wasi binary, both so that different runs can have better guarantees of determinism and for better sandboxing (since observing assigned inodes could give information about what else is running on the host).

Also keeping the ref_count of Vnode in sync with the host will be impossible.

I certainly wouldn't suggest keeping these in sync! The ref_count of Vnode should only represent references within the virtualized filesystem (i.e. the part accessible to the wasi binary).

Lastly a file could turn into a directory without changing the inode if for example all inodes are used and then a single file is removed followed by a single directory created.

This is a thing that can happen on a real filesystem too. In linux this is generally handled with the generation number (which is incremented whenever an inode number is reused).

view this post on Zulip Wasmtime GitHub notifications bot (Sep 28 2020 at 13:25):

sunfishcode commented on Issue #2232:

If you just want to customize small part of the behavior of an API, API virtualization should work well. You wouldn't need to build a whole filesystem yourself; you'd call into the "next level down" as needed. In the use case described at the top of this issue, the implementation of path_open would record the path being accessed, and then import and call path_open to do the actual work of opening the file.

A VFS layer makes sense to have when implementing filesystem APIs on top of things that aren't already filesystems, and which you have exclusive access to, such as block devices or in-memory filesystems. But when implementing filesystems in terms of APIs which already are filesystems, and which could be accessed concurrently by other processes, an extra layer of reference counting and and extra inode index space are redundant and potentially tricky to keep in sync. So if we have a VFS mechanism, it seems like we'd provide it as a library that filesystem implementations could use independently, rather than being something built into WasiCtx.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 28 2020 at 15:24):

joshuawarner32 commented on Issue #2232:

So if we have a VFS mechanism, it seems like we'd provide it as a library

I could get behind that.

In that case the more pressing question becomes, what's the right way to let crates use most of the existing wasi infrastructure, but also plug in this hypothetical library (wasi-vfs, maybe)? Would the right interface be directly (re)implementing the fd_read/etc. If so, is that possible now with WasiCtx?

Or perhaps there could be an intermediate layer that lets WasiCtx handle things like rights/perms on handles, but derer all other logic to a lower-level interface for data access - perhaps something like FileContents, but "global" to the filesystem? Maybe this could be similar in spirit to the Filesystem trait I discussed above, except it would _not_ take any responsibility for virtualizing inodes/etc.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 29 2020 at 03:20):

joshuawarner32 commented on Issue #2232:

I've continued to look through the code, and it looks like Handle is actually not too far away from the abstraction level I'm looking for. What about exposing that trait and adding a WasiCtxBuilder::preopened_handle method that accepts a Box<dyn Handle> (or similar) instead?


Last updated: Oct 23 2024 at 20:03 UTC