wasmtime / PR #3697 memfd/madvise-based CoW pooling alloc... · git-wasmtime

Stream: git-wasmtime

Topic: wasmtime / PR #3697 memfd/madvise-based CoW pooling alloc...

Wasmtime GitHub notifications bot (Jan 19 2022 at 01:32):

cfallin opened PR #3697 from memfd-cow to main:

Add a pooling allocator mode based on copy-on-write mappings of memfds.

As first suggested by Jan on the Zulip here [1], a cheap and effective
way to obtain copy-on-write semantics of a "backing image" for a Wasm
memory is to mmap a file with MAP_PRIVATE. The memfd mechanism
provided by the Linux kernel allows us to create anonymous,
in-memory-only files that we can use for this mapping, so we can
construct the image contents on-the-fly then effectively create a CoW
overlay. Furthermore, and importantly, madvise(MADV_DONTNEED, ...)
will discard the CoW overlay, returning the mapping to its original
state.

By itself this is almost enough for a very fast
instantiation-termination loop of the same image over and over,
without changing the address space mapping at all (which is
expensive). The only missing bit is how to implement
heap growth. But here memfds can help us again: if we create another
anonymous file and map it where the extended parts of the heap would
go, we can take advantage of the fact that a mmap() mapping can
be larger than the file itself, with accesses beyond the end
generating a SIGBUS, and the fact that we can cheaply resize the
file with ftruncate, even after a mapping exists. So we can map the
"heap extension" file once with the maximum memory-slot size and grow
the memfd itself as memory.grow operations occur.

The above CoW technique and heap-growth technique together allow us a
fastpath of madvise() and ftruncate() only when we re-instantiate
the same module over and over, as long as we can reuse the same
slot. This fastpath avoids all whole-process address-space locks in
the Linux kernel, which should mean it is highly scalable. It also
avoids the cost of copying data on read, as the uffd heap backend
does when servicing pagefaults; the kernel's own optimized CoW
logic (same as used by all file mmaps) is used instead.

There are still a few loose ends in this PR, which I intend to tie up
before merging:

There is no InstanceAllocationStrategy yet that attempts to
actually reuse instance slots; that should be added ASAP. For testing
so far, I have just instantiated the same one module repeatedly (so
reuse naturally occurs).

The guard-page strategy is slightly wrong; I need to implement the
pre-heap guard region as well. This will be done by performing
another mapping once, to reserve the whole address range, then
mmap'ing the image and extension file on top at appropriate
offsets (2GiB, 2GiB plus image size).

Thanks to Jan on Zulip (are you also @koute from #3691?) for the initial
idea/inspiration! This PR is meant to demonstrate my thoughts on how
to build the feature and spawn discussion; now that we see both approaches
hopefully we can work out a way to meet the needs of both of our use-cases.

[1] https://bytecodealliance.zulipchat.com/#narrow/stream/206238-general/topic/Copy.20on.20write.20based.20instance.20reuse/near/266657772

Wasmtime GitHub notifications bot (Jan 19 2022 at 01:33):

cfallin requested alexcrichton for a review on PR #3697.

Wasmtime GitHub notifications bot (Jan 19 2022 at 01:33):

cfallin requested peterhuene for a review on PR #3697.

Wasmtime GitHub notifications bot (Jan 19 2022 at 01:39):

cfallin updated PR #3697 from memfd-cow to main.

Wasmtime GitHub notifications bot (Jan 21 2022 at 01:16):

cfallin updated PR #3697 from memfd-cow to main.

Wasmtime GitHub notifications bot (Jan 21 2022 at 01:53):

cfallin updated PR #3697 from memfd-cow to main.

Wasmtime GitHub notifications bot (Jan 22 2022 at 06:05):

cfallin updated PR #3697 from memfd-cow to main.

Wasmtime GitHub notifications bot (Jan 22 2022 at 06:19):

cfallin updated PR #3697 from memfd-cow to main.

Wasmtime GitHub notifications bot (Jan 22 2022 at 06:43):

cfallin updated PR #3697 from memfd-cow to main.

Wasmtime GitHub notifications bot (Jan 25 2022 at 00:00):

cfallin updated PR #3697 from memfd-cow to main.

Wasmtime GitHub notifications bot (Jan 25 2022 at 00:07):

cfallin updated PR #3697 from memfd-cow to main.

Wasmtime GitHub notifications bot (Jan 25 2022 at 00:50):

cfallin updated PR #3697 from memfd-cow to main.

Wasmtime GitHub notifications bot (Jan 25 2022 at 00:51):

cfallin updated PR #3697 from memfd-cow to main.

Wasmtime GitHub notifications bot (Jan 25 2022 at 01:09):

cfallin updated PR #3697 from memfd-cow to main.

Wasmtime GitHub notifications bot (Jan 25 2022 at 01:13):

cfallin updated PR #3697 from memfd-cow to main.

Wasmtime GitHub notifications bot (Jan 25 2022 at 07:07):

cfallin updated PR #3697 from memfd-cow to main.

Wasmtime GitHub notifications bot (Jan 25 2022 at 08:06):

cfallin updated PR #3697 from memfd-cow to main.

Wasmtime GitHub notifications bot (Jan 25 2022 at 19:01):

cfallin updated PR #3697 from memfd-cow to main.

Wasmtime GitHub notifications bot (Jan 25 2022 at 19:02):

cfallin updated PR #3697 from memfd-cow to main.

Wasmtime GitHub notifications bot (Jan 25 2022 at 19:53):

cfallin updated PR #3697 from memfd-cow to main.

Wasmtime GitHub notifications bot (Jan 25 2022 at 22:01):

cfallin updated PR #3697 from memfd-cow to main.

Wasmtime GitHub notifications bot (Jan 25 2022 at 22:10):

cfallin updated PR #3697 from memfd-cow to main.

Wasmtime GitHub notifications bot (Jan 25 2022 at 22:20):

cfallin updated PR #3697 from memfd-cow to main.

Wasmtime GitHub notifications bot (Jan 25 2022 at 23:20):

cfallin updated PR #3697 from memfd-cow to main.

Wasmtime GitHub notifications bot (Jan 27 2022 at 21:23):

cfallin updated PR #3697 from memfd-cow to main.

Wasmtime GitHub notifications bot (Jan 27 2022 at 21:32):

cfallin updated PR #3697 from memfd-cow to main.

Wasmtime GitHub notifications bot (Jan 27 2022 at 21:36):

cfallin updated PR #3697 from memfd-cow to main.

Wasmtime GitHub notifications bot (Jan 27 2022 at 21:38):

cfallin has marked PR #3697 as ready for review.

Wasmtime GitHub notifications bot (Jan 27 2022 at 22:31):

alexcrichton submitted PR review.

Wasmtime GitHub notifications bot (Jan 27 2022 at 22:31):

alexcrichton submitted PR review.

Wasmtime GitHub notifications bot (Jan 27 2022 at 22:31):

alexcrichton created PR review comment:

Personally with conditional compilation things my goal would be to minimize the number of #[cfg] necessary, so to that end I think this might be a bit nicer have the body of is_memfd_with_image conditionally defined instead of consumers being conditionally defined.

Otherwise afterwards if we're worried about performance here I think it's best to use if cfg!(...) to handle performance issues. Without #[cfg] we can subvert lints like the drop(module) below and also ensure that this compiles all the time instead of in only one configuration.

Wasmtime GitHub notifications bot (Jan 27 2022 at 22:31):

alexcrichton created PR review comment:

This isn't applicable to this location in particular, but is there any reason to have this feature conditional? Naturally it's Linux-specific no matter what but I'm not sure why we wouldn't want to turn it on by default on Linux. Or otherwise this ideally seems like a Config option which is checked at runtime with compile-time abilities to compile it out if truly necessary. (I'm not sure if it's required we get this to an able-to-be-compiled-out-state though, we don't have really any restrictions on our runtime footprint at this time)

Wasmtime GitHub notifications bot (Jan 27 2022 at 22:31):

alexcrichton created PR review comment:

Would you be ok splitting the benchmark bits out to a separate PR?

Wasmtime GitHub notifications bot (Jan 27 2022 at 22:31):

alexcrichton created PR review comment:

Mind folding this into the block above with s/Test uffd functionality on Linux/Test Linux-specific functionality/

Wasmtime GitHub notifications bot (Jan 27 2022 at 22:31):

alexcrichton created PR review comment:

I think you'll want to reset the info field a line or two above this line as well.

Wasmtime GitHub notifications bot (Jan 27 2022 at 22:31):

alexcrichton created PR review comment:

For this with now multiple-methods of creating memories I think it might be best to have something like:

match &memory.kind {
    #{cfg(feature = "uffd")]
    MemoryKind::Uffd { something } => something.reset_guard_pages(),
    #{cfg(feature = "memfd-allocator")]
    MemoryKind::Memfd { something } => {
        something.clear_and_remain_ready();
        continue;
    }
    MemoryKind::Mmap => /* ... */
}

(or something along those lines)

Wasmtime GitHub notifications bot (Jan 27 2022 at 22:31):

alexcrichton created PR review comment:

Could you add some docs to what these fields are?

Wasmtime GitHub notifications bot (Jan 27 2022 at 22:31):

alexcrichton created PR review comment:

I'm sure you were planning on already doing this, but to ensure there's a note here as well I think it'd be best to split out the VMCallerCheckedAnyfunc changes to a separate PR

Wasmtime GitHub notifications bot (Jan 27 2022 at 22:31):

alexcrichton created PR review comment:

I think Rust-convention-wise this would be Fd rather than FD (e.g. Http instead of HTTP)

Wasmtime GitHub notifications bot (Jan 27 2022 at 22:31):

alexcrichton created PR review comment:

The UnsafeCell here and the unsafe impl Send are a bit worrisome to be.

How would you feel about a scheme where this was something like AtomicPtr<MemFDSlot> and we transferred ownership dynamically between a pool and the Memory that holds it? That way we could have what I think would be pretty cheap runtime asserts that the ownership here all works out and no risk of accidental races on this.

Wasmtime GitHub notifications bot (Jan 27 2022 at 22:31):

alexcrichton created PR review comment:

I think a better name for this might be unwrap_mmap to signal the panic, but if this is dead code it's probably safe to remove to (or is the #[allow] just for a #[cfg]?)

Wasmtime GitHub notifications bot (Jan 27 2022 at 22:31):

alexcrichton created PR review comment:

FWIW it's methods like this that make me pretty uncomfortable about the UnsafeCell and how I think it would be better to have an AtomicPtr or similar which is always used to transfer ownership to a single owner at a time. Promiting from a &self to &mut MemFDSlot is pretty scary here and easy to get lost when reading other code.

Wasmtime GitHub notifications bot (Jan 27 2022 at 22:31):

alexcrichton created PR review comment:

I think I've mentioned this before, but personally I feel like this would be better to not have so different between memfd-and-not. It seems like it might be simpler to still have one giant mmap for the entire memory pool and then memfd just does fancy stuff internally with everything as MAP_FIXED.

If instead we think it's better to have an mmap-per-slot then I think that the pooling allocator today to switch to that strategy. Basically I think it might be a bit more consistent to have this be the same across memfd and not.

Wasmtime GitHub notifications bot (Jan 27 2022 at 22:31):

alexcrichton created PR review comment:

Personally I've found that it's a pretty annoying thing to keep perfect-dead-code-warnings with lots of platform-specific code. I typically solve it by having "esoteric" configurations have this at the top of the crate:
#![cfg_attr(feature = "memfd-allocator", allow(dead_code))]
to deal with it at the beginning and not worry about it later. Basically I don't think we care much about dead code in configured situations, only in the default case do we largely care about removing dead code.

Wasmtime GitHub notifications bot (Jan 27 2022 at 22:31):

alexcrichton created PR review comment:

I think this method is used to guard against initialization loops during instantiation, but this feels like they're asking/answering different questions. I believe that image is none if the heap is empty, right? I think this may want a different method name or the callers use a different scheme for calling this or something like that. I suspect that the end-result is the same (if you have an empty heap you probably have 0 initializers), but it feels a bit confusing reading this now seeing what might be a disconnect.

Wasmtime GitHub notifications bot (Jan 27 2022 at 22:31):

alexcrichton created PR review comment:

Personally I wouldn't mind the duplication in this method and from_open_file. I feel like seeing the mmap call is a bit more readable without having to have comments for all the parameters.

Wasmtime GitHub notifications bot (Jan 27 2022 at 22:31):

alexcrichton created PR review comment:

I think this may be able to be removed now?

Wasmtime GitHub notifications bot (Jan 27 2022 at 22:31):

alexcrichton created PR review comment:

Looking through this I don't think there's a method for removing from MemFdRegistry? I think it'd be pretty easy to add though and just needs a hook in Drop for ModuleInner or w/e we have in Module

Wasmtime GitHub notifications bot (Jan 27 2022 at 22:31):

alexcrichton created PR review comment:

This method and logic feels really similar to the to_paged method on MemoryInitialization, could some of this be offloaded to impl MemoryInitialization { .. } to try to share some bits and pieces if possible?

Wasmtime GitHub notifications bot (Jan 27 2022 at 22:31):

alexcrichton created PR review comment:

Personally I like to use:
let defined_memory = match ... {
    Some(e) => e,
    None => continue,
};
// ...
to avoid an extra level of indentation.

Wasmtime GitHub notifications bot (Jan 27 2022 at 22:31):

alexcrichton created PR review comment:

I'm wary of casts like this but I feel like this sort of validation has already happened elsewhere when we loaded the module. I think though that if this were moved to impl MemoryInitialization it may make it easier to share this with other pieces and comment in one place that the cast is fine.

Wasmtime GitHub notifications bot (Jan 27 2022 at 22:31):

alexcrichton created PR review comment:

Given that this set is almost always 1 could this perhaps be PrimaryMap<Index, bool> to be a bit more efficient?

Wasmtime GitHub notifications bot (Jan 27 2022 at 22:31):

alexcrichton created PR review comment:

I think this ends up not being used nowadays, right? If so I think it'd be fine to remove it.

Wasmtime GitHub notifications bot (Jan 27 2022 at 22:31):

alexcrichton created PR review comment:

I've since forgotten what base is so reading this I'm sort of naively assuming that it's a global.get or something like that. Could this loop have some more comments about why paraticular segments are skipped?

Wasmtime GitHub notifications bot (Jan 27 2022 at 22:31):

alexcrichton created PR review comment:

I think this might be a good helper method to add on MemoryPlan or similar to avoid casts/multiplies inline here

Wasmtime GitHub notifications bot (Jan 27 2022 at 22:31):

alexcrichton created PR review comment:

It seems like the purpose of this loop is to calculate what to pass in to set_len, but is that the only purpose of the loop? If so I thought that write_at would automatically resize the file for us? Otherwise could we keep track of the current size and automatically call set_len before a write_at below to avoid having this double loop?

Wasmtime GitHub notifications bot (Jan 27 2022 at 22:31):

alexcrichton created PR review comment:

To confirm, this is cloexec-by-default, right?

Wasmtime GitHub notifications bot (Jan 27 2022 at 22:31):

alexcrichton created PR review comment:

Why do we deal with seals on the memfd?

Wasmtime GitHub notifications bot (Jan 27 2022 at 22:31):

alexcrichton created PR review comment:

Is this required? (I'm not sure of mmap's behavior of non-page-aligned files or if it's like more fast this way)

Wasmtime GitHub notifications bot (Jan 27 2022 at 22:31):

alexcrichton created PR review comment:

This actually brings up an interesting maintainability hazard though. We're sort of manually destroying things here but I think it might actually be best to std::ptr::drop_in_place if we can followed by a std::ptr::write if you're up for trying out such a refactoring.

Wasmtime GitHub notifications bot (Jan 27 2022 at 22:31):

alexcrichton created PR review comment:

I realize though that some of my questions are equally applicable to uffd which is a compile time feature as well. I don't think we have the best compile-time story there either, so it's not necessarily required to solve this here and now.

Wasmtime GitHub notifications bot (Jan 27 2022 at 22:45):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Jan 27 2022 at 22:45):

cfallin created PR review comment:

The main reason for conditionals at all is that the memfd allocator essentially takes over the pooling allocator, and the pooling allocator also works (IIUC) at least on macOS, maybe on other platforms too.

We could definitely do all of the conditional switching dynamically, of course; that's probably the right answer long-term. I think I was just mainly following uffd's lead here. I'm happy to do this refactor as a followup if you'd like!

Wasmtime GitHub notifications bot (Jan 27 2022 at 22:46):

cfallin created PR review comment:

Done (#3733)

Wasmtime GitHub notifications bot (Jan 27 2022 at 22:46):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Jan 27 2022 at 22:50):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Jan 27 2022 at 22:50):

cfallin created PR review comment:

My thinking was mainly as an additional security mitigation layer: this image is the starting point for all instantiations, so if we somehow had a bug that e.g. allowed an instance to directly write to it, or if some other security vulnerability allowed an attacker to direct a write to an arbitrary open fd, then all new instances would be affected. As long as the API exists, why not use it to make the thing read-only?

(I'm happy to add a comment noting this reasoning!)

Wasmtime GitHub notifications bot (Jan 27 2022 at 23:00):

cfallin updated PR #3697 from memfd-cow to main.

Wasmtime GitHub notifications bot (Jan 27 2022 at 23:01):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Jan 27 2022 at 23:01):

cfallin created PR review comment:

Done! (#3736)

Wasmtime GitHub notifications bot (Jan 27 2022 at 23:54):

cfallin updated PR #3697 from memfd-cow to main.

Wasmtime GitHub notifications bot (Jan 27 2022 at 23:55):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Jan 27 2022 at 23:55):

cfallin created PR review comment:

Fixed, good point!

Wasmtime GitHub notifications bot (Jan 27 2022 at 23:58):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Jan 27 2022 at 23:58):

cfallin created PR review comment:

Ah, yes. I went looking for info and realized this was part of the lazy-anyfunc changes that have been split off. But that just proves the point more: it's entirely unobvious in the other PR that this would need to change, and a review of that PR could have easily missed it.

I think the invariant of keeping Instances initialized-but-empty and "hot-patching" the fields was probably intended to keep the PrimaryMaps from being freed/allocated on each instantiation -- at least that's one benefit I see -- but I agree that the risk here is probably greater than the reward, and actually aligns with some of the points above re: instantiation from scratch vs. erasing state to reuse, just on a micro scale.

I went ahead and refactored so that the instances remain uninitialized memory until instantiation, and are drop_in_place()'d on termination. Good catch!

Wasmtime GitHub notifications bot (Jan 27 2022 at 23:58):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Jan 27 2022 at 23:58):

cfallin created PR review comment:

Done!

Wasmtime GitHub notifications bot (Jan 27 2022 at 23:58):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Jan 27 2022 at 23:58):

cfallin created PR review comment:

Added!

Wasmtime GitHub notifications bot (Jan 27 2022 at 23:59):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Jan 27 2022 at 23:59):

cfallin created PR review comment:

Renamed.

Wasmtime GitHub notifications bot (Jan 28 2022 at 00:10):

cfallin updated PR #3697 from memfd-cow to main.

Wasmtime GitHub notifications bot (Jan 28 2022 at 00:10):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Jan 28 2022 at 00:10):

cfallin created PR review comment:

Done.

Wasmtime GitHub notifications bot (Jan 28 2022 at 00:10):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Jan 28 2022 at 00:10):

cfallin created PR review comment:

Yup, that's exactly it (content depending on globals, which could vary based in imported modules). Added a comment.

Wasmtime GitHub notifications bot (Jan 28 2022 at 00:10):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Jan 28 2022 at 00:10):

cfallin created PR review comment:

Done.

Wasmtime GitHub notifications bot (Jan 28 2022 at 00:11):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Jan 28 2022 at 00:11):

cfallin created PR review comment:

Yep! Re-enabled test.

Wasmtime GitHub notifications bot (Jan 28 2022 at 00:12):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Jan 28 2022 at 00:12):

cfallin created PR review comment:

The other purpose of the loop is to look for any reasons we can't use a memfd (dynamically-placed segments, out-of-bounds segments) before we commit to one. I suppose we could allocate one and then throw it away if we see an error... actually I'll try that, it seems like a sensible optimization.

Wasmtime GitHub notifications bot (Jan 28 2022 at 00:16):

cfallin updated PR #3697 from memfd-cow to main.

Wasmtime GitHub notifications bot (Jan 28 2022 at 00:16):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Jan 28 2022 at 00:16):

cfallin created PR review comment:

Added comment saying basically the above.

Wasmtime GitHub notifications bot (Jan 28 2022 at 00:17):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Jan 28 2022 at 00:17):

cfallin created PR review comment:

Indeed, it appears so.

Wasmtime GitHub notifications bot (Jan 28 2022 at 01:17):

cfallin updated PR #3697 from memfd-cow to main.

Wasmtime GitHub notifications bot (Jan 28 2022 at 01:17):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Jan 28 2022 at 01:17):

cfallin created PR review comment:

Just refactored to make this the case again -- much simpler now. Thanks for pushing in this direction!

Wasmtime GitHub notifications bot (Jan 28 2022 at 01:20):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Jan 28 2022 at 01:20):

cfallin created PR review comment:

Yup, I agree. FWIW my original thinking was that we already had the same mutable-access-to-slot thing going on implicitly, because the pooling and uffd allocators were making syscalls to mutate individual slots and relying on higher-level instance mutual exclusion for safety. The difference is just that we didn't have an actual Rust object embodying the slot and actions on it previously.

In any case, the "take ownership and return it" pattern is much nicer; I feel better with this additional level of checking/safety. I ended up doing this with a mutex per slot, just for simplicity and to avoid gratuitous unsafe code and manual box<->raw handling; that's probably acceptable overhead but I'm happy to refactor to actual atomic-exchange magic if not.

Wasmtime GitHub notifications bot (Jan 28 2022 at 01:24):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Jan 28 2022 at 01:24):

cfallin created PR review comment:

I added a comment here to explain what's going on -- I had actually already forgotten the subtleties and had to debug a wee bit, but it turns out that this is important for handling invalid data initializers (e.g. out-of-bounds) properly. The invariant is that the memfd-construction code may reject creating a memfd for certain reasons, e.g. dynamically-placed segments, or out-of-bounds segments, or whatever else we need to reject. The MemFD mechanism may still handle the memory, and it may be initialized normally by writing into the anon-mmap memory; so really what we want is exactly "are we using an already-prepared image" and if not, we do the initializers. Added a better doc-comment here to explain.

Wasmtime GitHub notifications bot (Jan 28 2022 at 01:24):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Jan 28 2022 at 01:24):

cfallin created PR review comment:

Good point, I had forgotten about that! That would be a nice surprise in a long-running process. Added a drop-handler that removes from the registry as suggested.

Wasmtime GitHub notifications bot (Jan 28 2022 at 01:25):

cfallin updated PR #3697 from memfd-cow to main.

Wasmtime GitHub notifications bot (Jan 28 2022 at 01:26):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Jan 28 2022 at 01:26):

cfallin created PR review comment:

I think it's probably a good idea; IIRC the Linux semantics are that an mmap of a non-page-aligned-size file sees zeroes for the remainder of the last page, but that may be weird/nonportable behavior.

Wasmtime GitHub notifications bot (Jan 28 2022 at 01:26):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Jan 28 2022 at 01:26):

cfallin created PR review comment:

Fixed as per comment below!

Wasmtime GitHub notifications bot (Jan 28 2022 at 02:08):

cfallin updated PR #3697 from memfd-cow to main.

Wasmtime GitHub notifications bot (Jan 28 2022 at 02:09):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Jan 28 2022 at 02:09):

cfallin created PR review comment:

I put some of the bounds checks in MemoryInitializer; a lot of the other logic is sort of just-different-enough to make it tricky to share. Maybe with more thought a "visitor" sort of interface to the image implied by memory initializers could be built; that feels slightly out of scope here though, to me at least (happy to push more on it if you'd like).

Wasmtime GitHub notifications bot (Jan 28 2022 at 02:10):

cfallin created PR review comment:

Yup, good idea, I moved the bounds checks to helpers on the MemoryInitializer.

Wasmtime GitHub notifications bot (Jan 28 2022 at 02:10):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Jan 28 2022 at 02:10):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Jan 28 2022 at 02:10):

cfallin created PR review comment:

OK, so I refactored this and it's not significantly simpler, but it is at least a little more straightforward.

Wasmtime GitHub notifications bot (Jan 28 2022 at 02:10):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Jan 28 2022 at 02:10):

cfallin created PR review comment:

Yup, good idea!

Wasmtime GitHub notifications bot (Jan 28 2022 at 02:14):

cfallin updated PR #3697 from memfd-cow to main.

Wasmtime GitHub notifications bot (Jan 28 2022 at 02:14):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Jan 28 2022 at 02:14):

cfallin created PR review comment:

Done!

Wasmtime GitHub notifications bot (Jan 28 2022 at 02:20):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Jan 28 2022 at 02:20):

cfallin created PR review comment:

I like that approach, but I think it's a significant refactor of the code, including uffd code, because it's often not just a simple "approach A does this block, approach B does that block", but e.g. uffd or memfd adds just one variation.

I think this might make sense to do in a followup "remove all the static config by feature flags" PR -- I could both put uffd and memfd under dynamic config flags, and clean up the conditional compilation stuff at the same time. Would you mind if I punted until that?

Wasmtime GitHub notifications bot (Jan 28 2022 at 16:00):

alexcrichton submitted PR review.

Wasmtime GitHub notifications bot (Jan 28 2022 at 16:00):

alexcrichton submitted PR review.

Wasmtime GitHub notifications bot (Jan 28 2022 at 16:00):

alexcrichton created PR review comment:

Could this be merged with new_static and it grows various arguments to match the contents of Memory::Static?

IIRC the let base = match maximum { ... } was non-trivial and I forgot it the first time I wrote something awhile back, so having it not-duplicated I think would be best.

If there's issues with MemFdSlot being conditionally defined I think it would be reasonable to have something like:
#[cfg(not(feature = "memfd-allocator"))]
enum MemFdSlot {}
which should make Option<MemFdSlot> a 0-sized struct and statically assert we don't try to actually create one.

Wasmtime GitHub notifications bot (Jan 28 2022 at 16:00):

alexcrichton created PR review comment:

Could this use u64::try_from(...).unwrap()?

I realize it's much wordier but when it comes to dealing with memory I personally prefer to use fallible conversions for everything that's not necessarily immediately trivially obvious that it's a successful conversion (ideally having no as conversions at all but that's not always possible)

Wasmtime GitHub notifications bot (Jan 28 2022 at 16:00):

alexcrichton created PR review comment:

I think we can actually avoid saving this internally since initialization below should I think take into account the specific module being instantiated. For example there's a call to PrimaryMap::with_capacity(self.module_limits.memories as usize) but that's not necessary if the module being instantiated has fewer memories than the limits.

I think this could probably be achieved by extracting this logic to a helpe method and using that instead of the initialize method below perhaps?

Wasmtime GitHub notifications bot (Jan 28 2022 at 16:00):

alexcrichton created PR review comment:

Along the same lines as the seals elsewhere, perhaps this could have a seal for writing to guarantee it's always zero? (IIRC the seal for writing doesn't affect shrinking/growth)

Wasmtime GitHub notifications bot (Jan 28 2022 at 16:00):

alexcrichton created PR review comment:

The comment here about "Try to unregister" feels a bit misleading because this seems like there's less of an attempt here and more of a "this had better be gone" sort of feeling.

Additionally I'm not sure now that I think about it that this is the right place for this destructor. And actually now that I think for a moment I'm a bit confused about this design.

This design currently has a "registry" in an engine of memfds-per-module, but it seems like an alternative (and perhaps simpler?) design would be to store memfd information here in the Module itself. I see now that during the first instantiation of a module we create the memfd image if memfd is enabled, but it seems like that work would be best done at module-creation-time if memfd is enabled.

WIth per-module memfd information I think that would remove the need for a registry entirely?

Wasmtime GitHub notifications bot (Jan 28 2022 at 16:00):

alexcrichton created PR review comment:

While I don't think this is really an issue this feels morally to me like the size here shouldn't include the guard size. I naively expect that the MemFdSlot is only responsible for the linear memory, and the guard pages are concerns of other layers of the stack.

Wasmtime GitHub notifications bot (Jan 28 2022 at 16:00):

alexcrichton created PR review comment:

I commented below about error cases, but I think this is one error we're definitely not going to want to ignore. If the madvise fails for whatever reason I think that would definitely leak contents of memory between module instantiations which we don't want. On error here it seems like we should either panic to assert it's not actually happening or we discard the slot so the mappings are entirely redone on reuse.

Wasmtime GitHub notifications bot (Jan 28 2022 at 16:00):

alexcrichton created PR review comment:

I understand the theoretical design behind this, but out of curiosity have you been able to benchmark the difference between ftruncate and just-map-anonymous-memory?

Wasmtime GitHub notifications bot (Jan 28 2022 at 16:00):

alexcrichton created PR review comment:

FWIW you can avoid the "syntax salt" here with:
         if let Some(existing_image) = &self.image {

Wasmtime GitHub notifications bot (Jan 28 2022 at 16:00):

alexcrichton created PR review comment:

One possible simplification is that we could remove this field and have the extension file always mapped for the entire size of the wasm heap. That way we wouldn't need to do subtraction or anything in the set_heap_limit method and could instead simply forward to .set_len()?

Wasmtime GitHub notifications bot (Jan 28 2022 at 16:00):

alexcrichton created PR review comment:

Reading over this one concern I just had was "what happens if an error is returned and this flag isn't set?"

It looks like the MemFdSlot structure will be discarded and recreated the next time that it's requested (because MemFdSlot creation is lazy). To confirm, is that intended?

One possible idea I had is there could be a Drop for MemFdSlot which checks self.dirty and unmaps the information if it's there (and also setting self.dirty = true at the top of this function instead of at the end). I'm not sure if that really buys much benefit though.

In any case I was curious if you'd thought about the error case here and the impact it might have?

Wasmtime GitHub notifications bot (Jan 28 2022 at 16:00):

alexcrichton created PR review comment:

I had a big comment about how this didn't seem to handle trailing zeros after the image, only leading zeros, which I ended up deleting because I believe that this does handle it and I just missed it in the documentation.

Could this explicitly call out the orderings the mmaps are done to indicate how overlapping regions are handled and the cow image will naturally be flanked by zero'd memory as necessary?

Wasmtime GitHub notifications bot (Jan 28 2022 at 16:00):

alexcrichton created PR review comment:

If possible this is another place that I think would be best to share with the preexisting set_instance_memories method. It seems like this wants to share a fair bit of the logic and only tweaks things by acquiring a memfd slot, and that seems like it could be done in a more surgical #[cfg]'d fashion instead of duplicating the whole method.

Wasmtime GitHub notifications bot (Jan 28 2022 at 16:00):

alexcrichton created PR review comment:

Could this use memory_plans.iter() and ... I forget the method name but I'm sure we have one to convert from MemoryIndex to DefinedMemoryIndex ... to avoid using from_u32?

Wasmtime GitHub notifications bot (Jan 28 2022 at 16:00):

alexcrichton created PR review comment:

That all sounds like really solid reasoning to me, thanks for the comment!

Wasmtime GitHub notifications bot (Jan 28 2022 at 16:00):

alexcrichton created PR review comment:

Reading over this again I realize that this feels a bit out-of-sync with uffd because uffd also wants to skip memory initialization but this "skip the init" check is in a different location than the uffd one.

Could the check for uffd get moved over to https://github.com/bytecodealliance/wasmtime/blob/921fb6c73e4443ac534e9d4477d9f8cab318c166/crates/runtime/src/instance/allocator/pooling.rs#L1299-L1326 so "skip memory init" is only in one location?

Wasmtime GitHub notifications bot (Jan 28 2022 at 16:00):

alexcrichton created PR review comment:

Oh also one thing I forgot from earlier, I feel like this wants to do the same thing as uffd ("just call madvise") but it's done in a different fashion. I think that this location (which currently madvise's manually and then continues to skip further logic) would be a good place to try to better unify what uffd and memfd are doing. Ideally this could fall through to the shared call to decommit_memory_pages or something like that, or otherwise make it more clear that uffd and memfd are doing roughly similar things.

Wasmtime GitHub notifications bot (Jan 28 2022 at 16:00):

alexcrichton created PR review comment:

I actually think that having this conditionally-defined type would be useful for other things as well, it would remove all the #[cfg] below with something like:
impl MemFdSlot {
    fn has_image(&self) -> bool { match self {} }
    fn set_heap_limit(&mut self) -> Result<()> { match self {} }
}
(etc)

Wasmtime GitHub notifications bot (Jan 28 2022 at 16:00):

alexcrichton created PR review comment:

One other benefit of moving information out of the Engine is that it would avoid bouncing on a lock every time an instantiation is made which can be somewhat expensive

Wasmtime GitHub notifications bot (Jan 28 2022 at 16:00):

alexcrichton created PR review comment:

(I think this also means the change below to initialize_instance can be reverted as well)

Wasmtime GitHub notifications bot (Jan 28 2022 at 22:11):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Jan 28 2022 at 22:11):

cfallin created PR review comment:

Ah, this refactor is so much nicer. I did this for MemFdSlot and also the ModuleMemFds below (just a zero-sized type if memfd support is not included).

Wasmtime GitHub notifications bot (Jan 28 2022 at 22:11):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Jan 28 2022 at 22:11):

cfallin created PR review comment:

Good point -- there actually was no need for a registry system at all. I've made the wasmtime::Module own the memfd images. Thanks!

Wasmtime GitHub notifications bot (Jan 28 2022 at 22:25):

cfallin updated PR #3697 from memfd-cow to main.

Wasmtime GitHub notifications bot (Jan 28 2022 at 23:17):

cfallin updated PR #3697 from memfd-cow to main.

Wasmtime GitHub notifications bot (Jan 28 2022 at 23:18):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Jan 28 2022 at 23:18):

cfallin created PR review comment:

Done -- extracted a common core between this and the on-demand allocator to build a raw Instance.

Wasmtime GitHub notifications bot (Jan 28 2022 at 23:18):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Jan 28 2022 at 23:18):

cfallin created PR review comment:

Ah, yes, I'm using .values() here but the default .iter() on a PrimaryMap gives us an iterator that actually just gives the keys (the strong index type) directly. Much nicer.

Wasmtime GitHub notifications bot (Jan 28 2022 at 23:18):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Jan 28 2022 at 23:18):

cfallin created PR review comment:

Done!

Wasmtime GitHub notifications bot (Jan 28 2022 at 23:18):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Jan 28 2022 at 23:18):

cfallin created PR review comment:

Done!

Wasmtime GitHub notifications bot (Jan 28 2022 at 23:20):

cfallin updated PR #3697 from memfd-cow to main.

Wasmtime GitHub notifications bot (Jan 28 2022 at 23:21):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Jan 28 2022 at 23:21):

cfallin created PR review comment:

That's my old-school (pre-2016 or something?) Rust showing... but then look who I'm talking to :-)

(I actually kinda like the explicitness and the lack of magic in the pattern-matching with precise matching &s and refs, but maybe my brain's internal typechecker is just weird...)

Wasmtime GitHub notifications bot (Jan 28 2022 at 23:31):

cfallin updated PR #3697 from memfd-cow to main.

Wasmtime GitHub notifications bot (Jan 28 2022 at 23:33):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Jan 28 2022 at 23:33):

cfallin created PR review comment:

That's a very good point. I think the safest answer to this is that if some mmap error occurs within the big mmap used by the pooling allocator for all memories, all bets are off and our only safe/secure option is to panic the process. Otherwise, unless we can carefully reason about why the remapping failed, we have a slot that is permanently unusable, so the system is degraded (slots die over time due to errors and we have no way of replacing them). And we have no way to return an error from an individual instantiation that means "I not only failed allocation but blew up the allocator... please throw everything away and start over". So, replaced with .expect(). If you have other ideas here though I'm very curious!

Wasmtime GitHub notifications bot (Jan 28 2022 at 23:36):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Jan 28 2022 at 23:36):

cfallin created PR review comment:

I have not; avoided that due to the use of the global mmap lock by mprotect(), taking the fact that we should avoid that on received wisdom / experience from earlier pooling-allocator folks. It would be interesting to see exactly how much it matters for different workloads, and there is indeed a tradeoff (in that the CoW-of-memfd memory is slower, as we've discovered).

Wasmtime GitHub notifications bot (Jan 28 2022 at 23:40):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Jan 28 2022 at 23:40):

cfallin created PR review comment:

As above, made this panic the process if it fails, since such a failure leaves the whole pooling allocator in an unknown state.

Re: uffd -- this gets to the bigger question of how much we can share, but: given that (i) just enough stuff differs in the details between what memfd does and uffd does, and (ii) uffd may go away if memfd works better, I'm somewhat hesitant to refactor to try to share more logic when it's not an easy/simple change. It starts to get into the territory of "clean up uffd", which feels out-of-scope for this PR. My experience with trying to share too much logic as well is that sometimes it becomes hard to see through all the conditionals, and it can actually be simpler to maintain when the two disjoint approaches are kept separate. (See e.g. Cranelift backends and separate instruction lowering rules, vs. big mess of conditional logic. Not exactly the same, but similar in spirit.)

Wasmtime GitHub notifications bot (Jan 28 2022 at 23:41):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Jan 28 2022 at 23:41):

cfallin created PR review comment:

Yeah, I agree in principle; this is just following how the rest of the pooling allocator works. Could definitely be cleaned up in a followup.

Wasmtime GitHub notifications bot (Jan 28 2022 at 23:43):

cfallin created PR review comment:

We could, yeah; the reason I did it this way was to avoid potentially wasting resources. (The part of the file shadowed by the initial heap size will be a "hole", but there may still be a small resource cost in tracking per-page info or things like that.)

Wasmtime GitHub notifications bot (Jan 28 2022 at 23:43):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Jan 28 2022 at 23:44):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Jan 28 2022 at 23:44):

cfallin created PR review comment:

As mentioned above, I'm a little wary of this PR becoming "let's refactor and clean up all of uffd while we're at it". If we have issues with uffd's design, perhaps we could save them for another PR?

Wasmtime GitHub notifications bot (Jan 28 2022 at 23:44):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Jan 28 2022 at 23:44):

cfallin created PR review comment:

I unified this with the above, given the conditional-at-the-lowest-level strategy you outlined -- thanks a bunch for that, it's much cleaner now!

Wasmtime GitHub notifications bot (Jan 28 2022 at 23:47):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Jan 28 2022 at 23:47):

cfallin created PR review comment:

Done!

Wasmtime GitHub notifications bot (Jan 28 2022 at 23:47):

cfallin updated PR #3697 from memfd-cow to main.

Wasmtime GitHub notifications bot (Jan 28 2022 at 23:50):

cfallin updated PR #3697 from memfd-cow to main.

Wasmtime GitHub notifications bot (Jan 29 2022 at 00:23):

cfallin updated PR #3697 from memfd-cow to main.

Wasmtime GitHub notifications bot (Jan 31 2022 at 15:55):

alexcrichton submitted PR review.

Wasmtime GitHub notifications bot (Jan 31 2022 at 15:55):

alexcrichton created PR review comment:

Oh I'm happy to panic the process in these cases, I just wasn't sure what the intention was. I agree that panicking is the best option here.

Wasmtime GitHub notifications bot (Jan 31 2022 at 15:59):

alexcrichton submitted PR review.

Wasmtime GitHub notifications bot (Jan 31 2022 at 15:59):

alexcrichton created PR review comment:

Nah I think that makes sense to not clean up uffd while you're here, but this feels like a relatively easy thing to do in that there's already a location for uffd where we skip memory initialization, and intentionally not using that location (when I think it works right?) for memfd feels odd

Wasmtime GitHub notifications bot (Jan 31 2022 at 16:23):

alexcrichton submitted PR review.

Wasmtime GitHub notifications bot (Jan 31 2022 at 16:23):

alexcrichton submitted PR review.

Wasmtime GitHub notifications bot (Jan 31 2022 at 16:23):

alexcrichton created PR review comment:

I think the .expect can be removed here and the base check above removed in favor of moving the match to here?

Wasmtime GitHub notifications bot (Jan 31 2022 at 16:23):

alexcrichton created PR review comment:

If this is still necessary then I think this should be done at a lower level where it's more obvious which field causes the automatic inference to not work and a comment can say why this is needed.

Wasmtime GitHub notifications bot (Jan 31 2022 at 16:23):

alexcrichton created PR review comment:

This organization seems a bit odd where I would expect the "dummy" implementation and the real implementation to live roughly in the same area, similarly they would also theoretically mirror the dummy/real implementation of ModuleMemFds. Could there perhaps be crates/runtime/src/memfd.rs and crates/runtime/src/memfd_disabled.rs which have all the structs and all the dummy structs?

(unsure if this MemFdSlot needs to be defined in this file)

Wasmtime GitHub notifications bot (Jan 31 2022 at 16:23):

alexcrichton created PR review comment:

I think where possible it's best to replace this with something like match self {} which the compiler will allow since self is an empty enum and is a good way of showing that this is statically not reachable.

Wasmtime GitHub notifications bot (Jan 31 2022 at 16:23):

alexcrichton created PR review comment:

Well to some extent I don't really think this needs a major refactoring. While ideally we have MemoryKind variants or something like that what we have today in this PR is two Memory variants where one has fields which configures uffd or memfd if they're enabled. I think that this logic in this particular loop can still be structured as "here's an arm for each case" as opposed to "memfd is special and gets the early-continue". I was originally confused reading this loop because the loop always ends with madvise and it sort of seems like memfd doesn't end with madvise when it in fact does.

Wasmtime GitHub notifications bot (Jan 31 2022 at 16:23):

alexcrichton created PR review comment:

I feel like our error handling here is sort of half-measures in the sense that if this initialization fails then an abort happens but if Memory::new_static fails below it's bubbled up and handled. In an error below (or later for a memory afterwards) it seems like it would still result in dropping the MemFdSlot which itself doesn't do any cleanup and leaves the slot in a somewhat indeterminate state?

In some sense this may not be the end of the world really about having an indeterminate state because allocation of a memfd slot here always redoes the entire mapping for the entire memory, so if something is "leaked" it's only temporary in the sense that no wasm will be using that slot?

Wasmtime GitHub notifications bot (Jan 31 2022 at 18:45):

cfallin updated PR #3697 from memfd-cow to main.

Wasmtime GitHub notifications bot (Jan 31 2022 at 18:45):

cfallin updated PR #3697 from memfd-cow to main.

Wasmtime GitHub notifications bot (Jan 31 2022 at 18:45):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Jan 31 2022 at 18:45):

cfallin created PR review comment:

Done!

Wasmtime GitHub notifications bot (Jan 31 2022 at 18:46):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Jan 31 2022 at 18:46):

cfallin created PR review comment:

Ah, it no longer is, now that we're taking the more idiomatic pass-ownership-of-MemFdSlot approach. Removed!

Wasmtime GitHub notifications bot (Jan 31 2022 at 18:46):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Jan 31 2022 at 18:46):

cfallin created PR review comment:

Done!

Wasmtime GitHub notifications bot (Jan 31 2022 at 18:46):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Jan 31 2022 at 18:46):

cfallin created PR review comment:

Ah, right, this wasn't nearly as much a refactor as I thought; sorry! Done.

Wasmtime GitHub notifications bot (Jan 31 2022 at 18:47):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Jan 31 2022 at 18:47):

cfallin created PR review comment:

Done! This cleans up things sigificantly; pooling.rs has almost no memfd-conditional stuff left.

Wasmtime GitHub notifications bot (Jan 31 2022 at 18:52):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Jan 31 2022 at 18:52):

cfallin created PR review comment:

That's a good point. On thinking about this further:

If instantiation bails out after we've initialized the MemFdSlot (which can happen, e.g., if the resource-limiter says "no") then it's most idiomatic to just drop the slot object and let the pool create a new one later.

The most important thing -- the only thing I can think of that would absolutely require us to panic the process -- is that we don't let some other mmap ASLR its way into a gap in the pooling allocator's region if we leave a hole somewhere during instantiation. But we already avoid that, to prevent exactly that race condition; so even if the MemFdSlot fails, say, the second of 3 mmaps, we are still covering/reserving the whole range.

Given that, I think it's fine to drop the MemFdSlot at any point.

For an extra layer of {sanity, avoiding resource leaks e.g. keeping a memfd image alive, defense-in-depth-don't-map-more-than-we-need}, it is still nice to try to remove a stale mapping. I've written a Drop handler for MemFdSlot that does this in a best-effort way: it tries to map anonymous memory with PROT_NONE over the whole slot. It uses NORESERVE to make this more likely to succeed. But if it doesn't, the error is explicitly ignored (with a comment in the source to describe why safe): nothing is supposed to touch the slot anyway until the next MemFdSlot is created for it, and if mapping something into the slot fails again later for the new MemFdSlot, then it has the proper avenues to bubble up the error at that time.

Does that seem reasonable?

Wasmtime GitHub notifications bot (Jan 31 2022 at 19:15):

cfallin updated PR #3697 from memfd-cow to main.

Wasmtime GitHub notifications bot (Jan 31 2022 at 19:31):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Jan 31 2022 at 19:31):

cfallin created PR review comment:

I tried cleaning this up a bit, but it's really subtle and I'm getting lost in the corner cases of paged-vs-segment initialization mode, bulk memory and the need to trap at just the right point. In any case it's not a simple refactor. I can spend a few more hours on understanding this and unraveling it if we think we need it but it feels sort of pointless if we're possibly going to remove uffd soon :-)

Wasmtime GitHub notifications bot (Jan 31 2022 at 20:54):

cfallin updated PR #3697 from memfd-cow to main.

Wasmtime GitHub notifications bot (Jan 31 2022 at 22:00):

cfallin updated PR #3697 from memfd-cow to main.

Wasmtime GitHub notifications bot (Jan 31 2022 at 23:18):

alexcrichton submitted PR review.

Wasmtime GitHub notifications bot (Jan 31 2022 at 23:18):

alexcrichton created PR review comment:

as a minor nit the traits in rustix seem like you could probably pass image.fd.as_file() below as the fd argument rather than creating a new BorrowedFd structure (although I haven't used rustix, just a hunch)

Wasmtime GitHub notifications bot (Jan 31 2022 at 23:18):

alexcrichton created PR review comment:

One thing I think we might want to do here is to have an explicit flag (off by default) which disables this destructor behavior and we could set the flag from the pool to skip this since it's all about to get unmapped anyway.

Otherwise though the difference between this destructor and instantiate got me thinking. I think that there's a bug right now that can leak the initial memory from one module to another? When a slot is reused all the memory is accessible but reset back to the previous image, but the new anonymous mapping for the new heap might be smaller than the previous image, and subsequent heap grows simply mprotect the preexisting pages into existence, which I think means that the previous module's memory image may leak into the next.

Otherwise though I'd naively expect that the destructor here shares common code with the instantiate step (when the hot path isn't taken) where there's a step to clear out all vma mappings with this map anonymous call (although I don't entirely know what the the difference is with an mmap that has NORESERVE vs PROT_NONE)

Wasmtime GitHub notifications bot (Jan 31 2022 at 23:18):

alexcrichton submitted PR review.

Wasmtime GitHub notifications bot (Jan 31 2022 at 23:18):

alexcrichton created PR review comment:

as a minor nit I was originally thinking that this and runtime/src/memfd.rs would be the same file, in further effort to reduce the number of #[cfg] necessary

Wasmtime GitHub notifications bot (Jan 31 2022 at 23:18):

alexcrichton created PR review comment:

Unlike the Static variant of memory I think we might be able to skip the storage here entirely (and also make the instantiate a litte faster below with fewer syscalls) since with a runtime memory all memfd really is is the initial mapping of the cow image overlaid on top of the underlying anonymous mapping. In that sense I think we can skip most of the logic of MemFdSlot and avoid storing redundant data.

Now that being said I think the implementation here is correct and will work well regardless. I think it'd be fine for a follow-up to clean this up a bit.

Wasmtime GitHub notifications bot (Jan 31 2022 at 23:18):

alexcrichton created PR review comment:

Given where this PR now is and the trajectory of this feature, I think this is good to have as an optional dependency but I think it should be enabled by default and probably just called memfd (enabled-by-default probably from the wasmtime crate, not here). Especially with this making such a drastric difference for the on-demand allocator it seems like using memfd is a no-brainer.

To make this tweak as well as improve the portability here (right now if you enable memfd on Windows it'd probably just fail to compile) I think a few small changes are all that's needed. To be clear though this is icing on the cake and it's totally fine to do this as a follow-up, but I think the steps would be:

Remove the memfd-allocator

Add a memfd feature to the wasmtime crate that forwards to wasmtime-runtime/memfd

Add memfd to the default feature set of the wasmtime crate

Add a crates/runtime/build.rs that detects that the target's os is Linux (by looking at CARGO_CFG_TARGET_OS) and that the memfd feature is active (by looking at CARGO_FEATURE_MEMFD), and if both are present then printing println!("cargo:rustc-cfg=memfd")

Change all cfg(feature = "memfd-allocator") in the crate to cfg(memfd)

With that I think we'll have on-by-default memfd for Linux-only, with the opt-in ability to turn it off. Again though it's fine to defer this to a future PR.

As I type all this out though I also would question maybe we don't even need a feature for this. I suppose one thing that could come up is that if you have a million modules in a process that's a million file descriptors which can blow kernel limits, but other than that I'm not sure why anyone would explicitly want to disable memfd.

Wasmtime GitHub notifications bot (Jan 31 2022 at 23:43):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Jan 31 2022 at 23:43):

cfallin created PR review comment:

Ah, so I think this bug does not exist, because on instantiate, in the different-module case, we remap all memory from the start of heap up to static_size (in practice 6GiB) with anonymous zeroes (here) -- this will wipe all previous mappings. We then mprotect everything beyond the initial heap size to PROT_NONE.

Re: the other bits (sharing clearing logic) that's a good point though and I'll clean this up a bit!

Wasmtime GitHub notifications bot (Feb 01 2022 at 00:07):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Feb 01 2022 at 00:07):

cfallin created PR review comment:

... and, actually there was a bug, but a bit more subtle: the initial anonymous mmap was surrounded by if initial_size > 0 rather than a hypothetical if static_size > 0. The former may be false if the memory starts with zero size then grows; the latter is always true in any valid config (so we can make it unconditional). Thanks for poking at this!

Wasmtime GitHub notifications bot (Feb 01 2022 at 00:11):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Feb 01 2022 at 00:11):

cfallin created PR review comment:

I played with this a bit, and it's a good thought! I think though it may be safer to go through the MemFdSlot for heap growth specifically in case we choose to implement some other technique (such as ftruncate on a second memfd) in the future. At least if I'm reading your suggestion right, otherwise, we'd use the MemFdSlot's instantiate logic to set up the actual CoW mapping then throw it away (without its dtor running) and rely on the implicit alignment of the mprotect-growth strategy between memfd and the usual dynamic memory.

Re: syscalls on instantiate, we could indeed avoid one if we give the MemFdSlot a "no fixed mapping" mode, but then it takes responsibility for the pre-guard too, which complicates the pooling implementation somewhat. I'll think a bit more about this though.

Wasmtime GitHub notifications bot (Feb 01 2022 at 00:11):

cfallin updated PR #3697 from memfd-cow to main.

Wasmtime GitHub notifications bot (Feb 01 2022 at 00:11):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Feb 01 2022 at 00:11):

cfallin created PR review comment:

Good call, thanks!

Wasmtime GitHub notifications bot (Feb 01 2022 at 00:11):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Feb 01 2022 at 00:11):

cfallin created PR review comment:

Fixed!

Wasmtime GitHub notifications bot (Feb 01 2022 at 00:40):

cfallin updated PR #3697 from memfd-cow to main.

Wasmtime GitHub notifications bot (Feb 01 2022 at 00:41):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Feb 01 2022 at 00:41):

cfallin created PR review comment:

Yes, good point; I've moved everything into one memfd.rs in runtime/src/ (and memfd_disabled.rs alongside it).

Wasmtime GitHub notifications bot (Feb 01 2022 at 01:05):

cfallin updated PR #3697 from memfd-cow to main.

Wasmtime GitHub notifications bot (Feb 01 2022 at 01:07):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Feb 01 2022 at 01:07):

cfallin created PR review comment:

I've gone ahead and done all of this; the default build now will use memfd, validated by building the wasmtime binary and strace'ing an execution. Waiting for CI to see if moving memfd out of the target-only deps into the main deps will work on non-Linux, but I am hoping it will (the crate's got a toplevel #![cfg(target_os ...)] that should make it a no-op on other platforms...?).

Wasmtime GitHub notifications bot (Feb 01 2022 at 01:13):

cfallin updated PR #3697 from memfd-cow to main.

Wasmtime GitHub notifications bot (Feb 01 2022 at 05:26):

cfallin updated PR #3697 from memfd-cow to main.

Wasmtime GitHub notifications bot (Feb 01 2022 at 06:12):

cfallin updated PR #3697 from memfd-cow to main.

Wasmtime GitHub notifications bot (Feb 01 2022 at 06:39):

cfallin updated PR #3697 from memfd-cow to main.

Wasmtime GitHub notifications bot (Feb 01 2022 at 20:05):

alexcrichton submitted PR review.

Wasmtime GitHub notifications bot (Feb 01 2022 at 20:05):

alexcrichton created PR review comment:

Since there's a pooling-allocator feature below I think that this and memfd can probably be removed from the feature set here, but adding pooling-allocator to the default set of features seems pretty reasonable to me.

Wasmtime GitHub notifications bot (Feb 01 2022 at 20:05):

alexcrichton created PR review comment:

One of my worries is that there's a fair bit of tricky state with handling linear memories and having that duplicated across a few locations I think is not ideal. The MemFdSlot tries to take over a fair bit of the management of the memory which is also tracked by the runtime Memory. We discussed this a bit in person as well, but to reiterate here I think it might be good to investigate a bit to figure out whether some functions below could grow an extra parameter or two instead of storing the state internally so it could be passed in by the "source of truth".

I don't have a great idea though for whether this is possible, so I'll leave it up to your discretion about whether it ends up working out or not.

Wasmtime GitHub notifications bot (Feb 01 2022 at 20:05):

alexcrichton created PR review comment:

Since memfd can be used without the pooling allocator now is this condition still necessary? (other than disabling memfd when uffd is enabled)

Wasmtime GitHub notifications bot (Feb 01 2022 at 20:05):

alexcrichton submitted PR review.

Wasmtime GitHub notifications bot (Feb 01 2022 at 20:05):

alexcrichton created PR review comment:

Out of curiosity with the current #[cfg] system is this still necessary? With memfd on-by-default now this otherwise runs the risk of suppressing, by default, dead code warnings. (I think though that with the current organization the warnings here should be much less and/or easier to deal with)

Wasmtime GitHub notifications bot (Feb 01 2022 at 20:05):

alexcrichton created PR review comment:

I think that this path may still perform a few extra syscalls that aren't necesary, and in local benchmarking this isn't super-significant in the profile but it still seems nontrivial (~5% ish).

The MemFdSlot::instantiate method will mmap the whole memory as read/write, overwriting the Mmap::accessible_reserved above followed by mmap.make_accessible(). The mmap for the cow image is then done, finally followed by an mprotect to make everything past the end inaccessible. I think that the only mmap needed in this case, without replacing the above calls to Mmap, is the one for the cow image.

This is probably still faster than what we have today, though, so if you'd prefer this can get deferred to later.

Wasmtime GitHub notifications bot (Feb 01 2022 at 20:05):

alexcrichton created PR review comment:

It's guaranteed that both offset and len are multiples of page sizes, right? (just confirming)

Wasmtime GitHub notifications bot (Feb 01 2022 at 20:05):

alexcrichton created PR review comment:

Random thought that just occured to me, since we're doing fd equality matching here do we actually still need a compiled module id? I originally thought that was going to be used for this but now that I look at this again I see it not being used. I forget though if the compiled module id is actually used for anything else at this point in time though.

Wasmtime GitHub notifications bot (Feb 01 2022 at 20:05):

alexcrichton created PR review comment:

I think this could actually use .get_mut() to bypass the .lock() since we've got mutable access to the contents at this point.

Wasmtime GitHub notifications bot (Feb 01 2022 at 20:05):

alexcrichton created PR review comment:

I think this is fine to be a safe function since this isn't related to Rust's memory safety and is more of a defense-in-depth mechanism.

Wasmtime GitHub notifications bot (Feb 01 2022 at 20:05):

alexcrichton created PR review comment:

I think pooling-allocator can be dropped here?

Wasmtime GitHub notifications bot (Feb 01 2022 at 20:35):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Feb 01 2022 at 20:35):

cfallin created PR review comment:

It's used in the followup affinity PR (#3738) as otherwise there's an ID-reuse problem... strictly speaking that's just a heuristic so we could use fds there too and accept a false affinity sometimes. IMHO they're pretty cheap to maintain so maybe it's ok...?

Wasmtime GitHub notifications bot (Feb 01 2022 at 20:45):

alexcrichton submitted PR review.

Wasmtime GitHub notifications bot (Feb 01 2022 at 20:45):

alexcrichton created PR review comment:

Ah right, and nah yeah let's leave them in. Only worthwhile if they could be removed entirely, but using for affinity makes sense!

Wasmtime GitHub notifications bot (Feb 01 2022 at 23:18):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Feb 01 2022 at 23:18):

cfallin created PR review comment:

Ah, these were necessary to actually see commandline wasmtime use memfd I think -- the default-features = false otherwise turns everything off. (If there's a better way to arrange this then let me know -- I'm actually not sure I fully grok why default-features = false is right for the CLI tool)

Wasmtime GitHub notifications bot (Feb 01 2022 at 23:49):

cfallin updated PR #3697 from memfd-cow to main.

Wasmtime GitHub notifications bot (Feb 01 2022 at 23:50):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Feb 01 2022 at 23:50):

cfallin created PR review comment:

Good point, no longer needed! Removed.

Wasmtime GitHub notifications bot (Feb 01 2022 at 23:51):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Feb 01 2022 at 23:51):

cfallin created PR review comment:

Ah, I wasn't aware of that bit of Mutex's API, thanks!

Wasmtime GitHub notifications bot (Feb 01 2022 at 23:51):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Feb 01 2022 at 23:51):

cfallin created PR review comment:

Seems not to be, removed!

Wasmtime GitHub notifications bot (Feb 01 2022 at 23:51):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Feb 01 2022 at 23:51):

cfallin created PR review comment:

Yes; added doc-comment notes to make this explicit and asserts where the MemoryMemFd is built. Thanks!

Wasmtime GitHub notifications bot (Feb 01 2022 at 23:51):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Feb 01 2022 at 23:51):

cfallin created PR review comment:

Indeed!

Wasmtime GitHub notifications bot (Feb 01 2022 at 23:51):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Feb 01 2022 at 23:51):

cfallin created PR review comment:

Good point, changed.

Wasmtime GitHub notifications bot (Feb 01 2022 at 23:54):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Feb 01 2022 at 23:54):

cfallin created PR review comment:

I've updated this so that the extra mmap no longer occurs. The basic idea is actually a slightly more general optimization: if there is no image (this can happen any time a memory with no data segments uses a MemFdSlot), then the whole slot is just the anon-zeroes mmap, and we don't need to wipe it clear with a new one. This is true when the MemFdSlot is first created as well: it has an image of None and we require the caller to have created a backing mmap already (as both the dynamic linear memory and the pooling allocator do).

Wasmtime GitHub notifications bot (Feb 02 2022 at 00:37):

cfallin updated PR #3697 from memfd-cow to main.

Wasmtime GitHub notifications bot (Feb 02 2022 at 15:54):

alexcrichton submitted PR review.

Wasmtime GitHub notifications bot (Feb 02 2022 at 15:54):

alexcrichton submitted PR review.

Wasmtime GitHub notifications bot (Feb 02 2022 at 15:54):

alexcrichton created PR review comment:

Now that I think about it, mind adding some debug asserts here that image.offset + image.len fits within initial_size_bytes?

Wasmtime GitHub notifications bot (Feb 02 2022 at 15:54):

alexcrichton created PR review comment:

One thought I had reading over this, is this strictly necessary? For example if we didn't track initial_size and instead only tracked cur_size then we could leave the entire mapping as zero here and on reuse we'd hit the if initial_size_bytes < self.initial_size case (although afterwards it'd be self.cur_size).

I don't think that'd actually result in any fewer syscalls happening in any cases though.

Wasmtime GitHub notifications bot (Feb 02 2022 at 15:54):

alexcrichton created PR review comment:

I think now the prot flag is always ProtFlags::empty() so this could perhaps be renamed to reset_with_anon_memory or something like that? (with no arguments)

Wasmtime GitHub notifications bot (Feb 02 2022 at 15:54):

alexcrichton created PR review comment:

I think this should be self.initial_size - initial_size_bytes, right?

In that this is just making the previously-larger heap inaccessible after the base?

Wasmtime GitHub notifications bot (Feb 02 2022 at 15:54):

alexcrichton created PR review comment:

Oh actually this may not work out, I can see how a grown heap would then, on resue, re-mmap the CoW overlay.

Wasmtime GitHub notifications bot (Feb 02 2022 at 17:39):

acfoltzer requested acfoltzer for a review on PR #3697.

Wasmtime GitHub notifications bot (Feb 02 2022 at 18:03):

cfallin updated PR #3697 from memfd-cow to main.

Wasmtime GitHub notifications bot (Feb 02 2022 at 18:03):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Feb 02 2022 at 18:03):

cfallin created PR review comment:

I think it's correct, but it's pretty subtle so I added some more comments with a figure showing the before-and-after mmap state. Basically we have 0..self.initial_size accessible, and we want only 0..initial_size_bytes (the new, smaller size) accessible, so we make inaccessible the range initial_size_bytes..self.initial_size.

Wasmtime GitHub notifications bot (Feb 02 2022 at 18:03):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Feb 02 2022 at 18:03):

cfallin created PR review comment:

Done! (And I made it a release assert rather than debug-only, as this feels important enough and we use release asserts elsewhere in this file)

Wasmtime GitHub notifications bot (Feb 02 2022 at 18:03):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Feb 02 2022 at 18:03):

cfallin created PR review comment:

Done!

Wasmtime GitHub notifications bot (Feb 02 2022 at 18:07):

alexcrichton submitted PR review.

Wasmtime GitHub notifications bot (Feb 02 2022 at 18:07):

alexcrichton created PR review comment:

But this parameter is the length parameter, not an end, so isn't the length of what we're changing to PROT_NONE self.initial_size - initial_size_bytes since the new heap is smaller?

Wasmtime GitHub notifications bot (Feb 02 2022 at 18:08):

alexcrichton submitted PR review.

Wasmtime GitHub notifications bot (Feb 02 2022 at 18:08):

alexcrichton created PR review comment:

Something I just realized for this case, I think that self.image should be set to None if we don't take this path through the if, right?

Wasmtime GitHub notifications bot (Feb 02 2022 at 18:25):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Feb 02 2022 at 18:25):

cfallin created PR review comment:

Eek, indeed. This "oh wait one more case" on the core code is making me nervous enough that I'm gonna go figure out how to test this more thoroughly (so far it's been just "every other test uses this as a backend").

Wasmtime GitHub notifications bot (Feb 02 2022 at 18:30):

alexcrichton created PR review comment:

Internally it might be a bit more robust to use Range<usize> instead of start: usize, len: usize perhaps, but otherwise I don't think this was a functionality bug since it's more that it could theoretically trip the preexisting assert in setting page protections.

Wasmtime GitHub notifications bot (Feb 02 2022 at 18:30):

alexcrichton submitted PR review.

Wasmtime GitHub notifications bot (Feb 02 2022 at 19:30):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Feb 02 2022 at 19:30):

cfallin created PR review comment:

Ah, it actually can lead to incorrect behavior, or at least I managed to hit it with the tests I just added: if we don't clear image when we instantiate with no image, then the stale state remains, and on the next instantiation we may falsely think that we have this image mapped in when we don't (anymore).

In any case, fixed and tests added; thank you for finding this!

Wasmtime GitHub notifications bot (Feb 02 2022 at 19:33):

cfallin updated PR #3697 from memfd-cow to main.

Wasmtime GitHub notifications bot (Feb 02 2022 at 19:39):

alexcrichton created PR review comment:

oh oops I thought I was responding to https://github.com/bytecodealliance/wasmtime/pull/3697#discussion_r797879911 here, but I clearly am not, so disregard that last comment here!

Wasmtime GitHub notifications bot (Feb 02 2022 at 19:39):

alexcrichton submitted PR review.

Wasmtime GitHub notifications bot (Feb 02 2022 at 19:41):

cfallin updated PR #3697 from memfd-cow to main.

Wasmtime GitHub notifications bot (Feb 02 2022 at 19:42):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Feb 02 2022 at 19:42):

cfallin created PR review comment:

Ah yes, I see now; pushed an update that modifies set_protection to use Range instead and this is indeed much clearer!

Wasmtime GitHub notifications bot (Feb 02 2022 at 20:25):

cfallin updated PR #3697 from memfd-cow to main.

Wasmtime GitHub notifications bot (Feb 02 2022 at 21:04):

cfallin merged PR #3697.

Last updated: Apr 18 2025 at 13:08 UTC