wasmtime / issue #3758 Pooling allocator / memfd: do CoW ... · git-wasmtime

Stream: git-wasmtime

Topic: wasmtime / issue #3758 Pooling allocator / memfd: do CoW ...

Wasmtime GitHub notifications bot (Feb 02 2022 at 21:21):

Right now, the memfd-based copy-on-write mechanism for Wasm memories builds the CoW base image on the fly, when the module is first loaded/compiled. This "memfd" then lives just within the process as an anonymous file held open by an fd.

However, some applications of wasmtime separate out the compilation and runtime phases, such that the compilation emits a .cwasm file on disk that the runtime phase later loads. This raises the question: a .cwasm file is also a file, which can be held open by an fd and mmap'd. Why not just mmap the images for the memory/ies straight from the file?

This would mean that we not only would lazily populate the mapping after each repeated instantiate, but we would lazily page the data in from disk on the first instantiate. The virtual-memory subsystem could also throw away the pages (the original file pages would always remain clean in the page cache) under heavy memory pressure, since they have a disk backing. In contrast, a memfd would need to be swapped out to vacate the space.

Thus, doing this would both have advantages to load-to-first-instantiate latency, and to memory pressure on systems with many modules loaded.

As a final advantage, though memfd_create is a Linux-specific syscall, opening a file for an fd is something every Unix can do. If we also replace our use of madvise(MADV_DONTNEED) with a fresh-mmap-on-every-instantiate, we could use this scheme on macOS too (and maybe Windows, but I know less about the semantics of holding files open there).

Wasmtime GitHub notifications bot (Feb 02 2022 at 21:24):

cfallin commented on issue #3758:

The main functional change(s) this would require would be to:

create a ready-to-use image of each heap memory with initializers in the .cwasm. This is basically like an ELF binary's image of .data / .rodata. Right now I think we have this in a condensed form (list of initializers with offsets) instead;

possibly tweak the API to pass an open File into a new constructor for Module.

Wasmtime GitHub notifications bot (Feb 10 2022 at 21:40):

alexcrichton closed issue #3758:

Right now, the memfd-based copy-on-write mechanism for Wasm memories builds the CoW base image on the fly, when the module is first loaded/compiled. This "memfd" then lives just within the process as an anonymous file held open by an fd.

However, some applications of wasmtime separate out the compilation and runtime phases, such that the compilation emits a .cwasm file on disk that the runtime phase later loads. This raises the question: a .cwasm file is also a file, which can be held open by an fd and mmap'd. Why not just mmap the images for the memory/ies straight from the file?

This would mean that we not only would lazily populate the mapping after each repeated instantiate, but we would lazily page the data in from disk on the first instantiate. The virtual-memory subsystem could also throw away the pages (the original file pages would always remain clean in the page cache) under heavy memory pressure, since they have a disk backing. In contrast, a memfd would need to be swapped out to vacate the space.

Thus, doing this would both have advantages to load-to-first-instantiate latency, and to memory pressure on systems with many modules loaded.

As a final advantage, though memfd_create is a Linux-specific syscall, opening a file for an fd is something every Unix can do. If we also replace our use of madvise(MADV_DONTNEED) with a fresh-mmap-on-every-instantiate, we could use this scheme on macOS too (and maybe Windows, but I know less about the semantics of holding files open there).

Last updated: Apr 18 2025 at 19:03 UTC