cfallin opened issue #3758:
Right now, the memfd-based copy-on-write mechanism for Wasm memories builds the CoW base image on the fly, when the module is first loaded/compiled. This "memfd" then lives just within the process as an anonymous file held open by an fd.
However, some applications of wasmtime separate out the compilation and runtime phases, such that the compilation emits a
.cwasm
file on disk that the runtime phase later loads. This raises the question: a.cwasm
file is also a file, which can be held open by an fd and mmap'd. Why not just mmap the images for the memory/ies straight from the file?This would mean that we not only would lazily populate the mapping after each repeated instantiate, but we would lazily page the data in from disk on the first instantiate. The virtual-memory subsystem could also throw away the pages (the original file pages would always remain clean in the page cache) under heavy memory pressure, since they have a disk backing. In contrast, a memfd would need to be swapped out to vacate the space.
Thus, doing this would both have advantages to load-to-first-instantiate latency, and to memory pressure on systems with many modules loaded.
As a final advantage, though
memfd_create
is a Linux-specific syscall, opening a file for an fd is something every Unix can do. If we also replace our use ofmadvise(MADV_DONTNEED)
with a fresh-mmap-on-every-instantiate, we could use this scheme on macOS too (and maybe Windows, but I know less about the semantics of holding files open there).
cfallin commented on issue #3758:
The main functional change(s) this would require would be to:
- create a ready-to-use image of each heap memory with initializers in the
.cwasm
. This is basically like an ELF binary's image of.data
/.rodata
. Right now I think we have this in a condensed form (list of initializers with offsets) instead;- possibly tweak the API to pass an open
File
into a new constructor forModule
.
alexcrichton closed issue #3758:
Right now, the memfd-based copy-on-write mechanism for Wasm memories builds the CoW base image on the fly, when the module is first loaded/compiled. This "memfd" then lives just within the process as an anonymous file held open by an fd.
However, some applications of wasmtime separate out the compilation and runtime phases, such that the compilation emits a
.cwasm
file on disk that the runtime phase later loads. This raises the question: a.cwasm
file is also a file, which can be held open by an fd and mmap'd. Why not just mmap the images for the memory/ies straight from the file?This would mean that we not only would lazily populate the mapping after each repeated instantiate, but we would lazily page the data in from disk on the first instantiate. The virtual-memory subsystem could also throw away the pages (the original file pages would always remain clean in the page cache) under heavy memory pressure, since they have a disk backing. In contrast, a memfd would need to be swapped out to vacate the space.
Thus, doing this would both have advantages to load-to-first-instantiate latency, and to memory pressure on systems with many modules loaded.
As a final advantage, though
memfd_create
is a Linux-specific syscall, opening a file for an fd is something every Unix can do. If we also replace our use ofmadvise(MADV_DONTNEED)
with a fresh-mmap-on-every-instantiate, we could use this scheme on macOS too (and maybe Windows, but I know less about the semantics of holding files open there).
Last updated: Jan 24 2025 at 00:11 UTC