Hello - my team is evaluating Wasmtime for a project and came across the PoolingAllocation / CoW features. We want to see if this can be helpful for our use case. We have a Wasm module with two exported functions. The first function is used for initial setup and is a very heavy operation which uses a large amount of memory. The second function is an extremely light operation which uses data structures set up by the first function (without modifying any of the data structures). We also want to prevent state accumulation in calls to the second function and plan to use this in a high QPS setting with multiple threads.
We currently have a proof of concept built with WAMR where we run the first function and then fork from that process to run the second function. This effectively achieves our use cases (run the heavy function once and reuse the data structures without paying the copying or memory cost because of CoW). Since it's a new fork each time, we are preventing state accumulation by resetting to a good state. The negative is the overhead of process cloning.
Wasmtime's PoolingAllocation / CoW seem like a great alternative to this. A few things that I am not very clear about:
Can I use Wasmtime to execute a heavy initialization function and then have that be the basis on which each new module instance is created when using Pooling Allocation with CoW? Does that kind of granular control exist?
Thank you so much!
Hello! Wasmtime's pooling allocator and CoW support is around accelerating and optimizing the instantiation of a wasm module, but this doesn't modify the lifetime of the instance itself and this support isn't related to, for example, taking a snapshot of a running instance and copying it.
That being said one architecture which might be able to work for you is to first run your module through wizer with the heavy function first. This produces a wasm file which is in effect a snapshot. Wasmtime can then efficiently run that second "wizened" file with the pooling allocator and CoW efficiently
That's really cool, I'll take a look at Wizer. Thank you!
This may be a question for another group, but does Wizer allow input args for initialization (e.g. if I need a block of memory used in initialization)? Or do I need to enable Wasi and have the initialization method fetch it?
@fitzgen (he/him) might be able to help with the particulars of wizer (it should be possible one way or another though)
On phone but while the initialization function’s signature is fixed you can provide WASI access and read from disk or env vars or you can even provide a custom linker and define your own import functions that are available at init time
Thank you
Happy Monday! I had a couple follow up questions:
Thank you again!
After some experiments, it seems like the answer to #1 is "yes". I am able to see a dramatic speed up when recreating memory-heavy module instances after paying the cost once. Is this something deterministic that we can rely on? Is the CoW semantics cross-thread or per-thread? Thank you
For (1) yes CoW is enabled by default and is on regardless of the allocation strategy (on demand or pooling). For (2) I believe that will be achieved by having a single Module
which you instantiate in multiple stores
You can rely on CoW happening yes (if it's enabled) modulo some limits but if it happens once it'll always happen. I'm not sure what your question is about threading, but if it compiles in Rust then it's supposed to work
Last updated: Jan 24 2025 at 00:11 UTC