Stream: general

Topic: CoW for High Initialization Cost Modules


view this post on Zulip Friday More (Dec 17 2024 at 19:29):

Hello - my team is evaluating Wasmtime for a project and came across the PoolingAllocation / CoW features. We want to see if this can be helpful for our use case. We have a Wasm module with two exported functions. The first function is used for initial setup and is a very heavy operation which uses a large amount of memory. The second function is an extremely light operation which uses data structures set up by the first function (without modifying any of the data structures). We also want to prevent state accumulation in calls to the second function and plan to use this in a high QPS setting with multiple threads.

We currently have a proof of concept built with WAMR where we run the first function and then fork from that process to run the second function. This effectively achieves our use cases (run the heavy function once and reuse the data structures without paying the copying or memory cost because of CoW). Since it's a new fork each time, we are preventing state accumulation by resetting to a good state. The negative is the overhead of process cloning.

Wasmtime's PoolingAllocation / CoW seem like a great alternative to this. A few things that I am not very clear about:
Can I use Wasmtime to execute a heavy initialization function and then have that be the basis on which each new module instance is created when using Pooling Allocation with CoW? Does that kind of granular control exist?

Thank you so much!

view this post on Zulip Alex Crichton (Dec 17 2024 at 20:08):

Hello! Wasmtime's pooling allocator and CoW support is around accelerating and optimizing the instantiation of a wasm module, but this doesn't modify the lifetime of the instance itself and this support isn't related to, for example, taking a snapshot of a running instance and copying it.

That being said one architecture which might be able to work for you is to first run your module through wizer with the heavy function first. This produces a wasm file which is in effect a snapshot. Wasmtime can then efficiently run that second "wizened" file with the pooling allocator and CoW efficiently

The WebAssembly Pre-Initializer. Contribute to bytecodealliance/wizer development by creating an account on GitHub.

view this post on Zulip Friday More (Dec 17 2024 at 20:10):

That's really cool, I'll take a look at Wizer. Thank you!

view this post on Zulip Friday More (Dec 17 2024 at 21:27):

This may be a question for another group, but does Wizer allow input args for initialization (e.g. if I need a block of memory used in initialization)? Or do I need to enable Wasi and have the initialization method fetch it?

view this post on Zulip Alex Crichton (Dec 17 2024 at 21:34):

@fitzgen (he/him) might be able to help with the particulars of wizer (it should be possible one way or another though)

view this post on Zulip fitzgen (he/him) (Dec 17 2024 at 21:40):

On phone but while the initialization function’s signature is fixed you can provide WASI access and read from disk or env vars or you can even provide a custom linker and define your own import functions that are available at init time

view this post on Zulip Friday More (Dec 17 2024 at 22:06):

Thank you

view this post on Zulip Friday More (Dec 23 2024 at 23:46):

Happy Monday! I had a couple follow up questions:

  1. I am a little confused about the CoW feature (https://docs.wasmtime.dev/api/wasmtime/struct.Config.html#method.memory_init_cow) and its relation to pooling allocation (https://docs.wasmtime.dev/api/wasmtime/struct.PoolingAllocationConfig.html). It seems like the CoW feature is enabled by default and we can benefit from it even when using the OnDemand strategy?
  2. Suppose our application only uses a single module with multiple module instances (e.g., for serving concurrent requests), can we have a pooling allocation setup where all instances (in separate "slots") share the same backing memory with is CoW-ed?

Thank you again!

view this post on Zulip Friday More (Dec 26 2024 at 07:23):

After some experiments, it seems like the answer to #1 is "yes". I am able to see a dramatic speed up when recreating memory-heavy module instances after paying the cost once. Is this something deterministic that we can rely on? Is the CoW semantics cross-thread or per-thread? Thank you

view this post on Zulip Alex Crichton (Dec 26 2024 at 16:48):

For (1) yes CoW is enabled by default and is on regardless of the allocation strategy (on demand or pooling). For (2) I believe that will be achieved by having a single Module which you instantiate in multiple stores

view this post on Zulip Alex Crichton (Dec 26 2024 at 16:48):

You can rely on CoW happening yes (if it's enabled) modulo some limits but if it happens once it'll always happen. I'm not sure what your question is about threading, but if it compiles in Rust then it's supposed to work


Last updated: Jan 24 2025 at 00:11 UTC