Hello, I am working on a system with the goals of allowing WASM module instances to be suspended, their state saved, and then persisted to disk, to be resumed in the future at a later stage.
My understanding is that in theory something like this should be possible to support. In theory, a WASM program is simply it's memory (code, globals, imports, exports, stack, heap allocations, registers, program counter, etc) - so it should be possible to serialize and deserialize all of this data into instances...
How difficult would it be to implement something like this?
However, I also probably suspect that compiled bytecode, and potentially updates to engine behavior could throw things out of alignment and create all kinds of bugs if such a thing was implemented naively...
Because ideally, we'd want these types of snapshots to Just Work across different engine runtime versions too (they are just WASM modules... afterall)
I guess what I am proposing is a universal serializable wasm instance snapshot interface that runtime engines would know how to both produce or resume from
If you're looking for engine portability then you probably want to serialize to wasm. This is what wasmtime wizer does, for example. The caveat there is wasm has no way of representing a stack-in-progress and values of locals that are nonzero. Serializing wasm-on-the-stack has no portable representation currently, and Wasmtime also has no support for that.
Supporting serializing wasm-on-the-stack has come up numerous times in the past with Wasmtime, but it's an extremely difficult feature to support. At this time we don't have plans to support it, but you're welcome to search around on Zulip/GitHub to see the past discussions
@Alex Crichton This is something that I wish to pursue the development of out of business necessity. Would you be willing to help guide me into the right direction and address certain questions I've had?
One idea I had was something along these lines
Hey @Albert Marashi have you taken a look at some of the issues already about this feature?
https://github.com/bytecodealliance/wasmtime/issues/3017
https://github.com/bytecodealliance/wasmtime/issues/4002
I'd recommend taking in some of that context before going about this -- you'll see a discussion in the most recent issue similar to what just transpired here. Even the older issue has lots of discussion about this problem.
That said -- given that the code for wasmtime and relevant tooling is open source, there is nothing stopping you from attempting a solution that matches the approach you think might work. I'm certainly eager to see a working solution and discuss it's merits!
Keep in mind that your business use case might be OK with an asyncify-like business transformation (assuming that approach works, and there are many hurdles still yet to clear to make something that makes sense), but others may not want to make the trade-off in terms of performance.
Yes. I think I might have a viable solution
@Victor Adossi Yes, I've reviewed the prior discussions and considered several approaches. I believe I have a viable and generic solution, though it likely wouldn't align well with Wasmtime's goals directly — I'd either be forking Wasmtime or writing a custom runtime from scratch.
The core approach:
At a high level, the mechanism works by interrupting execution, advancing to a deterministic safepoint (between Wasm instructions), and then performing live analysis (precomputed during compilation) to identify which registers hold stack-relevant values. These are then pushed onto the stack of a universal, portable VM representation of the program state — producing a serializable snapshot.
Resumption is the reverse: stack values are restored into the appropriate registers, and execution continues from the safepoint.
Safepoint insertion & interruption:
Execution isn't interrupted at deterministic points, but the resulting snapshot is deterministic — safepoints are inserted at natural boundaries (function calls, loop headers) or between each Wasm-equivalent instruction. To trigger a suspension, I'm considering a guard-page mechanism: a memory page with READ permissions during normal execution, and permissions revoked when we want the module to trap — providing a low-overhead interruption signal without polling.
Through testing of hypothetical recursive fibbonaci functions, i've determined that the cost of step-wise trap checks are on the order of 3-10% of hot-path execution time. (CPU branch prediction cost approaches 0-1 cycles)
The cost of thread-based interruption would be 0% on the hot path since the code contains no checks, and the suspend/resume functionality would be amortized over the millions or billions of instructions executed between suspensions, which wouldn't be frequent enough to have any meaningful performance effects.
The only larger cost comes at the cost of code metadata and side tables to provide information to our runtime needed for the suspend/snapshot/resume functionality (register/local/stack mappings), which may be roughly 1-3x the size of the code itself depending on my implementation efficiency.
I am also considering the use of dynamic function compilation based on hot path analysis, as running some of the code in interpreted mode would mean only hot functions would need to be compiled (interpreter mode would naturally support step-wise execution and thus suspend+resume capabilities are baked in)
I worked on a quick proof of concept compiler that I started off as an interpreted VM - initially being 132x slower than wasmtime but brought it down to around 3x slower with the help of some JIT code. (Also it says "interpreter" in there but it's somewhere more in between of "interpreted" and "JIT")
I think I should be able to approach wasmtime performance
The native rust approach is comparing a bit of an unfair apples and oranges approach because the release version for it compiles the function into a tail call optimization afaik - whereas my WASM module uses recursive function calls)
I am also considering turning my WASM program's stack frames into a more friendly representation optimized towards serializability
Out of curiosity, how do you plan to handle host references, e.g. externrefs or open file descriptors, etc?
Or is this not relevant for your use case?
business-wise, things like sockets will have a cost associated to them
technically, the host will maintain them for the sleeping modules
Last updated: Feb 24 2026 at 04:36 UTC