I'm just starting to investigate the feasibility of this cooked scheme and thought I'd check if anybody's already done this or thought about it at all.
Let's say the host has asynchronously invoked a component function that runs for a very long time, routinely being suspended and resumed with epoch-based interruption. At some point, the host decides to shut down the instance (e.g. because the machine the host is running on would like to shut down), but persist the running fiber to disk in a way that it could be later re-loaded and resumed using an identical Engine
and instance of the same component (possibly on a different machine) in a way that's completely transparent to the guest.
I'm assuming this would require heavily unsafe modifications to wasmtime, but it doesn't seem like it's even as crazy as some of the other things cranelift does. Anybody know of any factors that would make this truly impossible? Serializing FiberStack
seems potentially straightforward enough. What else would have to be persisted and then re-loaded? Shared memories would also require some careful handling.
Thanks in advance for any insights!
Serializing
FiberStack
seems potentially straightforward enough.
This is fairly tricky: what you need is something like a "relocatable stack" (and relocatable register contents) where, when rehydrating the snapshot in a new process/context with different host memory addresses for stack and heap, we need to fix up the in-flight state appropriately. This boils down to having precise type information for all the pieces of address computations that can exist in the compiler IR, including intermediate VM data structures, that is sufficient to recompute them in a new context; or else not persisting them across suspend-points (which could be any function call).
See https://github.com/bytecodealliance/wasmtime/issues/3017 where we discussed some of this a few years ago. I don't think this is likely to be easily implementable, but someone who is sufficiently motivated with a deep enough understanding of the runtime and of Cranelift could probably do it in a few months' fulltime work
you do not want to touch stack rewriting with even a ten feet pole
i implemented a rudimentary version of this a while back for the physical to virtual memory transition of k23 but it’s uniquely cursed
i don’t think i had a worse time programming
Last updated: Feb 27 2025 at 23:03 UTC