I would like to implement delimited continuation by capturing part of the stack (also dump all register contents so that they can be recovered later) and transfer control to a distant caller frame. It looks like cranelift has instructions for me to get the stack frame pointers. But I have yet figured out how to transfer control and dump all registers. Does cranelift offer such features or do I need to hack into the implementation?
the work being done by the WasmFX proposal team may be relevant here, they are using a fork of wasmtime to prototype an implementation. I have not been tracking the exact progress of their fork, and I don't know how up to date it is with main
, but its at least some place to start https://github.com/effect-handlers/wasmtime
see also https://wasmfx.dev/
oh, it has apparently moved to https://github.com/wasmfx/wasmfxtime
Aha awesome! Thank you!
We do have a fully working implementation of the WasmFX instruction set (modulo resume_throw
which requires exceptions) in our fork of wasmtime (the fork is never more than a week behind upstream). Our first version was implemented on top of Wasmtime Fiber, which provides a basic form of delimited continuations for stack switching. Now, we are in the process of internalising the stack switching logic in codegen. We are also working on toolchain support; currently we are upstreaming support in binaryen under the flag --enable-typed-continuations
. Happy to talk more about the project and approach, feel free to ping me here or send me a DM!
Wonderful! Let me dig into it a bit more!
So IIUC, Cranelift (also the WasmFX fork) does not currently have a supported way to transfer control to a distant caller (or the other way after captured frames gets unpacked on the stack on continuation resumption). And by "we are in the process of internalising the stack switching logic in codegen" do you mean you are adding this capability? If so, I am curious how this would be done. Are you looking to add more instructions to the Cranelift IR to support this?
I am not sure what you mean by a "distant caller". Do you mean a remote process? If so, then we would need some kind of continuation (de)serialisation. I would be interesting in learning more about what you mean and your use-case(s).
By internalising the stack switching logic in codegen, I mean that, our initial prototype would perform a libcall (actually several libcalls) to the Wasmtime Fiber API for each WasmFX instruction. A consequence of this approach is that we repeatedly cross the Wasm-Host boundary even for things that execute only inside the Wasm world. We want to eliminate this boundary crossing to make the stack switching faster. So we are in the process of generating the code that performs the stack switch inside the Wasm world rather than in Rust world.
We have not yet extended or modified the IR of Cranelift. We have some plans to do so, but they are for later. First, we are going to code as much as possible on top of Cranelift's existing IR.
By "distant caller" I meant a non-immediate caller of the current executing function on the stack. For example, one way to implement algebraic effects handler is to manipulate the stack directly at runtime, which requires transferring control from a handler invocation to the matching handler implementation (and transfer control back on continuation resumption).
It's also possible to implement all of this with CPS transform and not touching the native stack at runtime. I am actually considering going down this path instead so I don't need any fancy extension on cranelift itself.
OK. I better understand what you mean now. We do manipulate the stack directly. It just happens in the Rust world rather than Wasm world. Wasmtime Fiber gives you a genuine fresh execution stack. However, in our refined implementation we no longer invoke the Wasmtime Fiber API, instead we perform all the stack manipulation inside Wasm, i.e. we generate the code that manipulates the stack directly rather than going through the indirection of a libcall to the Rust world.
Wasmtime Fiber provides a restriction form of delimited continuations, which with a little bit of glue is sufficient to encode effect handlers. It is not optimal, but it is functionally correct, which is why we did this first. CPS was not an option for us, as our goal is to extend the instruction set with native support for continuation-y things. Though, if you have control of the toolchain, then CPS is a viable option. For example, the JavaScript compiler for OCaml (js_of_ocaml
) uses a CPS transform to compile effect handlers.
Got it! Looking forward to the instruction extension to support first class continuation! I am just working on a hobby project so I will take the cps approach for now and probably switch to stack manipulation when it's ready (I am speculating stack manipulation would be more performant).
Btw I came across your soundly handle linearity paper and it seems the approach I took with my hobby project is pretty similar haha!
Tianyu Geng has marked this topic as resolved.
Tianyu Geng said:
Btw I came across your soundly handle linearity paper and it seems the approach I took with my hobby project is pretty similar haha!
I am glad to hear that. I think it is a quite nice solution! I am curious to know which of the approaches you used; did you use the kinds approach or qualified types? I hope you liked paper.
My project is in a somewhat different setting. I am trying to combine dependent type, modality, and effects handler. Obviously due to that full type inference is not decidable so it's not a goal to eliminate all type annotations (essentially each top level function is expected to have an explicit type, though holes are allowed and unification will fill them in during type checking, similar to Agda, idris, etc). The core type theory is an extension of Levy's call by push value, where values and computations are separated (though total computations can be embedded into the value space by a special value construct. This is critical in order to do type gymnastics ). On top of this setting, a computation type carries an effect annotation, which is just a set of effects and has a derived continuation linearity value. I have used 5-value rng for this (zero, linear, affine, relevant, unrestricted) instead of just two (linear and unrestricted). The zero is especially useful in erasing unnecessary values during type checking, similarly to the usage in quantitative type theory. Computation types admit a sub typing relationship according to the effect annotations. But effects are also qualified and can be constrained by the continuation linearity. On top of this, the reified continuation object passed to handler implementation is always linear and special dispose and replicate actions can be applied to them in addition to resume. This way, multishot continuation is explicitly replicated or disposed inside handler implementations. I also include a mechanism that allow users to hook custom code that should be invoked during disposal and replication inside handlers. (This bit is a little convoluted and it's hard to explain in a few sentences because such hooks can have side effects and the current implantation places some additional constraints for them). You can see I am including many things into this so I have no idea if my core type theory is consistent or whether my type checking algorithm remains decidable. I hope it is but I don't have any proofs. And the whole thing still wip.
I have spent quite some time on the type checker and core theory and so I would like to work on something different, which is why I am looking at how to actually compile this into reasonably efficient code.
Oops I meant "rig" rather than "rng"
Interesting! I think I get the gist of what you are working on. As for the theory part, are you familiar with Danel Ahman's PhD thesis (https://danel.ahman.ee/papers/phd-thesis.pdf)? You may be able to piggyback on some of Danel's developments. What you are saying about explicitly replicating multishot continuations, reminds me of Filinski's POPL'92 paper "linear continuations" (https://dl.acm.org/doi/10.1145/143165.143174). If you are not already familiar with it, then it may be worth a read.
Yep I forgot to mention my work is very much inspired by Daniel's paper! I have seen the latter one though. Thanks for the pointer!
Oops, I meant Danel. Auto correction on my phone is annoying...
Last updated: Jan 24 2025 at 00:11 UTC