I have been thinking about how to handle hot code swapping inside simplejit and how to export it to the user. I think I have something now that would work:
What do you all think about this plan?
extra, can be done later: add unwind info for plt
I got lazy compilation working in cg_clif. Cranelift PR up at #2403.
@bjorn3 what does it mean to be lazy in rustc/cg_clif?
@Joey Gouly cg_clif has a JIT mode. Currently it compiles all functions before running any user code. Lazy compilation would only compile functions as the get called, so it can start earlier and when code only gets called in edge cases, compilation can often be avoided completely.
@bjorn3 When using cargo run
? I dont get how this works w.r.t cargo etc
The JIT mode is currently rather limited as it requires all dependencies to be directly or indirectly available as dylib. You can use it by running cargo.sh jit
. This will use cargo rustc -- --jit
to make cargo pass --jit
only when compiling the executable. --jit
is intercepted by the cg_clif driver to enable JIT mode. In the JIT mode compilation happens as usual until the codegen step. During this step it will use SimpleJIT as backend instead of cranelift_object as usual. When compilation is done (or enough done to run something when using lazy compilation), the main function will be invoked, after which cg_clif exits with the exit code of the jitted program. https://github.com/bjorn3/rustc_codegen_cranelift/blob/4e547b942daf9b6786fabdb8ee52056a448d8c9f/src/driver/jit.rs
Wait does this mean it is possible to do inline caching inside optimizing jit compilers?
@playX Not really, this is about swapping whole functions at a time. The code sections are still read-only. In fact on some modern systems (eg iOS and SELinux) it isn't even possible to have writable code sections anymore. You have to call mmap
to switch between R+W and R+X.
Self-modifying code also has a high performance cost on modern processors. Javascript engines switched away from them for a reason.
@playX to build on the above, the usual technique for ICs these days is to use updatable function pointers -- e.g. in SpiderMonkey there are "inline cache (IC) chains" which are linked lists of structs, each of which has a pointer to code to branch to. Codegen just needs to emit an indirect call then
In other words all of this should happen at an abstraction level above the compiler IR (with the possible exception of support for tail-calling, which you want for the fallback from one IC to the next)
Yes I know about this. What I did is similar to these inline cache chains but I was mostly taking inspiration from WebKit which replaces jump targets rather than function pointers. What WebKit does is it generates new code block and replaces jump target from slow path to this new code block. I had to write my own macro assembler for this https://github.com/playxe/masm-rs
Last updated: Nov 22 2024 at 16:03 UTC