Stream: cranelift

Topic: hot code swapping for simplejit


view this post on Zulip bjorn3 (Nov 11 2020 at 21:57):

I have been thinking about how to handle hot code swapping inside simplejit and how to export it to the user. I think I have something now that would work:

  1. Always use GOT/PLT for relocations in functions, even when the hot code swapping is not used by the user. This simplifies the next steps at the cost of a bit of overhead when hot code swapping is not necessary.
  2. Refer to the PLT entry for function relocations within data objects to allow swapping functions without having to perform relocations on data objects again.
  3. Immediately perform all relocations in define_* except for data object -> data object relocations.
  4. Add prepare_*_for_redefine. This will allow a single future redefinition of the specified function or data object by overwriting the GOT entry. It will also mark all relevant data object -> data object relocations as pending again.
  5. Add define_*_with_address on SimpleJITModule as replacement for the current symbol(s) functions on SimpleJITBuilder.

What do you all think about this plan?

view this post on Zulip bjorn3 (Nov 12 2020 at 18:39):

  1. done (eaa2c5b and 5458473)
  2. done (8a4749a)
  3. done (bf9e5d9)
  4. partially done (functions only: bbc2afb)
  5. can be done later

extra, can be done later: add unwind info for plt

I got lazy compilation working in cg_clif. Cranelift PR up at #2403.

view this post on Zulip Joey Gouly (Nov 13 2020 at 10:20):

@bjorn3 what does it mean to be lazy in rustc/cg_clif?

view this post on Zulip bjorn3 (Nov 13 2020 at 11:04):

@Joey Gouly cg_clif has a JIT mode. Currently it compiles all functions before running any user code. Lazy compilation would only compile functions as the get called, so it can start earlier and when code only gets called in edge cases, compilation can often be avoided completely.

view this post on Zulip Joey Gouly (Nov 13 2020 at 11:46):

@bjorn3 When using cargo run? I dont get how this works w.r.t cargo etc

view this post on Zulip bjorn3 (Nov 13 2020 at 12:09):

The JIT mode is currently rather limited as it requires all dependencies to be directly or indirectly available as dylib. You can use it by running cargo.sh jit. This will use cargo rustc -- --jit to make cargo pass --jit only when compiling the executable. --jit is intercepted by the cg_clif driver to enable JIT mode. In the JIT mode compilation happens as usual until the codegen step. During this step it will use SimpleJIT as backend instead of cranelift_object as usual. When compilation is done (or enough done to run something when using lazy compilation), the main function will be invoked, after which cg_clif exits with the exit code of the jitted program. https://github.com/bjorn3/rustc_codegen_cranelift/blob/4e547b942daf9b6786fabdb8ee52056a448d8c9f/src/driver/jit.rs

Cranelift based backend for rustc. Contribute to bjorn3/rustc_codegen_cranelift development by creating an account on GitHub.

view this post on Zulip playX (Jan 12 2021 at 16:50):

Wait does this mean it is possible to do inline caching inside optimizing jit compilers?

view this post on Zulip bjorn3 (Jan 12 2021 at 17:16):

@playX Not really, this is about swapping whole functions at a time. The code sections are still read-only. In fact on some modern systems (eg iOS and SELinux) it isn't even possible to have writable code sections anymore. You have to call mmap to switch between R+W and R+X.

view this post on Zulip bjorn3 (Jan 12 2021 at 17:17):

Self-modifying code also has a high performance cost on modern processors. Javascript engines switched away from them for a reason.

view this post on Zulip Chris Fallin (Jan 12 2021 at 17:29):

@playX to build on the above, the usual technique for ICs these days is to use updatable function pointers -- e.g. in SpiderMonkey there are "inline cache (IC) chains" which are linked lists of structs, each of which has a pointer to code to branch to. Codegen just needs to emit an indirect call then

view this post on Zulip Chris Fallin (Jan 12 2021 at 17:30):

In other words all of this should happen at an abstraction level above the compiler IR (with the possible exception of support for tail-calling, which you want for the fallback from one IC to the next)

view this post on Zulip playX (Jan 13 2021 at 12:57):

Yes I know about this. What I did is similar to these inline cache chains but I was mostly taking inspiration from WebKit which replaces jump targets rather than function pointers. What WebKit does is it generates new code block and replaces jump target from slow path to this new code block. I had to write my own macro assembler for this https://github.com/playxe/masm-rs

Macro assembler for Rust. Contribute to playXE/masm-rs development by creating an account on GitHub.

Last updated: Oct 23 2024 at 20:03 UTC