I'm considering cranelift as a JIT from a rust host program, and I want the jitted code to handle rust data structures. The data structures are not repr(C)
, though, so writing the access code in cranelift itself (for enums and structs) is not available. I can write simple accessors that would almost certainly be inlined in rust code (the code of which is roughly *(x+n)
for some constant n
), but my guess is that if I register these as external functions to cranelift and call them then they will not be inlined into the jitted code. What are my options here?
@Mario Carneiro that's an excellent question! You're correct that these accessor call-sites would not inline anything; to Cranelift, the calls to your external functions are completely opaque. (This is similar to how things would work with most other compiler frameworks, fwiw, and isn't unique to Cranelift, though you likely are already aware of this :-) ). The simplest and most straightforward option would be to generate your accessor code at runtime -- today this would have to be inlined manually as we don't have an inlining pass, but in the future when we do, you could in theory generate the CLIF just once and take advantage of inlining.
Unfortunately, the most "magical" approach -- just writing native code in your runtime, and somehow getting this to be inlined where you generate VM calls -- is nearly intractable without some cooperation from your toolchain, because simply lifting already-compiled machine code and dropping it in will break for a number of reasons -- we would want to lift it back to an IR of some sort to allow optimizations and the regalloc pass to properly join it with surrounding code.
There is an interesting and rich design space of toolchain support for writing something that looks like native runtime code and getting this to become IR for the JIT, though. E.g., the V8 JS engine has "Torque", a DSL that lets one write VM functions easily, and there was a proposal "HolyJit" to do a similar thing in SpiderMonkey. Lots of interesting reading here if one is curious -- happy to find some links if so!
That's a good point about not having the relevant information from the surrounding code, since cranelift runs at runtime. This requires rustc support, but conceivably there could be an attribute that would, say, output the MIR for a function as a constant in the binary (or even just presented as a const
in rust), so that it can be accessed by tools like cranelift.
I think the most promising near-term approach would just be to use some unsafe code to do the discriminant checking and memoffset
for field access, although I think that depends on a couple unstable aspects of rust data layout
output the MIR for a function as a constant in the bnary
Indeed! That's kind of what HolyJit is doing (though in its own framework, not for Cranelift) -- it's a rustc compiler plugin that translates the function body into JIT IR. It might be an interesting starting point for something like this.
I agree it's a little bit tricky to depend on the data layout, especially if one can't control it (and put a repr(C)
on it); others may have more ideas how to introspect offsets or otherwise make this more stable (@Alex Crichton maybe?)
I am not personally too fussed about having to write functions as cranelift API calls instead of something that looks more "natural"; but rust likes to keep a lot of representation details close to the chest so I think it's actually impossible to write some functions on rust data structures if the only primitives are things that a from-scratch VM architecture would know about like adding numbers and dereferencing a pointer
oh, that holyjit project does seem pretty relevant, and I guess it extends the compiler to get access to these details
hmm, there is std::mem::discriminant
but yeah, maybe a limited rustc plugin just to give you details (discriminant offset, width, and values; and field offsets) of Rust layout would be enough to codegen the accessors?
std::mem::discriminant
is still a function though, you would have to turn it into cranelift IR for it to be usable
right, it doesn't look like there's a way to get the integer out of it
so, rustc plugin seems like the most straightforward way here
I wonder if Layout
can be used as a sort of introspection API here?
these are questions best asked of our resident rustc-internals gurus :-) I'm bowing out now (end of day) but hopefully someone else knows more -- best of luck!
thanks for the pointers
Last updated: Nov 22 2024 at 16:03 UTC