Stream: cranelift

Topic: Inlining external rust functions


view this post on Zulip Mario Carneiro (May 05 2021 at 04:24):

I'm considering cranelift as a JIT from a rust host program, and I want the jitted code to handle rust data structures. The data structures are not repr(C), though, so writing the access code in cranelift itself (for enums and structs) is not available. I can write simple accessors that would almost certainly be inlined in rust code (the code of which is roughly *(x+n) for some constant n), but my guess is that if I register these as external functions to cranelift and call them then they will not be inlined into the jitted code. What are my options here?

view this post on Zulip Chris Fallin (May 05 2021 at 06:29):

@Mario Carneiro that's an excellent question! You're correct that these accessor call-sites would not inline anything; to Cranelift, the calls to your external functions are completely opaque. (This is similar to how things would work with most other compiler frameworks, fwiw, and isn't unique to Cranelift, though you likely are already aware of this :-) ). The simplest and most straightforward option would be to generate your accessor code at runtime -- today this would have to be inlined manually as we don't have an inlining pass, but in the future when we do, you could in theory generate the CLIF just once and take advantage of inlining.

Unfortunately, the most "magical" approach -- just writing native code in your runtime, and somehow getting this to be inlined where you generate VM calls -- is nearly intractable without some cooperation from your toolchain, because simply lifting already-compiled machine code and dropping it in will break for a number of reasons -- we would want to lift it back to an IR of some sort to allow optimizations and the regalloc pass to properly join it with surrounding code.

There is an interesting and rich design space of toolchain support for writing something that looks like native runtime code and getting this to become IR for the JIT, though. E.g., the V8 JS engine has "Torque", a DSL that lets one write VM functions easily, and there was a proposal "HolyJit" to do a similar thing in SpiderMonkey. Lots of interesting reading here if one is curious -- happy to find some links if so!

view this post on Zulip Mario Carneiro (May 05 2021 at 06:34):

That's a good point about not having the relevant information from the surrounding code, since cranelift runs at runtime. This requires rustc support, but conceivably there could be an attribute that would, say, output the MIR for a function as a constant in the binary (or even just presented as a const in rust), so that it can be accessed by tools like cranelift.

view this post on Zulip Mario Carneiro (May 05 2021 at 06:37):

I think the most promising near-term approach would just be to use some unsafe code to do the discriminant checking and memoffset for field access, although I think that depends on a couple unstable aspects of rust data layout

view this post on Zulip Chris Fallin (May 05 2021 at 06:40):

output the MIR for a function as a constant in the bnary

Indeed! That's kind of what HolyJit is doing (though in its own framework, not for Cranelift) -- it's a rustc compiler plugin that translates the function body into JIT IR. It might be an interesting starting point for something like this.

Generic purpose Just-In-time compiler for Rust. Contribute to nbp/holyjit development by creating an account on GitHub.

view this post on Zulip Chris Fallin (May 05 2021 at 06:41):

I agree it's a little bit tricky to depend on the data layout, especially if one can't control it (and put a repr(C) on it); others may have more ideas how to introspect offsets or otherwise make this more stable (@Alex Crichton maybe?)

view this post on Zulip Mario Carneiro (May 05 2021 at 06:41):

I am not personally too fussed about having to write functions as cranelift API calls instead of something that looks more "natural"; but rust likes to keep a lot of representation details close to the chest so I think it's actually impossible to write some functions on rust data structures if the only primitives are things that a from-scratch VM architecture would know about like adding numbers and dereferencing a pointer

view this post on Zulip Mario Carneiro (May 05 2021 at 06:43):

oh, that holyjit project does seem pretty relevant, and I guess it extends the compiler to get access to these details

view this post on Zulip Chris Fallin (May 05 2021 at 06:44):

hmm, there is std::mem::discriminant but yeah, maybe a limited rustc plugin just to give you details (discriminant offset, width, and values; and field offsets) of Rust layout would be enough to codegen the accessors?

view this post on Zulip Mario Carneiro (May 05 2021 at 06:45):

std::mem::discriminant is still a function though, you would have to turn it into cranelift IR for it to be usable

view this post on Zulip Chris Fallin (May 05 2021 at 06:45):

right, it doesn't look like there's a way to get the integer out of it

view this post on Zulip Chris Fallin (May 05 2021 at 06:45):

so, rustc plugin seems like the most straightforward way here

view this post on Zulip Mario Carneiro (May 05 2021 at 06:46):

I wonder if Layout can be used as a sort of introspection API here?

view this post on Zulip Chris Fallin (May 05 2021 at 06:47):

these are questions best asked of our resident rustc-internals gurus :-) I'm bowing out now (end of day) but hopefully someone else knows more -- best of luck!

view this post on Zulip Mario Carneiro (May 05 2021 at 06:47):

thanks for the pointers


Last updated: Oct 23 2024 at 20:03 UTC