Where can I find a simple "hello world" Cranelift example, that just builds a simple (adder?) function, JIT compile and calls it? The simplest example I could find is the "toy language", which has an entire programming language, parser and extra stuff. Really hard to get started on Cranelift right now. I just need to be able to produce the IR (already learned how, by using FunctionBuilder), and then compile it to a callable Rust function, which mutates a Rust &[u8] buffer. A simple "hello world" adder would be immensely helpful to get started.
these slides seems to contain such small example: https://www.slideshare.net/RReverser/building-fast-interpreters-in-rust
but I can't find the code for it
@Victor Maia I'm not aware of any literal "hello world" example, but I agree, we should have something like that. https://github.com/bytecodealliance/cranelift-jit-demo is the closest to a "getting started" example that we have (probably what you are referring to). We're pretty resource-constrained and haven't been able to build up a lot of tutorial documentation unfortunately (docs are a priority for work in 2022, though, assuming other priorities don't get in the way)
If you do work out a very simple minimal example we'd be happy to take a PR to our docs directory though, to at least help future folks!
I see, that is fine. Have been in your shoes.
Maybe it would be helpful to link more prominently to the JIT example from the documentation?
Problem is, people that code to Cranelift probably already know how parsers, ASTs etc. work, so all the extra stuff get in the way of finding the Cranelift functions I'm interested in. But that is fine, I'll try to dig the JIT example and make a small "hello world" that just compiles an adder, and then I'll post it in case anyone else is interested.
The https://docs.rs/cranelift-codegen/latest/cranelift_codegen/ir/trait.InstBuilder.html trait shows what instructions can be built.
I guess I was almost there: https://gist.github.com/MaiaVictor/3a1b0a07517574c348c6131fbb7ab6d3
But I got a sad error: thread 'main' panicked at 'PLT is currently only supported on x86_64'
I guess that means Cranelift doesn't support ARM / Apple M1 yet? :(
Cranelift does support aarch64 and M1! But there may be an issue with the particular kind of relocation you're using
I unfortunately don't have M1 hardware so I'm not an expert; cc @Benjamin Bouvier to help (he is in UTC+1 so probably not online at the moment)
What's "relocation"? The problematic code is:
let builder = JITBuilder::new(cranelift_module::default_libcall_names());
let module = JITModule::new(builder);
After googling, I realized there is a way to change flags of JITBuilder, so I'm messing with that to see if I get lucky
Relocations are the way in which the runtime edits the code to refer to particular functions that are being called. The PLT (procedure linkage table) is a particular mechanism used by some relocations. Depending on the environment in which Cranelift is used, it will use different relocation types. It seems here that the particular configuration is causing Cranelift to try to use the PLT, which we apparently don't support on aarch64.
Wasmtime (which uses Cranelift) definitely works on M1, so a good starting point might be to see how it is generating calls and configuring things. IIRC we end up with Abs8 references (absolute 64-bit addresses) to libcalls, so there is definitely a way to make that work.
I'm not an expert in the cranelift-jit
frontend to Cranelift, and mostly work in the backends, so you'll probably need help from @Benjamin Bouvier or possibly @bjorn3 (who also uses Cranelift outside of Wasmtime). Sorry I can't offer more than that!
I see, that's already immensely helpful input, thank you. I don't think I have the knowledge to go further than that either, sadly.
yeah, there is an open issue for making cranelift-jit work on aarch64.
yep I'm reading it now
this is an issue specific to cranelift-jit. wasmtime and afaik aot compilation using cranelift-object are fully supported on aarch64.
I'm avoiding wasmtime and going directly to Cranelift because I need each % of performance I can get. I noticed wasmtime has some safety memory features that would likely make it slightly slower than a direct compilation. Also in wasmtime I can't apply a wasm function directly to a &mut [u8]
, I need to use wasm's linear memory, which in turn would mean I need to update the rest of my functions to also use wasm's linear memory (even though they're not related to JIT). Does that make sense, or am I going in the wrong direction?
i am responsible for the cranelift-jit change that caused this issue. it was part of a change i did to allow replacing function definitions.
That's quite cool actually
seems like benmk's solution on that thread solved the issue here, too. but it disables "PIC", which I don't know what is.
is PIC what allows replacing function definitions?
yes. this is not the normal purpose of pic, but I repurposed it in cranelift-jit for replacing function definitions. disabling it is likely slightly faster.
Any idea what thread 'main' panicked at 'remove_constant_phis: entry block unknown'
could mean? I thought the first block was the entry already?
where I'm at: https://gist.github.com/MaiaVictor/d41ae7af9e9ceaebdd778f27870a4ac4
hmm never mind, seems like codegen already creates a func inside it, which is what I need to use, rather than creating a new one (?)
the good news is: IT WORKS! yaaaay
the bad news it is probably mostly wrong
here is the code: https://gist.github.com/MaiaVictor/6f62947839c485c01751655d156ef35c
but I'm confused, it seems like add42
can only be called once, since it uses the func
object in codegen_ctx
should add42
instead return a codegen_ctx
?
// Builds a `f(x) = add(42,x)` function in Cranelift IR
fn add42(jit: &mut JIT) : codegen::Context {
like this?
basically I'm confused about the internal states, what should be kept and what should be created dynamically every time I compile a function
ideally add42
should return CraneliftIR
, but since FunctionBuilder::new
receives a function, I can't just build the IR separate from an existing/declared function, I guess?
The Context is meant to be reused between functions, but after every function you defined you need to call .clear()
to clear out all cached state. Basically it is meant to allow reusing allocations between compilation of different functions.
As for the panic can you show the printed Function
?
I am going to sleep. I will look at any reply tomorrow.
Thanks, for now I think all is fine. If you could just have a quick look on my code and let me know if you spot any obvious mistake (such as using the static objects incorrectly):
https://gist.github.com/MaiaVictor/682eeca48da1f2d6db77875b331cd88a
Specifically, I'm worried that make_adder
starts with jit.codegen_ctx.func.signature.params.push
, wouldn't that cause the params to accumulate through different functions?
Well I have just 2 quick questions if I may:
iconst.u64
with ins()? builder.ins().iconst(I64, amount)
is signed, if I understand correctlyFor 1. it is fine to just cast a u64 to an i64. It will fill the registers with the provided bits without interpreting it in any way. For integers smaller than 64bit the most significant bits in the immediate should be ignored, so it is fine to both sign or zero extend the immediate to 64 bits.
For 2. you need to multiply the array index by the element size and then add the resulting offset to the array pointer. You can then pass the resulting element pointer to the load instruction.
To answer https://gist.github.com/MaiaVictor/682eeca48da1f2d6db77875b331cd88a#file-it_works-rs-L73 FunctionBuilderContext is used to reuse some allocations between two FunctionBuilder's. There is only a perf difference between using a single FunctionBuilderContext and using a new one every time you make a FunctionBuilder. It is cleared automatically by FunctionBuilder::new().
To answer https://gist.github.com/MaiaVictor/682eeca48da1f2d6db77875b331cd88a#file-it_works-rs-L118 it checks that some cranelift ir invariants are not violated. Violating these could result in crashes during compilation or even to miscompilation. It mainly exists as a debugging tool. Once you are certain that you are always producing correct cranelift ir it is fine to disable it for release builds. (keeping it for debug builds can still be useful)
@Victor Maia The it_works.rs code looks fine to me.
I get it. And got the array code to work now. That is immensely helpful. Thank you so much for taking your time to answer my questions!
You're welcome!
Hi! Cool that you found another way to do it, I unfortunately don't have much insights into the PLT issue on aarch64 :thinking:
Hey last week I started working on my own programming language for the sake of learning how to do it, meaning I'm a complete novice into this world.
I started by outputting assembly (I'm on a M1 Mac) and then calling as
& ld
to create the final binary, while trying to make it run in linux as well I went down the rabbit hole of compiler backends and found cranelift.
Scouring for an updated version of a native binary example led me to nothing but I did find the JIT examples which are good enough for me to get started.
At this point I have a repo that has a minimal version of what @Victor Maia had but I wanted to extend it to also allow for printing (so think printf/puts) can anyone give me any pointers into how I would go for trying to achieve that?
My plan is to also extend the example to have a native binary version of the same thing.
Here is the repo https://github.com/Mike-Neto/cranelift-simple-jit/tree/main .
Usually this is accomplished by linking to the targeted platforms system library (such as glibc on Linux) and then calling those functions with the appropriate calling convention (such as SystemV).
For that route you'll also need to make sure your language supports the necessary parts to construct the C-compatible data for those function calls. (raw pointer to bytes in the case of puts
)
An alternative is to write a basic standard library in Rust to piggy-back of Rust's platform-agnostic standard library, and then expose those Rust functions to your language by declaring them with extern
and compiling the Rust side to a static library.
Last updated: Nov 22 2024 at 17:03 UTC