I have a working function that is generating IR. Next steps are global data, how does one handle that since it is not inside a function?
How are structs created/passed? And then finally, how do i run the optimizations and lowering on functions?
Any documentation on the above topics?
there is the cranelift-object
crate which is designed to help with this kind of thing, as I understand it. however, Wasmtime doesn't use that crate and does its own thing, so I don't have any experience with it. @bjorn3 might be able to give some tips, since cg_clif uses it (as I understand).
https://docs.rs/cranelift-object/latest/cranelift_object/
For optimization and lowering you want to call compile_and_emit
on a Context
. For a fairly minimal example which also demonstrates using cranelift-object, see the implementation of the clif-util compile
command in cranelift/src/compile.rs
.
I don't know specifically how to handle data sections beyond that Cranelift itself doesn't provide any of that; it just has load
and store
instructions and it's up to you to provide valid addresses to those. The same goes for heap addresses: you need to determine how to allocate heap space, but once you have a heap address Cranelift can take care of those memory accesses.
Similarly, Cranelift doesn't know anything about aggregate data structures. It's up to your frontend to pick a representation, whether that's a memory layout or passing individual fields in registers. If you need to call to or be called from functions produced by another compiler, you need to match the ABI details for how that compiler passes aggregates, although Cranelift will help you with using the right registers or stack layout for individual function arguments.
In short, Cranelift by itself doesn't know anything about interacting with an operating system or managing memory layouts. There's room for more helpers like cranelift-object but I don't know of anyone currently planning on building anything else that's generic like that.
For accessing globals you did use the global_value instruction to get the address of a global. And cranelift_module::Module::define_data()
to define the value of the global.
I'm having a hard time visualizing this. The function compilation seems straight forward, and I imagine there is a way to parse the ir if i stored it to disk. But I'm having a hard time understand how I can make a program without the global data, I also don't see how struct access is possible.
I was also under the impression from a previous conversation that cranelift https://docs.rs/cranelift/0.96.0/cranelift/index.html was all I would need, and that the object/module crates were for LTO's
Cranelift doesn't support LTO. cranelift-module is how you combine multiple functions and data objects into a single entity like an object file, but this doesn't do any optimizations between it's parts. Just linking them together.
sorry, i just meant linking. @bjorn3
If I need cranelift-module and cranelift-object, for creating object files, does that mean I still need llvm or a linker? I'm confused on what the whole process should look like.
You need a linker but not LLVM. Cranelift takes the place of LLVM.
Jamey Sharp said:
In short, Cranelift by itself doesn't know anything about interacting with an operating system or managing memory layouts. There's room for more helpers like cranelift-object but I don't know of anyone currently planning on building anything else that's generic like that.
I guess I'm pretty glad I went with zig then, they have their own linker and layout builder for such things accessible in their std library. Might be pretty easy to tap into. I'm just curious then how others can use cranelift, without needing to understand how to manage memory layouts or build the output for operating systems. Do you have any recommendations for learning how to do this, and then where would i start to build one of these additional helper crates for operating system and memory specific building like cranelift-crafter
incase i wanted to this.
I'm not sure what a good beginner resource would be to learn this. I picked up most of my knowledge about linkers and object file formats while working on rustc_codegen_cranelift and adding features it needs to cranelift. https://maskray.me/ is the blog of one of the devs of lld and https://www.airs.com/blog/ is the blog of the developer of gold. Both have been quite useful for figuring out specific details.
Do note however that while getting a basic linker working isn't all that hard once you know how, implementing all the features necessary to handle all the libraries you may want to use on is a lot harder. Glibc for example needs symbol versioning, C needs common linkage and weak linkage, you need COMDATA, linker relaxations for thread local storage, ... And x86 and x86_64 are the easy architectures. Many other architectures have limitations around eg not supporting absolute or relative addressing or range limits for relative addressing which the psABI for the architecture has workaround you need to implement.
As for how cranelift is normally used to produce executables, that is by using cranelift-object and then passing the object files to the linker the same way you would when compiling a C project.
By the way last time I checked zig still used lld rather than it's own builtin linker by default. Zld seems to only support Mach-O (macOS), Webassembly and non-PIE x86_64 ELF object files and not PIE ELF, non-x86_64 ELF and COFF/PE (Windows): https://github.com/kubkon/zld#supported-backends
bjorn3 said:
I'm not sure what a good beginner resource would be to learn this. I picked up most of my knowledge about linkers and object file formats while working on rustc_codegen_cranelift and adding features it needs to cranelift. https://maskray.me/ is the blog of one of the devs of lld and https://www.airs.com/blog/ is the blog of the developer of gold. Both have been quite useful for figuring out specific details.
Do note however that while getting a basic linker working isn't all that hard once you know how, implementing all the features necessary to handle all the libraries you may want to use on is a lot harder. Glibc for example needs symbol versioning, C needs common linkage and weak linkage, you need COMDATA, linker relaxations for thread local storage, ... And x86 and x86_64 are the easy architectures. Many other architectures have limitations around eg not supporting absolute or relative addressing or range limits for relative addressing which the psABI for the architecture has workaround you need to implement.
As for how cranelift is normally used to produce executables, that is by using cranelift-object and then passing the object files to the linker the same way you would when compiling a C project.
By the way last time I checked zig still used lld rather than it's own builtin linker by default. Zld seems to only support Mach-O (macOS), Webassembly and non-PIE x86_64 ELF object files and not PIE ELF, non-x86_64 ELF and COFF/PE (Windows): https://github.com/kubkon/zld#supported-backends
I bought the book Linkers and Loaders and skimmed through it. Old book but probably mostly relevant. I also ran through your object crate, and I'm starting to get a clear picture for how all this comes together now. Thank you everyone. I'm not too keen on building a linker or a loader, but there are some interesting ideas I have, that makes me thinking about doing it.
As a follow up, I still ended up doing all this in rust, and deciding to use cranelift-object
.
However, I have no time as we just had our second baby. Even before that, I had limited time, and basically paused my language for 6 months. I am making this a bucket list item, so making a Programming Language is on my long term goals list.
I'm trying to re-familiarize myself with all the discussions I had around this to start toying around again my free time.
And I just wanted to shout out again how awesome this community has been. Thank you everyone involved in this great place. You all have been incredibly thorough helping a nobody like me!
Last updated: Jan 24 2025 at 00:11 UTC