I made a few c bindings around the cranelift and cranelift-codegen crate, and linked them with a zig project. Here are my findings.
This is crate is cranelift
-rw-rw-r-- 2 christopher christopher 29038K May 23 13:39 libcraneliftc.a
It's total size is 29M,
I built this without the panic handler, as I couldn't figure out what I needed to link for _Unwind_Resume
or whatever it was, I didn't write it down in my notes.
Cargo.toml
[package]
name = "cranelift-export"
version = "0.1.0"
edition = "2021"
[profile.release]
panic="abort"
[profile.dev]
panic="abort"
[lib]
name = "craneliftc"
crate-type = ["staticlib", "cdylib"]
[dependencies]
cranelift = "0.96.1"
This allowed me to link the rust std library with my application.
-rw-r--r-- 1 christopher christopher 10M May 14 13:49 libstd-89bc084783fdc439.so
So the resulting size is 39M with cranelift and the rust std library.
The size ended up being larger for cranelift-codegen! 28762K
to be exact. I really can't explain that, unless there is some treeshaking in cranelift, whereas there could be none in cranelift-codegen.
I haven't decided if I'm going to use the better interface for cranelift or go all out on cranelift-codegen. I recently saw a conversation on github, and here about no_std. And I am early enough in my project, that I'm not worried about it. but having no_std in codegen could be a huge win from saving 10M and not having to distribute rusts horribly named std lib.
Are you linking with --gc-sections
when linking against libcraneliftc.a
? That will prevent unused functions (like most of the standard library) from ending up in the linked artifacy.
By the way if you pass --print native-staticlibs
when compiling the staticlib, rustc will tell you all the linker flags necessary for successfully linking. This does not include --gc-sections
however as it is not strictly necessary. Just recommended.
You don't need to ship libstd.so when compiling as either staticlib or cdylib. Rustc will link it into the compiled staticlib or cdylib automatically.
Note that the cdylib is linked by rustc with --gc-sections
by default, but for staticlibs rustc doesn't have control over the linker invocation.
bjorn3 said:
You don't need to ship libstd.so when compiling as either staticlib or cdylib. Rustc will link it into the compiled staticlib or cdylib automatically.
Hmm, my linker complains. At least a 100 or so of these.
error: ld.lld: undefined symbol: _Unwind_SetGR
note: referenced by gcc.rs:208 (library/std/src/personality/gcc.rs:208)
note: std-89bc084783fdc439.std.5f6d52e5-cgu.0.rcgu.o:(rust_eh_personality) in archive /home/christopher/source/type-lang/cranelift/target/release/libcraneliftc.a
error: ld.lld: undefined symbol: _Unwind_SetIP
note: referenced by gcc.rs:214 (library/std/src/personality/gcc.rs:214)
note: std-89bc084783fdc439.std.5f6d52e5-cgu.0.rcgu.o:(rust_eh_personality) in archive /home/christopher/source/type-lang/cranelift/target/release/libcraneliftc.a
error: ld.lld: undefined symbol: fstat64
note: referenced by fs.rs:1055 (library/std/src/sys/unix/fs.rs:1055)
note: std-89bc084783fdc439.std.5f6d52e5-cgu.0.rcgu.o:(std::backtrace_rs::symbolize::gimli::mmap::h1f7010bebead819b) in archive /home/christopher/source/type-lang/cranelift/target/release/libcraneliftc.a
error: ld.lld: undefined symbol: mmap
note: referenced by mmap_unix.rs:14 (library/std/src/../../backtrace/src/symbolize/gimli/mmap_unix.rs:14)
note: std-89bc084783fdc439.std.5f6d52e5-cgu.0.rcgu.o:(std::backtrace_rs::symbolize::gimli::mmap::h1f7010bebead819b) in archive /home/christopher/source/type-lang/cranelift/target/release/libcraneliftc.a
error: ld.lld: undefined symbol: dl_iterate_phdr
note: referenced by libs_dl_iterate_phdr.rs:15 (library/std/src/../../backtrace/src/symbolize/gimli/libs_dl_iterate_phdr.rs:15)
note: std-89bc084783fdc439.std.5f6d52e5-cgu.0.rcgu.o:(std::backtrace_rs::symbolize::gimli::resolve::haecb1fefe4a8bc82) in archive /home/christopher/source/type-lang/cranelift/target/release/libcraneliftc.a
You are likely missing -lunwind
as linker argument. If you compile with --print native-static-libs
you get the full list of libraries necessary.
bjorn3 said:
You are likely missing
-lunwind
as linker argument. If you compile with--print native-static-libs
you get the full list of libraries necessary.
cargo with --print native-static-libs produced unexpected argument --print found.
Weird, I thought it was std because of this line
note: std-89bc084783fdc439.std.5f6d52e5-cgu.0.rcgu.o
and libstd-89bc084783fdc439.so
is the version I had in my rust location.
const full_path = b.fmt("{s}/lib/libstd-89bc084783fdc439.so", .{path_unpadded});
_ = full_path;
//s_lib.addObjectFile(full_path);
however your suggestion appears to have worked. I wouldn't consider linking with unwind unreasonable. So that's perfect.
--print native-static-libs
is a rustc argument, so you either have to put it in RUSTFLAGS
(which triggers a rebuild) or use cargo rustc -- --print native-static-libs
.
The linker error comes from the version of libstd bundled into the staticlib. I can understand the confusion though.
Alright, after some more research, trial and error. I am going the redhat approach here.
I will make macros like below for exposing pretty much every part of cranelift I need.
macro_rules! dispose {
($namespace:ident) => {
paste::paste! {
#[no_mangle]
pub extern "C" fn [<CL_ $namespace _dispose >](val: *mut *mut [< $namespace >]) -> () {
unsafe {
drop(Box::from_raw(val));
}
}
}
};
}
I know this question has came up a few times. Is this something you guys are interested in as a crate? Right now, I just have it in my project, but could probably republish these bindings.
See also https://github.com/bytecodealliance/wasmtime/issues/1164
By the way I think I would personally prefer cranelift_
or maybe clif_
as prefix rather than CL_
, but others may have a different opinion.
Ideally the C bindings for the InstBuilder
would be automatically generated from the instruction definitions in cranelift-codegen-meta.
bjorn3 said:
See also https://github.com/bytecodealliance/wasmtime/issues/1164
yes, you must reexport all functionality in an extern "C" wrapper, if you wanted to return the data directly, you must repr(C) a wrapper struct, and then take care of the translation. Doing this seemed tedious, and also very difficult in some scenarios, like if the members had enums with values in their variants. So it is a shorter put to do the pointer style, and I would also use cbindgen
bjorn3 said:
Ideally the C bindings for the
InstBuilder
would be automatically generated from the instruction definitions in cranelift-codegen-meta.
I haven't gotten to this part yet, can you explain what you mean? are they already exported for C API?
oh wow, i just saw the link to the gist. they did all that work by hand!
cranelift-codegen-meta has a list of all instructions using which the InstBuilder
trait is automatically generated among ither things. Ideally the C bindings for these methods would be generated from the same list of instructions rather than written by hand.
bjorn3 said:
cranelift-codegen-meta has a list of all instructions using which the
InstBuilder
trait is automatically generated among ither things. Ideally the C bindings for these methods would be generated from the same list of instructions rather than written by hand.
I copied their code, and found about 200 errors on the .ins()
method on the FunctionBuilder
. I am guessing things like x86_pminu(val, val)
are not supported. So I think what you are saying is the InstBuilder (and many other things) are generated dynamically. so functions like this wrapper.
#[no_mangle]
pub extern "C" fn cranelift_x86_macho_tls_get_addr(
ptr: *mut FunctionData,
gv: GlobalValueCode,
) -> ValueCode {
let inst = unsafe {
assert!(!ptr.is_null());
&mut *ptr
};
return inst
.builder
.ins()
.x86_macho_tls_get_addr(GlobalValue::from_u32(gv))
.as_u32();
}
the x86_macho_tls_get_addr
function and the other 200+ are being generated?
I think the API they have made which works on this FunctionData
struct is fine for now. and would definitely agree that those should be hoisted, and worked on directly from the InstBuilder
since it is very dynamic in nature.
What is the purpose of this instbuilder losing these 200+ functions, is it safe for me to delete them from this gist, or are there replacements somewhere?
Since that gist was created there have been a decent amount of changes in which instructions exist. Hence why I did prefer if the bindings are autogenerated from the canonical list of instructions.
So I think what you are saying is the InstBuilder (and many other things) are generated dynamically.
Exactly!
so functions like this wrapper. [...] the
x86_macho_tls_get_addr
function and the other 200+ are being generated?
I don't think those were autogenerated in the gist, but I would like them to be.
Chris Clark said:
What is the purpose of this instbuilder losing these 200+ functions, is it safe for me to delete them from this gist, or are there replacements somewhere?
Those were probably only used by the old x86 backend which has since been deleted. It is fine to delete the bindings from the gist.
I was able to pull in the meta crate, make a little binary that just calls generate, and generate the .isle
files. What do they do? The comments inside those files didn't help.
And then, I can see everything else in out
which made the inst_builder, settings, types, etc.
Are you suggesting I modify meta to have a "generate c api" function which makese those C compatible?
ISLE is a pattern matching DSL. It is used for optimizations and for lowering cranelift instructions to machine instructions. It isn't relecant for generating C bindings.
https://github.com/bytecodealliance/wasmtime/blob/41417d9e0feff2aa3ae0ca66a7b8777bfaa88dc2/cranelift/codegen/meta/src/gen_inst.rs#L1014 is the function generating the InstBuilder
trait.
bjorn3 said:
https://github.com/bytecodealliance/wasmtime/blob/41417d9e0feff2aa3ae0ca66a7b8777bfaa88dc2/cranelift/codegen/meta/src/gen_inst.rs#L1014 is the function generating the
InstBuilder
trait.
Can we outline the requirements for an acceptable PR?
You will have to ask that to one of the maintainers. I'm merely a contributor, so I can't approve the PR.
@Chris Fallin @Jamey Sharp are you maintainers?
@Chris Clark it's best if you create an issue on the GitHub repo to discuss rather than tag individual people here (also mindful of the weekend you may not get immediate replies) -- it's hard for me to tell exactly what the need at hand is, so outlining the issue and the proposed solution with as much context as reasonable would be a good start, then we can go from there
The advice that bjorn3 has already given is all a good start, but yeah, we should discuss this further in an issue if you want the result to be something we can merge. We're interested in doing things that make Cranelift usable in more projects! On the other hand, maintaining a C API is a pain, so we want to be a little careful about it.
Last updated: Jan 24 2025 at 00:11 UTC