Stream: cranelift

Topic: Initial Findings


view this post on Zulip Chris Clark (May 23 2023 at 19:16):

I made a few c bindings around the cranelift and cranelift-codegen crate, and linked them with a zig project. Here are my findings.

This is crate is cranelift

-rw-rw-r--  2 christopher christopher 29038K May 23 13:39 libcraneliftc.a

It's total size is 29M,

I built this without the panic handler, as I couldn't figure out what I needed to link for _Unwind_Resume or whatever it was, I didn't write it down in my notes.

Cargo.toml

[package]
name = "cranelift-export"
version = "0.1.0"
edition = "2021"

[profile.release]
panic="abort"

[profile.dev]
panic="abort"

[lib]
name = "craneliftc"
crate-type = ["staticlib", "cdylib"]

[dependencies]
cranelift = "0.96.1"

This allowed me to link the rust std library with my application.

-rw-r--r-- 1 christopher christopher  10M May 14 13:49 libstd-89bc084783fdc439.so

So the resulting size is 39M with cranelift and the rust std library.

The size ended up being larger for cranelift-codegen! 28762K to be exact. I really can't explain that, unless there is some treeshaking in cranelift, whereas there could be none in cranelift-codegen.

I haven't decided if I'm going to use the better interface for cranelift or go all out on cranelift-codegen. I recently saw a conversation on github, and here about no_std. And I am early enough in my project, that I'm not worried about it. but having no_std in codegen could be a huge win from saving 10M and not having to distribute rusts horribly named std lib.

view this post on Zulip bjorn3 (May 23 2023 at 19:38):

Are you linking with --gc-sections when linking against libcraneliftc.a? That will prevent unused functions (like most of the standard library) from ending up in the linked artifacy.

view this post on Zulip bjorn3 (May 23 2023 at 19:39):

By the way if you pass --print native-staticlibs when compiling the staticlib, rustc will tell you all the linker flags necessary for successfully linking. This does not include --gc-sections however as it is not strictly necessary. Just recommended.

view this post on Zulip bjorn3 (May 23 2023 at 19:41):

You don't need to ship libstd.so when compiling as either staticlib or cdylib. Rustc will link it into the compiled staticlib or cdylib automatically.

view this post on Zulip bjorn3 (May 23 2023 at 19:42):

Note that the cdylib is linked by rustc with --gc-sections by default, but for staticlibs rustc doesn't have control over the linker invocation.

view this post on Zulip Chris Clark (May 23 2023 at 20:23):

bjorn3 said:

You don't need to ship libstd.so when compiling as either staticlib or cdylib. Rustc will link it into the compiled staticlib or cdylib automatically.

Hmm, my linker complains. At least a 100 or so of these.

error: ld.lld: undefined symbol: _Unwind_SetGR
    note: referenced by gcc.rs:208 (library/std/src/personality/gcc.rs:208)
    note:               std-89bc084783fdc439.std.5f6d52e5-cgu.0.rcgu.o:(rust_eh_personality) in archive /home/christopher/source/type-lang/cranelift/target/release/libcraneliftc.a
error: ld.lld: undefined symbol: _Unwind_SetIP
    note: referenced by gcc.rs:214 (library/std/src/personality/gcc.rs:214)
    note:               std-89bc084783fdc439.std.5f6d52e5-cgu.0.rcgu.o:(rust_eh_personality) in archive /home/christopher/source/type-lang/cranelift/target/release/libcraneliftc.a
error: ld.lld: undefined symbol: fstat64
    note: referenced by fs.rs:1055 (library/std/src/sys/unix/fs.rs:1055)
    note:               std-89bc084783fdc439.std.5f6d52e5-cgu.0.rcgu.o:(std::backtrace_rs::symbolize::gimli::mmap::h1f7010bebead819b) in archive /home/christopher/source/type-lang/cranelift/target/release/libcraneliftc.a
error: ld.lld: undefined symbol: mmap
    note: referenced by mmap_unix.rs:14 (library/std/src/../../backtrace/src/symbolize/gimli/mmap_unix.rs:14)
    note:               std-89bc084783fdc439.std.5f6d52e5-cgu.0.rcgu.o:(std::backtrace_rs::symbolize::gimli::mmap::h1f7010bebead819b) in archive /home/christopher/source/type-lang/cranelift/target/release/libcraneliftc.a
error: ld.lld: undefined symbol: dl_iterate_phdr
    note: referenced by libs_dl_iterate_phdr.rs:15 (library/std/src/../../backtrace/src/symbolize/gimli/libs_dl_iterate_phdr.rs:15)
    note:               std-89bc084783fdc439.std.5f6d52e5-cgu.0.rcgu.o:(std::backtrace_rs::symbolize::gimli::resolve::haecb1fefe4a8bc82) in archive /home/christopher/source/type-lang/cranelift/target/release/libcraneliftc.a

view this post on Zulip bjorn3 (May 23 2023 at 20:50):

You are likely missing -lunwind as linker argument. If you compile with --print native-static-libs you get the full list of libraries necessary.

view this post on Zulip Chris Clark (May 23 2023 at 22:31):

bjorn3 said:

You are likely missing -lunwind as linker argument. If you compile with --print native-static-libs you get the full list of libraries necessary.

cargo with --print native-static-libs produced unexpected argument --print found.
Weird, I thought it was std because of this line

note:               std-89bc084783fdc439.std.5f6d52e5-cgu.0.rcgu.o

and libstd-89bc084783fdc439.so is the version I had in my rust location.

        const full_path = b.fmt("{s}/lib/libstd-89bc084783fdc439.so", .{path_unpadded});
        _ = full_path;
        //s_lib.addObjectFile(full_path);

however your suggestion appears to have worked. I wouldn't consider linking with unwind unreasonable. So that's perfect.

view this post on Zulip bjorn3 (May 24 2023 at 07:09):

--print native-static-libs is a rustc argument, so you either have to put it in RUSTFLAGS (which triggers a rebuild) or use cargo rustc -- --print native-static-libs.

The linker error comes from the version of libstd bundled into the staticlib. I can understand the confusion though.

view this post on Zulip Chris Clark (May 26 2023 at 15:00):

Alright, after some more research, trial and error. I am going the redhat approach here.

I will make macros like below for exposing pretty much every part of cranelift I need.

macro_rules! dispose {
    ($namespace:ident) => {
        paste::paste! {
            #[no_mangle]
            pub extern "C" fn [<CL_ $namespace _dispose >](val: *mut *mut [< $namespace >]) -> () {
                unsafe {
                    drop(Box::from_raw(val));
                }
            }
        }
    };
}

I know this question has came up a few times. Is this something you guys are interested in as a crate? Right now, I just have it in my project, but could probably republish these bindings.

Learn techniques to create a binding to make a library written in Rust accessible to C programmers. (Part 2 of 4)

view this post on Zulip bjorn3 (May 26 2023 at 16:05):

See also https://github.com/bytecodealliance/wasmtime/issues/1164

What I'd like to see is a way to use Cranelift apis from another (non rust) language. Currently the cranelift apis are rust only, but having a layer on top of that like "llvm-c" is, would make the ...

view this post on Zulip bjorn3 (May 26 2023 at 16:06):

By the way I think I would personally prefer cranelift_ or maybe clif_ as prefix rather than CL_, but others may have a different opinion.

view this post on Zulip bjorn3 (May 26 2023 at 16:08):

Ideally the C bindings for the InstBuilder would be automatically generated from the instruction definitions in cranelift-codegen-meta.

view this post on Zulip Chris Clark (May 26 2023 at 16:19):

bjorn3 said:

See also https://github.com/bytecodealliance/wasmtime/issues/1164

yes, you must reexport all functionality in an extern "C" wrapper, if you wanted to return the data directly, you must repr(C) a wrapper struct, and then take care of the translation. Doing this seemed tedious, and also very difficult in some scenarios, like if the members had enums with values in their variants. So it is a shorter put to do the pointer style, and I would also use cbindgen

view this post on Zulip Chris Clark (May 26 2023 at 16:20):

bjorn3 said:

Ideally the C bindings for the InstBuilder would be automatically generated from the instruction definitions in cranelift-codegen-meta.

I haven't gotten to this part yet, can you explain what you mean? are they already exported for C API?

view this post on Zulip Chris Clark (May 26 2023 at 16:24):

oh wow, i just saw the link to the gist. they did all that work by hand!

view this post on Zulip bjorn3 (May 26 2023 at 18:30):

cranelift-codegen-meta has a list of all instructions using which the InstBuilder trait is automatically generated among ither things. Ideally the C bindings for these methods would be generated from the same list of instructions rather than written by hand.

view this post on Zulip Chris Clark (May 26 2023 at 18:58):

bjorn3 said:

cranelift-codegen-meta has a list of all instructions using which the InstBuilder trait is automatically generated among ither things. Ideally the C bindings for these methods would be generated from the same list of instructions rather than written by hand.

I copied their code, and found about 200 errors on the .ins()method on the FunctionBuilder. I am guessing things like x86_pminu(val, val) are not supported. So I think what you are saying is the InstBuilder (and many other things) are generated dynamically. so functions like this wrapper.

#[no_mangle]
pub extern "C" fn cranelift_x86_macho_tls_get_addr(
    ptr: *mut FunctionData,
    gv: GlobalValueCode,
) -> ValueCode {
    let inst = unsafe {
        assert!(!ptr.is_null());
        &mut *ptr
    };
    return inst
        .builder
        .ins()
        .x86_macho_tls_get_addr(GlobalValue::from_u32(gv))
        .as_u32();
}

the x86_macho_tls_get_addr function and the other 200+ are being generated?

I think the API they have made which works on this FunctionData struct is fine for now. and would definitely agree that those should be hoisted, and worked on directly from the InstBuilder since it is very dynamic in nature.

What is the purpose of this instbuilder losing these 200+ functions, is it safe for me to delete them from this gist, or are there replacements somewhere?

view this post on Zulip bjorn3 (May 26 2023 at 19:31):

Since that gist was created there have been a decent amount of changes in which instructions exist. Hence why I did prefer if the bindings are autogenerated from the canonical list of instructions.

view this post on Zulip bjorn3 (May 26 2023 at 19:34):

So I think what you are saying is the InstBuilder (and many other things) are generated dynamically.

Exactly!

so functions like this wrapper. [...] the x86_macho_tls_get_addr function and the other 200+ are being generated?

I don't think those were autogenerated in the gist, but I would like them to be.

view this post on Zulip bjorn3 (May 26 2023 at 19:35):

Chris Clark said:

What is the purpose of this instbuilder losing these 200+ functions, is it safe for me to delete them from this gist, or are there replacements somewhere?

Those were probably only used by the old x86 backend which has since been deleted. It is fine to delete the bindings from the gist.

view this post on Zulip Chris Clark (May 26 2023 at 19:42):

I was able to pull in the meta crate, make a little binary that just calls generate, and generate the .islefiles. What do they do? The comments inside those files didn't help.

And then, I can see everything else in out which made the inst_builder, settings, types, etc.

Are you suggesting I modify meta to have a "generate c api" function which makese those C compatible?

view this post on Zulip bjorn3 (May 26 2023 at 20:08):

ISLE is a pattern matching DSL. It is used for optimizations and for lowering cranelift instructions to machine instructions. It isn't relecant for generating C bindings.

view this post on Zulip bjorn3 (May 26 2023 at 20:09):

https://github.com/bytecodealliance/wasmtime/blob/41417d9e0feff2aa3ae0ca66a7b8777bfaa88dc2/cranelift/codegen/meta/src/gen_inst.rs#L1014 is the function generating the InstBuilder trait.

A fast and secure runtime for WebAssembly. Contribute to bytecodealliance/wasmtime development by creating an account on GitHub.

view this post on Zulip Chris Clark (May 26 2023 at 20:19):

bjorn3 said:

https://github.com/bytecodealliance/wasmtime/blob/41417d9e0feff2aa3ae0ca66a7b8777bfaa88dc2/cranelift/codegen/meta/src/gen_inst.rs#L1014 is the function generating the InstBuilder trait.

Can we outline the requirements for an acceptable PR?

view this post on Zulip bjorn3 (May 26 2023 at 21:22):

You will have to ask that to one of the maintainers. I'm merely a contributor, so I can't approve the PR.

view this post on Zulip Chris Clark (May 28 2023 at 01:09):

@Chris Fallin @Jamey Sharp are you maintainers?

view this post on Zulip Chris Fallin (May 28 2023 at 02:33):

@Chris Clark it's best if you create an issue on the GitHub repo to discuss rather than tag individual people here (also mindful of the weekend you may not get immediate replies) -- it's hard for me to tell exactly what the need at hand is, so outlining the issue and the proposed solution with as much context as reasonable would be a good start, then we can go from there

view this post on Zulip Jamey Sharp (May 30 2023 at 16:33):

The advice that bjorn3 has already given is all a good start, but yeah, we should discuss this further in an issue if you want the result to be something we can merge. We're interested in doing things that make Cranelift usable in more projects! On the other hand, maintaining a C API is a pain, so we want to be a little careful about it.


Last updated: Oct 23 2024 at 20:03 UTC