Stream: general

Topic: cranelift code


view this post on Zulip Minyeong Jeong (Jan 19 2021 at 12:15):

Hi, I just asked this community about linear memory, and thanks to @bjorn3 , I finally found the code that I have to work on. Looking into the code for hours, got some questions. I have a poor understanding on JITs or Interpreters, so the following codes were very confusing. Such as

let adj_bound = pos.ins().iadd_imm(bound, -(access_size as i64));
or
pos.func.dfg.replace(inst).iconst(addr_ty, 0);
or
let final_addr = pos.ins().iadd(base, offset);

My guess is, that they are generating Cranelift IR, which will further be optimized and turned into platform specific machine code. Am I right? But the fact that some of these codes are "returning" a value is so confusing. What are they returning? If they are returning the result of the instruction, why bother turn them into a machine code and execute it again even if we know the result? Sorry for my poor english and understanding of the topic. Any reply will help.

view this post on Zulip bjorn3 (Jan 19 2021 at 13:07):

You could think about the "Value"s accepted and returned from these functions as a register names or stack locations. They don't define the actual value, but only the location at which it would appear at runtime.

view this post on Zulip Chris Fallin (Jan 19 2021 at 17:07):

@Minyeong Jeong you can get a pretty good overview of this sort of code generation by understanding the IR (intermediate representation) formats such as CLIF (Cranelift's IR) or LLVM's IR. Typically these IRs write instructions like "result = add(arg1, arg2)". That's an add instruction, but we just write the destination "register" on the left; it's just different notation :-) The code that generates the IR will look like that. The operations are not running at compile time; all of the args and results are symbolic (registers or other sorts of values, as bjorn3 says).

view this post on Zulip Minyeong Jeong (Jan 20 2021 at 06:27):

@bjorn3 @Chris Fallin thank you and thank you again, for your kind explanation to a beginner. Appreciate it!

view this post on Zulip Minyeong Jeong (Jan 20 2021 at 06:31):

I have trouble understanding the code from dynamic_heap.

let bound = pos.ins().global_value(offset_ty, bound_gv);
    let (cc, lhs, bound) = if access_size == 1 {
        // `offset > bound - 1` is the same as `offset >= bound`.
        (IntCC::UnsignedGreaterThanOrEqual, offset, bound)
    } else if access_size <= min_size {
        // We know that bound >= min_size, so here we can compare `offset > bound - access_size`
        // without wrapping.
        let adj_bound = pos.ins().iadd_imm(bound, -(access_size as i64));
        (IntCC::UnsignedGreaterThan, offset, adj_bound)
    } else {
        // We need an overflow check for the adjusted offset.
        let access_size_val = pos.ins().iconst(offset_ty, access_size as i64);
        let (adj_offset, overflow) = pos.ins().iadd_ifcout(offset, access_size_val);
        pos.ins().trapif(
            isa.unsigned_add_overflow_condition(),
            overflow,
            ir::TrapCode::HeapOutOfBounds,
        );
        (IntCC::UnsignedGreaterThan, adj_offset, bound)
    };

Can someone explain how can they assure that there is no overflow from the second condition of if statement? I just generally don't get it why they have to split the last to conditions, and why one adjusts offset and one adjusts bound.

view this post on Zulip Chris Fallin (Jan 20 2021 at 06:57):

@Minyeong Jeong can you expand what you mean by "second condition of if statement"? Do you mean the "else if" body?

One key invariant in that code that might be useful is that we know that bound is at least min_size; in other words, the heap is always at least min_size. Given access_size <= min_size in this branch, and given min_size <= bound by that invariant, we have access_size <= bound, so the bound - access_size expression is always nonnegative.

view this post on Zulip Minyeong Jeong (Jan 21 2021 at 02:36):

@Chris Fallin Thank you for your kind explanation and sparing your valuable time. The answer was exactly about what I was curious about.

view this post on Zulip Ron Shavit (Sep 11 2021 at 01:30):

Hi, this may be a very nooby qerstion to ask, but as there is no representation of arrays in Cranelift, how would I go about representing an array of for example, int32? Is it possible?

view this post on Zulip Mario Carneiro (Sep 11 2021 at 03:20):

Arrays go in memory, and your local variable is instead the pointer to the array

view this post on Zulip Ron Shavit (Sep 11 2021 at 12:17):

Mario Carneiro said:

Arrays go in memory, and your local variable is instead the pointer to the array

Oh I'm dumb :)
I haven't touched c in so long that I forgot, thank you!

view this post on Zulip dammi-i (Dec 24 2021 at 08:33):

hello I'm getting this error whenever I try to call a function from _start function:
thread 'main' panicked at 'index out of bounds: the len is 0 but the index is 0', home/.cargo/registry/src/github.com-1ecc6299db9ec823/cranelift-codegen-0.79.0/src/ir/dfg.rs:707:44 note: run with RUST_BACKTRACE=1 environment variable to display a backtrace

view this post on Zulip bjorn3 (Dec 24 2021 at 15:58):

That panic indicates that you are trying to call a function that isn't declared inside the current Function. Make sure that you use declare_func_in_func to get the FuncRef for every function you use it in instead of reusing it across codegen of different functions.

view this post on Zulip bjorn3 (Dec 24 2021 at 15:58):

@dammi-i

view this post on Zulip dammi-i (Dec 25 2021 at 00:01):

thanks!!, one question btw how would I call libc functions eg printf

view this post on Zulip bjorn3 (Dec 25 2021 at 18:45):

@dammi-i You can use module.declare_function("printf", Linkage::Import, &printf_signature).

view this post on Zulip bjorn3 (Dec 25 2021 at 18:45):

And similar for other functions.

view this post on Zulip dammi-i (Dec 27 2021 at 06:09):

for some reasons it seems to segfault when I call a libc function here is the code: https://paste.rs/vYL.rs let me know what is wrong. I used cc to link the object file with libc (arch: aarc64, os: linux)
and thanks!

view this post on Zulip bjorn3 (Dec 27 2021 at 12:48):

@dammi-i Why is the module.declare_func_in_func commented out? Your code works fine on x86_64 with cc out.o -o out. It also works fine when cross compiling to aarch64-unknown-linux-gnu and then using qemu. Can you do objdump -dr ./out? (assuming you name the linked executable ./out)

view this post on Zulip dammi-i (Dec 27 2021 at 13:49):

here is the output of objdump -dr: https://paste.rs/zWh.s

view this post on Zulip bjorn3 (Dec 27 2021 at 18:52):

Could you also run readelf -r out on the same executable? For me the relevant sections of objdump -dr and readelf -r look like:

0000000000000784 <main>:
 784:   a9bf7bfd        stp     x29, x30, [sp, #-16]!
 788:   910003fd        mov     x29, sp
 78c:   d2800180        mov     x0, #0xc                        // #12
 790:   58000041        ldr     x1, 798 <main+0x14>
 794:   14000003        b       7a0 <main+0x1c>
        ...
 7a0:   d63f0020        blr     x1
 7a4:   d2800000        mov     x0, #0x0                        // #0
 7a8:   a8c17bfd        ldp     x29, x30, [sp], #16
 7ac:   d65f03c0        ret
Relocation section '.rela.dyn' at offset 0x468 contains 11 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
[...]
000000000798  000300000101 R_AARCH64_ABS64   0000000000000000 exit@GLIBC_2.17 + 0

Basically the address to the exit function that is loaded is relocated at runtime.

view this post on Zulip bjorn3 (Dec 27 2021 at 18:53):

In your case it seems like the address is already relocated at link time.

view this post on Zulip bjorn3 (Dec 27 2021 at 18:53):

@dammi-i

view this post on Zulip dammi-i (Dec 27 2021 at 19:48):

Relocation section '.rela.dyn' at offset 0x490 contains 4 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000002930  000000000403 R_AARCH64_RELATIV                    2730
000000002938  000000000403 R_AARCH64_RELATIV                    2740
000000002940  000000000403 R_AARCH64_RELATIV                    2750
000000002948  000000000403 R_AARCH64_RELATIV                    16a0

Relocation section '.rela.plt' at offset 0x4f0 contains 4 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000002968  000200000402 R_AARCH64_JUMP_SL 0000000000000000 __libc_init@LIBC + 0
000000002970  000100000402 R_AARCH64_JUMP_SL 0000000000000000 __cxa_atexit@LIBC + 0
000000002978  000300000402 R_AARCH64_JUMP_SL 0000000000000000 __register_atfork@LIBC + 0
000000002980  000400000402 R_AARCH64_JUMP_SL 0000000000001720 exit@LIBC + 0

view this post on Zulip bjorn3 (Dec 27 2021 at 20:18):

Yours uses relative instead of absolute relocations for some reasons despite the code expecting absolute addresses.
edit: misinterpreted. it is actually missing the relocation entirely.

view this post on Zulip bjorn3 (Dec 27 2021 at 20:20):

What does cc -v -Wl,-v out.o -o out show? Maybe the linker is configured differently?

view this post on Zulip bjorn3 (Dec 27 2021 at 20:37):

When trying it on my android phone, it gives a linker warning about a relocation in the .text section and refuses to run it for this reason. Cranelift doesn't yet support position independent code (which should make it run on Android) for AArch64 it seems.

view this post on Zulip dammi-i (Dec 27 2021 at 20:46):

clang version 13.0.0
Target: aarch64-unknown-linux-android24
Thread model: posix
InstalledDir: /data/data/com.termux/files/usr/bin
"/data/data/com.termux/files/usr/bin/ld.lld" --sysroot=/data/data/com.termux/files -pie -z noexecstack -EL --fix-cortex-a53-843419 --warn-shared-textrel -z now -z relro -z max-page-size=4096 --hash-style=gnu --enable-new-dtags -rpath=/data/data/com.termux/files/usr/lib --eh-frame-hdr -m aarch64linux -dynamic-linker /system/bin/linker64 -o out /data/data/com.termux/files/usr/lib/crtbegin_dynamic.o -L/data/data/com.termux/files/usr/lib -L/system/lib64 -v out.o /data/data/com.termux/files/usr/lib/clang/13.0.0/lib/android/libclang_rt.builtins-aarch64-android.a -l:libunwind.a -ldl -lc /data/data/com.termux/files/usr/lib/clang/13.0.0/lib/android/libclang_rt.builtins-aarch64-android.a -l:libunwind.a -ldl /data/data/com.termux/files/usr/lib/crtend_android.o
LLD 13.0.0 (compatible with GNU linkers)

view this post on Zulip bjorn3 (Dec 27 2021 at 20:54):

Oh, you are on android. I don't know why the linker mislinks it in the first place, but android requires position independent code. Cranelift doesn't support it yet on AArch64, so unfortunately I can't help you for now. https://github.com/bytecodealliance/wasmtime/issues/2907 is the issue for adding pic support.

Cranelift emits AbsoluteRelocation Reloc::Abs8 when is_pic setting is enabled in architecture aarch64 Steps to Reproduce (module ;; Recursive factorial (func (export "fac-rec") (param i64...

Last updated: Dec 23 2024 at 13:07 UTC