Hi! I am creating my own language, QuickScript, and I am running into a problem during AOT compilation.
This code:
fn do_math(a: i32, b: i32) -> i32 {
let v: i32 = a - b;
return v;
}
fn main() -> i32 {
let val: i32 = do_math(4, 2);
puts("Hello, world!");
puts("Another test!");
printf("Math: %i\n", val);
return 0;
}
Produces a segfault when it hits the line printf("Math: %i\n", val);
. It seems like it may be having trouble with either varargs or variable access, but I'm not sure. I'm almost certain that this is a bug in Cranelift, as not only does it seem somewhat similar to other issues, but I've also tried everything I can think of to get this to work. Some of the issues I looked at suggest stack misalignment and weird asm commands, but I don't know. Can anyone help me? The full code is at https://github.com/RedstoneWizard08/QuickScript.
Also if you want to check out the error for yourself, you can grab a nightly binary here: https://nightly.link/RedstoneWizard08/QuickScript/workflows/build/main/binaries
Some references I found:
Also, side note, how do I correctly extract the VCode that's generated, from the frontend? I can't figure out how to do that.
If need be I can also provide the CLIF (IR) output when dumped.
@Jacob Sapoznikow Cranelift doesn't support varargs, and at least on x86-64 there are some special calling convention details (number of args passed somehow?) that will cause a segfault if not explicitly supported
On x86_64 varargs work fine when interpreted as regular functions for as long as you are only passing integers and making sure to extend the integers to the full width of a register before passing them, that is zero or sign extending to i64. (C does implicit promotion, but Rust rejects any other integer types) On aarch64 this works too, except on apple platforms where a slightly different calling convention is used for variadic functions. (making cg_clif incompatible with arm macOS)
Okay. The way I've done varargs is that I just define the function to have the number of arguments passed to it. I do realize that that's a bad way to do it and there has to be a better way, as it causes conflicts. My issue is that I'm just trying to get something printing, and this single printf is causing trouble. I guess maybe it's time to implement my own version of the format!()
macro? Not sure though. Is there a way to rename the function so I can define a reference to it multiple times with different argument types? I'd love to be able to do a puts(string)
and puts(int)
in the same program.
Also, I'll try using an i64
instead of an i32
and see how it goes.
Very strange that x86_64 segfaults at that though.
I finally see why format and println have to be macros xD
Still segfaulting with an i64.
@Jacob Sapoznikow if I might hazard a friendly suggestion: issues like this ("my generated code crashes") are almost always because of some bug in the input CLIF; to really narrow that down one would have to go through your whole codebase and understand its design; and it's unlikely that most folks here have the time to do that. Maybe the best advice we can give is on debugging techniques: the place I would start is by trying to reduce the testcase to its minimal version, then single-stepping (have you tried capturing the crash under gdb or lldb, or even better, rr?). Usually that will either make the problem immediately apparent ("ah, that pointer is null") or at least raise further suspicion ("the crash happens in the call to this runtime function, did I resolve the relocation correctly?"). If you're able, looking at the disassembly at the point that you get the segfault, and examining register values, can give a lot of information as well.
I've tried using gdb and it was unhelpful. I'll take a look at rr.
did you look at the disassembly and register values? disass
(or disass $pc,+64
if you're in JIT-code without function info to tell gdb its boundaries), and info regs
respectively
I'm also not good enough in x86 assembly to know what the register values mean xD
Yeah, I did try that.
OK, can you tell us what you saw? What is the crashing instruction?
I'll look.
@RedstoneWizard08 ➜ /workspaces/QuickScript (main) $ cargo run -- c dev/main.qs
Blocking waiting for file lock on package cache
Blocking waiting for file lock on package cache
Blocking waiting for file lock on package cache
Compiling qsc-codegen v0.6.0 (/workspaces/QuickScript/crates/qsc-codegen)
Compiling qsc-cli v0.6.0 (/workspaces/QuickScript/crates/qsc-cli)
Finished dev [unoptimized + debuginfo] target(s) in 32.72s
Running `target/debug/qsc c dev/main.qs`
@RedstoneWizard08 ➜ /workspaces/QuickScript (main) $ gdb dev/main
GNU gdb (Ubuntu 9.2-0ubuntu1~20.04.1) 9.2
# ...
(No debugging symbols found in dev/main)
(gdb) run
Starting program: /workspaces/QuickScript/dev/main
Hello, world!
Another test!
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7e2ccbb in __printf (format=0x5555555575e4 <literal_string_6iQXR70woh> "Math: %i - %i = %i\n") at printf.c:28
28 printf.c: No such file or directory.
(gdb) disass
Dump of assembler code for function __printf:
0x00007ffff7e2cc90 <+0>: endbr64
0x00007ffff7e2cc94 <+4>: sub $0xd8,%rsp
0x00007ffff7e2cc9b <+11>: mov %rdi,%r10
0x00007ffff7e2cc9e <+14>: mov %rsi,0x28(%rsp)
0x00007ffff7e2cca3 <+19>: mov %rdx,0x30(%rsp)
0x00007ffff7e2cca8 <+24>: mov %rcx,0x38(%rsp)
0x00007ffff7e2ccad <+29>: mov %r8,0x40(%rsp)
0x00007ffff7e2ccb2 <+34>: mov %r9,0x48(%rsp)
0x00007ffff7e2ccb7 <+39>: test %al,%al
0x00007ffff7e2ccb9 <+41>: je 0x7ffff7e2ccf2 <__printf+98>
=> 0x00007ffff7e2ccbb <+43>: movaps %xmm0,0x50(%rsp)
0x00007ffff7e2ccc0 <+48>: movaps %xmm1,0x60(%rsp)
0x00007ffff7e2ccc5 <+53>: movaps %xmm2,0x70(%rsp)
0x00007ffff7e2ccca <+58>: movaps %xmm3,0x80(%rsp)
0x00007ffff7e2ccd2 <+66>: movaps %xmm4,0x90(%rsp)
--Type <RET> for more, q to quit, c to continue without paging--
0x00007ffff7e2ccda <+74>: movaps %xmm5,0xa0(%rsp)
0x00007ffff7e2cce2 <+82>: movaps %xmm6,0xb0(%rsp)
0x00007ffff7e2ccea <+90>: movaps %xmm7,0xc0(%rsp)
0x00007ffff7e2ccf2 <+98>: mov %fs:0x28,%rax
0x00007ffff7e2ccfb <+107>: mov %rax,0x18(%rsp)
0x00007ffff7e2cd00 <+112>: xor %eax,%eax
0x00007ffff7e2cd02 <+114>: lea 0xe0(%rsp),%rax
0x00007ffff7e2cd0a <+122>: xor %ecx,%ecx
0x00007ffff7e2cd0c <+124>: mov %rsp,%rdx
0x00007ffff7e2cd0f <+127>: mov %rax,0x8(%rsp)
0x00007ffff7e2cd14 <+132>: lea 0x20(%rsp),%rax
0x00007ffff7e2cd19 <+137>: mov %r10,%rsi
0x00007ffff7e2cd1c <+140>: mov %rax,0x10(%rsp)
0x00007ffff7e2cd21 <+145>: mov 0x18a220(%rip),%rax # 0x7ffff7fb6f48
0x00007ffff7e2cd28 <+152>: movl $0x8,(%rsp)
0x00007ffff7e2cd2f <+159>: mov (%rax),%rdi
--Type <RET> for more, q to quit, c to continue without paging--
0x00007ffff7e2cd32 <+162>: movl $0x30,0x4(%rsp)
0x00007ffff7e2cd3a <+170>: callq 0x7ffff7e41860 <__vfprintf_internal>
0x00007ffff7e2cd3f <+175>: mov 0x18(%rsp),%rcx
0x00007ffff7e2cd44 <+180>: xor %fs:0x28,%rcx
0x00007ffff7e2cd4d <+189>: jne 0x7ffff7e2cd57 <__printf+199>
0x00007ffff7e2cd4f <+191>: add $0xd8,%rsp
0x00007ffff7e2cd56 <+198>: retq
0x00007ffff7e2cd57 <+199>: callq 0x7ffff7efac90 <__stack_chk_fail>
End of assembler dump.
(gdb) info regs
Undefined info command: "regs". Try "help info".
(gdb) disass $pc,+64
Dump of assembler code from 0x7ffff7e2ccbb to 0x7ffff7e2ccfb:
=> 0x00007ffff7e2ccbb <__printf+43>: movaps %xmm0,0x50(%rsp)
0x00007ffff7e2ccc0 <__printf+48>: movaps %xmm1,0x60(%rsp)
0x00007ffff7e2ccc5 <__printf+53>: movaps %xmm2,0x70(%rsp)
0x00007ffff7e2ccca <__printf+58>: movaps %xmm3,0x80(%rsp)
0x00007ffff7e2ccd2 <__printf+66>: movaps %xmm4,0x90(%rsp)
0x00007ffff7e2ccda <__printf+74>: movaps %xmm5,0xa0(%rsp)
0x00007ffff7e2cce2 <__printf+82>: movaps %xmm6,0xb0(%rsp)
0x00007ffff7e2ccea <__printf+90>: movaps %xmm7,0xc0(%rsp)
0x00007ffff7e2ccf2 <__printf+98>: mov %fs:0x28,%rax
End of assembler dump.
(gdb)
It seems to be a part of printf
.
Segfault on a store to %rsp
is surprising. That's an aligned move (movaps
). What is rsp
at that point (p/x %rsp
)? Is it 16-aligned or only 8-aligned? Or maybe is the stack overflowed (can you examine (x
) memory at the address)?
(gdb) p/x %rsp
A syntax error in expression, near `%rsp'.
(gdb)
I don't know how to use gdb well :p
If I use a $
instead of a %
it gives me this output:
(gdb) p/x $rsp
$3 = 0x7fffffffd0c8
(gdb)
sorry, my mistake, $rsp
is right in that context. So it's unaligned; question is why, as ABI specifies it should be aligned on function entry
either an incorrect signature somewhere, or perhaps you're calling into JIT code with an unaligned (only 8-aligned) stack?
This bug only occurs in AOT compiled code, so I don't think it's JIT's fault.
As for the signature, I think it's correct. Here's the dumped IR:
function u0:0(i32, i32) -> i32 system_v {
block0(v0: i32, v1: i32):
v2 = isub v0, v1
return v2
}
function u0:0() -> i32 system_v {
gv0 = symbol colocated userextname2
gv1 = symbol colocated userextname3
gv2 = symbol colocated userextname5
sig0 = (i32, i32) -> i32 system_v
sig1 = (i64) -> i32 system_v
sig2 = (i64) -> i32 system_v
sig3 = (i64, i32, i32, i32) -> i32 system_v
sig4 = (i32) -> i32 system_v
fn0 = colocated u0:0 sig0
fn1 = u0:1 sig1
fn2 = u0:1 sig2
fn3 = u0:2 sig3
fn4 = u0:3 sig4
block0:
v0 = iconst.i32 4
v1 = iconst.i32 2
v2 = call fn0(v0, v1) ; v0 = 4, v1 = 2
v3 = symbol_value.i64 gv0
v4 = call fn1(v3)
v5 = symbol_value.i64 gv1
v6 = call fn2(v5)
v7 = symbol_value.i64 gv2
v8 = call fn3(v7, v0, v1, v2) ; v0 = 4, v1 = 2
v9 = iconst.i32 0
v10 = call fn4(v9) ; v9 = 0
return v9 ; v9 = 0
}
function u0:0() system_v {
}
set opt_level=speed
set tls_model=none
set libcall_call_conv=isa_default
set probestack_size_log2=12
set probestack_strategy=outline
set bb_padding_log2_minus_one=0
set regalloc_checker=1
set regalloc_verbose_logs=0
set enable_alias_analysis=1
set enable_verifier=1
set enable_pcc=0
set is_pic=1
set use_colocated_libcalls=0
set enable_float=1
set enable_nan_canonicalization=0
set enable_pinned_reg=0
set enable_atomics=1
set enable_safepoints=0
set enable_llvm_abi_extensions=0
set unwind_info=1
set preserve_frame_pointers=0
set machine_code_cfg_info=0
set enable_probestack=0
set probestack_func_adjusts_sp=0
set enable_jump_tables=1
set enable_heap_access_spectre_mitigation=1
set enable_table_access_spectre_mitigation=1
set enable_incremental_compilation_cache_checks=0
target x86_64 has_sse3=0 has_ssse3=0 has_sse41=0 has_sse42=0 has_avx=0 has_avx2=0 has_fma=0 has_avx512bitalg=0 has_avx512dq=0 has_avx512vl=0 has_avx512vbmi=0 has_avx512f=0 has_popcnt=0 has_bmi1=0 has_bmi2=0 has_lzcnt=0
It looks correct to me, the string is correct being an i64
as it's a pointer and all the numbers are i32
s.
This was generated from:
fn do_math(a: i32, b: i32) -> i32 {
let v: i32 = a - b;
return v;
}
fn main() -> i32 {
let a: i32 = 4;
let b: i32 = 2;
let val: i32 = do_math(a, b);
puts("Hello, world!");
puts("Another test!");
printf("Math: %i - %i = %i\n", a, b, val);
return 0;
}
Well, then s/perhaps you're calling into JIT with misaligned/perhaps you're calling into AOT with misaligned/ :-)
The way I would bisect the failure is: determine whether rsp is misaligned on entry to printf; if so, determine whether rsp is misaligned on entry to the function that calls printf (your main
presumably); if so, look at your startup code that invokes it, or if not, let's take a look at the disassembly of main
you can see "rsp in the past" by using a time-traveling debugger (this is why rr
is so useful!) and reverse-continuing to a breakpoint, or reverse-finishing out of a function, or reverse-instruction-stepping
I'll try using rr
and looking at rsp tomorrow. For now I gotta go to bed. Good night.
Okay, so I was going to rewrite my AST at some point anyway, and I think it's lack of type inference and debuggability is contributing to the trouble I'm having tracking down the issue, so that gives me the perfect excuse! I just spent a few hours doing that, now it's just down to codegen that needs to be fixed to work with the new one.
Always good to take an excuse to pay down tech-debt!
thank you, cranelift. i love the borrow checker and how nothing can be copied or cloned.
image.png
i hate lifetimes
Yay, I've resorted to using an Arc<RwLock<...>>
for my codegen state but now I need to pull data out of it and consume it, but the arc doesn't like that so yay unsafe code, and then when I use the RwLock::into_inner
method it returns a result which I can unwrap, but that causes the thing to throw a PoisonError which I can't debug because whoops no backtrace because I can't convert it to a diagnostic type because then stuff goes out of scope and that's illegal, so now I'm stuck and trying literally everything. I have no words for this. I love rust but sometimes it's just painful.
Hooray! I fixed AOT compilation! Now JIT is broken though, and segfaults instantly. Does anyone know of a way to debug a JIT backend?
Set the PERF_BUILDID_DIR env var to any value and cranelift-jit will write a map with the addresses of all compiled functions to /tmp/perf-<pid>.map
: https://github.com/bytecodealliance/wasmtime/blob/main/cranelift/jit/src/backend.rs#L436
Okay. That didn't help xD
I'm going to see if it's just an issue with functions except main not being registered for JIT right now.
I found the issue, and it's really dumb. It was literally that my transmute
was using a &*const u8
instead of a *const u8
because HashMap
s are weird. I feel so dumb.
Well, it's done! Thank you so much for your help @Chris Fallin and @bjorn3! It works now!
Jacob Sapoznikow has marked this topic as resolved.
Last updated: Nov 22 2024 at 16:03 UTC