Hello,
I am working on a garbage collected language like Java and I am running into some issues when trying to run my GC on generated code. For the most part, the stack maps generated work really well until recently. For some reason, one of the function calls in my language doesn't seem to have a stack map for it and I am not sure why.
I am trying to figure out if my issue is just user error on my part.
The way that function calls work in my language is that they are fetched based on a few values that tell the runtime where to find the required function. If the function hasn't been seen before, then it gets JIT-ed where we store the IP locations as keys to lists of SP offsets. This process allows me to run a check for GC before calling functions.
Here is the code in my language that I am trying to run that is giving me trouble:
class Main {
fn main(args: [String]) {
let printer: Printer = new Printer();
let n: u64 = 5
let result: u64 = Main::fib(n); // stack spills correctly here
printer.println-int(result);
Main::thrash() // Create a bunch of garbage to collect
let arr: [u64] = [1, 2, 3, 4];
while True {
Main::from-array(arr); // For some reason we have stack spilling around this call but not at this call
}
}
fn fib(n: u64) -> u64 {
if n < 2 {
return n;
}
return Main::fib(n - 1) + Main::fib(n - 2);
}
fn from-array(arr: [u64]) {
let x: u64 = 0;
}
fn thrash() {
let i: u64 = 0;
while i < 10 {
let x: [u64] = new [u64; 1];
i = i + 1;
}
}
}
I know that some of the stack maps are working because I ran the part of my code that collects references and it found a few of them but not all of them.
Any help is appreciated. If you need any other information like a disassembly, please let me know how to do such a thing, and I will try to provide it as fast as I can.
What is the generated clif ir?
How can I dump the generated clif ir?
You can do println!("{func}"); right after you finalize the FunctionBuilder if your Function is stored in a variable called func.
what would be even better would be the complete trace logging output, so we could see the CLIF before and after safepoint spilling. that would be something like
RUST_LOG=trace cargo run ...
if you are using env_logger
Sorry for taking so long. But here is the resulting clif ir. The logs are quite long and too big for pastebin. Should I just put the logs in a separate reply or is there a particular section you want to see?
clif ir:
function u0:0(i64, i64) system_v {
ss0 = explicit_slot 8, align = 8
ss1 = explicit_slot 8, align = 8
sig0 = (i64) -> i64 system_v
sig1 = (i64, i64, i64) -> i64 system_v
sig2 = (i64) -> i8 system_v
sig3 = (i64, i64) -> i64 system_v
sig4 = (i64) -> i8 system_v
sig5 = (i64, i64, i64, i64, i64) -> i64 system_v
sig6 = (i64) -> i8 system_v
sig7 = (i64, i64, i64) system_v
sig8 = (i64) -> i8 system_v
sig9 = (i64, i64, i64) -> i64 system_v
sig10 = (i64) -> i8 system_v
sig11 = (i64) system_v
sig12 = (i64) -> i8 system_v
sig13 = (i64) -> i64 system_v
sig14 = (i64, i64, i64) system_v
sig15 = (i64, i64, i64, i64) system_v
sig16 = (i64) -> i8 system_v
sig17 = (i64, i64, i64, i64) system_v
sig18 = (i64) -> i8 system_v
sig19 = (i64, i64, i64, i64) system_v
sig20 = (i64) -> i8 system_v
sig21 = (i64, i64, i64, i64) system_v
sig22 = (i64) -> i8 system_v
sig23 = (i64, i64, i64) -> i64 system_v
sig24 = (i64) -> i8 system_v
sig25 = (i64, i64) system_v
sig26 = (i64) -> i8 system_v
sig27 = (i64, i64, i64, i64, i64) -> i64 system_v
sig28 = (i64) -> i8 system_v
sig29 = (i64, i64, i64) system_v
sig30 = (i64) -> i8 system_v
sig31 = (i64, i64, i64, i64, i64) -> i64 system_v
sig32 = (i64) -> i8 system_v
sig33 = (i64, i64, i64) system_v
sig34 = (i64) -> i8 system_v
sig35 = (i64) system_v
fn0 = u0:8 sig0
fn1 = u0:9 sig1
fn2 = u0:6 sig2
fn3 = u0:6 sig4
fn4 = u0:10 sig5
fn5 = u0:6 sig6
fn6 = u0:6 sig8
fn7 = u0:9 sig9
fn8 = u0:6 sig10
fn9 = u0:6 sig12
fn10 = u0:8 sig13
fn11 = u0:11 sig14
fn12 = u0:12 sig15
fn13 = u0:6 sig16
fn14 = u0:12 sig17
fn15 = u0:6 sig18
fn16 = u0:12 sig19
fn17 = u0:6 sig20
fn18 = u0:12 sig21
fn19 = u0:6 sig22
fn20 = u0:9 sig23
fn21 = u0:6 sig24
fn22 = u0:6 sig26
fn23 = u0:10 sig27
fn24 = u0:6 sig28
fn25 = u0:6 sig30
fn26 = u0:10 sig31
fn27 = u0:6 sig32
fn28 = u0:6 sig34
fn29 = u0:7 sig35
block0(v0: i64, v1: i64):
jump block1
block1:
v2 = iconst.i64 35
v3 = call fn0(v2) ; v2 = 35
v99 = stack_addr.i64 ss1
store notrap v3, v99
v4 = iconst.i64 5
v5 = iconst.i64 66
v6 = iconst.i64 68
v8 = call fn1(v0, v5, v6), stack_map=[i64 @ ss1+0] ; v5 = 66, v6 = 68
v9 = call fn2(v0), stack_map=[i64 @ ss1+0]
brif v9, block2, block3
block2:
return
block3:
v11 = call_indirect.i64 sig3, v8(v0, v4), stack_map=[i64 @ ss1+0] ; v4 = 5
v13 = call fn3(v0), stack_map=[i64 @ ss1+0]
brif v13, block4, block5
block4:
return
block5:
v16 = iconst.i64 35
v17 = iconst.i64 -1
v18 = iconst.i64 36
v20 = call fn4(v0, v3, v16, v17, v18), stack_map=[i64 @ ss1+0] ; v16 = 35, v17 = -1, v18 = 36
v21 = call fn5(v0), stack_map=[i64 @ ss1+0]
brif v21, block6, block7
block6:
return
block7:
call_indirect.i64 sig7, v20(v0, v3, v11), stack_map=[i64 @ ss1+0]
v24 = call fn6(v0), stack_map=[i64 @ ss1+0]
brif v24, block8, block9
block8:
return
block9:
v25 = iconst.i64 66
v26 = iconst.i64 70
v28 = call fn7(v0, v25, v26), stack_map=[i64 @ ss1+0] ; v25 = 66, v26 = 70
v29 = call fn8(v0), stack_map=[i64 @ ss1+0]
brif v29, block10, block11
block10:
return
block11:
call_indirect.i64 sig11, v28(v0), stack_map=[i64 @ ss1+0]
v32 = call fn9(v0), stack_map=[i64 @ ss1+0]
brif v32, block12, block13
block12:
return
block13:
v33 = iconst.i64 4
v34 = iconst.i64 19
v35 = call fn10(v34), stack_map=[i64 @ ss1+0] ; v34 = 19
v100 = stack_addr.i64 ss0
store notrap v35, v100
v101 = stack_addr.i64 ss0
v98 = load.i64 notrap v101
call fn11(v0, v98, v33), stack_map=[i64 @ ss1+0, i64 @ ss0+0] ; v33 = 4
v37 = iconst.i64 0
v38 = iconst.i64 1
v102 = stack_addr.i64 ss0
v97 = load.i64 notrap v102
call fn12(v0, v97, v37, v38), stack_map=[i64 @ ss1+0, i64 @ ss0+0] ; v37 = 0, v38 = 1
v39 = call fn13(v0), stack_map=[i64 @ ss1+0, i64 @ ss0+0]
brif v39, block14, block15
block14:
return
block15:
v40 = iconst.i64 1
v41 = iconst.i64 2
v103 = stack_addr.i64 ss0
v96 = load.i64 notrap v103
call fn14(v0, v96, v40, v41), stack_map=[i64 @ ss1+0, i64 @ ss0+0] ; v40 = 1, v41 = 2
v43 = call fn15(v0), stack_map=[i64 @ ss1+0, i64 @ ss0+0]
brif v43, block16, block17
block16:
return
block17:
v44 = iconst.i64 2
v45 = iconst.i64 3
v104 = stack_addr.i64 ss0
v95 = load.i64 notrap v104
call fn16(v0, v95, v44, v45), stack_map=[i64 @ ss1+0, i64 @ ss0+0] ; v44 = 2, v45 = 3
v47 = call fn17(v0), stack_map=[i64 @ ss1+0, i64 @ ss0+0]
brif v47, block18, block19
block18:
return
block19:
v48 = iconst.i64 3
v49 = iconst.i64 4
v105 = stack_addr.i64 ss0
v94 = load.i64 notrap v105
call fn18(v0, v94, v48, v49), stack_map=[i64 @ ss1+0, i64 @ ss0+0] ; v48 = 3, v49 = 4
v51 = call fn19(v0), stack_map=[i64 @ ss1+0, i64 @ ss0+0]
brif v51, block20, block21
block20:
return
block21:
v106 = stack_addr.i64 ss0
v93 = load.i64 notrap v106
jump block22
block22:
v52 = iconst.i8 1
brif v52, block23, block24 ; v52 = 1
block23:
v54 = iconst.i64 66
v55 = iconst.i64 69
v57 = call fn20(v0, v54, v55) ; v54 = 66, v55 = 69
v58 = call fn21(v0)
brif v58, block26, block27
block26:
return
block27:
call_indirect.i64 sig25, v57(v0, v93)
v61 = call fn22(v0)
brif v61, block28, block29
block28:
return
block29:
v64 = iconst.i64 35
v65 = iconst.i64 -1
v66 = iconst.i64 36
v68 = call fn23(v0, v3, v64, v65, v66) ; v64 = 35, v65 = -1, v66 = 36
v69 = call fn24(v0)
brif v69, block30, block31
block30:
return
block31:
call_indirect.i64 sig29, v68(v0, v3, v11)
v72 = call fn25(v0)
brif v72, block32, block33
block32:
return
block33:
jump block22
block24:
v75 = iconst.i64 35
v76 = iconst.i64 -1
v77 = iconst.i64 36
v79 = call fn26(v0, v3, v75, v76, v77) ; v75 = 35, v76 = -1, v77 = 36
v80 = call fn27(v0)
brif v80, block34, block35
block34:
return
block35:
call_indirect.i64 sig33, v79(v0, v3, v11)
v83 = call fn28(v0)
brif v83, block36, block37
block36:
return
block37:
call fn29(v0)
return
}
It seems like the clif ir you generate contains a couple more calls than the source you are compiling. Can you get some debug information about which fn* corresponds with a call to what function? You could do a println!() at the point you are emitting the could I would guess.
Several of the calls don't have stack maps at least.
Also is the source available somewhere?
Yeah, I should specify a few details. I'll put the log in a separate reply due to its length.
Basically when you call, you fetch the function and then call it. After calling a function, we test if the context parameter holds an exception (another function call). There are also a few operations that are done through function calls like creating an object, creating an array (one for each size + object + floats)
You can find the source code here, it is not the greatest thing in the world but it could help you. Right now many design ideas have changed and I haven't documented them yet.
But to compile the test file I have you do cargo run --bin rowanc -- -s rowan-test-files rowan-test-files then to run the runtime, you will need to do it on a machine that supports libunwind (this will likely change) and do cargo run --bin rowan -- output/Main.class
And here is the log output from before and after stack spilling.
https://pastebin.com/V01TyyDf
@Alec Davis I feel like I'm not understanding exactly what your issue is, and what you expect to be happening but is not happening or whatever, so I think what would be really helpful would be:
cranelift_frontend::FunctionBuilder APIs rather than your whole compiler, but it is understandable if that is too difficultcargo test? cargo run? something else?v12, since it is marked as needing inclusion in stack maps, is spilled to the stack before the call instruction in block4 and then reloaded from the stack after the call."v12 is not spilled to the stack and reloaded after the call and is instead reused across the call, which could lead to stale GC values."cranelift-frontend crateenv_logger that would be RUST_LOG=cranelift_frontend=trace ...I think providing this level of detail will help clarify things and help us help you -- thanks!
I'll be sure to do that, thanks for the feedback. I am busy at work for most of the day, so it is hard to respond quickly while people are awake/available. I'll be sure to put the logs in a gist.
I believe that this is the smallest example in my language (I compile to my own class file definition so it is a little hard to share). I'll annotate what I think is supposed to happen.
class Main {
static thing: u64 = 66666666666;
fn main(args: [String]) {
let printer: Printer = new Printer();
let n: u64 = 40;
Main::thrash() // We create some garbage to collect
while True {
// Here to call println-int, we first fetch the method, but before trying that we first
// try to acquire a lock, if we are able to acquire it, then that means the runtime isn't
// collecting garbage. If we can't, then we run the routine to to collect all live rooted
// memory on the stack which should only include the printer object.
// However, for some reason, the call to fetch this particular method doesn't get a
// stack spill but the calls around it do. If we did a print before this loop, it would also find it there
printer.println-int(n);
}
}
fn thrash() {
let i: u64 = 0;
while i < 10 {
let x: [u64] = new [u64; 1];
i = i + 1;
}
}
}
I haven't set up a decent test system yet so to test this example you can overwrite rowan-test-files/main.rowan with the contents above.
Then to run the compiler to generate the associated class file you provide 2 arguments, the path to the stdlib directory and a path to the directory with your project. In this case, we want them to be the same because we don't need the standard library.
So the command to be run will be this:
cargo run --bin rowanc -- -s rowan-test-files rowan-test-files
This will put the newly generated class file in a directory called output.
To then run the runtime with the class file (It should be noted that the runtime only supports x86_64 SystemV systems. I am working on changing this) you run this command:
cargo run --bin rowan -- output/Main.class
What will happen, is that the runtime will link the class with itself and call the main method in the Main class.
In the main method above, what should happen is that:
thrash method,Garbage collection should run every 5 seconds. This is to test that it actually works.
When we call the print method, 4 things happen in this order.
However, what ends up happening is that in the loop, the call to fetch the print method doesn't get the stack spilled, and a stack map generated. This then causes the printer to get collected when it should still be in scope. This then causes a Rust panic because the object's pointer is null.
Then for completeness, here is the clif ir with where I think a stack map should exist but doesn't exist
function u0:0(i64, i64) system_v {
sig0 = (i64) -> i64 system_v
sig1 = (i64, i64, i64) -> i64 system_v
sig2 = (i64) -> i8 system_v
sig3 = (i64) system_v
sig4 = (i64) -> i8 system_v
sig5 = (i64, i64, i64, i64, i64) -> i64 system_v
sig6 = (i64) -> i8 system_v
sig7 = (i64, i64, i64) system_v
sig8 = (i64) -> i8 system_v
sig9 = (i64) system_v
fn0 = u0:7 sig0
fn1 = u0:8 sig1
fn2 = u0:5 sig2
fn3 = u0:5 sig4
fn4 = u0:9 sig5
fn5 = u0:5 sig6
fn6 = u0:5 sig8
fn7 = u0:6 sig9
block0(v0: i64, v1: i64):
v7 -> v0
v11 -> v0
jump block1
block1:
v2 = iconst.i64 35
v3 = call fn0(v2) ; v2 = 35
v4 = iconst.i64 40
v5 = iconst.i64 66
v6 = iconst.i64 69
v8 = call fn1(v7, v5, v6) ; v5 = 66, v6 = 69
v9 = call fn2(v7)
brif v9, block2, block3(v8)
block2:
return
block3(v10: i64):
call_indirect.i64 sig3, v8(v7)
v12 = call fn3(v11)
brif v12, block4, block5
block4:
return
block5:
jump block6(v4, v3, v11) ; v4 = 40
block6(v26: i64, v28: i64, v30: i64):
v14 -> v26
v27 -> v26
v15 -> v28
v29 -> v28
v19 -> v30
v23 -> v30
v25 -> v30
v31 -> v30
v13 = iconst.i8 1
brif v13, block7, block8 ; v13 = 1
block7:
v16 = iconst.i64 35
v17 = iconst.i64 -1
v18 = iconst.i64 36
v20 = call fn4(v19, v15, v16, v17, v18) ; v16 = 35, v17 = -1, v18 = 36 // Right here we should have a stack map since we are fetching a method.
v21 = call fn5(v19)
brif v21, block10, block11(v20)
block10:
return
block11(v22: i64):
call_indirect.i64 sig7, v20(v19, v15, v14)
v24 = call fn6(v23)
brif v24, block12, block13
block12:
return
block13:
jump block6(v27, v29, v31)
block8:
call fn7(v25)
return
}
However, when looking at the clif ir block7, we don't have a stack map. In fact, the last stack map was in block where we call thrash.
block7:
v16 = iconst.i64 35
v17 = iconst.i64 -1
v18 = iconst.i64 36
v20 = call fn4(v19, v15, v16, v17, v18) ; v16 = 35, v17 = -1, v18 = 36 // There should be a stack map here but there isn't one.
v21 = call fn5(v19)
brif v21, block10, block11(v20)
You can see the more detailed logs in this gist
Hopefully, this was both enough and not too much information. I look forward to hearing your response.
I wonder if using when using vars, I need to mark them again as requiring a stackmap. I tried marking the vars at one point as needing a stack map and that didn't change anything.
Alec Davis said:
I wonder if using when using vars, I need to mark them again as requiring a stackmap. I tried marking the vars at one point as needing a stack map and that didn't change anything.
if you expect the values that are associated with those variables to be in stack maps, then yes, you need to mark the variables as needing stack maps
The thing is, I tried that, and it didn't change anything, the same problem occurred. Would I also have to mark the value I get from using the variable as needing a stack map? That is something I haven't tried yet.
So far, I have given every function call that should get a stack map a stack map but I am still missing some live memory.
I have set variables, and values as needing stack maps and I am still having issues with constructing a stack map. I wonder if when I convert the stack map into a data structure that my runtime can use I am doing something wrong.
I construct my own map that I can look up during the gc phase.
This is how I construct a hashmap where the key is the the instruction pointer and the value is a list of offsets from the stack pointer.
let compiled_code = self.context.compiled_code().unwrap();
let stack_maps = compiled_code.buffer.user_stack_maps();
let mut object_locations = Vec::new();
for (location, _, map) in stack_maps {
let objects = map.entries()
.map(|(_, offset)| offset)
.collect::<Vec<_>>();
object_locations.push((*location, objects));
}
let locations = object_locations;
let code = module.get_finalized_function(*id) as *const ();
let mut object_locations = HashMap::new();
locations.into_iter()
.for_each(|(offset, objects)| {
object_locations.insert(offset as usize + code as usize, objects);
});
I think it was just user error on my part. Thank you, everyone for help.
Last updated: Dec 06 2025 at 07:03 UTC