Stream: cranelift

Topic: Find GC Roots in non-stack locations


view this post on Zulip Zeke Medley (Jun 13 2021 at 19:14):

Hello :) apologies for all the questions this weekend. I'm currently working on locating live references in the call stack for garbage collection in my Cranelift compiled lisp. I'm successfully locating stack variables but realizing is not sufficient because Cranelift will often store values in registers making them invisible when generating stack maps. Is there a known way people have worked around this in the past for cranelift-jit?

A lisp compiler and interpreter I've been working on - ZekeMedley/lust

view this post on Zulip bjorn3 (Jun 13 2021 at 19:54):

Shouldn't these values be spilled to the stack on calls? Or does it keep r32/r64 values in callee saved registers?

view this post on Zulip bjorn3 (Jun 13 2021 at 19:55):

Wasmtime only looks at the stack when performing gc of wasm externrefs: https://github.com/bytecodealliance/wasmtime/blob/b506bc30b1c8692995e3297fcf1a5d498d4d36d6/crates/runtime/src/externref.rs#L869-L914

Standalone JIT-style runtime for WebAssembly, using Cranelift - bytecodealliance/wasmtime

view this post on Zulip bjorn3 (Jun 13 2021 at 19:56):

@Zeke Medley

view this post on Zulip Zeke Medley (Jun 13 2021 at 20:15):

Hmm that's interesting. Let me look into that. Part of what suggests to me that this is the issue is that when generating a stackmap from a list of values only values that have locations on the stack are added.

view this post on Zulip Zeke Medley (Jun 13 2021 at 20:19):

The way that I generate my stackmaps is that I collect all of the values that are used in local variables and then generate a stackmap from that list. Some debugging shows that the values that get added here aren't deterministic which suggests to me that sometimes Cranelift is choosing to keep a value in a register instead of a stack location. Is it possible that there is a different time in the compilation process that I should be doing this generation so that that "spill" information is available?

A lisp compiler and interpreter I've been working on - ZekeMedley/lust

view this post on Zulip bjorn3 (Jun 13 2021 at 20:52):

Oh, you are making your own stackmaps? Using Cranelift generated ones should work better. All live values with a reference type (r32/r64) are included in those stackmaps. I don't think the cranelift-module interface allows you to pass in a StackMapSink though, so you may need to manually call the compile_and_emit function on Context and then use declare_function_bytes instead of declare_function.

view this post on Zulip Zeke Medley (Jun 13 2021 at 21:02):

I have been - but happy to change it up :) I'll need to make some changes to how I compile things for that likely because currently I'm just representing everything as a tagged pointer so everything has type i64. I'm unfamiliar with how sinks work. If I make a StackMapSink where will those resulting maps get emitted?

view this post on Zulip bjorn3 (Jun 13 2021 at 21:13):

StackMapSink is a trait with a method that gets called for every stack map. You could collect the stack maps directly into a vec (or map) you store inside the type implementing StackMapSink, you could store a reference to a global vec/map of stack maps or you could do something completely different.

view this post on Zulip Zeke Medley (Jun 13 2021 at 21:18):

Ah got it. Thanks so much again for the help.

view this post on Zulip bjorn3 (Jun 13 2021 at 21:32):

no problem

view this post on Zulip Chris Fallin (Jun 14 2021 at 06:20):

@Zeke Medley -- to add one additional detail (thanks @bjorn3 for covering answers above!), re: "does Cranelift only generate GC root info for values in stack locations": yes, this is an invariant that comes from our definition of "stackmaps" (which ultimately derives from the history of Cranelift as an intended JIT backend for Firefox and so SpiderMonkey's GC design, but that's tangential): we assume that the GC can only trace values on the stack, and not values that happened to be in registers at the time the GC was invoked. For a synchronous invocation that'd mean a special prologue in the GC entry-point that saves all registers, at least, and for an async GC (stop a thread and examine its state) it would be even more complex. So we ensure that we spill all reference-type (pointer) values to the stack at every safepoint, and reload from the same slot before using again (i.e. the old value if still in a register is considered stale).

To use this functionality it should be enough to (i) use the reference types (r32 or r64 -- the type has to match the target machine's pointer width) for your GC pointers, (ii) make sure the places where GC could be invoked are safepoints -- this is currently hardcoded to all calls, but could be changed if needed; and (iii) consume the stack-maps, which are lists of offsets from the stack pointer at the safepoint.

view this post on Zulip Chris Fallin (Jun 14 2021 at 06:21):

Anyway, happy to answer more! And if we need to make this a little more flexible to fit your use-case, we can definitely work something out as well. Right now reference-types in Wasm are the main forcing function for this support but it would be nice to prove it to be more general!

view this post on Zulip Zeke Medley (Jun 14 2021 at 17:33):

Hi @Chris Fallin :) thanks for the additional info here. From some hairbrained initial thoughts it seems like this ought to be manageable from my end. I think there will be some tricks in integrating reference types with my current tagged values / pointers but I suspect this will be doable. This is currently just a side project so it might not be until this weekend when I can think more about it but I'll follow up with more questions as they come along!

view this post on Zulip Chris Fallin (Jun 14 2021 at 17:56):

Great! One specific note re: tagging that might be helpful: Cranelift doesn't actually care what the bits are in an r32/r64; it never actually tries to dereference a pointer or anything like that. So if e.g. your tagging is "biased" toward non-pointers (pointers have some bits set that need to be masked out first), that's fine; CL will just give you a list of locations of live refs, nothing more.

view this post on Zulip bjorn3 (Jun 14 2021 at 21:50):

That has the potential problem that the ref value isn't live between the bitcast to an int and the use of the pointer. If a gc happens in between (due to say optimizations), that can cause problems with unrooted objects potentially being gced.

view this post on Zulip Chris Fallin (Jun 14 2021 at 22:20):

Right, one needs to be careful not to have a safepoint when there is a value live outside of a root that has to be accessed. This is usually covered by the fact that safepoints (in ordinary VM-with-GC designs) are calls or inline expansions of primitives; in the former case one only casts ref to int to pass an arg (and after the arg is passed, the callee is responsible for it), and in the latter case, one knows what one is doing


Last updated: Oct 23 2024 at 20:03 UTC