Stream: cranelift

Topic: Implementing dynamically-typed language frontend


view this post on Zulip juh9870 (Feb 22 2024 at 13:07):

Hello, I am trying to port MiniScript scripting language (reference link) to rust, and use Cranelift to run it. The issue is that MiniScript is dynamically typed, and functions are first-class citizens, so I was curious, are there any articles or examples of implementing such things? I'm mostly interested in functions-as-values, and garbage collecting, as well as supporting multiple data types (lists, dictionaries, strings). All Cranelift examples I could find are overly simplified languages that only support numeric values.

view this post on Zulip fitzgen (he/him) (Feb 22 2024 at 18:38):

hi @juh9870,

Cranelift only supports scalars and simd; it is up to the frontend to lower structs and things into those operations (eg pass a struct via pointer and access fields via reading from offsets)

as far as GC goes: there are the r32/r64 types which are the same as i32/i64 but for which Cranelift will spill to the stack at GC safepoints and emit stack maps that your collector can use to find all on-stack roots

view this post on Zulip fitzgen (he/him) (Feb 22 2024 at 18:40):

so again with GC types, cranelift only ever sees/understands the pointer, and it is up to the frontend to emit code to eg read the ith element of a GC-managed list from the r64 that is the GC pointer to that list

view this post on Zulip fitzgen (he/him) (Feb 22 2024 at 18:41):

also, it is up to you to emit any GC barrier code that your collector might need, Cranelift won't do that for you automatically (it doesn't know what kind of barriers are needed)

view this post on Zulip juh9870 (Feb 23 2024 at 17:23):

I see. Are there any examples of some of these being implemented in some dynamically typed language?

view this post on Zulip fitzgen (he/him) (Feb 23 2024 at 18:34):

other than Wasmtime's reference types support, I'm not aware of any

here is CLIF construction for doing table.gets on tables of externrefs which are lowered to r64s: https://github.com/bytecodealliance/wasmtime/blob/dd0364d367c579abc8f572d2a056aca6cd286887/crates/cranelift/src/func_environ.rs#L1389

view this post on Zulip Chris Fallin (Feb 23 2024 at 19:00):

@juh9870 your questions I think might benefit from Cranelift-orthogonal resources -- you're more or less asking how to compile a higher-level language (with lambdas/closures, an object system, etc) into something at the machine-code level. Cranelift accepts a machine-independent IR, but from the perspective of your questions, it's not much different than machine code: it has pointers and memory as a linear array of bytes, and basic arithmetic and branching, and everything else has to be built on that.

There are some good books and papers on how to compile languages like Scheme (see e.g. Lisp in Small Pieces, or I think the nanopass framework is interesting to study, or Andy Wingo's blog posts on Guile) that will cover the closure bit; and for objects you can read up on how e.g. SpiderMonkey or V8 keep around objects with slots and "shapes". Garbage collection is also a huge topic and there are good textbooks on this (the "Garbage Collection Handbook" is the most common recommendation I see). There's a whole universe of literature on efficiently compiling ML (as in Ocaml and Standard ML, not the more recent "machine learning" meaning!) as well. "Lambda-lifting", "closure conversion", "monomorphization" vs. "uniform type representation" are some good search terms to get you started


Last updated: Nov 22 2024 at 16:03 UTC