wasmtime / issue #10766 Cranelift impressions from a firs... · git-wasmtime

Stream: git-wasmtime

Topic: wasmtime / issue #10766 Cranelift impressions from a firs...

Wasmtime GitHub notifications bot (May 11 2025 at 16:08):

Hello,

I tried using Cranelift for a new project JIT-compiling scripts for realtime audio processing. I've done similar work using Inkwell which wraps LLVM. I remain very interested in having a modern and competitive Rust-first JIT compiler, and I really want to see the Cranelift project succeed. While I was able to get get started quickly with Cranelift, there have been multiple separate and persistent issues which taken together have made me reconsider using Cranelift over Inkwell. I want to share my experiences and specific feedback because I think such first-time impressions are vital to the health of open source software and again, I love this project and want it to keep growing.

Apologies for putting so many things into a single issue. If it suits you, feel free to chop this up and quote verbatim or paraphrase into separate issues. For now, I'm moving on from Cranelift and returning to Inkwell and LLVM for my own productivity.

Multiple distracting TODO's in the JIT demo

My early rapid success interacting with Cranelift was from following and adapting the jit demo. Reading through example code though, there are several distracting TODO comments questioning Cranelift's own design and pointing out known pain points. As a first-time user, this gives me the impression that the API hasn't matured and is probably unstable, which makes me hesitant to want to commit to it. What I would love to see here instead is just a self-contained demo where each part of the library plays an obvious role. Possibly some of this commentary is out of date and the demo can be refined already.

Scattered documentation due to crate separation

I rely heavily on documentation hosted on docs.rs where it is easily navigable and rapidly searchable. Oftentimes I search for individual types, names, and keywords to quickly and serendipitously discover related functionality. The separation of Cranelift into multiple crates however means that oftentimes, the thing I'm looking for isn't found because it's in a separate crate. For example, FunctionBuilder is defined in cranelift-frontend, but all of the instruction-inserting methods I use it with are defined in the InstBuilder trait in the cranelift-codegen crate, where I would have to know to look first. This may be more of a problem for Rust's own documentation system than with Cranelift per se.

Difficulties with the first block of a function

I was trying to JIT-compile a function consisting of an entry block, multiple user-defined sequences which can't be known ahead of time, and an exit block. The first thing I tried was to create the 'entry' block first, then populate user-defined blocks, and finally switch to the 'entry' block and populate it with a jump instruction to the first user-defined block. This turned out to be very frustrating, and Cranelift would invariably error with "invalid reference to entry block". It turns out that Cranelift considers the entry block not to be the first created block (as I had assumed), but rather the first block that I pass to FunctionBuilder::switch_to_block, as explained in the docs for FunctionBuilder. This was confusing for me, since I wasn't ready to put anything in my entry block yet and so wouldn't have thought to switch to it first. Additionally, I'm not really convinced the documentation is correct here, and Cranelift actively prevents me from partially filling and later switching back to partly-filled blocks.

An explicit method such as FunctionBuilder::set_entry_block or having the first _created_ block become the function entry point would make this process a lot more straightforward. After all, the blocks themselves are merely nodes in a control flow graph, and everywhere else with jump and return instructions, their dependencies are explicitly stated, so it would make sense to explicitly state which is the entry block, rather than implying it through non-obvious workflows.

A few minimal motivating examples. This works fine and as I expect:
let entry_block = builder.create_block();
let exit_block = builder.create_block();

builder.switch_to_block(entry_block);
builder.ins().jump(exit_block, &[]);

builder.switch_to_block(exit_block);
builder.ins().return_(&[]);
Filling the exit block first errors:
let entry_block = builder.create_block();
let exit_block = builder.create_block();

builder.switch_to_block(exit_block);
builder.ins().return_(&[]);

builder.switch_to_block(entry_block);
builder.ins().jump(exit_block, &[]);
The error being:
Compilation(Verifier(VerifierErrors([VerifierError { location: inst1, context: None, message: "invalid reference to entry block block1" }])))
Contrary to the documentation, adding a call to switch_to_block with the first block does not fix this:
let entry_block = builder.create_block();
let exit_block = builder.create_block();

builder.switch_to_block(entry_block);

builder.switch_to_block(exit_block);
builder.ins().return_(&[]);

builder.switch_to_block(entry_block);
builder.ins().jump(exit_block, &[]);
This again errors with
value: Compilation(Verifier(VerifierErrors([VerifierError { location: inst1, context: None, message: "invalid reference to entry block block1" }])))
If hypothesize that perhaps the first block that I switch to _and insert an instruction into_ becomes the first block, attempting to insert a nop to return to later just outright panics:
let entry_block = builder.create_block();
let exit_block = builder.create_block();

builder.switch_to_block(entry_block);
builder.ins().nop();

builder.switch_to_block(exit_block);
builder.ins().return_(&[]);

builder.switch_to_block(entry_block);
builder.ins().jump(exit_block, &[]);
This panics with:
thread panicked at cranelift-frontend-0.117.2/src/frontend.rs:361:9:
you have to fill your block before switching
Inflexible model of program synthesis

The above troubles point to a broader issue I have with Cranelift. In Inkwell and LLVM, generating JIT instructions and manipulating basic blocks feels very much like writing real source code: you can insert some instructions here, move the cursor and finish one idea, then append or prepend instructions where it makes sense for user-defined workloads as they're needed, and verify everything when one is ready. This enables writing JIT instructions just like how one would, for example, type out the loop variable and open and close brackets for a for-loop in one's editor before thinking about how to fill in the body. In Cranelift, the simple panic message you have to fill your block before switching tells me that this fine-grained but intuitive level of incremental edits is currently not possible.

I view this as a major shortcoming for JIT API, since if I already had a fully-ordered instruction-by-instruction understanding of what my user-defined program will do, I would be necessarily be repeating a huge amount of work that other JIT libraries like Inkwell and LLVM help me with. The restriction to fill a basic block all in one go may be appropriate for straightforward AST traversals with simple semantics, but it hugely complicates other workflows where single parts of user-defined code map to multiple instructions in multiple blocks. These troubles are compounded by the confusion around how to specify and populate the entry block as explained above.

I encourage removing these restrictions and supporting more dynamic workflows where blocks and instructions are created and revised on the fly, with the ability to insert, prepend and append instructions within blocks similarly to how a human would write them in an incremental, out-of-order manner. The mental model of JIT compilation that has served me well using Inkwell is one where JIT instructions and basic blocks are a fluid work-in-progress authoring format, and validation and rule enforcement come during finalization but no sooner.

Lack of familiar names in debug output

A key part of my JIT development workflow is printing out the IR when anything goes wrong and inspecting it. Inkwell's JIT API can do this quite nicely since everything ultimately models LLVM IR and most things need a name up front in the API. Here's what some sample LLVM IR might look like printed out as part of my development process:
define void @my_function(ptr %ptr_ptr_dst, i64 %dst_len, ...) {
entry:
  %arg_ptr_retv = call ptr @argument_pointer_wrapper(ptr %context_ptr, i64 527665334732589126)
  %ptr1 = getelementptr i64, ptr %arg_ptr_retv, i64 1
  --- snip ---
  %len_is_zero = icmp eq i64 %dst_len, 0

check_startover:                                  ; No predecessors!
  %init_flag = load i8, ptr %ptr_init_flag, align 1
  %was_init = icmp eq i8 %init_flag, 1
  br i1 %was_init, label %resume, label %startover

startover:                                        ; preds = %check_startover
  store i8 1, ptr %ptr_init_flag, align 1
  br label %pre_loop

resume:                                           ; preds = %check_startover
  br label %pre_loop

pre_loop:                                         ; preds = %resume, %startover
  br label %loop_begin

loop_begin:                                       ; preds = %loop_end, %pre_loop
  %loop_counter = phi i64 [ 0, %pre_loop ], [ %loop_counter_inc, %loop_end ]
  %loop_counter_inc = add i64 %loop_counter, 1
  %ptr_local_val = alloca float, align 4
  --- snip ---
In this snippet, I've forgotten a jump instruction from the entry block to check_startover and the IR printout even points this out with a helpful No predecessors! comment. The immediate usefulness of this stems largely from having given blocks and instructions human-readab
[message truncated]

Wasmtime GitHub notifications bot (May 11 2025 at 19:45):

cfallin closed issue #10766:

Hello,

I tried using Cranelift for a new project JIT-compiling scripts for realtime audio processing. I've done similar work using Inkwell which wraps LLVM. I remain very interested in having a modern and competitive Rust-first JIT compiler, and I really want to see the Cranelift project succeed. While I was able to get get started quickly with Cranelift, there have been multiple separate and persistent issues which taken together have made me reconsider using Cranelift over Inkwell. I want to share my experiences and specific feedback because I think such first-time impressions are vital to the health of open source software and again, I love this project and want it to keep growing.

Apologies for putting so many things into a single issue. If it suits you, feel free to chop this up and quote verbatim or paraphrase into separate issues. For now, I'm moving on from Cranelift and returning to Inkwell and LLVM for my own productivity.

Multiple distracting TODO's in the JIT demo

My early rapid success interacting with Cranelift was from following and adapting the jit demo. Reading through example code though, there are several distracting TODO comments questioning Cranelift's own design and pointing out known pain points. As a first-time user, this gives me the impression that the API hasn't matured and is probably unstable, which makes me hesitant to want to commit to it. What I would love to see here instead is just a self-contained demo where each part of the library plays an obvious role. Possibly some of this commentary is out of date and the demo can be refined already.

Scattered documentation due to crate separation

I rely heavily on documentation hosted on docs.rs where it is easily navigable and rapidly searchable. Oftentimes I search for individual types, names, and keywords to quickly and serendipitously discover related functionality. The separation of Cranelift into multiple crates however means that oftentimes, the thing I'm looking for isn't found because it's in a separate crate. For example, FunctionBuilder is defined in cranelift-frontend, but all of the instruction-inserting methods I use it with are defined in the InstBuilder trait in the cranelift-codegen crate, where I would have to know to look first. This may be more of a problem for Rust's own documentation system than with Cranelift per se.

Difficulties with the first block of a function

I was trying to JIT-compile a function consisting of an entry block, multiple user-defined sequences which can't be known ahead of time, and an exit block. The first thing I tried was to create the 'entry' block first, then populate user-defined blocks, and finally switch to the 'entry' block and populate it with a jump instruction to the first user-defined block. This turned out to be very frustrating, and Cranelift would invariably error with "invalid reference to entry block". It turns out that Cranelift considers the entry block not to be the first created block (as I had assumed), but rather the first block that I pass to FunctionBuilder::switch_to_block, as explained in the docs for FunctionBuilder. This was confusing for me, since I wasn't ready to put anything in my entry block yet and so wouldn't have thought to switch to it first. Additionally, I'm not really convinced the documentation is correct here, and Cranelift actively prevents me from partially filling and later switching back to partly-filled blocks.

An explicit method such as FunctionBuilder::set_entry_block or having the first _created_ block become the function entry point would make this process a lot more straightforward. After all, the blocks themselves are merely nodes in a control flow graph, and everywhere else with jump and return instructions, their dependencies are explicitly stated, so it would make sense to explicitly state which is the entry block, rather than implying it through non-obvious workflows.

A few minimal motivating examples. This works fine and as I expect:
let entry_block = builder.create_block();
let exit_block = builder.create_block();

builder.switch_to_block(entry_block);
builder.ins().jump(exit_block, &[]);

builder.switch_to_block(exit_block);
builder.ins().return_(&[]);
Filling the exit block first errors:
let entry_block = builder.create_block();
let exit_block = builder.create_block();

builder.switch_to_block(exit_block);
builder.ins().return_(&[]);

builder.switch_to_block(entry_block);
builder.ins().jump(exit_block, &[]);
The error being:
Compilation(Verifier(VerifierErrors([VerifierError { location: inst1, context: None, message: "invalid reference to entry block block1" }])))
Contrary to the documentation, adding a call to switch_to_block with the first block does not fix this:
let entry_block = builder.create_block();
let exit_block = builder.create_block();

builder.switch_to_block(entry_block);

builder.switch_to_block(exit_block);
builder.ins().return_(&[]);

builder.switch_to_block(entry_block);
builder.ins().jump(exit_block, &[]);
This again errors with
value: Compilation(Verifier(VerifierErrors([VerifierError { location: inst1, context: None, message: "invalid reference to entry block block1" }])))
If hypothesize that perhaps the first block that I switch to _and insert an instruction into_ becomes the first block, attempting to insert a nop to return to later just outright panics:
let entry_block = builder.create_block();
let exit_block = builder.create_block();

builder.switch_to_block(entry_block);
builder.ins().nop();

builder.switch_to_block(exit_block);
builder.ins().return_(&[]);

builder.switch_to_block(entry_block);
builder.ins().jump(exit_block, &[]);
This panics with:
thread panicked at cranelift-frontend-0.117.2/src/frontend.rs:361:9:
you have to fill your block before switching
Inflexible model of program synthesis

The above troubles point to a broader issue I have with Cranelift. In Inkwell and LLVM, generating JIT instructions and manipulating basic blocks feels very much like writing real source code: you can insert some instructions here, move the cursor and finish one idea, then append or prepend instructions where it makes sense for user-defined workloads as they're needed, and verify everything when one is ready. This enables writing JIT instructions just like how one would, for example, type out the loop variable and open and close brackets for a for-loop in one's editor before thinking about how to fill in the body. In Cranelift, the simple panic message you have to fill your block before switching tells me that this fine-grained but intuitive level of incremental edits is currently not possible.

I view this as a major shortcoming for JIT API, since if I already had a fully-ordered instruction-by-instruction understanding of what my user-defined program will do, I would be necessarily be repeating a huge amount of work that other JIT libraries like Inkwell and LLVM help me with. The restriction to fill a basic block all in one go may be appropriate for straightforward AST traversals with simple semantics, but it hugely complicates other workflows where single parts of user-defined code map to multiple instructions in multiple blocks. These troubles are compounded by the confusion around how to specify and populate the entry block as explained above.

I encourage removing these restrictions and supporting more dynamic workflows where blocks and instructions are created and revised on the fly, with the ability to insert, prepend and append instructions within blocks similarly to how a human would write them in an incremental, out-of-order manner. The mental model of JIT compilation that has served me well using Inkwell is one where JIT instructions and basic blocks are a fluid work-in-progress authoring format, and validation and rule enforcement come during finalization but no sooner.

Lack of familiar names in debug output

A key part of my JIT development workflow is printing out the IR when anything goes wrong and inspecting it. Inkwell's JIT API can do this quite nicely since everything ultimately models LLVM IR and most things need a name up front in the API. Here's what some sample LLVM IR might look like printed out as part of my development process:
define void @my_function(ptr %ptr_ptr_dst, i64 %dst_len, ...) {
entry:
  %arg_ptr_retv = call ptr @argument_pointer_wrapper(ptr %context_ptr, i64 527665334732589126)
  %ptr1 = getelementptr i64, ptr %arg_ptr_retv, i64 1
  --- snip ---
  %len_is_zero = icmp eq i64 %dst_len, 0

check_startover:                                  ; No predecessors!
  %init_flag = load i8, ptr %ptr_init_flag, align 1
  %was_init = icmp eq i8 %init_flag, 1
  br i1 %was_init, label %resume, label %startover

startover:                                        ; preds = %check_startover
  store i8 1, ptr %ptr_init_flag, align 1
  br label %pre_loop

resume:                                           ; preds = %check_startover
  br label %pre_loop

pre_loop:                                         ; preds = %resume, %startover
  br label %loop_begin

loop_begin:                                       ; preds = %loop_end, %pre_loop
  %loop_counter = phi i64 [ 0, %pre_loop ], [ %loop_counter_inc, %loop_end ]
  %loop_counter_inc = add i64 %loop_counter, 1
  %ptr_local_val = alloca float, align 4
  --- snip ---
In this snippet, I've forgotten a jump instruction from the entry block to check_startover and the IR printout even points this out with a helpful No predecessors! comment. The immediate usefulness of this stems largely from having given blocks and instructions human-reada
[message truncated]

Wasmtime GitHub notifications bot (May 11 2025 at 19:45):

cfallin commented on issue #10766:

Hi @timstr,

Thanks for your comment. First, some points for context:

Cranelift does not really have a full-time person whose job it is to "listen to feedback", clarify misconceptions, improve documentation, and the like. There are a number of us here who have written large parts of the compiler backend and are able to jump in and maintain as needed, but we all have day-job tasks pushing forward other efforts right now. As such, while I'm sure you meant well with this issue, it comes across as fairly out-of-place: we are not really in place to say "ah yes, we'll improve that right away, sorry about that!" Rather, if you want improvements, "PRs welcome", as is always the case in open-source. The project works best when folks who need things help to build them.

You'll find this in contrast to LLVM, which has probably O(100) full-time contributors. If you need a universal compiler that is going to work everywhere, address ~every use case, and have fairly good polish, that's your best bet. We won't be offended if you "move on from Cranelift" and use LLVM :-)

Cranelift is a mature compiler, and "distracting TODOs" (as you say) in the JIT demo are not really a good indicator of whether the project is ready for use or whether APIs will change unpredictably. The compiler backend is in use as Wasmtime's backend in a bunch of load-bearing corporate use-cases (see the ADOPTERS.md file), and is shipped as an alternative backend for the Rust compiler as well, so it is in our interest to keep it stable.

The JIT demo hasn't gotten a lot of love, and is not really maintained as well as we would like, but again, see above -- unfortunately no one is paid full-time to improve documentation and examples. And, it's not really a "first tier" part of Cranelift -- those are the pieces in the critical path of its major use-cases, namely cranelift-codegen and cranelift-frontend. If you'd like to contribute documentation or code updates, PRs are definitely welcome.

Cranelift's crate separation is an intentional factoring for separation of concerns:

cranelift-codegen is the core compiler. It is kept agnostic to the surrounding environment as much as possible: it is a per-function compiler and doesn't have a concept of symbol names or of modules (only opaque references to other functions/data that are passed straight through into emitted relocations); it doesn't have a concept of any kind of code consumer or object format or JIT API (it just returns a vector of bytes plus metadata); it doesn't provide any "convenience" APIs (the user must provide valid SSA).

cranelift-frontend's main purpose is to lower from a few higher-level convenience abstractions: in particular, it implements a standard SSA construction algorithm so you can operate in terms of multiply-updated "variables".

cranelift-module, cranelift-object, cranelift-jit, ... are all "utility crates" that you can put together as needed to build the rest of a full-featured compiler. In Wasmtime we don't use most of these -- we manage our own object file format, module metadata, relocation handling, code loading, etc., so these get a little less maintenance attention and are effectively "community-maintained".

I wouldn't really describe this as a downside: the factoring is what allows Cranelift to be extremely flexible. For example, keeping compilation as a function-level concern in the core parts is what allows it to parallelize across function compilations, which is something LLVM cannot do.

Your issues with entry block manipulation and creating code in arbitrary order are limitations of the SSA construction algorithm and not of cranelift-codegen proper. Cranelift contains APIs that let you insert instructions anywhere -- internally we represent instruction and block layout with linked lists so these insertions are O(1) -- you only need to use the right APIs. (See the Cursor abstraction and Layout API for more.)

To say it another way: cranelift-frontend gives you SSA construction, and to do so, it needs to see the program in close to program order (imagine the state it would need to track otherwise). But if you're willing to produce valid SSA and use the cranelift-codegen API, you can just spray instructions anywhere. This is exactly the same as LLVM. (Well, if you use mem2reg and stack slots for all your variables, then you get both, but that's a pretty expensive approach versus in-place SSA construction.)

To be blunt: I wish you had come to us with these concerns as questions rather than posting a wall of text and saying essentially "I'm giving up, see ya". We would have been happy to clarify the roles of the different levels of API, and whether the restrictions you perceive are fundamental or not. (And where they are, offer pointers to the theory, algorithms, and further reading that show why the restrictions exist.)

Names for entities: Yeah, definitely, it would be useful to have names attached to values and to blocks. Willing to contribute a PR?

Since this whole issue as a unit is not really actionable, I'll go ahead and close it, but do feel free to file issues (or post PRs directly) for the individual parts, such as names for values/blocks or any doc improvements. Thanks!

Wasmtime GitHub notifications bot (May 11 2025 at 19:45):

cfallin edited a comment on issue #10766:

Hi @timstr,

Thanks for your comment. Some thoughts:

Cranelift does not really have a full-time person whose job it is to "listen to feedback", clarify misconceptions, improve documentation, and the like. There are a number of us here who have written large parts of the compiler backend and are able to jump in and maintain as needed, but we all have day-job tasks pushing forward other efforts right now. As such, while I'm sure you meant well with this issue, it comes across as fairly out-of-place: we are not really in place to say "ah yes, we'll improve that right away, sorry about that!" Rather, if you want improvements, "PRs welcome", as is always the case in open-source. The project works best when folks who need things help to build them.

You'll find this in contrast to LLVM, which has probably O(100) full-time contributors. If you need a universal compiler that is going to work everywhere, address ~every use case, and have fairly good polish, that's your best bet. We won't be offended if you "move on from Cranelift" and use LLVM :-)

Cranelift is a mature compiler, and "distracting TODOs" (as you say) in the JIT demo are not really a good indicator of whether the project is ready for use or whether APIs will change unpredictably. The compiler backend is in use as Wasmtime's backend in a bunch of load-bearing corporate use-cases (see the ADOPTERS.md file), and is shipped as an alternative backend for the Rust compiler as well, so it is in our interest to keep it stable.

The JIT demo hasn't gotten a lot of love, and is not really maintained as well as we would like, but again, see above -- unfortunately no one is paid full-time to improve documentation and examples. And, it's not really a "first tier" part of Cranelift -- those are the pieces in the critical path of its major use-cases, namely cranelift-codegen and cranelift-frontend. If you'd like to contribute documentation or code updates, PRs are definitely welcome.

Cranelift's crate separation is an intentional factoring for separation of concerns:

cranelift-codegen is the core compiler. It is kept agnostic to the surrounding environment as much as possible: it is a per-function compiler and doesn't have a concept of symbol names or of modules (only opaque references to other functions/data that are passed straight through into emitted relocations); it doesn't have a concept of any kind of code consumer or object format or JIT API (it just returns a vector of bytes plus metadata); it doesn't provide any "convenience" APIs (the user must provide valid SSA).

cranelift-frontend's main purpose is to lower from a few higher-level convenience abstractions: in particular, it implements a standard SSA construction algorithm so you can operate in terms of multiply-updated "variables".

cranelift-module, cranelift-object, cranelift-jit, ... are all "utility crates" that you can put together as needed to build the rest of a full-featured compiler. In Wasmtime we don't use most of these -- we manage our own object file format, module metadata, relocation handling, code loading, etc., so these get a little less maintenance attention and are effectively "community-maintained".

I wouldn't really describe this as a downside: the factoring is what allows Cranelift to be extremely flexible. For example, keeping compilation as a function-level concern in the core parts is what allows it to parallelize across function compilations, which is something LLVM cannot do.

Your issues with entry block manipulation and creating code in arbitrary order are limitations of the SSA construction algorithm and not of cranelift-codegen proper. Cranelift contains APIs that let you insert instructions anywhere -- internally we represent instruction and block layout with linked lists so these insertions are O(1) -- you only need to use the right APIs. (See the Cursor abstraction and Layout API for more.)

To say it another way: cranelift-frontend gives you SSA construction, and to do so, it needs to see the program in close to program order (imagine the state it would need to track otherwise). But if you're willing to produce valid SSA and use the cranelift-codegen API, you can just spray instructions anywhere. This is exactly the same as LLVM. (Well, if you use mem2reg and stack slots for all your variables, then you get both, but that's a pretty expensive approach versus in-place SSA construction.)

To be blunt: I wish you had come to us with these concerns as questions rather than posting a wall of text and saying essentially "I'm giving up, see ya". We would have been happy to clarify the roles of the different levels of API, and whether the restrictions you perceive are fundamental or not. (And where they are, offer pointers to the theory, algorithms, and further reading that show why the restrictions exist.)

Names for entities: Yeah, definitely, it would be useful to have names attached to values and to blocks. Willing to contribute a PR?

Since this whole issue as a unit is not really actionable, I'll go ahead and close it, but do feel free to file issues (or post PRs directly) for the individual parts, such as names for values/blocks or any doc improvements. Thanks!

Last updated: Feb 24 2026 at 05:28 UTC