I'm wondering what the Cranelift IR would look like for this source code:
let result = if condition1 {
Some(do_something2())
} else {
None
};
do_something3();
if let Some(inner_value) = result {
do_something4(inner_value);
} else {
do_something5();
}
Conceptually, Option
is a struct with two fields. The first field is the the discriminant, which is 0 for Some
and 1 for None
. The second field is the inner value, which can be left uninitialized if the discriminant is None
.
Implementing this using loads and stores, it's obvious how this would work. The Option
would have a layout somewhere in memory. The branch creating the Some
value would store 0 at the memory location of the discriminant and the return value of do_something2
at the memory location of the inner value. The branch creating the None
value would store 1 at the memory location of the discriminant, and do nothing to the memory location of the inner value, leaving it uninitialized.
I've got the feeling that using loads and stores wouldn't be a good approach. Better would be to use SSA values that can be optimized. I guess the Some
value could be created as (iconst(ty, 0), iconst(ty, inner_value))
. But how do I then create the None
value, where the inner value is supposed to be uninitialized? Is this the right approach at all?
Here's another example.
I enter the following code on godbolt.org:
pub fn example() -> Option<u32> {
Some(42)
}
The following comes out:
example::example:
mov eax, 1
mov edx, 42
ret
The discriminant is assigned to eax and the inner value to edx.
Then I enter the following:
pub fn example() -> Option<u32> {
None
}
And this comes out:
example::example:
xor eax, eax
ret
The discriminant is assigned to eax and edx is left uninitialized.
How can I do that with Cranelift?
clif doesn't have undef
values or anything like that, so probably the best thing to do would be to use zeros
you could also look at what cg_clif
, the cranelift backend for rustc, does
@Veverak adding onto what @fitzgen (he/him) says above, the reason we don't have undef is because there is a focus on determinism in our IR -- in practice there have been a lot of bugs in e.g. LLVM related to its handling of undef values
Thanks for the answers. I had a brief look at cg_clif and got the impression that it implements enums through pointer writes, which I guess leads to memory accesses that won't be optimized away. The LLVM backend seems to implement enums through stack stores and loads, as evidenced by running the compiler without optimizations, and then LLVM optimizes the stack slots away. Cranelift doesn't seems to do the same optimization, and I guess it never will. I would have liked if I could do this optimization myself by constructing enums as SSA values when I know that it doesn't need an address on the stack.
I guess I'll make my frontend optimize the case when a block that constructs an enum is immediately followed by a block that matches on that enum, by skipping the match block and jumping directly into the match arm that takes this variant, passing the inner value in SSA form. For any other case, I'll store the enum in a stack slot.
@Veverak it sounds like what you're looking for is something like LLVM's mem2reg pass? In other words, something that recognizes a memory location is never aliased and does not escape, so can lift loads and stores into SSA? I think that is unlikely to happen in the short term, but it's conceivable that CL could have something like that someday
@Chris Fallin That sounds like the pass I was thinking of. But I wasn't expecting CL to have this pass. Since my frontend already knows that the value should go into registers, not memory, it could directly generate code without memory accesses instead of relying on this pass. But I can't generate the SSA-equivalent of the relevant memory accesses without a way to express uninitialized values.
Zero-initializing values that will never be read may be costly depend on what function I'm compiling. Maybe there's a loop that constructs many values of one variant and will have to zero-initialize all the fields of all the other variants that it doesn't create.
Last updated: Nov 22 2024 at 17:03 UTC