Stream: cranelift

Topic: Representing uninitialized memory


view this post on Zulip Veverak (Oct 26 2021 at 13:43):

I'm wondering what the Cranelift IR would look like for this source code:

let result = if condition1 {
    Some(do_something2())
} else {
    None
};
do_something3();
if let Some(inner_value) = result {
    do_something4(inner_value);
} else {
    do_something5();
}

Conceptually, Option is a struct with two fields. The first field is the the discriminant, which is 0 for Some and 1 for None. The second field is the inner value, which can be left uninitialized if the discriminant is None.

Implementing this using loads and stores, it's obvious how this would work. The Option would have a layout somewhere in memory. The branch creating the Some value would store 0 at the memory location of the discriminant and the return value of do_something2 at the memory location of the inner value. The branch creating the None value would store 1 at the memory location of the discriminant, and do nothing to the memory location of the inner value, leaving it uninitialized.

I've got the feeling that using loads and stores wouldn't be a good approach. Better would be to use SSA values that can be optimized. I guess the Some value could be created as (iconst(ty, 0), iconst(ty, inner_value)). But how do I then create the None value, where the inner value is supposed to be uninitialized? Is this the right approach at all?

view this post on Zulip Veverak (Oct 26 2021 at 15:41):

Here's another example.

I enter the following code on godbolt.org:

pub fn example() -> Option<u32> {
    Some(42)
}

The following comes out:

example::example:
        mov     eax, 1
        mov     edx, 42
        ret

The discriminant is assigned to eax and the inner value to edx.

Then I enter the following:

pub fn example() -> Option<u32> {
    None
}

And this comes out:

example::example:
        xor     eax, eax
        ret

The discriminant is assigned to eax and edx is left uninitialized.

How can I do that with Cranelift?

Compiler Explorer is an interactive online compiler which shows the assembly output of compiled C++, Rust, Go (and many more) code.

view this post on Zulip fitzgen (he/him) (Oct 26 2021 at 16:01):

clif doesn't have undef values or anything like that, so probably the best thing to do would be to use zeros

you could also look at what cg_clif, the cranelift backend for rustc, does

view this post on Zulip Chris Fallin (Oct 26 2021 at 16:02):

@Veverak adding onto what @fitzgen (he/him) says above, the reason we don't have undef is because there is a focus on determinism in our IR -- in practice there have been a lot of bugs in e.g. LLVM related to its handling of undef values

view this post on Zulip Veverak (Oct 26 2021 at 16:23):

Thanks for the answers. I had a brief look at cg_clif and got the impression that it implements enums through pointer writes, which I guess leads to memory accesses that won't be optimized away. The LLVM backend seems to implement enums through stack stores and loads, as evidenced by running the compiler without optimizations, and then LLVM optimizes the stack slots away. Cranelift doesn't seems to do the same optimization, and I guess it never will. I would have liked if I could do this optimization myself by constructing enums as SSA values when I know that it doesn't need an address on the stack.

I guess I'll make my frontend optimize the case when a block that constructs an enum is immediately followed by a block that matches on that enum, by skipping the match block and jumping directly into the match arm that takes this variant, passing the inner value in SSA form. For any other case, I'll store the enum in a stack slot.

view this post on Zulip Chris Fallin (Oct 26 2021 at 16:26):

@Veverak it sounds like what you're looking for is something like LLVM's mem2reg pass? In other words, something that recognizes a memory location is never aliased and does not escape, so can lift loads and stores into SSA? I think that is unlikely to happen in the short term, but it's conceivable that CL could have something like that someday

view this post on Zulip Veverak (Oct 26 2021 at 16:44):

@Chris Fallin That sounds like the pass I was thinking of. But I wasn't expecting CL to have this pass. Since my frontend already knows that the value should go into registers, not memory, it could directly generate code without memory accesses instead of relying on this pass. But I can't generate the SSA-equivalent of the relevant memory accesses without a way to express uninitialized values.

Zero-initializing values that will never be read may be costly depend on what function I'm compiling. Maybe there's a loop that constructs many values of one variant and will have to zero-initialize all the fields of all the other variants that it doesn't create.


Last updated: Oct 23 2024 at 20:03 UTC