Stream: cranelift

Topic: RegAlloc2: DisallowedBranchArg


view this post on Zulip Leaves (Jul 01 2024 at 02:21):

Hello, I am using RegAlloc2 and have a question. I have produced the following MachineCode instructions. Their Mnemonics and Operands Vec can be seen here:

   cmp_r_m (Use: v23i reg),(Use: v25i reg),(Def: v28v fixed(p6v)),
   jne (Use: v0i any),(Use: v28v fixed(p6v)),

However, the second operand(v28v) in the JNE is an x86 flag. The zero flag. This needs to be used by the JNE as it needs to be materialized in the actual ZF bit in the EFLAGS register however none of the successor blocks actually use this and so it isn't a branch parameter. The only real block parameter here is v0i, passed to both successors of the JNE.

When running RegAlloc2 over this, I get the DisallowedBranchArg(inst) specifying that JNE.

The docs say to create an "edge block" to solve this issue. I assume that means I need to insert blocks in between the current block and each of its predecessors that expects an additional parameter(which would be the ZF value used by the JNE). Then this could be discarded in the edge blocks before unconditionally branching to the real successors.

Alternatively, could I just add this extra operand to each of the successor blocks parameter lists, even though they don't actually use it? Seems like that might be the best option.

Thank you.

view this post on Zulip Chris Fallin (Jul 01 2024 at 03:11):

More basic question (X-Y issue at play here): why track eflags via register allocation? It has special constraints (only one register, can't easily be spilled), is implicitly written by many instructions, and is usually handled specially during isel

view this post on Zulip Chris Fallin (Jul 01 2024 at 03:12):

What we do in Cranelift is to implicitly get the eflags dataflow right by generating the cmp right before the conditional jump; RA2 doesn't otherwise know anything about it

view this post on Zulip Leaves (Jul 01 2024 at 03:21):

If I could avoid having to model the flags I would, but as it stands I cannot. I have them modeled each with their own individual PREG and instructions that use them can require that they come in in the right place. Then define MachineInstructions like this. This one is for x86 setne.

    setne(
        Data(),
        RegIn(),
        FlagIn(
            (zf, OperandConstraint::FixedReg(PREG_ZF))
        ),
        RegOut(
            (output, OperandConstraint::Any)
        ),
        FlagOut()
    ),

view this post on Zulip Leaves (Jul 01 2024 at 03:23):

Ofc modeling them this way is really annoying when regalloc2 DOES create an edit which moves from one flag to another. But this is kinda rare.

view this post on Zulip Chris Fallin (Jul 01 2024 at 03:23):

Can you say more why you need to model flags in this way? regalloc2 unfortunately is not designed to handle this (as you're seeing with the constraints around conditional-jump inputs).

view this post on Zulip Leaves (Jul 01 2024 at 03:26):

The input is the result of an x86 lifter. The flags have to be modeled and ideally I'd like to keep them in their original registers whenever possible. Trust me I know regalloc2 isn't setup to handle this :rofl:. If there were more(unlimited) RegClasses I could prevent flag-to-flag moves. I considered forking and trying to implement this(side question: do you think that would be difficult to do?)

view this post on Zulip Chris Fallin (Jul 01 2024 at 03:29):

It's an interesting question! The constraint around jump inputs comes mainly from the move-insertion (edge cases around where to place moves); I have vague recollections of running into some fuzzbugs and inserting this constraint, but no solid summary of what the problems will be, so it's worth removing the check, fuzzing, and seeing what comes up.

view this post on Zulip Chris Fallin (Jul 01 2024 at 03:29):

(And as it goes, there is one free regclass: we have two bits now and only have three classes defined, Int, Float and Vector)

view this post on Zulip Leaves (Jul 01 2024 at 03:29):

Need at least 6 more unfortunately :pensive:

view this post on Zulip Leaves (Jul 01 2024 at 03:30):

How does regalloc 2 handle MIPs like branching, where there are uses of registers in the branch?

view this post on Zulip Chris Fallin (Jul 01 2024 at 03:30):

Ah, if you know your function bodies won't be large, you could steal more bits from the vreg index

view this post on Zulip Chris Fallin (Jul 01 2024 at 03:31):

(I'm not sure what a "MIP" is but branch instructions are handled mostly like any other in terms of uses and defs; they're only special w.r.t. move insertion for edge-moves)

view this post on Zulip Leaves (Jul 01 2024 at 03:31):

Might try doing that^ shouldn't be very large.

MIPS as in the architecture:

BNE $t0,$t1,Target

It has uses of registers, so they would be in the operand slice, but it conceivably might not be passing these values on to the successor blocks.

view this post on Zulip Chris Fallin (Jul 01 2024 at 03:33):

ah, MIPS, sorry, parse error! We do generate cbz/cbnz in Cranelift on aarch64 which takes one register arg, and branch insts on risc-v which take two; I'm honestly blanking on how this works but that would be the place to start

view this post on Zulip Leaves (Jul 01 2024 at 03:34):

:thumbs_up: thank you. I think however that's solved should be similar to the solution I'm looking for. Similar case at least, ignoring the fact that I'm doing dumb things with the flags.

view this post on Zulip Chris Fallin (Jul 01 2024 at 03:34):

Ah, OK, that's how: as per here we ensure the CFG is such that moves can be placed in successor blocks instead

A new register allocator. Contribute to bytecodealliance/regalloc2 development by creating an account on GitHub.

view this post on Zulip Chris Fallin (Jul 01 2024 at 03:34):

so, back to your original message, something like "split the edge blocks" sounds like the right answer :-)

view this post on Zulip Chris Fallin (Jul 01 2024 at 03:35):

(in Cranelift's BlockLoweringOrder we always ensure we split critical edges)

view this post on Zulip Leaves (Jul 01 2024 at 03:36):

If I just tacked on the flag's needed by the branch to the params of each of the successors, would that work too?

So that now the flag(s) are used by the branch, and used in the params of the successor blocks.

view this post on Zulip Chris Fallin (Jul 01 2024 at 03:39):

not necessarily, as the edge moves may (should) happen in the successor blocks' starts if there are multiple successors

view this post on Zulip Chris Fallin (Jul 01 2024 at 03:39):

in other words the flags will be alive, somewhere, but RA2 isn't set up to tell you where

view this post on Zulip Chris Fallin (Jul 01 2024 at 03:40):

better to split the block edges and use a true arg, then all the machinery will naturally work to give you the location

view this post on Zulip Leaves (Jul 01 2024 at 03:41):

are the operands of the branch instruction itself ignored then? I would put the flag in an use operand to the branch with a FixedReg constraint(which would force the move to happen before?)

view this post on Zulip Chris Fallin (Jul 01 2024 at 03:42):

no, the normal args to the branch (not block args) should work properly

view this post on Zulip Chris Fallin (Jul 01 2024 at 03:43):

(and yes, moves necessary to put things in place would happen before the branch)

view this post on Zulip Leaves (Jul 01 2024 at 03:43):

Ok I'll give it a shot and report back :salute:

view this post on Zulip Chris Fallin (Jul 01 2024 at 03:43):

best of luck!

view this post on Zulip Leaves (Jul 01 2024 at 03:43):

Going to look into increasing VReg size as well :eyes:

view this post on Zulip Leaves (Jul 01 2024 at 03:55):

Ok so here is what is the problem now:

view this post on Zulip Leaves (Jul 01 2024 at 03:55):

And I will solve it by doing this.
https://i.imgur.com/BP6XmjB.png

view this post on Zulip Leaves (Jul 01 2024 at 03:55):

Dummy edge blocks that consume the ZF, and pass only the expected args to the true successors.

view this post on Zulip Chris Fallin (Jul 01 2024 at 04:28):

Sounds good except I don't think it should be necessary to consume ZF in the successor blocks: the branch itself using ZF should be enough (i.e., encode only the actual dependence)

view this post on Zulip Chris Fallin (Jul 01 2024 at 04:29):

the purpose of the edge-split blocks is to give the regalloc a place to put moves that have to happen "on the edge"; without that, there'd be no place to put such moves for the top-center-to-bottom-left edge (because prior to the cond branch, we could go either way, we don't know we're taking that edge; and in teh bottom-left block, we could have come from top-center or top-left)

view this post on Zulip Chris Fallin (Jul 01 2024 at 04:30):

once that block is there, regalloc is happy to let the cond br consume things, because it knows it doesn't have to put moves before the branch

view this post on Zulip Leaves (Jul 01 2024 at 05:12):

yep noticed that too as implementing, adding the block alone will remove the multiple predecessors on successor violation.


Last updated: Nov 22 2024 at 17:03 UTC