Hello, I am using RegAlloc2 and have a question. I have produced the following MachineCode instructions. Their Mnemonics and Operands Vec can be seen here:
cmp_r_m (Use: v23i reg),(Use: v25i reg),(Def: v28v fixed(p6v)),
jne (Use: v0i any),(Use: v28v fixed(p6v)),
However, the second operand(v28v) in the JNE is an x86 flag. The zero flag. This needs to be used by the JNE as it needs to be materialized in the actual ZF bit in the EFLAGS register however none of the successor blocks actually use this and so it isn't a branch parameter. The only real block parameter here is v0i, passed to both successors of the JNE.
When running RegAlloc2 over this, I get the DisallowedBranchArg(inst) specifying that JNE.
The docs say to create an "edge block" to solve this issue. I assume that means I need to insert blocks in between the current block and each of its predecessors that expects an additional parameter(which would be the ZF value used by the JNE). Then this could be discarded in the edge blocks before unconditionally branching to the real successors.
Alternatively, could I just add this extra operand to each of the successor blocks parameter lists, even though they don't actually use it? Seems like that might be the best option.
Thank you.
More basic question (X-Y issue at play here): why track eflags via register allocation? It has special constraints (only one register, can't easily be spilled), is implicitly written by many instructions, and is usually handled specially during isel
What we do in Cranelift is to implicitly get the eflags dataflow right by generating the cmp right before the conditional jump; RA2 doesn't otherwise know anything about it
If I could avoid having to model the flags I would, but as it stands I cannot. I have them modeled each with their own individual PREG and instructions that use them can require that they come in in the right place. Then define MachineInstructions like this. This one is for x86 setne.
setne(
Data(),
RegIn(),
FlagIn(
(zf, OperandConstraint::FixedReg(PREG_ZF))
),
RegOut(
(output, OperandConstraint::Any)
),
FlagOut()
),
Ofc modeling them this way is really annoying when regalloc2 DOES create an edit which moves from one flag to another. But this is kinda rare.
Can you say more why you need to model flags in this way? regalloc2 unfortunately is not designed to handle this (as you're seeing with the constraints around conditional-jump inputs).
The input is the result of an x86 lifter. The flags have to be modeled and ideally I'd like to keep them in their original registers whenever possible. Trust me I know regalloc2 isn't setup to handle this :rofl:. If there were more(unlimited) RegClasses I could prevent flag-to-flag moves. I considered forking and trying to implement this(side question: do you think that would be difficult to do?)
It's an interesting question! The constraint around jump inputs comes mainly from the move-insertion (edge cases around where to place moves); I have vague recollections of running into some fuzzbugs and inserting this constraint, but no solid summary of what the problems will be, so it's worth removing the check, fuzzing, and seeing what comes up.
(And as it goes, there is one free regclass: we have two bits now and only have three classes defined, Int, Float and Vector)
Need at least 6 more unfortunately :pensive:
How does regalloc 2 handle MIPs like branching, where there are uses of registers in the branch?
Ah, if you know your function bodies won't be large, you could steal more bits from the vreg index
(I'm not sure what a "MIP" is but branch instructions are handled mostly like any other in terms of uses and defs; they're only special w.r.t. move insertion for edge-moves)
Might try doing that^ shouldn't be very large.
MIPS as in the architecture:
BNE $t0,$t1,Target
It has uses of registers, so they would be in the operand slice, but it conceivably might not be passing these values on to the successor blocks.
ah, MIPS, sorry, parse error! We do generate cbz
/cbnz
in Cranelift on aarch64 which takes one register arg, and branch insts on risc-v which take two; I'm honestly blanking on how this works but that would be the place to start
:thumbs_up: thank you. I think however that's solved should be similar to the solution I'm looking for. Similar case at least, ignoring the fact that I'm doing dumb things with the flags.
Ah, OK, that's how: as per here we ensure the CFG is such that moves can be placed in successor blocks instead
so, back to your original message, something like "split the edge blocks" sounds like the right answer :-)
(in Cranelift's BlockLoweringOrder we always ensure we split critical edges)
If I just tacked on the flag's needed by the branch to the params of each of the successors, would that work too?
So that now the flag(s) are used by the branch, and used in the params of the successor blocks.
not necessarily, as the edge moves may (should) happen in the successor blocks' starts if there are multiple successors
in other words the flags will be alive, somewhere, but RA2 isn't set up to tell you where
better to split the block edges and use a true arg, then all the machinery will naturally work to give you the location
are the operands of the branch instruction itself ignored then? I would put the flag in an use operand to the branch with a FixedReg constraint(which would force the move to happen before?)
no, the normal args to the branch (not block args) should work properly
(and yes, moves necessary to put things in place would happen before the branch)
Ok I'll give it a shot and report back :salute:
best of luck!
Going to look into increasing VReg size as well :eyes:
Ok so here is what is the problem now:
And I will solve it by doing this.
https://i.imgur.com/BP6XmjB.png
Dummy edge blocks that consume the ZF, and pass only the expected args to the true successors.
Sounds good except I don't think it should be necessary to consume ZF in the successor blocks: the branch itself using ZF should be enough (i.e., encode only the actual dependence)
the purpose of the edge-split blocks is to give the regalloc a place to put moves that have to happen "on the edge"; without that, there'd be no place to put such moves for the top-center-to-bottom-left edge (because prior to the cond branch, we could go either way, we don't know we're taking that edge; and in teh bottom-left block, we could have come from top-center or top-left)
once that block is there, regalloc is happy to let the cond br consume things, because it knows it doesn't have to put moves before the branch
yep noticed that too as implementing, adding the block alone will remove the multiple predecessors on successor violation.
Last updated: Nov 22 2024 at 17:03 UTC