cfallin commented on issue #4856:
cc @uweigand to review?
I also spent the past day trying to go further in this branch, and managed to clean up a lot more, but there are some panics from undefined regs there so I don't have it quite right. @uweigand if you have time to poke at this more I'd very much appreciate it (but it's not urgent at all!). My long-term goal is to push the regalloc input toward fully-SSA (no mods, no multiple defs) and fully constraint-based (no pinned vregs) input, which allows for more flexibility in managing copies and spills, and makes for a more efficient solver generally.
uweigand commented on issue #4856:
I also spent the past day trying to go further in this branch, and managed to clean up a lot more, but there are some panics from undefined regs there so I don't have it quite right. @uweigand if you have time to poke at this more I'd very much appreciate it (but it's not urgent at all!). My long-term goal is to push the regalloc input toward fully-SSA (no mods, no multiple defs) and fully constraint-based (no pinned vregs) input, which allows for more flexibility in managing copies and spills, and makes for a more efficient solver generally.
The main problem here seems to be the tricks I had been playing with
uninitialized_regpair
. This was intended to solve the problem of how to initialize a register pair for those instructions that use one as input (basically, divides). My model has been that I need to allocate a register pair (uninitialized at this point), and then load up low and high parts of it. That used to work with the old regpair method, but with the new method it now exposes those uninitialized registers to regalloc, which it doesn't like.But fortunately, with the new model we can instead just load up the two halves into independent vregs and just construct a regpair from those two vregs. That fixes the "udiv" case. For the "sdiv" case, the instruction actually does not read the high half of the input regpair, so it actually should be uninitialized. But here we can simply change the
sdivmod
pattern to just only take aReg
instead of aRegPair
as input, which is closer to the true semantics anyway.Overall, this change simplifies the logic around regpairs anyway, so I like it. I've attached a patch to implement those changes.
regpair-patch.txtIn addition, I noticed that you've consistently swapped register numbers: the high half of the pair goes into
%r0
, and the low half goes into%r1
(we're bigendian, after all ...). Also the two inputs toumul_wide
were swapped (I guess the operation is commutative, but it still was a surprise). I've added those changes to the patch as well.Now, I'm running into a new error:
FAIL filetests/filetests/isa/s390x/vec-arithmetic.clif: panicked in worker #10: Could not allocate minimal bundle, but the allocation problem should be possible to solve
This looks like a regalloc problem (at first glance, it occurs when using multiple wide multiplications in a row, so maybe regalloc runs into conflicts since they're all forced into the same physical register pair?) ... could you have a look here?
cfallin commented on issue #4856:
Thanks @uweigand! I've updated based on feedback (and, importantly, reverted the 2-to-3-arg change in assembly printing). Thanks for looking further at the followup patch as well; I will pick that up and try to finish it next week, most likely.
cfallin commented on issue #4856:
Ah, I think this needs an r+ from someone with write access to the repo -- anyone want to give a rubberstamp on top of Ulrich's review above?
Last updated: Jan 24 2025 at 00:11 UTC