uweigand commented on issue #5172:
Additionally, this PR marks
r14
on s390x as non-allocatable, as it's used as the link register in the return instruction.This is really unfortunate, in general we'll get better code by allowing
r14
to be allocatable inside the function (and other compilers do this). I guess modelling this fully correctly would require something like extracting the incoming return address fromr14
into a vreg at theargs
instruction, and then passing that vreg back as argument into the return instruction?
cfallin commented on issue #5172:
Additionally, this PR marks
r14
on s390x as non-allocatable, as it's used as the link register in the return instruction.This is really unfortunate, in general we'll get better code by allowing
r14
to be allocatable inside the function (and other compilers do this). I guess modelling this fully correctly would require something like extracting the incoming return address fromr14
into a vreg at theargs
instruction, and then passing that vreg back as argument into the return instruction?Right, and we can certainly do that. The short-term issue is we want to rework all uses of pinned vregs, so the current approach of naming r14 as such (via the corresponding pinned vreg) has to go, but there's nothing stopping us from doing the same thing in an SSA-friendly way. I might suggest we can do that in a followup PR though, unless @elliottt is willing to do it right away. (And to be clear, I spoke with Trevor yesterday and suggested the approach currently in the PR as a way to make progress for now, so I think waiting is fine.)
uweigand commented on issue #5172:
Additionally, this PR marks
r14
on s390x as non-allocatable, as it's used as the link register in the return instruction.This is really unfortunate, in general we'll get better code by allowing
r14
to be allocatable inside the function (and other compilers do this). I guess modelling this fully correctly would require something like extracting the incoming return address fromr14
into a vreg at theargs
instruction, and then passing that vreg back as argument into the return instruction?Right, and we can certainly do that. The short-term issue is we want to rework all uses of pinned vregs, so the current approach of naming r14 as such (via the corresponding pinned vreg) has to go, but there's nothing stopping us from doing the same thing in an SSA-friendly way. I might suggest we can do that in a followup PR though, unless @elliottt is willing to do it right away. (And to be clear, I spoke with Trevor yesterday and suggested the approach currently in the PR as a way to make progress for now, so I think waiting is fine.)
Sure, I agree, this can be fixed afterwards.
Actually, there seems to be a much simpler fix: just remove the
use
of the link register by the return instruction as far as regalloc is concerned. The return instruction is emitted after the epilog, where by definition all callee-saved registers must have the contents they had on function entry. As this includes the link register, we know it has the correct content and can just use it without involving the register allocator at all.There is another place that might cause issues for regalloc: the call instructions currently have a
def
of the link register. But as that value is never actually used by the caller, it doesn't really need to be adef
- it would suffice to have the link register as part of the clobbers of the call.
cfallin commented on issue #5172:
Ah, that is actually a far simpler approach! @elliottt would you be willing to implement this instead?
elliottt commented on issue #5172:
Ah, that is actually a far simpler approach! @elliottt would you be willing to implement this instead?
Absolutely! Thanks for the suggestion, @uweigand!
elliottt commented on issue #5172:
@uweigand I've implemented your suggestion in 57f0a6b.
uweigand commented on issue #5172:
@uweigand I've implemented your suggestion in 57f0a6b.
LGTM, thanks!
uweigand commented on issue #5172:
@uweigand, would you mind taking one last look over the s390x changes? I fixed the struct-return behavior, and would appreciate some additional eyes.
Still looks good to me, code quality seems unchanged compares to the last time I looked. Note that s390x doesn't actually require the struct-return behavior, that's an Intel issue. But from what I can see there, this also looks good.
Last updated: Dec 23 2024 at 12:05 UTC