Stream: git-wasmtime

Topic: wasmtime / issue #5172 Generate SSA code from returns


view this post on Zulip Wasmtime GitHub notifications bot (Nov 02 2022 at 20:30):

uweigand commented on issue #5172:

Additionally, this PR marks r14 on s390x as non-allocatable, as it's used as the link register in the return instruction.

This is really unfortunate, in general we'll get better code by allowing r14 to be allocatable inside the function (and other compilers do this). I guess modelling this fully correctly would require something like extracting the incoming return address from r14 into a vreg at the args instruction, and then passing that vreg back as argument into the return instruction?

view this post on Zulip Wasmtime GitHub notifications bot (Nov 02 2022 at 20:41):

cfallin commented on issue #5172:

Additionally, this PR marks r14 on s390x as non-allocatable, as it's used as the link register in the return instruction.

This is really unfortunate, in general we'll get better code by allowing r14 to be allocatable inside the function (and other compilers do this). I guess modelling this fully correctly would require something like extracting the incoming return address from r14 into a vreg at the args instruction, and then passing that vreg back as argument into the return instruction?

Right, and we can certainly do that. The short-term issue is we want to rework all uses of pinned vregs, so the current approach of naming r14 as such (via the corresponding pinned vreg) has to go, but there's nothing stopping us from doing the same thing in an SSA-friendly way. I might suggest we can do that in a followup PR though, unless @elliottt is willing to do it right away. (And to be clear, I spoke with Trevor yesterday and suggested the approach currently in the PR as a way to make progress for now, so I think waiting is fine.)

view this post on Zulip Wasmtime GitHub notifications bot (Nov 03 2022 at 09:23):

uweigand commented on issue #5172:

Additionally, this PR marks r14 on s390x as non-allocatable, as it's used as the link register in the return instruction.

This is really unfortunate, in general we'll get better code by allowing r14 to be allocatable inside the function (and other compilers do this). I guess modelling this fully correctly would require something like extracting the incoming return address from r14 into a vreg at the args instruction, and then passing that vreg back as argument into the return instruction?

Right, and we can certainly do that. The short-term issue is we want to rework all uses of pinned vregs, so the current approach of naming r14 as such (via the corresponding pinned vreg) has to go, but there's nothing stopping us from doing the same thing in an SSA-friendly way. I might suggest we can do that in a followup PR though, unless @elliottt is willing to do it right away. (And to be clear, I spoke with Trevor yesterday and suggested the approach currently in the PR as a way to make progress for now, so I think waiting is fine.)

Sure, I agree, this can be fixed afterwards.

Actually, there seems to be a much simpler fix: just remove the use of the link register by the return instruction as far as regalloc is concerned. The return instruction is emitted after the epilog, where by definition all callee-saved registers must have the contents they had on function entry. As this includes the link register, we know it has the correct content and can just use it without involving the register allocator at all.

There is another place that might cause issues for regalloc: the call instructions currently have a def of the link register. But as that value is never actually used by the caller, it doesn't really need to be a def - it would suffice to have the link register as part of the clobbers of the call.

view this post on Zulip Wasmtime GitHub notifications bot (Nov 03 2022 at 16:56):

cfallin commented on issue #5172:

Ah, that is actually a far simpler approach! @elliottt would you be willing to implement this instead?

view this post on Zulip Wasmtime GitHub notifications bot (Nov 03 2022 at 17:41):

elliottt commented on issue #5172:

Ah, that is actually a far simpler approach! @elliottt would you be willing to implement this instead?

Absolutely! Thanks for the suggestion, @uweigand!

view this post on Zulip Wasmtime GitHub notifications bot (Nov 03 2022 at 18:18):

elliottt commented on issue #5172:

@uweigand I've implemented your suggestion in 57f0a6b.

view this post on Zulip Wasmtime GitHub notifications bot (Nov 03 2022 at 19:18):

uweigand commented on issue #5172:

@uweigand I've implemented your suggestion in 57f0a6b.

LGTM, thanks!

view this post on Zulip Wasmtime GitHub notifications bot (Nov 08 2022 at 19:08):

uweigand commented on issue #5172:

@uweigand, would you mind taking one last look over the s390x changes? I fixed the struct-return behavior, and would appreciate some additional eyes.

Still looks good to me, code quality seems unchanged compares to the last time I looked. Note that s390x doesn't actually require the struct-return behavior, that's an Intel issue. But from what I can see there, this also looks good.


Last updated: Jan 24 2025 at 00:11 UTC