elliottt opened PR #8407 from elliottt:trevor/x64-fixed-return-call-ind-dest
to bytecodealliance:main
:
Use
r10
for the destination of indirect return calls, and indirect calls using the winch calling convention, as it is a caller-saved register.For tail calls, this ensures that we won't accidentally pick a callee-saved register for the destination, clobbering it when we restore callee-saves in the call to
emit_return_call_common_sequence
.For winch calls, using
r10
instead ofr15
means that it's still possible to use the pinned register in combination with winch.<!--
Please make sure you include the following information:
If this work has been discussed elsewhere, please include a link to that
conversation. If it was discussed in an issue, just mention "issue #...".Explain why this change is needed. If the details are in an issue already,
this can be brief.Our development process is documented in the Wasmtime book:
https://docs.wasmtime.dev/contributing-development-process.htmlPlease ensure all communication follows the code of conduct:
https://github.com/bytecodealliance/wasmtime/blob/main/CODE_OF_CONDUCT.md
-->
elliottt requested wasmtime-compiler-reviewers for a review on PR #8407.
elliottt requested abrown for a review on PR #8407.
elliottt requested fitzgen for a review on PR #8407.
elliottt requested jameysharp for a review on PR #8407.
elliottt created PR review comment:
I think that we could switch back to always using
r11
for the temp, as this change means we'll no longer accidentally pickr11
for the destination.
elliottt submitted PR review.
jameysharp submitted PR review:
I think there are a couple of x86-specific challenges we still need to work through here.
jameysharp submitted PR review:
I think there are a couple of x86-specific challenges we still need to work through here.
jameysharp created PR review comment:
I don't think the
RegMem::Mem
case is safe here, because any registers in the address mode could themselves be callee-saved. We might need to require thatReturnCallUnknown
is used only with a register, never a memory operand, unlikeCallUnknown
.As for using
r11
for the temporary register, I think we actually must do that. In general I'd rather let RA2 choose the temporary, but it's used after callee-saved registers are restored, so as it stands now we could clobber something we're supposed to have preserved.
elliottt submitted PR review.
elliottt created PR review comment:
I was wondering about this, especially since it's not clear to me why this is a
RegMem
and not a plain oldReg
.
elliottt edited PR review comment.
elliottt updated PR #8407.
elliottt updated PR #8407.
elliottt submitted PR review.
elliottt created PR review comment:
Actually, switching it to accepting a
Reg
was really easy, and the only place we construct this instruction we were always providing a register that we turned into aRegMem
:+1:
elliottt edited PR #8407:
Use
r10
for the destination of indirect return calls, and indirect calls using the winch calling convention, as it is a caller-saved register. Additionally, always user11
for the temporary that's used for moving the return address inemit_return_call_common_sequence
, as it's also a caller-saved register.For tail calls, this ensures that we won't accidentally pick a callee-saved register for the destination, clobbering it when we restore callee-saves in the call to
emit_return_call_common_sequence
.For winch calls, using
r10
instead ofr15
means that it's still possible to use the pinned register in combination with winch.<!--
Please make sure you include the following information:
If this work has been discussed elsewhere, please include a link to that
conversation. If it was discussed in an issue, just mention "issue #...".Explain why this change is needed. If the details are in an issue already,
this can be brief.Our development process is documented in the Wasmtime book:
https://docs.wasmtime.dev/contributing-development-process.htmlPlease ensure all communication follows the code of conduct:
https://github.com/bytecodealliance/wasmtime/blob/main/CODE_OF_CONDUCT.md
-->
elliottt edited PR #8407:
Use
r10
for the destination of indirect return calls, and indirect calls using the winch calling convention, as it is a caller-saved register. Additionally, always user11
for the temporary that's used for moving the return address inemit_return_call_common_sequence
, as it's also a caller-saved register.For tail calls, this ensures that we won't accidentally pick a callee-saved register for the destination or temporary, clobbering it when we restore callee-saves in the call to
emit_return_call_common_sequence
.For winch calls, using
r10
instead ofr15
means that it's still possible to use the pinned register in combination with winch.<!--
Please make sure you include the following information:
If this work has been discussed elsewhere, please include a link to that
conversation. If it was discussed in an issue, just mention "issue #...".Explain why this change is needed. If the details are in an issue already,
this can be brief.Our development process is documented in the Wasmtime book:
https://docs.wasmtime.dev/contributing-development-process.htmlPlease ensure all communication follows the code of conduct:
https://github.com/bytecodealliance/wasmtime/blob/main/CODE_OF_CONDUCT.md
-->
jameysharp submitted PR review.
jameysharp created PR review comment:
In general it makes sense to me to allow an indirect call on x86 to sink a load, if that's where the callee came from.
We could even hit this case from Wasmtime, where an indirect call through a table gets the callee function pointer by loading it out of a struct in memory. (I think load sinking currently fails there because there's another load in between; we might be able to micro-optimize this for x86 by swapping the two loads.)
However, I think that's a code-size optimization we just can't safely support for tail calls.
jameysharp submitted PR review:
I believe this is correct, though again I hope @cfallin will look at it later.
I'm also noticing that our tail-call sequence, when we move the return address, is not a valid epilogue according to the Windows unwinding rules. That is probably not an issue since nothing should trigger unwinding there, but it's worth keeping in mind.
elliottt merged PR #8407.
cfallin commented on PR #8407:
Likewise to the aarch64 PR, this looks right to me, and for the same reasons -- as you note in the comments, r10 is caller-saved (volatile) and it's also not used for an arg, so we're good to go!
Last updated: Dec 23 2024 at 12:05 UTC