elliottt opened PR #8383 from elliottt:trevor/tail-calls-riscv64
to bytecodealliance:main
:
To mirror the implementation in the x64 backend, switch to eagerly reserving enough space in the incoming argument area for any tail call present in the function being compiled.
NOTE: this doesn't make the change to enable callee-save registers with the tail calling convention on aarch64, but that will be a relatively small change following this PR.
<!--
Please make sure you include the following information:
If this work has been discussed elsewhere, please include a link to that
conversation. If it was discussed in an issue, just mention "issue #...".Explain why this change is needed. If the details are in an issue already,
this can be brief.Our development process is documented in the Wasmtime book:
https://docs.wasmtime.dev/contributing-development-process.htmlPlease ensure all communication follows the code of conduct:
https://github.com/bytecodealliance/wasmtime/blob/main/CODE_OF_CONDUCT.md
-->
github-actions[bot] commented on PR #8383:
Subscribe to Label Action
cc @cfallin, @fitzgen
<details>
This issue or pull request has been labeled: "cranelift", "isle"Thus the following users have been cc'd because of the following labels:
- cfallin: isle
- fitzgen: isle
To subscribe or unsubscribe from this label, edit the <code>.github/subscribe-to-label.json</code> configuration file.
Learn more.
</details>
elliottt updated PR #8383.
elliottt updated PR #8383.
elliottt edited PR #8383:
To mirror the implementation in the x64 backend, switch to eagerly reserving enough space in the incoming argument area for any tail call present in the function being compiled.
The diff here is large due to the change of representation for incoming arguments in the riscv64 specific
AMode
type. Now that we reference incoming arguments with a negative offset from the beginning of the incoming argument area, all references to incoming arguments changed in the riscv64 filetests.NOTE: this doesn't make the change to enable callee-save registers with the tail calling convention on aarch64, but that will be a relatively small change following this PR.
<!--
Please make sure you include the following information:
If this work has been discussed elsewhere, please include a link to that
conversation. If it was discussed in an issue, just mention "issue #...".Explain why this change is needed. If the details are in an issue already,
this can be brief.Our development process is documented in the Wasmtime book:
https://docs.wasmtime.dev/contributing-development-process.htmlPlease ensure all communication follows the code of conduct:
https://github.com/bytecodealliance/wasmtime/blob/main/CODE_OF_CONDUCT.md
-->
elliottt edited PR #8383:
To mirror the implementation in the x64 backend, switch to eagerly reserving enough space in the incoming argument area for any tail call present in the function being compiled.
The diff here is large due to the change of representation for incoming arguments in the riscv64 specific
AMode
type. Now that we reference incoming arguments with a negative offset from the beginning of the incoming argument area, all references to incoming arguments changed in the riscv64 filetests. These would have ended up being no-op changes in the disassembly, though I took this opportunity to make them SP relative now, to move closer to frame pointers not being strictly required for tail calls.NOTE: this doesn't make the change to enable callee-save registers with the tail calling convention on aarch64, but that will be a relatively small change following this PR.
<!--
Please make sure you include the following information:
If this work has been discussed elsewhere, please include a link to that
conversation. If it was discussed in an issue, just mention "issue #...".Explain why this change is needed. If the details are in an issue already,
this can be brief.Our development process is documented in the Wasmtime book:
https://docs.wasmtime.dev/contributing-development-process.htmlPlease ensure all communication follows the code of conduct:
https://github.com/bytecodealliance/wasmtime/blob/main/CODE_OF_CONDUCT.md
-->
elliottt has marked PR #8383 as ready for review.
elliottt requested fitzgen for a review on PR #8383.
elliottt requested wasmtime-compiler-reviewers for a review on PR #8383.
elliottt requested jameysharp for a review on PR #8383.
elliottt requested afonso360 for a review on PR #8383.
elliottt edited PR #8383:
To mirror the implementation in the x64 backend, switch to eagerly reserving enough space in the incoming argument area for any tail call present in the function being compiled.
The diff here is large due to the change of representation for incoming arguments in the riscv64 specific
AMode
type. Now that we reference incoming arguments with a negative offset from the beginning of the incoming argument area, all references to incoming arguments changed in the riscv64 filetests. These would have ended up being no-op changes in the disassembly, though I took this opportunity to make them SP relative now, to move closer to frame pointers not being strictly required for tail calls.As this is the last backend that was waiting to be migrated to the approach of resizing the incoming argument area in the function prologue, I've also deleted the shared code in
machinst/abi.rs
that's no longer needed.NOTE: this doesn't make the change to enable callee-save registers with the tail calling convention on aarch64, but that will be a relatively small change following this PR.
<!--
Please make sure you include the following information:
If this work has been discussed elsewhere, please include a link to that
conversation. If it was discussed in an issue, just mention "issue #...".Explain why this change is needed. If the details are in an issue already,
this can be brief.Our development process is documented in the Wasmtime book:
https://docs.wasmtime.dev/contributing-development-process.htmlPlease ensure all communication follows the code of conduct:
https://github.com/bytecodealliance/wasmtime/blob/main/CODE_OF_CONDUCT.md
-->
elliottt submitted PR review.
elliottt created PR review comment:
Maybe this would be better written as
fp(incoming_arg_area - {offset})
?
elliottt edited PR review comment.
afonso360 submitted PR review:
:wave: Hey,
I don't think I can do a very good review here, I don't remember a lot about how our frames are set up and haven't been following the rest of the tail calls work very closely. So I'd prefer if someone else could review this.
afonso360 submitted PR review:
:wave: Hey,
I don't think I can do a very good review here, I don't remember a lot about how our frames are set up and haven't been following the rest of the tail calls work very closely. So I'd prefer if someone else could review this.
afonso360 created PR review comment:
Is this island check now done in some common code, or do we no longer need it? I remember having some issues with missing islands when we enabled compressed jumps since with the compressed extensions we have a very short jump range.
elliottt submitted PR review.
elliottt created PR review comment:
Thanks for catching this, that check isn't being done in any generic code. I'll add this back in!
jameysharp submitted PR review.
jameysharp submitted PR review.
jameysharp created PR review comment:
I'd also take
"-{}(incoming_arg)"
to signal that the offset is subtracted from the base. I don't think it's worth bikeshedding this very much but I think clarifying that it's a subtraction would help with readability.
jameysharp submitted PR review.
jameysharp created PR review comment:
On further inspection, there is generic code in
VCode::emit
that emits islands based on the backend'sInst::worst_case_size()
estimate. Bothgen_epilogue
and theseReturnCall
/ReturnCallInd
pseudo-instructions need to fit within that worst-case size. These pseudo-instructions might be slightly longer thangen_epilogue
right now but they're almost the same, so I think they don't need an explicit island check.
elliottt created PR review comment:
I did some more refactoring, and the resulting code generated for a
return_call
orreturn_call_indirect
should now be pretty close to the same size asgen_epilogue
. Lots of chatting with @jameysharp has me convinced that we shouldn't need to check for island emission at this point anymore, and I've removed the manual resets at the call-sites foremit_return_call_common_sequence
to ensure that we trigger an assert if this isn't true.
elliottt submitted PR review.
elliottt updated PR #8383.
elliottt updated PR #8383.
elliottt updated PR #8383.
elliottt updated PR #8383.
elliottt updated PR #8383.
jameysharp submitted PR review:
I'm convinced this is all correct. Ship it!
elliottt merged PR #8383.
Last updated: Jan 24 2025 at 00:11 UTC