Stream: git-wasmtime

Topic: wasmtime / PR #8407 x64: Use r10 for the indirect tail ca...


view this post on Zulip Wasmtime GitHub notifications bot (Apr 18 2024 at 23:58):

elliottt opened PR #8407 from elliottt:trevor/x64-fixed-return-call-ind-dest to bytecodealliance:main:

Use r10 for the destination of indirect return calls, and indirect calls using the winch calling convention, as it is a caller-saved register.

For tail calls, this ensures that we won't accidentally pick a callee-saved register for the destination, clobbering it when we restore callee-saves in the call to emit_return_call_common_sequence.

For winch calls, using r10 instead of r15 means that it's still possible to use the pinned register in combination with winch.

<!--
Please make sure you include the following information:

Our development process is documented in the Wasmtime book:
https://docs.wasmtime.dev/contributing-development-process.html

Please ensure all communication follows the code of conduct:
https://github.com/bytecodealliance/wasmtime/blob/main/CODE_OF_CONDUCT.md
-->

view this post on Zulip Wasmtime GitHub notifications bot (Apr 18 2024 at 23:58):

elliottt requested wasmtime-compiler-reviewers for a review on PR #8407.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 18 2024 at 23:58):

elliottt requested abrown for a review on PR #8407.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 18 2024 at 23:59):

elliottt requested fitzgen for a review on PR #8407.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 18 2024 at 23:59):

elliottt requested jameysharp for a review on PR #8407.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 19 2024 at 00:00):

elliottt created PR review comment:

I think that we could switch back to always using r11 for the temp, as this change means we'll no longer accidentally pick r11 for the destination.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 19 2024 at 00:00):

elliottt submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 19 2024 at 00:59):

jameysharp submitted PR review:

I think there are a couple of x86-specific challenges we still need to work through here.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 19 2024 at 00:59):

jameysharp submitted PR review:

I think there are a couple of x86-specific challenges we still need to work through here.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 19 2024 at 00:59):

jameysharp created PR review comment:

I don't think the RegMem::Mem case is safe here, because any registers in the address mode could themselves be callee-saved. We might need to require that ReturnCallUnknown is used only with a register, never a memory operand, unlike CallUnknown.

As for using r11 for the temporary register, I think we actually must do that. In general I'd rather let RA2 choose the temporary, but it's used after callee-saved registers are restored, so as it stands now we could clobber something we're supposed to have preserved.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 19 2024 at 01:25):

elliottt submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 19 2024 at 01:25):

elliottt created PR review comment:

I was wondering about this, especially since it's not clear to me why this is a RegMem and not a plain old Reg.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 19 2024 at 01:29):

elliottt edited PR review comment.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 19 2024 at 01:29):

elliottt updated PR #8407.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 19 2024 at 01:35):

elliottt updated PR #8407.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 19 2024 at 01:36):

elliottt submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 19 2024 at 01:36):

elliottt created PR review comment:

Actually, switching it to accepting a Reg was really easy, and the only place we construct this instruction we were always providing a register that we turned into a RegMem :+1:

view this post on Zulip Wasmtime GitHub notifications bot (Apr 19 2024 at 01:38):

elliottt edited PR #8407:

Use r10 for the destination of indirect return calls, and indirect calls using the winch calling convention, as it is a caller-saved register. Additionally, always use r11 for the temporary that's used for moving the return address in emit_return_call_common_sequence, as it's also a caller-saved register.

For tail calls, this ensures that we won't accidentally pick a callee-saved register for the destination, clobbering it when we restore callee-saves in the call to emit_return_call_common_sequence.

For winch calls, using r10 instead of r15 means that it's still possible to use the pinned register in combination with winch.

<!--
Please make sure you include the following information:

Our development process is documented in the Wasmtime book:
https://docs.wasmtime.dev/contributing-development-process.html

Please ensure all communication follows the code of conduct:
https://github.com/bytecodealliance/wasmtime/blob/main/CODE_OF_CONDUCT.md
-->

view this post on Zulip Wasmtime GitHub notifications bot (Apr 19 2024 at 01:39):

elliottt edited PR #8407:

Use r10 for the destination of indirect return calls, and indirect calls using the winch calling convention, as it is a caller-saved register. Additionally, always use r11 for the temporary that's used for moving the return address in emit_return_call_common_sequence, as it's also a caller-saved register.

For tail calls, this ensures that we won't accidentally pick a callee-saved register for the destination or temporary, clobbering it when we restore callee-saves in the call to emit_return_call_common_sequence.

For winch calls, using r10 instead of r15 means that it's still possible to use the pinned register in combination with winch.

<!--
Please make sure you include the following information:

Our development process is documented in the Wasmtime book:
https://docs.wasmtime.dev/contributing-development-process.html

Please ensure all communication follows the code of conduct:
https://github.com/bytecodealliance/wasmtime/blob/main/CODE_OF_CONDUCT.md
-->

view this post on Zulip Wasmtime GitHub notifications bot (Apr 19 2024 at 01:44):

jameysharp submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 19 2024 at 01:44):

jameysharp created PR review comment:

In general it makes sense to me to allow an indirect call on x86 to sink a load, if that's where the callee came from.

We could even hit this case from Wasmtime, where an indirect call through a table gets the callee function pointer by loading it out of a struct in memory. (I think load sinking currently fails there because there's another load in between; we might be able to micro-optimize this for x86 by swapping the two loads.)

However, I think that's a code-size optimization we just can't safely support for tail calls.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 19 2024 at 01:58):

jameysharp submitted PR review:

I believe this is correct, though again I hope @cfallin will look at it later.

I'm also noticing that our tail-call sequence, when we move the return address, is not a valid epilogue according to the Windows unwinding rules. That is probably not an issue since nothing should trigger unwinding there, but it's worth keeping in mind.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 19 2024 at 16:03):

elliottt merged PR #8407.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 22 2024 at 19:36):

cfallin commented on PR #8407:

Likewise to the aarch64 PR, this looks right to me, and for the same reasons -- as you note in the comments, r10 is caller-saved (volatile) and it's also not used for an arg, so we're good to go!


Last updated: Dec 23 2024 at 12:05 UTC