Stream: git-wasmtime

Topic: wasmtime / PR #2267 Fix AArch64 ABI to respect half-calle...


view this post on Zulip Wasmtime GitHub notifications bot (Oct 06 2020 at 01:47):

cfallin opened PR #2267 from fix-aarch64-abi to main:

This PR updates the AArch64 ABI implementation so that it (i) properly
respects that v8-v15 inclusive have callee-save lower halves, and
caller-save upper halves, by conservatively approximating (to full
registers) in the appropriate directions when generating prologue
caller-saves and when informing the regalloc of clobbered regs across
callsites.

In order to prevent saving all of these vector registers in the prologue
of every non-leaf function due to the above approximation, this also
makes use of a new regalloc.rs feature to exclude call instructions'
writes from the clobber set returned by register allocation. This is
safe whenever the caller and callee have the same ABI (because anything
the callee could clobber, the caller is allowed to clobber as well
without saving it in the prologue).

Requires bytecodealliance/regalloc.rs#100; this PR refers to my fork of
regalloc.rs until that PR is reviewed, merged and included in a new
release.

Fixes #2254.

<!--

Please ensure that the following steps are all taken care of before submitting
the PR.

Please ensure all communication adheres to the code of conduct.
-->

view this post on Zulip Wasmtime GitHub notifications bot (Oct 06 2020 at 01:47):

cfallin requested julian-seward1 for a review on PR #2267.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 06 2020 at 10:04):

bnjbvr submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 06 2020 at 10:04):

bnjbvr submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 06 2020 at 10:04):

bnjbvr created PR Review Comment:

uber-nit: can you join the two cases, please? &Inst::Call { ref info } | &Inst::CallInd { ref info }

view this post on Zulip Wasmtime GitHub notifications bot (Oct 06 2020 at 10:04):

bnjbvr created PR Review Comment:

This rule about caller- vs callee- saved is actually generic over the targets, so we could also apply it for x64, right?

view this post on Zulip Wasmtime GitHub notifications bot (Oct 06 2020 at 10:04):

bnjbvr created PR Review Comment:

Maybe worth adding a back reference to the other comment (that references this one) around callee-saved in call sites?

view this post on Zulip Wasmtime GitHub notifications bot (Oct 06 2020 at 10:31):

julian-seward1 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 06 2020 at 10:31):

julian-seward1 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 06 2020 at 10:31):

julian-seward1 created PR Review Comment:

Similar naming concern here. Also, for good performance, this needs to be inlined, no?

view this post on Zulip Wasmtime GitHub notifications bot (Oct 06 2020 at 10:31):

julian-seward1 created PR Review Comment:

This all is reasonable (and as we discussed and agreed), but it has the side effect that I am not entirely happy with the fact that the name of this function doesn't make clear that it now only pertains to call sites. Would you be prepared to consider the following changes (or something modulo them, so as to increase clarity for the reader) ?

view this post on Zulip Wasmtime GitHub notifications bot (Oct 06 2020 at 18:17):

cfallin submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 06 2020 at 18:17):

cfallin created PR Review Comment:

Unfortunately they're different types (CallInfo vs. CallIndInfo) so we can't unify the patterns here :-/

view this post on Zulip Wasmtime GitHub notifications bot (Oct 06 2020 at 19:39):

cfallin submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 06 2020 at 19:39):

cfallin created PR Review Comment:

Done!

view this post on Zulip Wasmtime GitHub notifications bot (Oct 06 2020 at 19:43):

cfallin submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 06 2020 at 19:43):

cfallin created PR Review Comment:

Indeed it is, although we get less benefit (only the compile-time improvement, no generated-code change) because x64 doesn't have half-caller/half-callee saves. Since it adds overhead to Inst::CallKnown/Inst::CallUnknown (we don't out-of-line the fields yet) and there are issues with the size of Inst, I'm inclined not to go down this path, at least at first. Can certainly look at this in a subsequent PR though!

view this post on Zulip Wasmtime GitHub notifications bot (Oct 06 2020 at 19:46):

cfallin submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 06 2020 at 19:46):

cfallin created PR Review Comment:

Verified that inlining occurs; rustc seems to be doing the right thing here. (It's statically monomorphized so I expect it always would as long as someone does not inflate the body to some huge size...).

view this post on Zulip Wasmtime GitHub notifications bot (Oct 06 2020 at 19:53):

cfallin submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 06 2020 at 19:53):

cfallin created PR Review Comment:

Re: naming: were you thinking something more specific to calls, as above, or something else?

I put a bit of thought into this and I'm not sure that I like including "call instruction", as the concept is more general. (In principle the MachInst layer doesn't even know about the concept of a call instruction.) The concept really is just "should we aggregate this instruction's effects into the summary of clobbered registers". Any other suggestions?

Or if we do decide that this is specific to calls, I think I'd prefer is_call_instruction_with_ignored_clobbers(). I could go with this; it just seems a bit non-orthogonal, is all.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 06 2020 at 20:21):

cfallin updated PR #2267 from fix-aarch64-abi to main:

This PR updates the AArch64 ABI implementation so that it (i) properly
respects that v8-v15 inclusive have callee-save lower halves, and
caller-save upper halves, by conservatively approximating (to full
registers) in the appropriate directions when generating prologue
caller-saves and when informing the regalloc of clobbered regs across
callsites.

In order to prevent saving all of these vector registers in the prologue
of every non-leaf function due to the above approximation, this also
makes use of a new regalloc.rs feature to exclude call instructions'
writes from the clobber set returned by register allocation. This is
safe whenever the caller and callee have the same ABI (because anything
the callee could clobber, the caller is allowed to clobber as well
without saving it in the prologue).

Requires bytecodealliance/regalloc.rs#100; this PR refers to my fork of
regalloc.rs until that PR is reviewed, merged and included in a new
release.

Fixes #2254.

<!--

Please ensure that the following steps are all taken care of before submitting
the PR.

Please ensure all communication adheres to the code of conduct.
-->

view this post on Zulip Wasmtime GitHub notifications bot (Oct 06 2020 at 20:23):

cfallin submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 06 2020 at 20:23):

cfallin created PR Review Comment:

Ah, I like this much better! The ABIMachineSpec now has get_regs_clobbered_by_call(), and aarch64 has is_reg_saved_in_prologue(), get_regs_saved_in_prologue(), and is_reg_clobbered_by_call().

view this post on Zulip Wasmtime GitHub notifications bot (Oct 06 2020 at 21:16):

julian-seward1 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 06 2020 at 21:16):

julian-seward1 created PR Review Comment:

Ah, leave it as it is then. That the other names have been improved is good enough for me. Thanks for doing that.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 06 2020 at 21:18):

julian-seward1 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 06 2020 at 21:44):

cfallin updated PR #2267 from fix-aarch64-abi to main:

This PR updates the AArch64 ABI implementation so that it (i) properly
respects that v8-v15 inclusive have callee-save lower halves, and
caller-save upper halves, by conservatively approximating (to full
registers) in the appropriate directions when generating prologue
caller-saves and when informing the regalloc of clobbered regs across
callsites.

In order to prevent saving all of these vector registers in the prologue
of every non-leaf function due to the above approximation, this also
makes use of a new regalloc.rs feature to exclude call instructions'
writes from the clobber set returned by register allocation. This is
safe whenever the caller and callee have the same ABI (because anything
the callee could clobber, the caller is allowed to clobber as well
without saving it in the prologue).

Requires bytecodealliance/regalloc.rs#100; this PR refers to my fork of
regalloc.rs until that PR is reviewed, merged and included in a new
release.

Fixes #2254.

<!--

Please ensure that the following steps are all taken care of before submitting
the PR.

Please ensure all communication adheres to the code of conduct.
-->

view this post on Zulip Wasmtime GitHub notifications bot (Oct 06 2020 at 22:37):

cfallin edited PR #2267 from fix-aarch64-abi to main:

This PR updates the AArch64 ABI implementation so that it (i) properly
respects that v8-v15 inclusive have callee-save lower halves, and
caller-save upper halves, by conservatively approximating (to full
registers) in the appropriate directions when generating prologue
caller-saves and when informing the regalloc of clobbered regs across
callsites.

In order to prevent saving all of these vector registers in the prologue
of every non-leaf function due to the above approximation, this also
makes use of a new regalloc.rs feature to exclude call instructions'
writes from the clobber set returned by register allocation. This is
safe whenever the caller and callee have the same ABI (because anything
the callee could clobber, the caller is allowed to clobber as well
without saving it in the prologue).

Fixes #2254.

<!--

Please ensure that the following steps are all taken care of before submitting
the PR.

Please ensure all communication adheres to the code of conduct.
-->

view this post on Zulip Wasmtime GitHub notifications bot (Oct 06 2020 at 22:42):

cfallin merged PR #2267.


Last updated: Dec 23 2024 at 12:05 UTC