jameysharp opened PR #8292 from jameysharp:specialize-fp-offset
to bytecodealliance:main
:
The StackAMode::FPOffset address mode was always used together with fp_to_arg_offset, to compute addresses within the current stack frame's argument area.
Instead, introduce a new StackAMode::ArgOffset variant specifically for stack addresses within the current frame's argument area. The details of how to find the argument area are folded into the conversion from the target-independent StackAMode into target-dependent address-mode types.
Currently, fp_to_arg_offset returns a target-specific constant, so I've preserved that constant in each backend's address-mode conversion.
However, in general the location of the argument area may depend on calling convention, flags, or other concerns. Also, it may not always be desirable to use a frame pointer register as the base to find the argument area. I expect some backends will eventually need to introduce new synthetic addressing modes to resolve argument-area offsets after register allocation, when the full frame layout is known.
I also cleaned up a couple minor things while I was in the area:
- Determining argument extension type was written in a confusing way and also had a typo in the comment describing it.
- riscv64's AMode::offset was only used in one place and is clearer when inlined.
jameysharp requested abrown for a review on PR #8292.
jameysharp requested wasmtime-compiler-reviewers for a review on PR #8292.
abrown submitted PR review:
Nice clean up!
abrown submitted PR review:
Nice clean up!
abrown created PR review comment:
.expect("Offset in ArgOffset is greater than 2GB; should hit implementation limit first");
bjorn3 submitted PR review.
bjorn3 created PR review comment:
The
+ 16
to compute the final frame pointer offset can now overflow too, right?
jameysharp submitted PR review.
jameysharp created PR review comment:
Good catches! With regard to overflow, I'm moving this
+16
inside thei32::try_from
so the add happens ati64
instead. That makes its overflow behavior the same as current, which already had uncheckedi64
addition. Similarly, the aarch64 and riscv64 targets are still doing the addition ati64
after this patch, and s390x doesn't add anything so can't overflow.
jameysharp updated PR #8292.
jameysharp has enabled auto merge for PR #8292.
jameysharp merged PR #8292.
Last updated: Nov 22 2024 at 16:03 UTC