evacchi opened PR #8365 from evacchi:winch-fpu
to bytecodealliance:main
:
- Add some of the instructions in #8321
- Introduces preliminary support to FPU registers to regalloc
- Introduces tests for the instructions above (ported over from x64)
- Fixes a few typos.
This is a draft with a handful of tests for
float_add{32,64}
(LGTM but another pair of eyes would help), I will add the remaining later.
github-actions[bot] commented on PR #8365:
Subscribe to Label Action
cc @saulecabrera
<details>
This issue or pull request has been labeled: "winch"Thus the following users have been cc'd because of the following labels:
- saulecabrera: winch
To subscribe or unsubscribe from this label, edit the <code>.github/subscribe-to-label.json</code> configuration file.
Learn more.
</details>
evacchi updated PR #8365.
evacchi edited PR #8365:
- Add some of the instructions in #8321
- Introduces preliminary support to FPU registers to regalloc
- Introduces tests for the instructions above (ported over from x64)
- Fixes a few typos.
This is a draft ~with a handful of tests for
float_add{32,64}
(LGTM but another pair of eyes would help), I will add the remaining later~. Added the remaining tests.
saulecabrera submitted PR review:
Thanks for the PR; the general direction looks good to me. I left a couple of comments that I think need to be resolved before landing this work.
saulecabrera submitted PR review:
Thanks for the PR; the general direction looks good to me. I left a couple of comments that I think need to be resolved before landing this work.
saulecabrera created PR review comment:
I believe here we need to properly exclude the designated float scratch register?
pub(crate) const NON_ALLOCATABLE_FPR: u32 = 1 << float_scratch().hw_enc(); /// Bitmask to represent the available general purpose registers pub(crate) const ALL_FPR: u32 = u32::MAX & !NON_ALLOCATABLE_FPR;
saulecabrera created PR review comment:
I was under the impression that we could emit
fmov sd, sn
(the non-64-bit case) according to https://developer.arm.com/documentation/ddi0602/2024-03/SIMD-FP-Instructions/FMOV--register---Floating-point-Move-register-without-conversion-?lang=en, but maybe we'd need to implement this in Cranelift to make it possible. Could you a TODO comment here? I'll follow-up on this one.
saulecabrera created PR review comment:
/// Bitmask to represent the available floating point registers.
evacchi submitted PR review.
evacchi created PR review comment:
whoops!
evacchi submitted PR review.
evacchi created PR review comment:
oh good point!
evacchi submitted PR review.
evacchi created PR review comment:
sure I'll add a TODO!
If I understand correctly, the optimizing backend always uses F64 unless it needs V128:
ISLE:
(decl fpu_move (Type Reg) Reg) (rule (fpu_move _ src) (let ((dst WritableReg (temp_writable_reg $I8X16)) (_ Unit (emit (MInst.FpuMove128 dst src)))) dst)) (rule 1 (fpu_move (fits_in_64 _) src) (let ((dst WritableReg (temp_writable_reg $F64)) (_ Unit (emit (MInst.FpuMove64 dst src)))) dst))
generated code:
// Generated as internal constructor for term fpu_move. pub fn constructor_fpu_move<C: Context>(ctx: &mut C, arg0: Type, arg1: Reg) -> Reg { let v7 = C::fits_in_64(ctx, arg0); if let Some(v8) = v7 { let v10 = C::temp_writable_reg(ctx, F64); let v11 = MInst::FpuMove64 { rd: v10, rn: arg1 }; let v12 = C::emit(ctx, &v11); let v13 = C::writable_reg_to_reg(ctx, v10); // Rule at src/isa/aarch64/inst.isle line 2251. return v13; } ... // v128
evacchi updated PR #8365.
evacchi updated PR #8365.
evacchi updated PR #8365.
evacchi has marked PR #8365 as ready for review.
evacchi requested wasmtime-core-reviewers for a review on PR #8365.
evacchi requested fitzgen for a review on PR #8365.
evacchi requested wasmtime-compiler-reviewers for a review on PR #8365.
saulecabrera submitted PR review.
saulecabrera merged PR #8365.
Last updated: Jan 24 2025 at 00:11 UTC