bnjbvr opened PR #1995 from x64
to main
:
I've implemented basic support for Baldrdash so as to be able to run the extensive Spidermonkey test suite, and then could get back to fix a few integer operators implemented before. Also implemented most of the integer opcodes required for wasm; I could start to run successfully a few large wasm integer-only programs with wasmtime too.
I plan on enabling limited testing of wast files on CI soon, to catch future regressions in codegen.
Sorry about the large PR!
bnjbvr requested julian-seward1 for a review on PR #1995.
bnjbvr updated PR #1995 from x64
to main
:
I've implemented basic support for Baldrdash so as to be able to run the extensive Spidermonkey test suite, and then could get back to fix a few integer operators implemented before. Also implemented most of the integer opcodes required for wasm; I could start to run successfully a few large wasm integer-only programs with wasmtime too.
I plan on enabling limited testing of wast files on CI soon, to catch future regressions in codegen.
Sorry about the large PR!
jlb6740 submitted PR Review.
jlb6740 created PR Review Comment:
Rotate in shift enum ... making the naming a little inconsistent from the first 3? Perhaps a more generic name (and comment update) for the Enum to show that Rotate is also housed. Also for consistency could prepend shift to Left, RightZ, and RightS, i.e. ShiftLeft, ShiftRightZ, ShiftRightS.
jlb6740 deleted PR Review Comment.
jlb6740 submitted PR Review.
jlb6740 submitted PR Review.
jlb6740 created PR Review Comment:
Rotate in shift enum ... making the naming a little inconsistent from the first 3? Perhaps a more generic name (and comment update) for the Enum to show that Rotate is also housed. Also for consistency could prepend shift to Left, RightZ, and RightS, i.e. ShiftLeft, ShiftRightZ, ShiftRightS.
bnjbvr requested cfallin and julian-seward1 for a review on PR #1995.
julian-seward1 submitted PR Review.
julian-seward1 created PR Review Comment:
+1 for that suggestion.
julian-seward1 submitted PR Review.
julian-seward1 created PR Review Comment:
Could you clarify the register usage in this comment a bit? Something like:
RDX:RAX <- RAX * rhs
.
julian-seward1 submitted PR Review.
julian-seward1 created PR Review Comment:
Hmm,
gen_move
has its destination on the left? That feels inconsistent with the rest of the x64 stuff, which I thought had destinations as the rightmost argument. But maybe that's becausegen_move
is from the applies-to-all-targetstrait Inst
, right? I guess the use ofWritable
guarantees we're safe here.
julian-seward1 created PR Review Comment:
ReadOnlyGprRmR*
feels a bit cumbersome to me (and did when I looked at it some days ago). Would you consider insteadUnaryRmR
(and consistent renaming for the derived-from-it names?)
julian-seward1 submitted PR Review.
julian-seward1 submitted PR Review.
julian-seward1 created PR Review Comment:
nit-level, at best, but .. (eg)
*Extend32
doesn't make it obvious to the casual reader whether the 32 is the source or dest width. Would you consider addingTo
in front of the number? eg*ExtendTo32
?
cfallin submitted PR Review.
cfallin submitted PR Review.
cfallin created PR Review Comment:
Add a comment giving the instruction this is emitting?
movq $imm64, %dst
or similar...
cfallin created PR Review Comment:
Perhaps for a future refactor, but IMHO we should move from
is_64
bools to a purpose-built enum.
cfallin created PR Review Comment:
Just a stylistic preference, but I tend to prefer panics for things like this, e.g.
_ => panic!("Int args only supported for SysV calling conventions")
or somesuch...(similar comment below)
cfallin created PR Review Comment:
Drive-by / pre-existing, but perhaps we could turn
is_div
andis_signed
into their own (combined) enum -- SDiv, UDiv, SRem, URem?
cfallin created PR Review Comment:
I know this is just following the aarch64 backend (so it's OK for now!), but I'm wondering if we should make the pinned register a flags option, and then set it explicitly in Baldrdash, to avoid the hardcoded reference to SpiderMonkey here. Perhaps for a future refactor?
julian-seward1 submitted PR Review.
julian-seward1 created PR Review Comment:
I'm sure this works correctly, but it concerns me in two ways:
The text here is long and it's (almost) exactly a 6-way unrolled loop in the 64-bit case. Would it be possible to do this by having a loop here (in lower.rs, not the generated code) which reads a table of constants? In the style of https://sourceware.org/git/?p=valgrind.git;a=blob;f=VEX/priv/guest_amd64_toIR.c;h=fadf47d41d6ba6f63a5ca3800409b11da94c5f7e;hb=HEAD#l4948
I see an integer multiply at the end. In the sequences I'm familiar with, that doesn't appear. It's probably a 3-cycle latency, compared with the 1-cycle latency and parallelisable over #-available-basic-ALUs for a sequence using only and/or/add/sub/shifts.
Given that the current goal is to get correct x64 code generation landed, I'd be happy for these to be looked at in a followup. They don't need to block landing.
julian-seward1 submitted PR Review.
julian-seward1 created PR Review Comment:
+1 for that; please add comments always, indicating what insns are being emitted. Inferring from the opcodes is nearly impossible.
julian-seward1 submitted PR Review.
julian-seward1 created PR Review Comment:
Also .. is there a typo in the commit message? "movaps" isn't the insn you mean AFAIK; that's an aligned SSE load (move-aligned-packed-singleprecision). Maybe you mean "movabsq" ?
julian-seward1 submitted PR Review.
julian-seward1 created PR Review Comment:
If it's an outright bug that should never happen, then +1 for panicing rather than continuing (returning
false
or whatever). I'm not convinced that's the case here, though (can't infer)
julian-seward1 submitted PR Review.
julian-seward1 created PR Review Comment:
I just want to check here that these changes to the register numbering don't change the order at which they appear in the final
RealRegUniverse
. (To be specific: you are not reordering registers within each reg-class, in the universe). Is that true? I ask because the ordering of regs within each reg class is relevant -- at least for RA/BT -- to whether caller- or callee-saved registers are used in preference, and that has an effect on the quality of the generated allocation.
julian-seward1 submitted PR Review.
julian-seward1 created PR Review Comment:
+1; losing
r15
in configurations where that isn't necessary would be unfortunate, given that there are only circa 12 int regs available for allocation anyway. And in the 32-bit x86 case (if we ever do that) it would be quite serious.
julian-seward1 submitted PR Review.
julian-seward1 submitted PR Review.
julian-seward1 created PR Review Comment:
Please can this be upgraded to a release assert, or at least somehow be checked even in release builds? If for whatever reason
ty
ofI16
orI8
ever gets here, we will silently generate wrong code and it could be hard to track down.
julian-seward1 submitted PR Review.
julian-seward1 created PR Review Comment:
Also .. for "Make sure to zero %rdx right before this instruction!" please add a line explaining why this is necessary and what will (or might) happen if you don't.
bjorn3 submitted PR Review.
bjorn3 created PR Review Comment:
CallConv::Fast
andCallConv::Cold
can also reach this I think.
bjorn3 submitted PR Review.
bjorn3 created PR Review Comment:
If you implement
Display
instead, you getToString
for free.
bjorn3 submitted PR Review.
bjorn3 created PR Review Comment:
It is already conditionally used as pinned register: https://github.com/bytecodealliance/wasmtime/pull/1995/files#diff-d4cd987e797b464cfda540d1f41fcfb3R200-R210
bnjbvr submitted PR Review.
bnjbvr created PR Review Comment:
Indeed,
gen_move
is on the target-independent trait. Maybe we can change this at some point, if the aarch64 backend agrees with the x64 convention?
bnjbvr submitted PR Review.
bnjbvr created PR Review Comment:
Yes! Recent PRs with
OperandSize
enum have inspired me... Added to the backlog!
bnjbvr submitted PR Review.
bnjbvr created PR Review Comment:
+1 to what @bjorn3 said, its usage is opt-in, disabled by default.
bnjbvr submitted PR Review.
bnjbvr created PR Review Comment:
I've mostly ported what the old backend did, and I'm happy to revisit it later when this has landed. Thanks for the comments!
bnjbvr submitted PR Review.
bnjbvr created PR Review Comment:
It's a long story, but... yes, this does put R15 at the end of the allocatable list.
The reason is the following: regalloc has the concept of overall allocatable registers. So if in certain modes (i.e. when the use_pinned_reg flag is true), we want to mark one register as allocatable or not, then it must be at the edge of the "allocatable" list, so at the end of its register class (otherwise its register index changes when the pinned reg is in use).
To keep BT happy with the order of caller or callee-saved registers, I guess we could have a different register index for r15, according to whether it's pinned or not. I think it might cause more churn for other registers (namely, those which will be placed after r15). So maybe this justifies a completely different
RegUniverse
of its own? If that's ok, i'd like to defer this. We may need having different RegUniverse for respecting the win64 calling convention.
julian-seward1 submitted PR Review.
julian-seward1 created PR Review Comment:
If that's ok, i'd like to defer this.
Oh, definitely defer. Let's get this landed and running first, then we can come back and optimise later. (Sorry .. I should have made this clearer in the comment)
bnjbvr updated PR #1995 from x64
to main
:
I've implemented basic support for Baldrdash so as to be able to run the extensive Spidermonkey test suite, and then could get back to fix a few integer operators implemented before. Also implemented most of the integer opcodes required for wasm; I could start to run successfully a few large wasm integer-only programs with wasmtime too.
I plan on enabling limited testing of wast files on CI soon, to catch future regressions in codegen.
Sorry about the large PR!
bnjbvr updated PR #1995 from x64
to main
:
I've implemented basic support for Baldrdash so as to be able to run the extensive Spidermonkey test suite, and then could get back to fix a few integer operators implemented before. Also implemented most of the integer opcodes required for wasm; I could start to run successfully a few large wasm integer-only programs with wasmtime too.
I plan on enabling limited testing of wast files on CI soon, to catch future regressions in codegen.
Sorry about the large PR!
bnjbvr merged PR #1995.
julian-seward1 submitted PR Review.
julian-seward1 created PR Review Comment:
Well, aarch64 has destinations on the left. So it's never going to agree with x64 (the AT&T conventions, at least.).
Last updated: Dec 23 2024 at 12:05 UTC