bnjbvr edited PR #2004 from x64-fp
to main
:
Opening this PR as a WIP, to coordinate better with @jlb6740; this lacks proper testing right now. Only the last commit is relevant, the rest comes from #1995.
This implements the following:
- load/store for floating point values
- more binary and unary opcodes
- constant generation
bnjbvr updated PR #2004 from x64-fp
to main
:
Opening this PR as a WIP, to coordinate better with @jlb6740; this lacks proper testing right now. Only the last commit is relevant, the rest comes from #1995.
This implements the following:
- load/store for floating point values
- more binary and unary opcodes
- constant generation
bnjbvr updated PR #2004 from x64-fp
to main
:
Opening this PR as a WIP, to coordinate better with @jlb6740; this lacks proper testing right now. Only the last commit is relevant, the rest comes from #1995.
This implements the following:
- load/store for floating point values
- more binary and unary opcodes
- constant generation
julian-seward1 submitted PR Review.
julian-seward1 submitted PR Review.
julian-seward1 created PR Review Comment:
What does this function do / how does it help? I can't guess from the name, and reading the code doesn't help much. Can you add a comment? Maybe rename it?
julian-seward1 created PR Review Comment:
I look at this (the introduction of
is_int
) and imagine the day when this needs to be extended again to handle 128-bit vector registers. That day can't be far off. In that case evenis_int
will be inadequate, I guess. Would it be more future-proof to return aRegClass
instead of thebool
here?
julian-seward1 created PR Review Comment:
Hmm, this comment isn't really right. Is
XMM_RM_R
only used for scalar floating point? I would have thought it could be (and, eventually, will be) used for any SSE operation of the formR := OP(R, RM)
.
julian-seward1 created PR Review Comment:
nit:
XmmUnary
-->XmmUnaryRmR
julian-seward1 created PR Review Comment:
nit: "from float to reg/mem" isn't right. These are stores (they can't be anything else), and AFAIK they could also be stores of the entire vector register, not just the lowest F32 or F64 lane.
julian-seward1 created PR Review Comment:
As a suggestion, would you consider renaming
xmm_to_int
toxmm_to_gpr
(and the associated data structures?) The reason is that "xmm" is unambiguously an XMM register. But "int" isn't so clear; it could mean "integer register" (as you mean here) but it could also be interpreted as "integer value stored in some lane(s) of an XMM register"._to_gpr
removes that ambiguity.
julian-seward1 created PR Review Comment:
What happens to the upper 96/64 bits of the destination? I don't think it is important for code generation, but it would be nice if what happens was stated here.
julian-seward1 created PR Review Comment:
Just as a comment .. these assertions are good (I like assertions), but .. won't the
emit
function eventually assert/panic if the wrong class register is in an instruction field? I think that's true but I don't remember exactly. So I think you're safe even without the assert. (and if theemit
fns don't assert/panic in such circumstances, we should fix them so they do; that's the last line of defense against mistakes in register classes).
julian-seward1 created PR Review Comment:
On reading further down, I see this appears to be used to assert the register class. Would you consider renaming this to
assert_regclass_is
? I'd prefer to avoid the term "register type" since that's not really a concept we have; what we have are the concepts "register class" and "type of the value stored in the register", which may not be the same. Eg a F32 value in a V128-class register.
julian-seward1 created PR Review Comment:
isn't it the inverse-of-cc that needs to be printed here?
julian-seward1 created PR Review Comment:
We can probably special-case
0.0
usingxorpd
or whatever the official dependency-breaking generate-vector-zero insn is. But not now; we can come back to this.
julian-seward1 created PR Review Comment:
Whatever is chosen, make sure it's an officially-recognised dependency-breaking insn.
julian-seward1 created PR Review Comment:
Can this ever fail for valid input?
julian-seward1 created PR Review Comment:
Oh, you comment that just below :-)
julian-seward1 submitted PR Review.
julian-seward1 created PR Review Comment:
Oh, maybe that's a stupid idea, because F64 and F32s live in V128 class registers, and we need a way to distinguish between these three cases. So ignore this.
bnjbvr submitted PR Review.
bnjbvr created PR Review Comment:
Indeed, done! I've also removed the use of a match to just use
if let ...
direct pattern matching, which reduces code indent by one tab.
bnjbvr submitted PR Review.
bnjbvr created PR Review Comment:
I renamed it XmmCmove for consistency, and precised that it applied only to scalars at this point; or we can just generalize it right now and use
movapd
consistently here, not sure...
bnjbvr submitted PR Review.
bnjbvr created PR Review Comment:
The emit functions don't protect much against this, at this point. You're right it'd be better if they did, i'll make this in a subsequent PR!
bnjbvr submitted PR Review.
bnjbvr created PR Review Comment:
good catch, thx
bnjbvr submitted PR Review.
bnjbvr created PR Review Comment:
In theory, yes: the complex variants can take as many inputs as the user wants, and the documentation says it'll add all the inputs together to form the effective address. In practice,
postopt
I think is the only creator of these instruction variants, and will only use up to 3 inputs together, to map to what the x86 architecture can do.
bnjbvr updated PR #2004 from x64-fp
to main
:
Opening this PR as a WIP, to coordinate better with @jlb6740; this lacks proper testing right now. Only the last commit is relevant, the rest comes from #1995.
This implements the following:
- load/store for floating point values
- more binary and unary opcodes
- constant generation
bnjbvr merged PR #2004.
Last updated: Nov 22 2024 at 17:03 UTC