jlb6740 edited PR #1837 from vcode-add-fcopy-sign
to master
:
<!--
Please ensure that the following steps are all taken care of before submitting
the PR.
[ ] This has been discussed in issue #..., or if not, please tell us why
here.[ ] A short description of what this does, why it is needed; if the
description becomes long, the matter should probably be discussed in an issue
first.[ ] This PR contains test cases, if meaningful.
- [ ] A reviewer from the core maintainer team has been assigned for this PR.
If you don't know who could review this, please indicate so. The list of
suggested reviewers on the right can help you.Please ensure all communication adheres to the code of conduct.
-->
jlb6740 updated PR #1837 from vcode-add-fcopy-sign
to master
:
<!--
Please ensure that the following steps are all taken care of before submitting
the PR.
[ ] This has been discussed in issue #..., or if not, please tell us why
here.[ ] A short description of what this does, why it is needed; if the
description becomes long, the matter should probably be discussed in an issue
first.[ ] This PR contains test cases, if meaningful.
- [ ] A reviewer from the core maintainer team has been assigned for this PR.
If you don't know who could review this, please indicate so. The list of
suggested reviewers on the right can help you.Please ensure all communication adheres to the code of conduct.
-->
jlb6740 updated PR #1837 from vcode-add-fcopy-sign
to master
:
<!--
Please ensure that the following steps are all taken care of before submitting
the PR.
[ ] This has been discussed in issue #..., or if not, please tell us why
here.[ ] A short description of what this does, why it is needed; if the
description becomes long, the matter should probably be discussed in an issue
first.[ ] This PR contains test cases, if meaningful.
- [ ] A reviewer from the core maintainer team has been assigned for this PR.
If you don't know who could review this, please indicate so. The list of
suggested reviewers on the right can help you.Please ensure all communication adheres to the code of conduct.
-->
jlb6740 updated PR #1837 from vcode-add-fcopy-sign
to master
:
<!--
Please ensure that the following steps are all taken care of before submitting
the PR.
[ ] This has been discussed in issue #..., or if not, please tell us why
here.[ ] A short description of what this does, why it is needed; if the
description becomes long, the matter should probably be discussed in an issue
first.[ ] This PR contains test cases, if meaningful.
- [ ] A reviewer from the core maintainer team has been assigned for this PR.
If you don't know who could review this, please indicate so. The list of
suggested reviewers on the right can help you.Please ensure all communication adheres to the code of conduct.
-->
jlb6740 updated PR #1837 from vcode-add-fcopy-sign
to master
:
<!--
Please ensure that the following steps are all taken care of before submitting
the PR.
[ ] This has been discussed in issue #..., or if not, please tell us why
here.[ ] A short description of what this does, why it is needed; if the
description becomes long, the matter should probably be discussed in an issue
first.[ ] This PR contains test cases, if meaningful.
- [ ] A reviewer from the core maintainer team has been assigned for this PR.
If you don't know who could review this, please indicate so. The list of
suggested reviewers on the right can help you.Please ensure all communication adheres to the code of conduct.
-->
bnjbvr submitted PR Review.
bnjbvr submitted PR Review.
bnjbvr created PR Review Comment:
Can you add a comment that states for which kind of instructions we'll use this (no need to enumerate precisely which they are, just which kind), and how it differs from XMM_RM_R, please?
bnjbvr created PR Review Comment:
Why does this sse opcode need to be present for this
RM/R
mode, as well as in an above variant forR/R
moves? I'm afraid this would mean that there are two possible Inst representations formovd xmm0, rax
, for instance, which is something to avoid in general (it makes pattern-matching more complicated, for instance).
bnjbvr created PR Review Comment:
Even if it is a bit redundant, how about deduplicating the bodies for these two variants? Then we could more effectively separate these by opcodes, and catch invalid (Inst kind, opcode) combinations.
bnjbvr created PR Review Comment:
Can you add a comment at the top of this sequence explaining how this is lowered, please? like:
mov 0x80000000, tregI64 movd tregI64, tregV128
etc
bnjbvr created PR Review Comment:
nit: since this is used only once, can you inline it at its use, instead, please?
bnjbvr created PR Review Comment:
It is totally fine to use another temp, because at this level we're operating on virtual registers; it is the register allocator's duty to try to coalesce those virtual registers into real registers. So feel free to use as many temporaries as you'll need.
bnjbvr created PR Review Comment:
nit: can this message be a bit more specific: "lowering for fcopysign f64"
bnjbvr created PR Review Comment:
A few questions:
- should this be F32 and not F64, since they only use the lower 32 bits?
- what do you think about naming these
scratch_gpr
,scratch_fp1
and so on, or simplytmp_gpr
,tmp_1
etc? Having the register class's name in the variable's name feels a bit misleading: one might think that a given register is a full SIMD one, while it's not quite true.- could one of these V128 temporaries be avoided if we'd use the destination operand as a temporary?
bnjbvr created PR Review Comment:
Can you teach the register allocator that Movaps is a move instruction, by adding it to the list of instructions matched in
is_move
in inst/mod.rs? (This is what enables register coalescing)
bnjbvr created PR Review Comment:
Can we reuse T1 as the destination here? It's dead after this use.
bnjbvr created PR Review Comment:
This function seems to be unused, can we remove it?
bnjbvr edited PR Review Comment.
bnjbvr edited PR Review Comment.
bnjbvr edited PR Review Comment.
julian-seward1 submitted PR Review.
julian-seward1 created PR Review Comment:
Moves between GPRs and XMM regs (bit-preserving or with FP/Int conversions) don't fit cleanly in either the GPR-instruction or SSE*-instruction categories. Maybe cleanest to create a new top level category, or maybe two (one that does conversions, one that just copies?) with their own opcode enumerations, and the usual RM->R structure.
bnjbvr edited PR Review Comment.
jlb6740 submitted PR Review.
jlb6740 created PR Review Comment:
Why does this sse opcode need to be present for this
RM/R
mode, as well as in an above variant forR/R
moves? I'm afraid this would mean that there are two possible Inst representations formovd xmm0, rax
, for instance, which is something to avoid in general (it makes pattern-matching more complicated, for instance).Hi @bnjbvr ... It doesn't and #1838 removes the R_R. Perhaps I should just combine these two patches? The R_R was following the initial convention here but I thought these move instructions really should fit under a structure with "RM" in the name which would also by doing so would characterize these move instructions more consistently with how they are described in the arch guides (specifically the Op/En columns).
jlb6740 submitted PR Review.
jlb6740 created PR Review Comment:
Hi .. yes, this is to satisfy dependencies w.r.t. the regalloc (maybe regusagecollector?) which needs to understand that we are writing a destination registers and not modifying it (meaning a valid value already existed there). It was when I was running into runtime errors that I realized the purpose of the _r_r structure was to be able to make this distinction which should apply to moves in general.
The difference is in the code below in `fn x64_map_regs"
Inst::XMM_MOV_RM_R {
op: _,
ref mut src,
ref mut dst,
} => {
src.map_uses(mapper);
map_def(mapper, dst);
}
"
Where in XMM_RM_R instead of map_def we have a call to map_mod.Johnnie
jlb6740 edited PR Review Comment.
jlb6740 edited PR Review Comment.
jlb6740 edited PR Review Comment.
jlb6740 edited PR Review Comment.
jlb6740 edited PR Review Comment.
jlb6740 edited PR Review Comment.
jlb6740 submitted PR Review.
jlb6740 created PR Review Comment:
Ok :+1: . Did you think we should fold in rhs as well? None of the other implementations such as Opcode::Iconst that don't fold things in so there would be some inconsistency with the other matches?
jlb6740 edited PR Review Comment.
jlb6740 edited PR Review Comment.
jlb6740 submitted PR Review.
jlb6740 created PR Review Comment:
This is used in mod.rs and the impact comes during testing. We have no way to specify a 32-bit gpr register so for example instead of RegMem::reg(eax) what is being done in other gpr instruction tests is to use RegMem::reg(rax) but then create a parameter that sets the is_64 bool to false. So to print out ether movl vs movq the reg used is first passed in rbx but the bool determines whether rax or eax is printed.
// Mov_R_R insns.push(( Inst::mov_r_r(false, rbx, w_rsi), "89DE", "movl %ebx, %esi", )); ... insns.push(( Inst::mov_r_r(true, rbx, w_rsi), "4889DE", "movq %rbx, %rsi", ));
I too started with this parameter for xmm_mov_rm_r when I realized I needed to have the distinction for proper testing, but it was somewhat redundant and heavyweight to add this parameter everywhere as the opcode itself already specifies the operand size. That's the point of this function but right now it is only called in one place in mod.rs:
![image](https://user-images.githubusercontent.com/45402135/85072498-12e8ff80-b16e-11ea-842a-4b5fc129b7b5.png)
jlb6740 edited PR Review Comment.
jlb6740 submitted PR Review.
jlb6740 created PR Review Comment:
Moves between GPRs and XMM regs (bit-preserving or with FP/Int conversions) don't fit cleanly in either the GPR-instruction or SSE*-instruction categories. Maybe cleanest to create a new top level category, or maybe two (one that does conversions, one that just copies?) with their own opcode enumerations, and the usual RM->R structure.
Ok, thanks. Will think about this a little before the next push.
jlb6740 edited PR Review Comment.
jlb6740 edited PR Review Comment.
jlb6740 updated PR #1837 from vcode-add-fcopy-sign
to master
:
<!--
Please ensure that the following steps are all taken care of before submitting
the PR.
[ ] This has been discussed in issue #..., or if not, please tell us why
here.[ ] A short description of what this does, why it is needed; if the
description becomes long, the matter should probably be discussed in an issue
first.[ ] This PR contains test cases, if meaningful.
- [ ] A reviewer from the core maintainer team has been assigned for this PR.
If you don't know who could review this, please indicate so. The list of
suggested reviewers on the right can help you.Please ensure all communication adheres to the code of conduct.
-->
jlb6740 submitted PR Review.
jlb6740 created PR Review Comment:
I've refactored the lowering a bit in consideration of avoiding one of the temporaries as mentioned above. I didn't see more opportunity here after those changes, but perhaps there are still some things I could with dst that I didn't realize?
bnjbvr submitted PR Review.
bnjbvr created PR Review Comment:
Ok, thanks for the explanation! Didn't see this use there.
bnjbvr submitted PR Review.
bnjbvr submitted PR Review.
bnjbvr created PR Review Comment:
Thanks for the explanation; can you add a small code comment that sums it up, above the definition of this enum variant, please?
bnjbvr created PR Review Comment:
I think it's reasonable to keep the lhs/rhs variables, since they give slightly more context to what they represent, while
w64
's value was less clear, in my opinion. I am not sure what you mean by "264", though, feel free to do it too if it makes sense to you!
bnjbvr submitted PR Review.
bnjbvr created PR Review Comment:
nit: this can be removed now :)
bnjbvr created PR Review Comment:
Looks great, thank you!
jlb6740 submitted PR Review.
jlb6740 created PR Review Comment:
Meant w64 not 264.
jlb6740 updated PR #1837 from vcode-add-fcopy-sign
to master
:
<!--
Please ensure that the following steps are all taken care of before submitting
the PR.
[ ] This has been discussed in issue #..., or if not, please tell us why
here.[ ] A short description of what this does, why it is needed; if the
description becomes long, the matter should probably be discussed in an issue
first.[ ] This PR contains test cases, if meaningful.
- [ ] A reviewer from the core maintainer team has been assigned for this PR.
If you don't know who could review this, please indicate so. The list of
suggested reviewers on the right can help you.Please ensure all communication adheres to the code of conduct.
-->
jlb6740 updated PR #1837 from vcode-add-fcopy-sign
to master
:
<!--
Please ensure that the following steps are all taken care of before submitting
the PR.
[ ] This has been discussed in issue #..., or if not, please tell us why
here.[ ] A short description of what this does, why it is needed; if the
description becomes long, the matter should probably be discussed in an issue
first.[ ] This PR contains test cases, if meaningful.
- [ ] A reviewer from the core maintainer team has been assigned for this PR.
If you don't know who could review this, please indicate so. The list of
suggested reviewers on the right can help you.Please ensure all communication adheres to the code of conduct.
-->
jlb6740 submitted PR Review.
jlb6740 created PR Review Comment:
@julian-seward1 I haven't addressed your comment in code yet and was thinking about it for a separate patch from this one if possible. In that patch we could create another enum called something like SseDtMixedOpcode (standing for sse data transfer mixed register opcode or something distinguishing like that) to show that the operand's registers aren't all the same category. However doing that seems to not do much more than organize the instructions at a finer granularity of SSE instruction which then opens the door for many more categories at that same granularity (for example: xmm only data transfer instructions, sse scalar arithmetic, packed arithmetic, sse comparison, sse conversion, etc) which maybe has the downfall of organizing a lot of new data structures that instead of simplifying, just creates more logic that is duplicated across lowering. I'd be happy to do that though as it does give more organization which in general feels like a good thing, but we also could just keep the one SSE instruction category which maybe simplifies things (distinguishing only from GPR, AVX, etc):
![image](https://user-images.githubusercontent.com/45402135/85326562-58fdd600-b482-11ea-93b3-e3bc9cd130c6.png)
Any more thoughts here?
jlb6740 edited PR Review Comment.
bnjbvr submitted PR Review.
bnjbvr created PR Review Comment:
I agree with Johnnie here. In general, I even think we could move away from the Vcode Inst variant names being low-level descriptors of the kind of instructions, and could generally name them around what they do; the categories mentioned by @jlb6740 in the previous comment make a lot of sense to that effect:
xmm_data_transfer
instead ofXMM_Mov_RM_R
,alu_arithmetic
, etc.
julian-seward1 submitted PR Review.
julian-seward1 created PR Review Comment:
My view is, the further you move the insn names away from the underlying structure of the insn set (eg
xmm_data_transfer
, etc) the harder it is to follow, since readers then have to guess or infer which set of insnsxmm_data_transfer
maps to.maybe has the downfall of organizing a lot of new data structures that instead of simplifying,
I'm not sure I understand what is meant here. Do you mean, it creates a lot of new top level
Inst
cases? To be concrete, all I was suggesting, for the cross-register-set insns, was adding two newInst
cases, somewhat like this:new Inst case // XMM <-> GPR moves Inst::Mov_XMM_GPR { toXMM: bool, // true => gpr->xmm, false => xmm->gpr gpr: Reg, // src or dst xmm: Reg, // dst or src is64: bool, // 32 or 64 bit transfer? We don't care about anything else laneNo: u8 // in the XMM; it may well be that this isn't even necessary } new Inst case // Floating point conversions Inst::Cvt_XMM_GPR { gpr: Reg, // src or dst xmm: Reg, // dst or src how: SseFICvtOp, // what the conversion is rmode: RMode // rounding mode to use, if relevant and // if the insn supports direct specification of rounding modes } enum SSeFICvtOp { 32UtoF32, // needs rounding 32StoF32, // needs rounding 32UtoF64, // no loss of precision 32StoF64, // no loss of precision 64StoF64, // needs rounding 64UtoF64, // needs rounding // All the following involve rounding (RM) F32toU32, F32toS32, F64toU32, F64toU32, F64toS64, F64toU64, } enum RMode { Zero, // round to zero, PlusInf, // round to +Inf, MinusInf, // round to -Inf, Nearest // round to nearest, MXCSR // round per mxcsr.RM }
I'm not claiming that all of those insns exist or make sense. But I do think these two would cover all xmm-gpr moves that we care about generating. For example,
CVTSD2SI is Cvt_XMM_GPR { gpr:_, xmm:_, how: F64toS64, rmode: MXCSR } CVTTSD2SI is the same, but with fixed rmode: is Cvt_XMM_GPR { gpr:_, xmm:_, how: F64toS64, rmode: Zero }
Plain XMM-to-XMM moves might want to stay in their own Inst:: case, since they have special significance to the register allocator. I wasn't suggesting any further subdivision of
Inst
.
bnjbvr submitted PR Review.
bnjbvr created PR Review Comment:
My view is, the further you move the insn names away from the underlying structure of the insn set (eg xmm_data_transfer, etc) the harder it is to follow, since readers then have to guess or infer which set of insns xmm_data_transfer maps to.
I kind of disagree with this. The useful information in
xmm_mov_rm_r
is really the part aboutxmm
(this is for float registers) and aboutmov
(this is for moves); the rest of it requires to be aware of the x86 jargon, and doesn't bring new information compared to what's is in theInst
variant struct fields. Also, in my opinion trying to enforce a 1:1 mapping on Inst name to the list of instructions that use it is the best way to make sure that comments and code stay out of sync over a long period of time. It also makes the code generating these Inst variants very hard to read, because this naming hides what's really interesting at the lowering level (= the semantics) by giving a preference to the implementation detail (= the encoding).
jlb6740 submitted PR Review.
jlb6740 created PR Review Comment:
@julian-seward1 I thought you were suggesting too to change the "enum SseOpcode" in x64/inst/args.rs but I guess that wasn't the case and you were only referring to the instruction enums?
maybe has the downfall of organizing a lot of new data structures that instead of simplifying,
I'm not sure I understand what is meant here. Do you mean, it creates a lot of new top level
Inst
cases?Sorry, I wasn't clear here. I was wondering if having whichever extra structures needed ... including the ones you quickly put together above, adds more logic to consider or patch and to eventually get "out of sync" as @bnjbvr points out. The advantage that see is that it feels like it organizes things so that they are more orderly, but not sure if it truly does do that or if it just hurts readability by giving you that many more puzzle pieces to reason and figure out how the lowering is taking place. The more code to break, the more puzzle pieces (structs, match cases, etc) to consider, is what I meant by complexity. So while I agree with @bnjbvr who I think summed my thoughts better, I do wonder if there is more benefit to having this organization than I initially see and if it the long run we'd regret not having that finer granularity of lowering detail. Frankly, I'll be happy to implement either way ... and not sure who will make that call since there is a difference of opinion here, but I am thinking any change could be saved for a separate pr?
jlb6740 edited PR Review Comment.
jlb6740 edited PR Review Comment.
jlb6740 edited PR Review Comment.
bnjbvr submitted PR Review.
bnjbvr created PR Review Comment:
I agree with @jlb6740, and I think that this is something we can easily revisit afterwards if it appears to be more convenient to split it up, so let's keep it as is in the meanwhile.
bnjbvr submitted PR Review.
bnjbvr submitted PR Review.
bnjbvr created PR Review Comment:
This comment seems to be incorrect, can you revisit it, please?
bnjbvr created PR Review Comment:
ditto for
regE
toreg_e
bnjbvr created PR Review Comment:
nit: can you rename
regE
toreg_e
please?
bnjbvr created PR Review Comment:
typo:
computation
bnjbvr created PR Review Comment:
pre-existing: can you rename
srcE
tosrc_e
please?
bnjbvr created PR Review Comment:
(I've got a large patch that makes this more uniform, so let's do it here too) Can you use this syntax to omit the unneeded fields:
XMM_MOV_RM_R { src, dst, .. } =>
bnjbvr created PR Review Comment:
nit: can you rename
srcE
tosrc_e
please?
bnjbvr created PR Review Comment:
ditto here
bnjbvr created PR Review Comment:
Unfortunately, we can't put
movd
here because it goes from one register class to another, and this may largely confuse register allocation. Can you remove it, please?
jlb6740 submitted PR Review.
jlb6740 created PR Review Comment:
Note ... there will be an inconsistency here w.r.t. to some other cases in this match. Will probably want to revisit those and make sure it is all consistent. For this patch will not touch those other cases as I think initially there may have been a subtlety behind choosing regE vs reg_e for naming that we should make sure is consistent everywhere.
jlb6740 updated PR #1837 from vcode-add-fcopy-sign
to master
:
<!--
Please ensure that the following steps are all taken care of before submitting
the PR.
[ ] This has been discussed in issue #..., or if not, please tell us why
here.[ ] A short description of what this does, why it is needed; if the
description becomes long, the matter should probably be discussed in an issue
first.[ ] This PR contains test cases, if meaningful.
- [ ] A reviewer from the core maintainer team has been assigned for this PR.
If you don't know who could review this, please indicate so. The list of
suggested reviewers on the right can help you.Please ensure all communication adheres to the code of conduct.
-->
jlb6740 merged PR #1837.
Last updated: Nov 22 2024 at 16:03 UTC