cfallin opened PR #2538 from multi-reg-framework to main:
This will allow for support for
I128values everywhere, andI64
values on 32-bit targets (e.g., ARM32 and x86-32). It does not alter the
machine backends to build such support; it just adds the framework for
the MachInst backends to reason about aValueresiding in more than
one register.This is a finalized version of the framework part of the draft PR (#2504); I128
operator implementations will come in a followup PR.<!--
Please ensure that the following steps are all taken care of before submitting
the PR.
[ ] This has been discussed in issue #..., or if not, please tell us why
here.[ ] A short description of what this does, why it is needed; if the
description becomes long, the matter should probably be discussed in an issue
first.[ ] This PR contains test cases, if meaningful.
- [ ] A reviewer from the core maintainer team has been assigned for this PR.
If you don't know who could review this, please indicate so. The list of
suggested reviewers on the right can help you.Please ensure all communication adheres to the code of conduct.
-->
cfallin requested bnjbvr and julian-seward1 for a review on PR #2538.
cfallin requested bnjbvr and julian-seward1 for a review on PR #2538.
bjorn3 submitted PR Review.
bjorn3 created PR Review Comment:
These and a few other methods are the same in all cases. Please move them to a single unconditional impl block.
cfallin updated PR #2538 from multi-reg-framework to main:
This will allow for support for
I128values everywhere, andI64
values on 32-bit targets (e.g., ARM32 and x86-32). It does not alter the
machine backends to build such support; it just adds the framework for
the MachInst backends to reason about aValueresiding in more than
one register.This is a finalized version of the framework part of the draft PR (#2504); I128
operator implementations will come in a followup PR.<!--
Please ensure that the following steps are all taken care of before submitting
the PR.
[ ] This has been discussed in issue #..., or if not, please tell us why
here.[ ] A short description of what this does, why it is needed; if the
description becomes long, the matter should probably be discussed in an issue
first.[ ] This PR contains test cases, if meaningful.
- [ ] A reviewer from the core maintainer team has been assigned for this PR.
If you don't know who could review this, please indicate so. The list of
suggested reviewers on the right can help you.Please ensure all communication adheres to the code of conduct.
-->
cfallin updated PR #2538 from multi-reg-framework to main:
This will allow for support for
I128values everywhere, andI64
values on 32-bit targets (e.g., ARM32 and x86-32). It does not alter the
machine backends to build such support; it just adds the framework for
the MachInst backends to reason about aValueresiding in more than
one register.This is a finalized version of the framework part of the draft PR (#2504); I128
operator implementations will come in a followup PR.<!--
Please ensure that the following steps are all taken care of before submitting
the PR.
[ ] This has been discussed in issue #..., or if not, please tell us why
here.[ ] A short description of what this does, why it is needed; if the
description becomes long, the matter should probably be discussed in an issue
first.[ ] This PR contains test cases, if meaningful.
- [ ] A reviewer from the core maintainer team has been assigned for this PR.
If you don't know who could review this, please indicate so. The list of
suggested reviewers on the right can help you.Please ensure all communication adheres to the code of conduct.
-->
cfallin submitted PR Review.
cfallin created PR Review Comment:
Done!
bjorn3 submitted PR Review.
bnjbvr submitted PR Review.
bnjbvr submitted PR Review.
bnjbvr created PR Review Comment:
nit: and
VALUE_REGS_PARTShere too?
bnjbvr created PR Review Comment:
This unconditionally does four tests, but the invariant is that if part
nis invalid, then partkfork>=nare invalid too. So using explicit branches with short-circuiting might be better (and penalize the "general" case of having a 1:1 mapping less), at the cost of slightly more verbosity. Of course if rustc can vectorize this, please ignore me :-)
bnjbvr created PR Review Comment:
Does it require a different R type?
bnjbvr created PR Review Comment:
nit:
VALUE_REGS_PARTShere instead of 4?
bnjbvr created PR Review Comment:
Is the above clone actually necessary, if we just
iter(and notinto_iter) here?
bnjbvr created PR Review Comment:
This comment needs pluralization too.
bnjbvr created PR Review Comment:
Are these adds guaranteed to not overflow? (and in
offsettoo?)
bnjbvr created PR Review Comment:
Maybe rename
into_regtointo_regs? Below it's a bit weird thatregsgoes intointo_reg(singular).
bnjbvr created PR Review Comment:
nit: this comment needs an update
bnjbvr created PR Review Comment:
(Second time I see an instance of this, so asking, but it might be false.) Could it be that the
reg_classarg is now always ignored, since the type is used instead? If so, we could change the signature ofgen_constantand all its callers to reflect this.
bnjbvr created PR Review Comment:
/me wears the premature optimization hat
Could we use
SmallVec<[M::I; VALUE_REGS_PARTS]>here, and for the other helpers (and some of their callers too)? It documents the 1:1 mapping between move/load/store Inst and real reg present in the value-regs, and may even reduce the number of dynamic allocations.
bnjbvr created PR Review Comment:
I imagine this is implemented in a subsequent PR?
bnjbvr created PR Review Comment:
Maybe add an assert that the value doesn't overflow the i64 range, in case we'd support i128 in the future? (unless this is already correctly replaced in a subsequent PR)
bnjbvr created PR Review Comment:
Design thought: would it make sense to have a
alloc_singlereg_tmphelper that avoids the constructing of anOptionand unwrapping it, and returns only a single reg without using theValueRegsconstruct? This line being repeated so many times makes me wary of the penalty cost of this multi-regs-per-value machinery for the general case where it's unused...
bnjbvr created PR Review Comment:
This change might not be needed. (Can't wait for removing the above
allow(dead_code)once and for all!)
bnjbvr created PR Review Comment:
preexisting: There should be only one valid reference type per arch, can you cause the
R32arm to panic?
bjorn3 submitted PR Review.
bjorn3 created PR Review Comment:
For the 4 element case:
pub struct ValueRegs { parts: [u32; 4], } impl ValueRegs { /// Return the number of registers used. pub fn len(self) -> usize { // If rustc/LLVM is smart enough, this might even be vectorized... (self.parts[0] != !0) as usize + (self.parts[1] != !0) as usize + (self.parts[2] != !0) as usize + (self.parts[3] != !0) as usize } }.LCPI0_0: .quad 1 .quad 1 playground::ValueRegs::len: movdqu (%rdi), %xmm0 pcmpeqd %xmm1, %xmm1 pcmpeqd %xmm1, %xmm0 pxor %xmm1, %xmm0 pshufd $246, %xmm0, %xmm1 movdqa .LCPI0_0(%rip), %xmm2 pand %xmm2, %xmm1 pshufd $212, %xmm0, %xmm0 pand %xmm2, %xmm0 paddq %xmm1, %xmm0 pshufd $78, %xmm0, %xmm1 paddq %xmm0, %xmm1 movq %xmm1, %rax retqFor the 2 element case:
pub struct ValueRegs { parts: [u32; 2], } impl ValueRegs { /// Return the number of registers used. pub fn len(self) -> usize { // If rustc/LLVM is smart enough, this might even be vectorized... (self.parts[0] != !0) as usize + (self.parts[1] != !0) as usize } }playground::ValueRegs::len: xorl %eax, %eax cmpl $-1, %edi setne %al movabsq $-4294967296, %rcx cmpq %rcx, %rdi adcq $0, %rax retq
bnjbvr submitted PR Review.
bnjbvr created PR Review Comment:
Even better, for n=2 everything is in registers; scalar replacement of aggregates FTW \o/
cfallin submitted PR Review.
cfallin created PR Review Comment:
Thanks @bjorn3 for showing the assembly here. I think that, yes, we want to avoid control flow and let the compiler use its cleverness here.
cfallin submitted PR Review.
cfallin created PR Review Comment:
Yes, in some cases -- for example going from
ValueRegs<Reg>toValueRegs<Writable<Reg>>or vice-versa.
cfallin submitted PR Review.
cfallin created PR Review Comment:
Yes, unfortunately, as
self.emit()needs&mut selfandself.retval_regs.iter()holds a&selffor the duration of the loop.
cfallin submitted PR Review.
cfallin created PR Review Comment:
I believe so; the largest type we support a load/store for is 16 bytes, so to overflow the
i64we would need to generate 2^60 instructions; we would OOM the address space before we overflowed.
cfallin submitted PR Review.
cfallin created PR Review Comment:
Indeed!
cfallin submitted PR Review.
cfallin created PR Review Comment:
There were occasional merge conflicts on these type lists before and I think it's cleaner just to import
types::*here as we do in some other files; updated.
cfallin updated PR #2538 from multi-reg-framework to main:
This will allow for support for
I128values everywhere, andI64
values on 32-bit targets (e.g., ARM32 and x86-32). It does not alter the
machine backends to build such support; it just adds the framework for
the MachInst backends to reason about aValueresiding in more than
one register.This is a finalized version of the framework part of the draft PR (#2504); I128
operator implementations will come in a followup PR.<!--
Please ensure that the following steps are all taken care of before submitting
the PR.
[ ] This has been discussed in issue #..., or if not, please tell us why
here.[ ] A short description of what this does, why it is needed; if the
description becomes long, the matter should probably be discussed in an issue
first.[ ] This PR contains test cases, if meaningful.
- [ ] A reviewer from the core maintainer team has been assigned for this PR.
If you don't know who could review this, please indicate so. The list of
suggested reviewers on the right can help you.Please ensure all communication adheres to the code of conduct.
-->
cfallin submitted PR Review.
cfallin created PR Review Comment:
Yes, good point; I went ahead and carried this refactor through all the backends.
cfallin submitted PR Review.
cfallin created PR Review Comment:
Done!
cfallin submitted PR Review.
cfallin created PR Review Comment:
Done, sort of; this type is exposed up to the ABI trait and I'm not sure I want to expose a detail like "max number of value registers per value" there, but it does seem reasonable to pick a small-ish SmallVec size and standardize it; so I went through the ABI code with a
SmallVec<[Inst; 4]>-shaped paintbrush using a newSmallInstVec<Inst>typedef.
cfallin updated PR #2538 from multi-reg-framework to main:
This will allow for support for
I128values everywhere, andI64
values on 32-bit targets (e.g., ARM32 and x86-32). It does not alter the
machine backends to build such support; it just adds the framework for
the MachInst backends to reason about aValueresiding in more than
one register.This is a finalized version of the framework part of the draft PR (#2504); I128
operator implementations will come in a followup PR.<!--
Please ensure that the following steps are all taken care of before submitting
the PR.
[ ] This has been discussed in issue #..., or if not, please tell us why
here.[ ] A short description of what this does, why it is needed; if the
description becomes long, the matter should probably be discussed in an issue
first.[ ] This PR contains test cases, if meaningful.
- [ ] A reviewer from the core maintainer team has been assigned for this PR.
If you don't know who could review this, please indicate so. The list of
suggested reviewers on the right can help you.Please ensure all communication adheres to the code of conduct.
-->
cfallin submitted PR Review.
cfallin created PR Review Comment:
Done!
cfallin submitted PR Review.
cfallin created PR Review Comment:
Done (here and x64)!
cfallin submitted PR Review.
cfallin created PR Review Comment:
Done!
cfallin submitted PR Review.
cfallin created PR Review Comment:
Done.
cfallin submitted PR Review.
cfallin created PR Review Comment:
Done.
cfallin submitted PR Review.
cfallin created PR Review Comment:
Done!
cfallin updated PR #2538 from multi-reg-framework to main:
This will allow for support for
I128values everywhere, andI64
values on 32-bit targets (e.g., ARM32 and x86-32). It does not alter the
machine backends to build such support; it just adds the framework for
the MachInst backends to reason about aValueresiding in more than
one register.This is a finalized version of the framework part of the draft PR (#2504); I128
operator implementations will come in a followup PR.<!--
Please ensure that the following steps are all taken care of before submitting
the PR.
[ ] This has been discussed in issue #..., or if not, please tell us why
here.[ ] A short description of what this does, why it is needed; if the
description becomes long, the matter should probably be discussed in an issue
first.[ ] This PR contains test cases, if meaningful.
- [ ] A reviewer from the core maintainer team has been assigned for this PR.
If you don't know who could review this, please indicate so. The list of
suggested reviewers on the right can help you.Please ensure all communication adheres to the code of conduct.
-->
cfallin merged PR #2538.
Last updated: Jan 09 2026 at 13:15 UTC