Stream: git-wasmtime

Topic: wasmtime / PR #2538 Multi-register value support: framewo...


view this post on Zulip Wasmtime GitHub notifications bot (Jan 04 2021 at 06:06):

cfallin opened PR #2538 from multi-reg-framework to main:

This will allow for support for I128 values everywhere, and I64
values on 32-bit targets (e.g., ARM32 and x86-32). It does not alter the
machine backends to build such support; it just adds the framework for
the MachInst backends to reason about a Value residing in more than
one register.

This is a finalized version of the framework part of the draft PR (#2504); I128
operator implementations will come in a followup PR.

<!--

Please ensure that the following steps are all taken care of before submitting
the PR.

Please ensure all communication adheres to the code of conduct.
-->

view this post on Zulip Wasmtime GitHub notifications bot (Jan 04 2021 at 06:06):

cfallin requested bnjbvr and julian-seward1 for a review on PR #2538.

view this post on Zulip Wasmtime GitHub notifications bot (Jan 04 2021 at 06:06):

cfallin requested bnjbvr and julian-seward1 for a review on PR #2538.

view this post on Zulip Wasmtime GitHub notifications bot (Jan 04 2021 at 19:02):

bjorn3 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Jan 04 2021 at 19:02):

bjorn3 created PR Review Comment:

These and a few other methods are the same in all cases. Please move them to a single unconditional impl block.

view this post on Zulip Wasmtime GitHub notifications bot (Jan 04 2021 at 20:53):

cfallin updated PR #2538 from multi-reg-framework to main:

This will allow for support for I128 values everywhere, and I64
values on 32-bit targets (e.g., ARM32 and x86-32). It does not alter the
machine backends to build such support; it just adds the framework for
the MachInst backends to reason about a Value residing in more than
one register.

This is a finalized version of the framework part of the draft PR (#2504); I128
operator implementations will come in a followup PR.

<!--

Please ensure that the following steps are all taken care of before submitting
the PR.

Please ensure all communication adheres to the code of conduct.
-->

view this post on Zulip Wasmtime GitHub notifications bot (Jan 04 2021 at 20:59):

cfallin updated PR #2538 from multi-reg-framework to main:

This will allow for support for I128 values everywhere, and I64
values on 32-bit targets (e.g., ARM32 and x86-32). It does not alter the
machine backends to build such support; it just adds the framework for
the MachInst backends to reason about a Value residing in more than
one register.

This is a finalized version of the framework part of the draft PR (#2504); I128
operator implementations will come in a followup PR.

<!--

Please ensure that the following steps are all taken care of before submitting
the PR.

Please ensure all communication adheres to the code of conduct.
-->

view this post on Zulip Wasmtime GitHub notifications bot (Jan 04 2021 at 21:00):

cfallin submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Jan 04 2021 at 21:00):

cfallin created PR Review Comment:

Done!

view this post on Zulip Wasmtime GitHub notifications bot (Jan 04 2021 at 21:48):

bjorn3 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Jan 05 2021 at 15:40):

bnjbvr submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Jan 05 2021 at 15:40):

bnjbvr submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Jan 05 2021 at 15:40):

bnjbvr created PR Review Comment:

nit: and VALUE_REGS_PARTS here too?

view this post on Zulip Wasmtime GitHub notifications bot (Jan 05 2021 at 15:40):

bnjbvr created PR Review Comment:

This unconditionally does four tests, but the invariant is that if part n is invalid, then part k for k>=n are invalid too. So using explicit branches with short-circuiting might be better (and penalize the "general" case of having a 1:1 mapping less), at the cost of slightly more verbosity. Of course if rustc can vectorize this, please ignore me :-)

view this post on Zulip Wasmtime GitHub notifications bot (Jan 05 2021 at 15:40):

bnjbvr created PR Review Comment:

Does it require a different R type?

view this post on Zulip Wasmtime GitHub notifications bot (Jan 05 2021 at 15:40):

bnjbvr created PR Review Comment:

nit: VALUE_REGS_PARTS here instead of 4?

view this post on Zulip Wasmtime GitHub notifications bot (Jan 05 2021 at 15:40):

bnjbvr created PR Review Comment:

Is the above clone actually necessary, if we just iter (and not into_iter) here?

view this post on Zulip Wasmtime GitHub notifications bot (Jan 05 2021 at 15:40):

bnjbvr created PR Review Comment:

This comment needs pluralization too.

view this post on Zulip Wasmtime GitHub notifications bot (Jan 05 2021 at 15:40):

bnjbvr created PR Review Comment:

Are these adds guaranteed to not overflow? (and in offset too?)

view this post on Zulip Wasmtime GitHub notifications bot (Jan 05 2021 at 15:40):

bnjbvr created PR Review Comment:

Maybe rename into_reg to into_regs? Below it's a bit weird that regs goes into into_reg (singular).

view this post on Zulip Wasmtime GitHub notifications bot (Jan 05 2021 at 15:40):

bnjbvr created PR Review Comment:

nit: this comment needs an update

view this post on Zulip Wasmtime GitHub notifications bot (Jan 05 2021 at 15:40):

bnjbvr created PR Review Comment:

(Second time I see an instance of this, so asking, but it might be false.) Could it be that the reg_class arg is now always ignored, since the type is used instead? If so, we could change the signature of gen_constant and all its callers to reflect this.

view this post on Zulip Wasmtime GitHub notifications bot (Jan 05 2021 at 15:40):

bnjbvr created PR Review Comment:

/me wears the premature optimization hat

Could we use SmallVec<[M::I; VALUE_REGS_PARTS]> here, and for the other helpers (and some of their callers too)? It documents the 1:1 mapping between move/load/store Inst and real reg present in the value-regs, and may even reduce the number of dynamic allocations.

view this post on Zulip Wasmtime GitHub notifications bot (Jan 05 2021 at 15:40):

bnjbvr created PR Review Comment:

I imagine this is implemented in a subsequent PR?

view this post on Zulip Wasmtime GitHub notifications bot (Jan 05 2021 at 15:40):

bnjbvr created PR Review Comment:

Maybe add an assert that the value doesn't overflow the i64 range, in case we'd support i128 in the future? (unless this is already correctly replaced in a subsequent PR)

view this post on Zulip Wasmtime GitHub notifications bot (Jan 05 2021 at 15:40):

bnjbvr created PR Review Comment:

Design thought: would it make sense to have a alloc_singlereg_tmp helper that avoids the constructing of an Option and unwrapping it, and returns only a single reg without using the ValueRegs construct? This line being repeated so many times makes me wary of the penalty cost of this multi-regs-per-value machinery for the general case where it's unused...

view this post on Zulip Wasmtime GitHub notifications bot (Jan 05 2021 at 15:40):

bnjbvr created PR Review Comment:

This change might not be needed. (Can't wait for removing the above allow(dead_code) once and for all!)

view this post on Zulip Wasmtime GitHub notifications bot (Jan 05 2021 at 15:40):

bnjbvr created PR Review Comment:

preexisting: There should be only one valid reference type per arch, can you cause the R32 arm to panic?

view this post on Zulip Wasmtime GitHub notifications bot (Jan 05 2021 at 15:53):

bjorn3 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Jan 05 2021 at 15:53):

bjorn3 created PR Review Comment:

For the 4 element case:

pub struct ValueRegs {
    parts: [u32; 4],
}

impl ValueRegs {
    /// Return the number of registers used.
    pub fn len(self) -> usize {
        // If rustc/LLVM is smart enough, this might even be vectorized...
        (self.parts[0] != !0) as usize
            + (self.parts[1] != !0) as usize
            + (self.parts[2] != !0) as usize
            + (self.parts[3] != !0) as usize
    }
}
.LCPI0_0:
    .quad   1
    .quad   1

playground::ValueRegs::len:
    movdqu  (%rdi), %xmm0
    pcmpeqd %xmm1, %xmm1
    pcmpeqd %xmm1, %xmm0
    pxor    %xmm1, %xmm0
    pshufd  $246, %xmm0, %xmm1
    movdqa  .LCPI0_0(%rip), %xmm2
    pand    %xmm2, %xmm1
    pshufd  $212, %xmm0, %xmm0
    pand    %xmm2, %xmm0
    paddq   %xmm1, %xmm0
    pshufd  $78, %xmm0, %xmm1
    paddq   %xmm0, %xmm1
    movq    %xmm1, %rax
    retq

For the 2 element case:

pub struct ValueRegs {
    parts: [u32; 2],
}

impl ValueRegs {
    /// Return the number of registers used.
    pub fn len(self) -> usize {
        // If rustc/LLVM is smart enough, this might even be vectorized...
        (self.parts[0] != !0) as usize
            + (self.parts[1] != !0) as usize
    }
}

playground::ValueRegs::len:
    xorl    %eax, %eax
    cmpl    $-1, %edi
    setne   %al
    movabsq $-4294967296, %rcx
    cmpq    %rcx, %rdi
    adcq    $0, %rax
    retq

view this post on Zulip Wasmtime GitHub notifications bot (Jan 05 2021 at 16:26):

bnjbvr submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Jan 05 2021 at 16:26):

bnjbvr created PR Review Comment:

Even better, for n=2 everything is in registers; scalar replacement of aggregates FTW \o/

view this post on Zulip Wasmtime GitHub notifications bot (Jan 05 2021 at 23:04):

cfallin submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Jan 05 2021 at 23:04):

cfallin created PR Review Comment:

Thanks @bjorn3 for showing the assembly here. I think that, yes, we want to avoid control flow and let the compiler use its cleverness here.

view this post on Zulip Wasmtime GitHub notifications bot (Jan 05 2021 at 23:05):

cfallin submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Jan 05 2021 at 23:05):

cfallin created PR Review Comment:

Yes, in some cases -- for example going from ValueRegs<Reg> to ValueRegs<Writable<Reg>> or vice-versa.

view this post on Zulip Wasmtime GitHub notifications bot (Jan 06 2021 at 00:07):

cfallin submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Jan 06 2021 at 00:07):

cfallin created PR Review Comment:

Yes, unfortunately, as self.emit() needs &mut self and self.retval_regs.iter() holds a &self for the duration of the loop.

view this post on Zulip Wasmtime GitHub notifications bot (Jan 06 2021 at 00:10):

cfallin submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Jan 06 2021 at 00:10):

cfallin created PR Review Comment:

I believe so; the largest type we support a load/store for is 16 bytes, so to overflow the i64 we would need to generate 2^60 instructions; we would OOM the address space before we overflowed.

view this post on Zulip Wasmtime GitHub notifications bot (Jan 06 2021 at 00:12):

cfallin submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Jan 06 2021 at 00:12):

cfallin created PR Review Comment:

Indeed!

view this post on Zulip Wasmtime GitHub notifications bot (Jan 06 2021 at 01:19):

cfallin submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Jan 06 2021 at 01:19):

cfallin created PR Review Comment:

There were occasional merge conflicts on these type lists before and I think it's cleaner just to import types::* here as we do in some other files; updated.

view this post on Zulip Wasmtime GitHub notifications bot (Jan 06 2021 at 01:21):

cfallin updated PR #2538 from multi-reg-framework to main:

This will allow for support for I128 values everywhere, and I64
values on 32-bit targets (e.g., ARM32 and x86-32). It does not alter the
machine backends to build such support; it just adds the framework for
the MachInst backends to reason about a Value residing in more than
one register.

This is a finalized version of the framework part of the draft PR (#2504); I128
operator implementations will come in a followup PR.

<!--

Please ensure that the following steps are all taken care of before submitting
the PR.

Please ensure all communication adheres to the code of conduct.
-->

view this post on Zulip Wasmtime GitHub notifications bot (Jan 06 2021 at 01:21):

cfallin submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Jan 06 2021 at 01:21):

cfallin created PR Review Comment:

Yes, good point; I went ahead and carried this refactor through all the backends.

view this post on Zulip Wasmtime GitHub notifications bot (Jan 06 2021 at 01:21):

cfallin submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Jan 06 2021 at 01:21):

cfallin created PR Review Comment:

Done!

view this post on Zulip Wasmtime GitHub notifications bot (Jan 06 2021 at 01:30):

cfallin submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Jan 06 2021 at 01:30):

cfallin created PR Review Comment:

Done, sort of; this type is exposed up to the ABI trait and I'm not sure I want to expose a detail like "max number of value registers per value" there, but it does seem reasonable to pick a small-ish SmallVec size and standardize it; so I went through the ABI code with a SmallVec<[Inst; 4]>-shaped paintbrush using a new SmallInstVec<Inst> typedef.

view this post on Zulip Wasmtime GitHub notifications bot (Jan 06 2021 at 01:30):

cfallin updated PR #2538 from multi-reg-framework to main:

This will allow for support for I128 values everywhere, and I64
values on 32-bit targets (e.g., ARM32 and x86-32). It does not alter the
machine backends to build such support; it just adds the framework for
the MachInst backends to reason about a Value residing in more than
one register.

This is a finalized version of the framework part of the draft PR (#2504); I128
operator implementations will come in a followup PR.

<!--

Please ensure that the following steps are all taken care of before submitting
the PR.

Please ensure all communication adheres to the code of conduct.
-->

view this post on Zulip Wasmtime GitHub notifications bot (Jan 06 2021 at 01:31):

cfallin submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Jan 06 2021 at 01:31):

cfallin created PR Review Comment:

Done!

view this post on Zulip Wasmtime GitHub notifications bot (Jan 06 2021 at 01:31):

cfallin submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Jan 06 2021 at 01:31):

cfallin created PR Review Comment:

Done (here and x64)!

view this post on Zulip Wasmtime GitHub notifications bot (Jan 06 2021 at 01:31):

cfallin submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Jan 06 2021 at 01:31):

cfallin created PR Review Comment:

Done!

view this post on Zulip Wasmtime GitHub notifications bot (Jan 06 2021 at 01:31):

cfallin submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Jan 06 2021 at 01:31):

cfallin created PR Review Comment:

Done.

view this post on Zulip Wasmtime GitHub notifications bot (Jan 06 2021 at 01:31):

cfallin submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Jan 06 2021 at 01:31):

cfallin created PR Review Comment:

Done.

view this post on Zulip Wasmtime GitHub notifications bot (Jan 06 2021 at 01:31):

cfallin submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Jan 06 2021 at 01:31):

cfallin created PR Review Comment:

Done!

view this post on Zulip Wasmtime GitHub notifications bot (Jan 06 2021 at 01:45):

cfallin updated PR #2538 from multi-reg-framework to main:

This will allow for support for I128 values everywhere, and I64
values on 32-bit targets (e.g., ARM32 and x86-32). It does not alter the
machine backends to build such support; it just adds the framework for
the MachInst backends to reason about a Value residing in more than
one register.

This is a finalized version of the framework part of the draft PR (#2504); I128
operator implementations will come in a followup PR.

<!--

Please ensure that the following steps are all taken care of before submitting
the PR.

Please ensure all communication adheres to the code of conduct.
-->

view this post on Zulip Wasmtime GitHub notifications bot (Jan 06 2021 at 17:58):

cfallin merged PR #2538.


Last updated: Jan 09 2026 at 13:15 UTC