Stream: git-wasmtime

Topic: wasmtime / PR #1605 Stub of a new isel backend for x64


view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 16:52):

bnjbvr opened PR #1605 from rebase-x64 to master:

This is a draft-ish PR containing a rebase of Julian's x64 work on the new isel backend, including quite a few refactorings of mine. During today's Cranelift meeting, we agreed that it might make sense to add the incomplete backend here, allowing incremental development by more contributors.

I've tried to split the files in a way that was similar to the aarch64 backend, but some things might still be wobbly; I might check this a bit further more tomorrow. Of course, that's also something we could do as follow-up PRs.

The new x64 backend is controlled by a new Cargo feature x64 (which also requires x86). Then, users of clif-util can use the --target x86_64 argument, and:

Second (small) commit is what allows passing ISA-specific flags to --set, which wasn't implemented yet.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:02):

bjorn3 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:02):

bjorn3 created PR Review Comment:

fn xmm(num: u8) -> Reg {
    fpr(num, 14 + num)
}

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:03):

bjorn3 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:03):

bjorn3 created PR Review Comment:

Please return the original triple as passed to isa_builder.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:06):

bjorn3 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:06):

bjorn3 created PR Review Comment:

    if isa_flags.use_new_backend() {
        #[cfg(feature = "x64")]
        {
            use super::x64;
            return x64::isa_builder(triple).finish(shared_flags)
        }
        #[cfg(not(feature = "x64"))]
        {
            panic!("new backend x86 support not included by cargo features!");
        }
    } else {
        Box::new(Isa {
            triple,
            isa_flags,
            shared_flags,
            cpumode: level1,
        })
    }

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:08):

bjorn3 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:08):

bjorn3 created PR Review Comment:

There is already a way to have target specific settings: target riscv32 supports_m=1

https://github.com/bytecodealliance/wasmtime/blob/d6b158992697546dfda982495bfa7572668a9888/cranelift/filetests/filetests/isa/riscv/expand-i32.clif#L4

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:11):

bjorn3 created PR Review Comment:

/// `simm32` is its sign-extension out to 64 bits.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:11):

bjorn3 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:12):

bjorn3 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:12):

bjorn3 created PR Review Comment:

    /// the signless, non-extending (N x N -> N, for N in {32,64}) variant
    Mul,

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:12):

bjorn3 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:12):

bjorn3 created PR Review Comment:

/// These indicate ways of extending (widening) a value, using the Intel
/// naming: B(yte) = u8, W(ord) = u16, L(ong)word = u32, Q(uad)word = u64

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:15):

bjorn3 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:15):

bjorn3 created PR Review Comment:

    /// byte -> long word
    BL,
    /// byte -> quad word
    BQ,
    /// word -> long word
    WL,
    /// word -> quad word
    WQ,
    /// long -> quad word
    LQ,

Also is there no byte -> word variant?

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:18):

bjorn3 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:18):

bjorn3 created PR Review Comment:

    /// overflow
    O = 0,
    /// no overflow
    NO = 1,
    /// <u
    B = 2,
    /// >= u
    NB = 3,
    /// zero
    Z = 4,
    /// not zero
    NZ = 5,
    /// <=u
    BE = 6,
    /// >u
    NBE = 7,
    /// negative
    S = 8,
    /// not negative
    NS = 9,
    /// <s
    L = 12,
    /// >=s
    NL= 13,
    /// <=s
    LE = 14,
    /// >s
    NLE = 15,

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:18):

bjorn3 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:18):

bjorn3 created PR Review Comment:

        self as u8

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:19):

bjorn3 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:19):

bjorn3 created PR Review Comment:

#[derive(Copy, Clone)]
#[repr(u8)]

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:19):

bjorn3 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:19):

bjorn3 created PR Review Comment:

            CC::O => CC::NO,
            CC::NO => CC::O,

            CC::B => CC::NB,
            CC::NB => CC::B,

            CC::Z => CC::NZ,
            CC::NZ => CC::Z,

            CC::BE => CC::NBE,
            CC::NBE => CC::BE,

            CC::S => CC::NS,
            CC::NS => CC::S,

            CC::L => CC::NL,
            CC::NL => CC::L,

            CC::LE => CC::NLE,
            CC::NLE => CC::LE,

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:21):

bjorn3 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:21):

bjorn3 created PR Review Comment:

// Don't build these directly.  Instead use the `Inst::*` functions to create them.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:22):

bjorn3 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:22):

bjorn3 created PR Review Comment:

    Ret,

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:22):

bjorn3 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:22):

bjorn3 created PR Review Comment:

Why? You can just emit multiple nop instructions, right?

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:26):

bjorn3 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:26):

bjorn3 created PR Review Comment:

                s + &format!("{nil: <width$}", nil = "", width = need)

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:26):

bjorn3 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:26):

bjorn3 created PR Review Comment:

                _ => panic!("Inst(x64).show.suffixBWLQ (size={})", size),

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:28):

bjorn3 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:28):

bjorn3 created PR Review Comment:

    // For registers that appear in a read role, apply `pre_map`.  For those
    // that appear in a write role, apply `post_map`.  For registers that

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:28):

bjorn3 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:28):

bjorn3 created PR Review Comment:

    // Note that this function must be closely coordinated with `fn regs`, in
    // the sense that each arm "agrees" with the one in `fn regs` about which

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:28):

bjorn3 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:28):

bjorn3 created PR Review Comment:

            apply_map(dst, pre_map); // yes, really `pre_map`

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:28):

bjorn3 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:28):

bjorn3 created PR Review Comment:

        // conceivably use `movl %reg, %reg` to zero out the top 32 bits of

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:29):

bjorn3 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:29):

bjorn3 created PR Review Comment:

            collector.add_use(*dst); // yes, really `add_use`

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:29):

bjorn3 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:29):

bjorn3 created PR Review Comment:

    // serious error in the allocator or in our `get_regs` function.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:30):

bjorn3 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:30):

bjorn3 created PR Review Comment:

// For an instruction that has as operands a register `encG` and a memory
// address `memE`, create and emit, first the REX prefix, then caller-supplied
// opcode byte(s) (`opcodes` and `numOpcodes`), then the MOD/RM byte, then
// optionally, a SIB byte, and finally optionally an immediate that will be
// derived from the `memE` operand.  For most instructions up to and including
// SSE4.2, that will be the whole instruction.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:31):

bjorn3 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:31):

bjorn3 created PR Review Comment:

/// Get the encoding number from something which we sincerely hope is a real
/// register of class I64.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:31):

bjorn3 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:31):

bjorn3 created PR Review Comment:

/// Emit the REX prefix byte even if it appears to be redundant (== 0x40).

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:31):

bjorn3 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:31):

bjorn3 created PR Review Comment:

/// Set the W bit in the REX prefix to zero.  By default it will be set to 1,
/// indicating a 64-bit operation.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:31):

bjorn3 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:31):

bjorn3 created PR Review Comment:

/// Add an 0x66 (operand-size override) prefix.  This is necessary to indicate
/// a 16-bit operation.  Normally this will be used together with F_CLEAR_REX_W.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:32):

bjorn3 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:32):

bjorn3 created PR Review Comment:

// then the caller should pass `opcodes` == 0xF3_0F_27 and `numOpcodes` == 3.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:32):

bjorn3 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:32):

bjorn3 created PR Review Comment:

// The register operand is represented here not as a `Reg` but as its hardware
// encoding, `encG`.  `flags` can specify special handling for the REX prefix.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:32):

bjorn3 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:32):

bjorn3 created PR Review Comment:

This should all be a doc comment.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:32):

bjorn3 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:32):

bjorn3 created PR Review Comment:

    // General comment for this function: the registers in `memE` must be

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:32):

bjorn3 created PR Review Comment:

    // expression.  But `encG` can be derived from a register of any class.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:32):

bjorn3 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:33):

bjorn3 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:33):

bjorn3 created PR Review Comment:

/// This is the core 'emit' function for instructions that do not reference
/// memory.
///
/// This is conceptually the same as
/// emit_REX_OPCODES_MODRM_SIB_IMM_encG_memE, except it is for the case
/// where the E operand is a register rather than memory.  Hence it is much
/// simpler.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:33):

bjorn3 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:33):

bjorn3 created PR Review Comment:

/// These are merely wrappers for the above two functions that facilitate passing
/// actual `Reg`s rather than their encodings.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:33):

bjorn3 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:33):

bjorn3 created PR Review Comment:

    // JRS FIXME 2020Feb07: this should really just be `regEnc` not `iregEnc`

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:34):

bjorn3 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:34):

bjorn3 created PR Review Comment:

    // JRS FIXME 2020Feb07: these should really just be `regEnc` not `iregEnc`

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:34):

bjorn3 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:34):

bjorn3 created PR Review Comment:

/// Write a suitable number of bits from an imm64 to the sink.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:36):

bjorn3 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:36):

bjorn3 created PR Review Comment:

/// The top-level emit function.
///
/// Important!  Do not add improved (shortened) encoding cases to existing
/// instructions without also adding tests for those improved encodings.  That
/// is a dangerous game that leads to hard-to-track-down errors in the emitted
/// code.
///
/// For all instructions, make sure to have test coverage for all of the
/// following situations.  Do this by creating the cross product resulting from
/// applying the following rules to each operand:
///
/// (1) for any insn that mentions a register: one test using a register from
///     the group [rax, rcx, rdx, rbx, rsp, rbp, rsi, rdi] and a second one
///     using a register from the group [r8, r9, r10, r11, r12, r13, r14, r15].
///     This helps detect incorrect REX prefix construction.
///
/// (2) for any insn that mentions a byte register: one test for each of the
///     four encoding groups [al, cl, dl, bl], [spl, bpl, sil, dil],
///     [r8b .. r11b] and [r12b .. r15b].  This checks that
///     apparently-redundant REX prefixes are retained when required.
///
/// (3) for any insn that contains an immediate field, check the following
///     cases: field is zero, field is in simm8 range (-128 .. 127), field is
///     in simm32 range (-0x8000_0000 .. 0x7FFF_FFFF).  This is because some
///     instructions that require a 32-bit immediate have a short-form encoding
///     when the imm is in simm8 range.
///
/// Rules (1), (2) and (3) don't apply for registers within address expressions
/// (`Addr`s).  Those are already pretty well tested, and the registers in them
/// don't have any effect on the containing instruction (apart from possibly
/// require REX prefix bits).
///
/// When choosing registers for a test, avoid using registers with the same
/// offset within a given group.  For example, don't use rax and r8, since they
/// both have the lowest 3 bits as 000, and so the test won't detect errors
/// where those 3-bit register sub-fields are confused by the emitter.  Instead
/// use (eg) rax (lo3 = 000) and r9 (lo3 = 001).  Similarly, don't use (eg) cl
/// and bpl since they have the same offset in their group; use instead (eg) cl
/// and sil.
///
/// For all instructions, also add a test that uses only low-half registers
/// (rax .. rdi, xmm0 .. xmm7) etc, so as to check that any redundant REX
/// prefixes are correctly omitted.  This low-half restriction must apply to
/// _all_ registers in the insn, even those in address expressions.
///
/// Following these rules creates large numbers of test cases, but it's the
/// only way to make the emitter reliable.
///
/// Known possible improvements:
///
/// * there's a shorter encoding for shl/shr/sar by a 1-bit immediate.  (Do we
///   care?)

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:38):

bjorn3 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:38):

bjorn3 created PR Review Comment:

// See comments at the top of `fn x64_emit` for advice on how to create

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:38):

bjorn3 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:38):

bjorn3 created PR Review Comment:

// (cd cranelift/codegen && \

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:38):

bjorn3 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 17:38):

bjorn3 created PR Review Comment:

            // This insn is 6 bytes long.  Currently `offset` is relative to

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 18:21):

iximeow submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 18:21):

iximeow created PR Review Comment:

encoding-wise, movzx and movsx have such an encoding with 660f{b6,be}. For example, movzx ax, cl // 66 0f b6 c1. I dunno if we're just not using them though.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 19:49):

iximeow submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 19:49):

iximeow created PR Review Comment:

            // Baldrdash expects the stack to take at least the number of words set in

Or maybe I'm misreading :)

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 20:16):

iximeow created PR Review Comment:

just making sure I'm following the intended direction: this will eventually have an if call_conv.extends_windows_fastcall() arm like the existing x86 abi code, correct?

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 20:16):

iximeow submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 23:29):

iximeow submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2020 at 23:29):

iximeow created PR Review Comment:

                sink.put1(mkModRegRM(1, encG & 7, 5)); // could/?should? be encE & 7 for consistency
                sink.put1(0x25);
                sink.put1((simm32 & 0xFF) as u8);

and similar for !low8WillSXto32() would encode the RBP/R13 variants that are currently missing here right? is there a wider reason this is panic!, or just "not gotten to it yet"?

view this post on Zulip Wasmtime GitHub notifications bot (Apr 28 2020 at 00:17):

iximeow submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 28 2020 at 00:17):

iximeow created PR Review Comment:

This means we can't encode SHR 0, %rax (sure fine) but more importantly that attempting to emit a SHR 0, %rax gets you SHR %cl, %rax instead! My understanding is that this is a level below anything a user would see from building a function in Cranelift IR, so it's still unlikely to trip across this, correct?

view this post on Zulip Wasmtime GitHub notifications bot (Apr 28 2020 at 06:17):

julian-seward1 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 28 2020 at 06:17):

julian-seward1 created PR Review Comment:

I propose not to generate 16 bit operations (addw, etc) to the extent that it's possible to avoid them. They put us at risk of partial register stalls. Instead I plan to do as much 16-bit work as possible at 32 bit width, hence avoiding the stalls and also the need for the 66 prefix.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 28 2020 at 06:23):

julian-seward1 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 28 2020 at 06:23):

julian-seward1 created PR Review Comment:

Is this really a win? IMO it wastes a lot of vertical screen space for minimal gain. Sure, we can rustdoc CC itself, but I think the benefit of documenting the meaning of each individual item is not worth the loss of screen space.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 28 2020 at 06:25):

julian-seward1 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 28 2020 at 06:25):

julian-seward1 created PR Review Comment:

I'd prefer to retain the original. That makes it easy to figure out the encoding for each condition. With self as u8 you have to count through the definition to find the encoding.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 28 2020 at 06:30):

iximeow submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 28 2020 at 06:30):

iximeow created PR Review Comment:

that sounds reasonable to me (pending if/when we might also optimize for size at least)

Do you know if decode stalls from 66-prefixed instructions with immediates are still a concern? At the least it might be helpful to note what is _intentionally_ not implemented, for future readers.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 28 2020 at 06:32):

julian-seward1 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 28 2020 at 06:32):

julian-seward1 created PR Review Comment:

Hmm, I thought I had exhaustively tested this in the encoding test cases (there are 100+ amode encoding tests). So I'm not convinced there's a missing case (am not convinced there isn't a missing case, either). It's simple enough to shake out, by checking the encoding test cases.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 28 2020 at 06:41):

iximeow submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 28 2020 at 06:41):

iximeow created PR Review Comment:

Ah, you're right - the initial checks against R{BP,13} had me looking for an else mentioning them, but they're encoded through the if low8willSXto32(*simm32) && encE != regs::ENC_RSP && encE != regs::ENC_R12 case. There are a few tests for these such as for 127(%rbp).

How do you feel about upgrading this to an unreachable!?

view this post on Zulip Wasmtime GitHub notifications bot (Apr 28 2020 at 06:48):

julian-seward1 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 28 2020 at 06:48):

julian-seward1 created PR Review Comment:

Yes, it's the instruction selector's responsibility not to try to produce shr $0, %reg, but that's fine -- it can just emit no-instruction if it needs to do that -- which would in itself reflect incomplete cleanup at the CLIR level. I guess it would be conceptually cleaner and safer change this so that the shift amount is an Option<u8> though; maybe we should do that.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 28 2020 at 06:48):

julian-seward1 edited PR Review Comment.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 28 2020 at 08:03):

julian-seward1 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 28 2020 at 08:03):

julian-seward1 created PR Review Comment:

Multiple single-byte nop insns, you mean? Sure, you can do that, but that wastes front end resources. Intel defines "official" no-ops with lengths at least 1 to 14 and maybe more (I forget).

view this post on Zulip Wasmtime GitHub notifications bot (Apr 28 2020 at 08:04):

julian-seward1 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 28 2020 at 08:04):

julian-seward1 created PR Review Comment:

I don't know the details (Benjamin surely will), but no doubt the intent exists to make it possible to support windows-specific ABIs as necessary.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 28 2020 at 08:05):

julian-seward1 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 28 2020 at 08:05):

julian-seward1 created PR Review Comment:

I would be the first to admit that the logic is pretty hard to follow there. Er, if unreachable! is the right thing there, then use that. Just so long as it really is unreachable.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 28 2020 at 08:06):

julian-seward1 edited PR Review Comment.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 28 2020 at 08:13):

julian-seward1 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 28 2020 at 08:13):

julian-seward1 created PR Review Comment:

I don't know about decode stalls offhand; we'd have to consult the latest incarnation of the Intel "Optimisation Reference Manual". Note though that I was speaking above of dispatch stalls, not decode stalls -- I forgot there might also be decode limitations till you mentioned it.

Yes, documenting the general intention to avoid generating 16-bit operations as much as possible would be a good thing to do.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 28 2020 at 08:15):

bjorn3 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 28 2020 at 08:15):

bjorn3 created PR Review Comment:

The previous suggestion explicitly assigns a discriminant to every variant. Otherwise this wouldn't even be correct, as 10 and 11 are skipped.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 28 2020 at 10:31):

bnjbvr submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 28 2020 at 10:31):

bnjbvr created PR Review Comment:

The function I'm modifying is called by the CLI flags parser, though; is there a way to pass target-specific flags through the CLI that I missed?

view this post on Zulip Wasmtime GitHub notifications bot (Apr 28 2020 at 10:33):

bnjbvr submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 28 2020 at 10:33):

bnjbvr created PR Review Comment:

Agreed with Julian, there's an official selection of nop instructions for different sizes.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 28 2020 at 10:35):

bnjbvr submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 28 2020 at 10:35):

bnjbvr created PR Review Comment:

Yes, that's the idea! We'd pass the CallConv, or we'd select different impl in the caller according to the given CallConv here, as done in x86/abi.rs.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 30 2020 at 15:02):

bnjbvr submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 30 2020 at 15:02):

bnjbvr created PR Review Comment:

I'll punt on this one, it does break the symmetry with GPR, and it may require a bunch of changes all over the place.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 30 2020 at 15:05):

bnjbvr submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 30 2020 at 15:05):

bnjbvr created PR Review Comment:

I'll assume no, then :)

view this post on Zulip Wasmtime GitHub notifications bot (Apr 30 2020 at 15:18):

bnjbvr submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 30 2020 at 15:18):

bnjbvr created PR Review Comment:

I heard screen space is free, so I'll take it!

view this post on Zulip Wasmtime GitHub notifications bot (Apr 30 2020 at 15:23):

bnjbvr submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 30 2020 at 15:23):

bnjbvr created PR Review Comment:

nice

view this post on Zulip Wasmtime GitHub notifications bot (Apr 30 2020 at 15:37):

bnjbvr updated PR #1605 from rebase-x64 to master:

This is a draft-ish PR containing a rebase of Julian's x64 work on the new isel backend, including quite a few refactorings of mine. During today's Cranelift meeting, we agreed that it might make sense to add the incomplete backend here, allowing incremental development by more contributors.

I've tried to split the files in a way that was similar to the aarch64 backend, but some things might still be wobbly; I might check this a bit further more tomorrow. Of course, that's also something we could do as follow-up PRs.

The new x64 backend is controlled by a new Cargo feature x64 (which also requires x86). Then, users of clif-util can use the --target x86_64 argument, and:

Second (small) commit is what allows passing ISA-specific flags to --set, which wasn't implemented yet.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 30 2020 at 16:08):

bjorn3 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 30 2020 at 16:08):

bjorn3 created PR Review Comment:

it may require a bunch of changes all over the place.

These functions are not exported from this module.

I'll punt on this one, it does break the symmetry with GPR

Fair enough

view this post on Zulip Wasmtime GitHub notifications bot (Apr 30 2020 at 16:10):

bjorn3 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 30 2020 at 16:10):

bjorn3 created PR Review Comment:

/// Actually codegens an instruction's results into registers.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 30 2020 at 16:10):

bjorn3 deleted PR Review Comment.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 30 2020 at 16:16):

bjorn3 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 30 2020 at 16:16):

bjorn3 created PR Review Comment:

What about a return in the middle of a function? The current code would just fall through to the next instruction without ever inserting a return, right?

view this post on Zulip Wasmtime GitHub notifications bot (Apr 30 2020 at 16:17):

bjorn3 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 30 2020 at 16:17):

bjorn3 created PR Review Comment:

                    // `VMContext` is `r14` in Baldrdash.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 30 2020 at 16:29):

bjorn3 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 30 2020 at 16:29):

bjorn3 created PR Review Comment:

    pub(crate) fn get_enc(self) -> u8 {
        self as u8

CC is copy

view this post on Zulip Wasmtime GitHub notifications bot (Apr 30 2020 at 18:37):

abrown submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (May 01 2020 at 09:55):

julian-seward1 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (May 01 2020 at 09:55):

julian-seward1 created PR Review Comment:

I'm sure this is handled correctly, although I am not exactly sure how. The new backend and register allocators are block-order invariant; they can be presented with the blocks in any order; they don't care. After regalloc is done, a final block order is chosen, branches which would be fallthroughs are removed, empty blocks are removed, and the prologue/epilogue machinery identify the entry/exit blocks and add the relevant extra insns.

view this post on Zulip Wasmtime GitHub notifications bot (May 01 2020 at 09:57):

bjorn3 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (May 01 2020 at 09:57):

bjorn3 created PR Review Comment:

I mean how does it know that this is an exit block if there is no ret emitted?

view this post on Zulip Wasmtime GitHub notifications bot (May 01 2020 at 10:21):

julian-seward1 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (May 01 2020 at 10:21):

julian-seward1 created PR Review Comment:

You'd have to ask Chris or Benjamin for a definite answer, but .. I'm pretty sure it's because the higher level steering machinery in the new BE knows which CLIR-level blocks are exit blocks, so it knows which machine code blocks are exit blocks.

view this post on Zulip Wasmtime GitHub notifications bot (May 05 2020 at 13:32):

bnjbvr submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (May 05 2020 at 13:32):

bnjbvr created PR Review Comment:

Ha, interesting. Indeed it lacks the emission of Ret. This bug wouldn't be observable in spidermonkey since it uses the FallthroughRet, which means wasm returns in the middle of a function are translated into a jump to a single return block at the end of the function.

view this post on Zulip Wasmtime GitHub notifications bot (May 05 2020 at 13:58):

bnjbvr submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (May 05 2020 at 13:58):

bnjbvr created PR Review Comment:

Taking this back: the actual Ret opcode for this is generated in machinst/lower.rs, in gen_retval_setup.

view this post on Zulip Wasmtime GitHub notifications bot (May 05 2020 at 14:00):

bnjbvr updated PR #1605 from rebase-x64 to master:

This is a draft-ish PR containing a rebase of Julian's x64 work on the new isel backend, including quite a few refactorings of mine. During today's Cranelift meeting, we agreed that it might make sense to add the incomplete backend here, allowing incremental development by more contributors.

I've tried to split the files in a way that was similar to the aarch64 backend, but some things might still be wobbly; I might check this a bit further more tomorrow. Of course, that's also something we could do as follow-up PRs.

The new x64 backend is controlled by a new Cargo feature x64 (which also requires x86). Then, users of clif-util can use the --target x86_64 argument, and:

Second (small) commit is what allows passing ISA-specific flags to --set, which wasn't implemented yet.

view this post on Zulip Wasmtime GitHub notifications bot (May 05 2020 at 14:35):

bnjbvr merged PR #1605.


Last updated: Nov 22 2024 at 16:03 UTC