Stream: git-wasmtime

Topic: wasmtime / issue #7147 riscv64: Suboptimal register alloc...


view this post on Zulip Wasmtime GitHub notifications bot (Oct 04 2023 at 10:47):

afonso360 opened issue #7147:

:wave: Hey,

This is something that I noticed while working on #7123.

.clif Test Case

test compile precise-output
set unwind_info=false
target riscv64

function %c_lh(i64) -> i16, i16 {
block0(v0: i64):
  v1 = load.i16 v0+0
  v2 = load.i16 v0+2
  return v1, v2
}

; VCode:
; block0:
;   lh a3,0(a0)
;   mv a4,a3
;   lh a1,2(a0)
;   mv a0,a4
;   ret
;
; Disassembled:
; block0: ; offset 0x0
;   lh a3, 0(a0)
;   mv a4, a3
;   lh a1, 2(a0)
;   mv a0, a4
;   ret

Steps to Reproduce

Expected Results

We could eliminate the first move by generating something along these lines:

; VCode:
; block0:
;   lh a3,0(a0)
;   lh a1,2(a0)
;   mv a0,a3
;   ret

Actual Results

We get an extra move between loads.

Versions and Environment

Cranelift version or commit: main

Operating system: Linux

Architecture: RISC-V

Extra Info

VCode:

VCode {
  Entry block: 0
  v193 := v196
  v194 := v195
Block 0:
    (original IR block: block0)
    (instruction range: 0 .. 4)
  Inst 0: args v192=a0
  Inst 1: lh v196,0(v192)
  Inst 2: lh v195,2(v192)
  Inst 3: rets v193=a0 v194=a1
}

view this post on Zulip Wasmtime GitHub notifications bot (Oct 04 2023 at 10:47):

afonso360 added the cranelift:area:riscv64 label to Issue #7147.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 04 2023 at 10:47):

afonso360 added the cranelift label to Issue #7147.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 04 2023 at 10:47):

afonso360 commented on issue #7147:

I couldn't fit this in the initial issue due to a character limit.

<details>

<summary> Here's the full trace log during compliation: </summary>

afonso@DESKTOP-1AHKMV2:~/git/wasmtime/cranelift$ RUST_LOG=trace cargo run -- test ./lmao.clif
    Blocking waiting for file lock on build directory
    Finished dev [unoptimized + debuginfo] target(s) in 45.38s
     Running `/home/afonso/git/wasmtime/target/debug/clif-util test ./lmao.clif`
 INFO  file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg.
 INFO  file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg.
 INFO  file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg.
 INFO  file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg.
 INFO  file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg.
 INFO  file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg.
 INFO  file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg.
 INFO  file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg.
 INFO  file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg.
 INFO  file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg.
 INFO  file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg.
 INFO  file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg.
 INFO  file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg.
 INFO  file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg.
 INFO  file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg.
 INFO  file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg.
 INFO  file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg.
 DEBUG cranelift_codegen::timing > timing: Starting Processing test file, (during <no pass>)
 INFO  cranelift_filetests::runone > ---
File: ./lmao.clif
 DEBUG cranelift_codegen::timing   > timing: Starting Parsing textual Cranelift IR, (during Processing test file)
 DEBUG cranelift_codegen::timing   > timing: Ending Parsing textual Cranelift IR: 0ms
 DEBUG cranelift_codegen::timing   > timing: Starting Verify Cranelift IR, (during Processing test file)
 DEBUG cranelift_codegen::timing   > timing: Starting Control flow graph, (during Verify Cranelift IR)
 DEBUG cranelift_codegen::timing   > timing: Ending Control flow graph: 0ms
 DEBUG cranelift_codegen::timing   > timing: Starting Dominator tree, (during Verify Cranelift IR)
 DEBUG cranelift_codegen::timing   > timing: Ending Dominator tree: 0ms
 DEBUG cranelift_codegen::timing   > timing: Ending Verify Cranelift IR: 0ms
 INFO  cranelift_filetests::subtest > Test: compile(%c_lh) riscv64
 DEBUG cranelift_codegen::timing    > timing: Starting Compilation passes, (during Processing test file)
 DEBUG cranelift_codegen::timing    > timing: Starting Verify Cranelift IR, (during Compilation passes)
 DEBUG cranelift_codegen::timing    > timing: Starting Control flow graph, (during Verify Cranelift IR)
 DEBUG cranelift_codegen::timing    > timing: Ending Control flow graph: 0ms
 DEBUG cranelift_codegen::timing    > timing: Starting Dominator tree, (during Verify Cranelift IR)
 DEBUG cranelift_codegen::timing    > timing: Ending Dominator tree: 0ms
 DEBUG cranelift_codegen::timing    > timing: Ending Verify Cranelift IR: 0ms
 DEBUG cranelift_codegen::context   > Number of CLIF instructions to optimize: 3
 DEBUG cranelift_codegen::context   > Number of CLIF blocks to optimize: 1
 TRACE cranelift_codegen::context   > Optimizing (opt level None):
function %c_lh(i64) -> i16, i16 fast {
block0(v0: i64):
    v1 = load.i16 v0
    v2 = load.i16 v0+2
    return v1, v2
}

 DEBUG cranelift_codegen::timing    > timing: Starting Control flow graph, (during Compilation passes)
 DEBUG cranelift_codegen::timing    > timing: Ending Control flow graph: 0ms
 TRACE cranelift_codegen::legalizer > Pre-legalization function:
function %c_lh(i64) -> i16, i16 fast {
block0(v0: i64):
    v1 = load.i16 v0
    v2 = load.i16 v0+2
    return v1, v2
}

 TRACE cranelift_codegen::legalizer > Post-legalization function:
function %c_lh(i64) -> i16, i16 fast {
block0(v0: i64):
    v1 = load.i16 v0
    v2 = load.i16 v0+2
    return v1, v2
}

 DEBUG cranelift_codegen::timing    > timing: Starting Verify Cranelift IR, (during Compilation passes)
 DEBUG cranelift_codegen::timing    > timing: Starting Control flow graph, (during Verify Cranelift IR)
 DEBUG cranelift_codegen::timing    > timing: Ending Control flow graph: 0ms
 DEBUG cranelift_codegen::timing    > timing: Starting Dominator tree, (during Verify Cranelift IR)
 DEBUG cranelift_codegen::timing    > timing: Ending Dominator tree: 0ms
 DEBUG cranelift_codegen::timing    > timing: Ending Verify Cranelift IR: 0ms
 DEBUG cranelift_codegen::timing    > timing: Starting Dominator tree, (during Compilation passes)
 DEBUG cranelift_codegen::timing    > timing: Ending Dominator tree: 0ms
 DEBUG cranelift_codegen::timing    > timing: Starting Remove unreachable blocks, (during Compilation passes)
 DEBUG cranelift_codegen::timing    > timing: Ending Remove unreachable blocks: 0ms
 DEBUG cranelift_codegen::timing    > timing: Starting Verify Cranelift IR, (during Compilation passes)
 DEBUG cranelift_codegen::timing    > timing: Starting Control flow graph, (during Verify Cranelift IR)
 DEBUG cranelift_codegen::timing    > timing: Ending Control flow graph: 0ms
 DEBUG cranelift_codegen::timing    > timing: Starting Dominator tree, (during Verify Cranelift IR)
 DEBUG cranelift_codegen::timing    > timing: Ending Dominator tree: 0ms
 DEBUG cranelift_codegen::timing    > timing: Ending Verify Cranelift IR: 0ms
 DEBUG cranelift_codegen::timing    > timing: Starting Remove constant phi-nodes, (during Compilation passes)
 DEBUG cranelift_codegen::remove_constant_phis > do_remove_constant_phis: done, 1 iters.   0 formals, of which 0 const.
 DEBUG cranelift_codegen::timing               > timing: Ending Remove constant phi-nodes: 0ms
 DEBUG cranelift_codegen::timing               > timing: Starting Verify Cranelift IR, (during Compilation passes)
 DEBUG cranelift_codegen::timing               > timing: Starting Control flow graph, (during Verify Cranelift IR)
 DEBUG cranelift_codegen::timing               > timing: Ending Control flow graph: 0ms
 DEBUG cranelift_codegen::timing               > timing: Starting Dominator tree, (during Verify Cranelift IR)
 DEBUG cranelift_codegen::timing               > timing: Ending Dominator tree: 0ms
 DEBUG cranelift_codegen::timing               > timing: Ending Verify Cranelift IR: 0ms
 TRACE cranelift_codegen::machinst::abi        > ABISig: sig Signature { params: [AbiParam { value_type: types::I64, purpose: Normal, extension: None }], returns: [AbiParam { value_type: types::I16, purpose: Normal, extension: None }, AbiParam { value_type: types::I16, purpose: Normal, extension: None }], call_conv: Fast } => args end = 3 rets end = 2
             arg stack = 0 ret stack = 0 stack_ret_arg = false
 TRACE cranelift_codegen::machinst::abi        > ABI: func signature Signature { params: [AbiParam { value_type: types::I64, purpose: Normal, extension: None }], returns: [AbiParam { value_type: types::I16, purpose: Normal, extension: None }, AbiParam { value_type: types::I16, purpose: Normal, extension: None }], call_conv: Fast }
 TRACE cranelift_codegen::machinst::blockorder > BlockLoweringOrder: function body function %c_lh(i64) -> i16, i16 fast {
block0(v0: i64):
    v1 = load.i16 v0
    v2 = load.i16 v0+2
    return v1, v2
}

 TRACE cranelift_codegen::machinst::blockorder > BlockLoweringOrder: BlockLoweringOrder {
    lowered_order: [
        Orig {
            block: block0,
        },
    ],
    lowered_succ_indices: [],
    lowered_succ_ranges: [
        (
            None,
            0..0,
        ),
    ],
    cold_blocks: {},
    indirect_branch_targets: {},
}
 TRACE cranelift_codegen::machinst::lower      > bb block0 param v0: regs ValueRegs { parts: [v192, v2097151] }
 TRACE cranelift_codegen::machinst::lower      > bb block0 inst inst0 (Load { opcode: Load, arg: v0, flags: MemFlags { bits: 0 }, offset: Offset32(0) }): result v1 regs ValueRegs { parts: [v193, v2097151] }
 TRACE cranelift_codegen::machinst::lower      > bb block0 inst inst1 (Load { opcode: Load, arg: v0, flags: MemFlags { bits: 0 }, offset: Offset32(2) }): result v2 regs ValueRegs { parts: [v194, v2097151] }
 TRACE cranelift_codegen::machinst::lower      > bb block0 inst inst0 has color 1
 TRACE cranelift_codegen::machinst::lower      >  -> side-effecting; incrementing color for next inst
 TRACE cranelift_codegen::machinst::lower      > bb block0 inst inst1 has color 2
 TRACE cranelift_codegen::machinst::lower      >  -> side-effecting; incrementing color for next inst
 TRACE cranelift_codegen::machinst::lower      > bb block0 inst inst2 has color 3
 TRACE cranelift_codegen::machinst::lower      >  -> side-effecting; incrementing color for next inst
 TRACE cranelift_codegen::machinst::lower      > arg v0 used, old state Unused, new Once
 TRACE cranelift_codegen::machinst::lower      > arg v0 used, old state Once, new Multiple
 TRACE cranelift_codegen::machinst::lower      >  -> pushing args for v0 onto stack
 TRACE cranelift_codegen::machinst::lower      > arg v1 used, old state Unused, new Once
 TRACE cranelift_codegen::machinst::lower      > arg v2 used, old state Unused, new Once
 DEBUG cranelift_codegen::machinst::compile    > Number of CLIF instructions to lower: 3
 DEBUG cranelift_codegen::machinst::compile
[message truncated]

view this post on Zulip Wasmtime GitHub notifications bot (Oct 04 2023 at 10:52):

afonso360 added the cranelift:area:regalloc label to Issue #7147.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 04 2023 at 16:20):

afonso360 commented on issue #7147:

Updated locally to the latest regalloc2 (7d90ca9) and it still produces the same output.

<details>

<summary> Trace log </summary>

     Running `/home/afonso/git/wasmtime/target/debug/clif-util test ./lmao.clif`
 INFO  file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg.
 INFO  file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg.
 INFO  file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg.
 INFO  file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg.
 INFO  file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg.
 INFO  file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg.
 INFO  file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg.
 INFO  file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg.
 INFO  file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg.
 INFO  file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg.
 INFO  file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg.
 INFO  file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg.
 INFO  file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg.
 INFO  file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg.
 INFO  file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg.
 DEBUG cranelift_codegen::timing > timing: Starting Processing test file, (during <no pass>)
 INFO  cranelift_filetests::runone > ---
File: ./lmao.clif
 DEBUG cranelift_codegen::timing   > timing: Starting Parsing textual Cranelift IR, (during Processing test file)
 INFO  file_per_thread_logger      > Set up logging; filename prefix is cranelift.dbg.
 INFO  file_per_thread_logger      > Set up logging; filename prefix is cranelift.dbg.
 DEBUG cranelift_codegen::timing   > timing: Ending Parsing textual Cranelift IR: 2ms
 DEBUG cranelift_codegen::timing   > timing: Starting Verify Cranelift IR, (during Processing test file)
 DEBUG cranelift_codegen::timing   > timing: Starting Control flow graph, (during Verify Cranelift IR)
 DEBUG cranelift_codegen::timing   > timing: Ending Control flow graph: 0ms
 DEBUG cranelift_codegen::timing   > timing: Starting Dominator tree, (during Verify Cranelift IR)
 DEBUG cranelift_codegen::timing   > timing: Ending Dominator tree: 0ms
 DEBUG cranelift_codegen::timing   > timing: Ending Verify Cranelift IR: 0ms
 INFO  cranelift_filetests::subtest > Test: compile(%c_lh) riscv64
 DEBUG cranelift_codegen::timing    > timing: Starting Compilation passes, (during Processing test file)
 DEBUG cranelift_codegen::timing    > timing: Starting Verify Cranelift IR, (during Compilation passes)
 DEBUG cranelift_codegen::timing    > timing: Starting Control flow graph, (during Verify Cranelift IR)
 DEBUG cranelift_codegen::timing    > timing: Ending Control flow graph: 0ms
 DEBUG cranelift_codegen::timing    > timing: Starting Dominator tree, (during Verify Cranelift IR)
 DEBUG cranelift_codegen::timing    > timing: Ending Dominator tree: 0ms
 DEBUG cranelift_codegen::timing    > timing: Ending Verify Cranelift IR: 0ms
 DEBUG cranelift_codegen::context   > Number of CLIF instructions to optimize: 3
 DEBUG cranelift_codegen::context   > Number of CLIF blocks to optimize: 1
 TRACE cranelift_codegen::context   > Optimizing (opt level None):
function %c_lh(i64) -> i16, i16 fast {
block0(v0: i64):
    v1 = load.i16 v0
    v2 = load.i16 v0+2
    return v1, v2
}

 DEBUG cranelift_codegen::timing    > timing: Starting Control flow graph, (during Compilation passes)
 DEBUG cranelift_codegen::timing    > timing: Ending Control flow graph: 0ms
 TRACE cranelift_codegen::legalizer > Pre-legalization function:
function %c_lh(i64) -> i16, i16 fast {
block0(v0: i64):
    v1 = load.i16 v0
    v2 = load.i16 v0+2
    return v1, v2
}

 TRACE cranelift_codegen::legalizer > Post-legalization function:
function %c_lh(i64) -> i16, i16 fast {
block0(v0: i64):
    v1 = load.i16 v0
    v2 = load.i16 v0+2
    return v1, v2
}

 DEBUG cranelift_codegen::timing    > timing: Starting Verify Cranelift IR, (during Compilation passes)
 DEBUG cranelift_codegen::timing    > timing: Starting Control flow graph, (during Verify Cranelift IR)
 DEBUG cranelift_codegen::timing    > timing: Ending Control flow graph: 0ms
 DEBUG cranelift_codegen::timing    > timing: Starting Dominator tree, (during Verify Cranelift IR)
 DEBUG cranelift_codegen::timing    > timing: Ending Dominator tree: 0ms
 DEBUG cranelift_codegen::timing    > timing: Ending Verify Cranelift IR: 0ms
 DEBUG cranelift_codegen::timing    > timing: Starting Dominator tree, (during Compilation passes)
 DEBUG cranelift_codegen::timing    > timing: Ending Dominator tree: 0ms
 DEBUG cranelift_codegen::timing    > timing: Starting Remove unreachable blocks, (during Compilation passes)
 DEBUG cranelift_codegen::timing    > timing: Ending Remove unreachable blocks: 0ms
 DEBUG cranelift_codegen::timing    > timing: Starting Verify Cranelift IR, (during Compilation passes)
 DEBUG cranelift_codegen::timing    > timing: Starting Control flow graph, (during Verify Cranelift IR)
 DEBUG cranelift_codegen::timing    > timing: Ending Control flow graph: 0ms
 DEBUG cranelift_codegen::timing    > timing: Starting Dominator tree, (during Verify Cranelift IR)
 DEBUG cranelift_codegen::timing    > timing: Ending Dominator tree: 0ms
 DEBUG cranelift_codegen::timing    > timing: Ending Verify Cranelift IR: 0ms
 DEBUG cranelift_codegen::timing    > timing: Starting Remove constant phi-nodes, (during Compilation passes)
 DEBUG cranelift_codegen::remove_constant_phis > do_remove_constant_phis: done, 1 iters.   0 formals, of which 0 const.
 DEBUG cranelift_codegen::timing               > timing: Ending Remove constant phi-nodes: 0ms
 DEBUG cranelift_codegen::timing               > timing: Starting Verify Cranelift IR, (during Compilation passes)
 DEBUG cranelift_codegen::timing               > timing: Starting Control flow graph, (during Verify Cranelift IR)
 DEBUG cranelift_codegen::timing               > timing: Ending Control flow graph: 0ms
 DEBUG cranelift_codegen::timing               > timing: Starting Dominator tree, (during Verify Cranelift IR)
 DEBUG cranelift_codegen::timing               > timing: Ending Dominator tree: 0ms
 DEBUG cranelift_codegen::timing               > timing: Ending Verify Cranelift IR: 0ms
 TRACE cranelift_codegen::machinst::abi        > ABISig: sig Signature { params: [AbiParam { value_type: types::I64, purpose: Normal, extension: None }], returns: [AbiParam { value_type: types::I16, purpose: Normal, extension: None }, AbiParam { value_type: types::I16, purpose: Normal, extension: None }], call_conv: Fast } => args end = 3 rets end = 2
             arg stack = 0 ret stack = 0 stack_ret_arg = false
 TRACE cranelift_codegen::machinst::abi        > ABI: func signature Signature { params: [AbiParam { value_type: types::I64, purpose: Normal, extension: None }], returns: [AbiParam { value_type: types::I16, purpose: Normal, extension: None }, AbiParam { value_type: types::I16, purpose: Normal, extension: None }], call_conv: Fast }
 TRACE cranelift_codegen::machinst::blockorder > BlockLoweringOrder: function body function %c_lh(i64) -> i16, i16 fast {
block0(v0: i64):
    v1 = load.i16 v0
    v2 = load.i16 v0+2
    return v1, v2
}

 TRACE cranelift_codegen::machinst::blockorder > BlockLoweringOrder: BlockLoweringOrder {
    lowered_order: [
        Orig {
            block: block0,
        },
    ],
    lowered_succ_indices: [],
    lowered_succ_ranges: [
        (
            None,
            0..0,
        ),
    ],
    cold_blocks: {},
    indirect_branch_targets: {},
}
 TRACE cranelift_codegen::machinst::lower      > bb block0 param v0: regs ValueRegs { parts: [v192, v2097151] }
 TRACE cranelift_codegen::machinst::lower      > bb block0 inst inst0 (Load { opcode: Load, arg: v0, flags: MemFlags { bits: 0 }, offset: Offset32(0) }): result v1 regs ValueRegs { parts: [v193, v2097151] }
 TRACE cranelift_codegen::machinst::lower      > bb block0 inst inst1 (Load { opcode: Load, arg: v0, flags: MemFlags { bits: 0 }, offset: Offset32(2) }): result v2 regs ValueRegs { parts: [v194, v2097151] }
 TRACE cranelift_codegen::machinst::lower      > bb block0 inst inst0 has color 1
 TRACE cranelift_codegen::machinst::lower      >  -> side-effecting; incrementing color for next inst
 TRACE cranelift_codegen::machinst::lower      > bb block0 inst inst1 has color 2
 TRACE cranelift_codegen::machinst::lower      >  -> side-effecting; incrementing color for next inst
 TRACE cranelift_codegen::machinst::lower      > bb block0 inst inst2 has color 3
 TRACE cranelift_codegen::machinst::lower      >  -> side-effecting; incrementing color for next inst
 TRACE cranelift_codegen::machinst::lower      > arg v0 used, old state Unused, new Once
 TRACE cranelift_codegen::machinst::lower      > arg v0 used, old state Once, new Multiple
 TRACE cranelift_codegen::machinst::lower      >  -> pushing args for v0 onto stack
 TRACE cranelift_codegen::machinst::lower      > arg v1 used, old state Unused, new Once
 TRACE cranelift_codegen::machinst::lower      > arg v2 used, old state Unused, new Once
 DEBUG cranelift_codegen::machinst::compile    > Number of CLIF instructions to lower: 3
 DEBUG cranelift_codegen::machinst::compile    > Number of CLIF blocks to lower: 1
 DEBUG cranelift_codegen::timing               > timing: Starting VCode lowering, (
[message truncated]

view this post on Zulip Wasmtime GitHub notifications bot (Oct 04 2023 at 16:20):

afonso360 edited a comment on issue #7147:

Tested locally with the latest regalloc2 (7d90ca9) and it still produces the same output.

<details>

<summary> Trace log </summary>

     Running `/home/afonso/git/wasmtime/target/debug/clif-util test ./lmao.clif`
 INFO  file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg.
 INFO  file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg.
 INFO  file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg.
 INFO  file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg.
 INFO  file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg.
 INFO  file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg.
 INFO  file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg.
 INFO  file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg.
 INFO  file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg.
 INFO  file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg.
 INFO  file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg.
 INFO  file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg.
 INFO  file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg.
 INFO  file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg.
 INFO  file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg.
 DEBUG cranelift_codegen::timing > timing: Starting Processing test file, (during <no pass>)
 INFO  cranelift_filetests::runone > ---
File: ./lmao.clif
 DEBUG cranelift_codegen::timing   > timing: Starting Parsing textual Cranelift IR, (during Processing test file)
 INFO  file_per_thread_logger      > Set up logging; filename prefix is cranelift.dbg.
 INFO  file_per_thread_logger      > Set up logging; filename prefix is cranelift.dbg.
 DEBUG cranelift_codegen::timing   > timing: Ending Parsing textual Cranelift IR: 2ms
 DEBUG cranelift_codegen::timing   > timing: Starting Verify Cranelift IR, (during Processing test file)
 DEBUG cranelift_codegen::timing   > timing: Starting Control flow graph, (during Verify Cranelift IR)
 DEBUG cranelift_codegen::timing   > timing: Ending Control flow graph: 0ms
 DEBUG cranelift_codegen::timing   > timing: Starting Dominator tree, (during Verify Cranelift IR)
 DEBUG cranelift_codegen::timing   > timing: Ending Dominator tree: 0ms
 DEBUG cranelift_codegen::timing   > timing: Ending Verify Cranelift IR: 0ms
 INFO  cranelift_filetests::subtest > Test: compile(%c_lh) riscv64
 DEBUG cranelift_codegen::timing    > timing: Starting Compilation passes, (during Processing test file)
 DEBUG cranelift_codegen::timing    > timing: Starting Verify Cranelift IR, (during Compilation passes)
 DEBUG cranelift_codegen::timing    > timing: Starting Control flow graph, (during Verify Cranelift IR)
 DEBUG cranelift_codegen::timing    > timing: Ending Control flow graph: 0ms
 DEBUG cranelift_codegen::timing    > timing: Starting Dominator tree, (during Verify Cranelift IR)
 DEBUG cranelift_codegen::timing    > timing: Ending Dominator tree: 0ms
 DEBUG cranelift_codegen::timing    > timing: Ending Verify Cranelift IR: 0ms
 DEBUG cranelift_codegen::context   > Number of CLIF instructions to optimize: 3
 DEBUG cranelift_codegen::context   > Number of CLIF blocks to optimize: 1
 TRACE cranelift_codegen::context   > Optimizing (opt level None):
function %c_lh(i64) -> i16, i16 fast {
block0(v0: i64):
    v1 = load.i16 v0
    v2 = load.i16 v0+2
    return v1, v2
}

 DEBUG cranelift_codegen::timing    > timing: Starting Control flow graph, (during Compilation passes)
 DEBUG cranelift_codegen::timing    > timing: Ending Control flow graph: 0ms
 TRACE cranelift_codegen::legalizer > Pre-legalization function:
function %c_lh(i64) -> i16, i16 fast {
block0(v0: i64):
    v1 = load.i16 v0
    v2 = load.i16 v0+2
    return v1, v2
}

 TRACE cranelift_codegen::legalizer > Post-legalization function:
function %c_lh(i64) -> i16, i16 fast {
block0(v0: i64):
    v1 = load.i16 v0
    v2 = load.i16 v0+2
    return v1, v2
}

 DEBUG cranelift_codegen::timing    > timing: Starting Verify Cranelift IR, (during Compilation passes)
 DEBUG cranelift_codegen::timing    > timing: Starting Control flow graph, (during Verify Cranelift IR)
 DEBUG cranelift_codegen::timing    > timing: Ending Control flow graph: 0ms
 DEBUG cranelift_codegen::timing    > timing: Starting Dominator tree, (during Verify Cranelift IR)
 DEBUG cranelift_codegen::timing    > timing: Ending Dominator tree: 0ms
 DEBUG cranelift_codegen::timing    > timing: Ending Verify Cranelift IR: 0ms
 DEBUG cranelift_codegen::timing    > timing: Starting Dominator tree, (during Compilation passes)
 DEBUG cranelift_codegen::timing    > timing: Ending Dominator tree: 0ms
 DEBUG cranelift_codegen::timing    > timing: Starting Remove unreachable blocks, (during Compilation passes)
 DEBUG cranelift_codegen::timing    > timing: Ending Remove unreachable blocks: 0ms
 DEBUG cranelift_codegen::timing    > timing: Starting Verify Cranelift IR, (during Compilation passes)
 DEBUG cranelift_codegen::timing    > timing: Starting Control flow graph, (during Verify Cranelift IR)
 DEBUG cranelift_codegen::timing    > timing: Ending Control flow graph: 0ms
 DEBUG cranelift_codegen::timing    > timing: Starting Dominator tree, (during Verify Cranelift IR)
 DEBUG cranelift_codegen::timing    > timing: Ending Dominator tree: 0ms
 DEBUG cranelift_codegen::timing    > timing: Ending Verify Cranelift IR: 0ms
 DEBUG cranelift_codegen::timing    > timing: Starting Remove constant phi-nodes, (during Compilation passes)
 DEBUG cranelift_codegen::remove_constant_phis > do_remove_constant_phis: done, 1 iters.   0 formals, of which 0 const.
 DEBUG cranelift_codegen::timing               > timing: Ending Remove constant phi-nodes: 0ms
 DEBUG cranelift_codegen::timing               > timing: Starting Verify Cranelift IR, (during Compilation passes)
 DEBUG cranelift_codegen::timing               > timing: Starting Control flow graph, (during Verify Cranelift IR)
 DEBUG cranelift_codegen::timing               > timing: Ending Control flow graph: 0ms
 DEBUG cranelift_codegen::timing               > timing: Starting Dominator tree, (during Verify Cranelift IR)
 DEBUG cranelift_codegen::timing               > timing: Ending Dominator tree: 0ms
 DEBUG cranelift_codegen::timing               > timing: Ending Verify Cranelift IR: 0ms
 TRACE cranelift_codegen::machinst::abi        > ABISig: sig Signature { params: [AbiParam { value_type: types::I64, purpose: Normal, extension: None }], returns: [AbiParam { value_type: types::I16, purpose: Normal, extension: None }, AbiParam { value_type: types::I16, purpose: Normal, extension: None }], call_conv: Fast } => args end = 3 rets end = 2
             arg stack = 0 ret stack = 0 stack_ret_arg = false
 TRACE cranelift_codegen::machinst::abi        > ABI: func signature Signature { params: [AbiParam { value_type: types::I64, purpose: Normal, extension: None }], returns: [AbiParam { value_type: types::I16, purpose: Normal, extension: None }, AbiParam { value_type: types::I16, purpose: Normal, extension: None }], call_conv: Fast }
 TRACE cranelift_codegen::machinst::blockorder > BlockLoweringOrder: function body function %c_lh(i64) -> i16, i16 fast {
block0(v0: i64):
    v1 = load.i16 v0
    v2 = load.i16 v0+2
    return v1, v2
}

 TRACE cranelift_codegen::machinst::blockorder > BlockLoweringOrder: BlockLoweringOrder {
    lowered_order: [
        Orig {
            block: block0,
        },
    ],
    lowered_succ_indices: [],
    lowered_succ_ranges: [
        (
            None,
            0..0,
        ),
    ],
    cold_blocks: {},
    indirect_branch_targets: {},
}
 TRACE cranelift_codegen::machinst::lower      > bb block0 param v0: regs ValueRegs { parts: [v192, v2097151] }
 TRACE cranelift_codegen::machinst::lower      > bb block0 inst inst0 (Load { opcode: Load, arg: v0, flags: MemFlags { bits: 0 }, offset: Offset32(0) }): result v1 regs ValueRegs { parts: [v193, v2097151] }
 TRACE cranelift_codegen::machinst::lower      > bb block0 inst inst1 (Load { opcode: Load, arg: v0, flags: MemFlags { bits: 0 }, offset: Offset32(2) }): result v2 regs ValueRegs { parts: [v194, v2097151] }
 TRACE cranelift_codegen::machinst::lower      > bb block0 inst inst0 has color 1
 TRACE cranelift_codegen::machinst::lower      >  -> side-effecting; incrementing color for next inst
 TRACE cranelift_codegen::machinst::lower      > bb block0 inst inst1 has color 2
 TRACE cranelift_codegen::machinst::lower      >  -> side-effecting; incrementing color for next inst
 TRACE cranelift_codegen::machinst::lower      > bb block0 inst inst2 has color 3
 TRACE cranelift_codegen::machinst::lower      >  -> side-effecting; incrementing color for next inst
 TRACE cranelift_codegen::machinst::lower      > arg v0 used, old state Unused, new Once
 TRACE cranelift_codegen::machinst::lower      > arg v0 used, old state Once, new Multiple
 TRACE cranelift_codegen::machinst::lower      >  -> pushing args for v0 onto stack
 TRACE cranelift_codegen::machinst::lower      > arg v1 used, old state Unused, new Once
 TRACE cranelift_codegen::machinst::lower      > arg v2 used, old state Unused, new Once
 DEBUG cranelift_codegen::machinst::compile    > Number of CLIF instructions to lower: 3
 DEBUG cranelift_codegen::machinst::compile    > Number of CLIF blocks to lower: 1
 DEBUG cranelift_codegen::timing               > timing: Starting VCode low
[message truncated]

view this post on Zulip Wasmtime GitHub notifications bot (Oct 04 2023 at 21:42):

elliottt commented on issue #7147:

@cfallin and I looked through this, and it looks like the problem is the current splitting heuristic in RA2: it's splitting at the location of the first conflict, which happens to be that first load, rather than at the point where a fixed constraint exists (the return instruction). Splitting early means that we get an allocation of a3 somewhat arbitrarily to v1, however the other half of the split which contains the second load and the return now also conflicts with the ultimate requirement for v1 to be in a0, thus a second split is introduced after the second load. That second split introduces the allocation of v1 to a4, which is why we see the move from a3 to a4 between the first two loads, and ultimately the move from a4 to a0 to satisfy the abi.

One possible solution here would be to augment the split point heuristic to look at the region after the conflict and move the split point to right before the first fixed use, which would in cases like this ensure that we weren't setting ourselves up for additional splits that would require additional moves.


Last updated: Jan 10 2026 at 02:36 UTC