afonso360 opened issue #7147:
:wave: Hey,
This is something that I noticed while working on #7123.
.clifTest Casetest compile precise-output set unwind_info=false target riscv64 function %c_lh(i64) -> i16, i16 { block0(v0: i64): v1 = load.i16 v0+0 v2 = load.i16 v0+2 return v1, v2 } ; VCode: ; block0: ; lh a3,0(a0) ; mv a4,a3 ; lh a1,2(a0) ; mv a0,a4 ; ret ; ; Disassembled: ; block0: ; offset 0x0 ; lh a3, 0(a0) ; mv a4, a3 ; lh a1, 2(a0) ; mv a0, a4 ; retSteps to Reproduce
clif-util test ./the-aboveExpected Results
We could eliminate the first move by generating something along these lines:
; VCode: ; block0: ; lh a3,0(a0) ; lh a1,2(a0) ; mv a0,a3 ; retActual Results
We get an extra move between loads.
Versions and Environment
Cranelift version or commit: main
Operating system: Linux
Architecture: RISC-V
Extra Info
VCode:
VCode { Entry block: 0 v193 := v196 v194 := v195 Block 0: (original IR block: block0) (instruction range: 0 .. 4) Inst 0: args v192=a0 Inst 1: lh v196,0(v192) Inst 2: lh v195,2(v192) Inst 3: rets v193=a0 v194=a1 }
afonso360 added the cranelift:area:riscv64 label to Issue #7147.
afonso360 added the cranelift label to Issue #7147.
afonso360 commented on issue #7147:
I couldn't fit this in the initial issue due to a character limit.
<details>
<summary> Here's the full trace log during compliation: </summary>
afonso@DESKTOP-1AHKMV2:~/git/wasmtime/cranelift$ RUST_LOG=trace cargo run -- test ./lmao.clif Blocking waiting for file lock on build directory Finished dev [unoptimized + debuginfo] target(s) in 45.38s Running `/home/afonso/git/wasmtime/target/debug/clif-util test ./lmao.clif` INFO file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg. INFO file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg. INFO file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg. INFO file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg. INFO file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg. INFO file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg. INFO file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg. INFO file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg. INFO file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg. INFO file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg. INFO file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg. INFO file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg. INFO file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg. INFO file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg. INFO file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg. INFO file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg. INFO file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg. DEBUG cranelift_codegen::timing > timing: Starting Processing test file, (during <no pass>) INFO cranelift_filetests::runone > --- File: ./lmao.clif DEBUG cranelift_codegen::timing > timing: Starting Parsing textual Cranelift IR, (during Processing test file) DEBUG cranelift_codegen::timing > timing: Ending Parsing textual Cranelift IR: 0ms DEBUG cranelift_codegen::timing > timing: Starting Verify Cranelift IR, (during Processing test file) DEBUG cranelift_codegen::timing > timing: Starting Control flow graph, (during Verify Cranelift IR) DEBUG cranelift_codegen::timing > timing: Ending Control flow graph: 0ms DEBUG cranelift_codegen::timing > timing: Starting Dominator tree, (during Verify Cranelift IR) DEBUG cranelift_codegen::timing > timing: Ending Dominator tree: 0ms DEBUG cranelift_codegen::timing > timing: Ending Verify Cranelift IR: 0ms INFO cranelift_filetests::subtest > Test: compile(%c_lh) riscv64 DEBUG cranelift_codegen::timing > timing: Starting Compilation passes, (during Processing test file) DEBUG cranelift_codegen::timing > timing: Starting Verify Cranelift IR, (during Compilation passes) DEBUG cranelift_codegen::timing > timing: Starting Control flow graph, (during Verify Cranelift IR) DEBUG cranelift_codegen::timing > timing: Ending Control flow graph: 0ms DEBUG cranelift_codegen::timing > timing: Starting Dominator tree, (during Verify Cranelift IR) DEBUG cranelift_codegen::timing > timing: Ending Dominator tree: 0ms DEBUG cranelift_codegen::timing > timing: Ending Verify Cranelift IR: 0ms DEBUG cranelift_codegen::context > Number of CLIF instructions to optimize: 3 DEBUG cranelift_codegen::context > Number of CLIF blocks to optimize: 1 TRACE cranelift_codegen::context > Optimizing (opt level None): function %c_lh(i64) -> i16, i16 fast { block0(v0: i64): v1 = load.i16 v0 v2 = load.i16 v0+2 return v1, v2 } DEBUG cranelift_codegen::timing > timing: Starting Control flow graph, (during Compilation passes) DEBUG cranelift_codegen::timing > timing: Ending Control flow graph: 0ms TRACE cranelift_codegen::legalizer > Pre-legalization function: function %c_lh(i64) -> i16, i16 fast { block0(v0: i64): v1 = load.i16 v0 v2 = load.i16 v0+2 return v1, v2 } TRACE cranelift_codegen::legalizer > Post-legalization function: function %c_lh(i64) -> i16, i16 fast { block0(v0: i64): v1 = load.i16 v0 v2 = load.i16 v0+2 return v1, v2 } DEBUG cranelift_codegen::timing > timing: Starting Verify Cranelift IR, (during Compilation passes) DEBUG cranelift_codegen::timing > timing: Starting Control flow graph, (during Verify Cranelift IR) DEBUG cranelift_codegen::timing > timing: Ending Control flow graph: 0ms DEBUG cranelift_codegen::timing > timing: Starting Dominator tree, (during Verify Cranelift IR) DEBUG cranelift_codegen::timing > timing: Ending Dominator tree: 0ms DEBUG cranelift_codegen::timing > timing: Ending Verify Cranelift IR: 0ms DEBUG cranelift_codegen::timing > timing: Starting Dominator tree, (during Compilation passes) DEBUG cranelift_codegen::timing > timing: Ending Dominator tree: 0ms DEBUG cranelift_codegen::timing > timing: Starting Remove unreachable blocks, (during Compilation passes) DEBUG cranelift_codegen::timing > timing: Ending Remove unreachable blocks: 0ms DEBUG cranelift_codegen::timing > timing: Starting Verify Cranelift IR, (during Compilation passes) DEBUG cranelift_codegen::timing > timing: Starting Control flow graph, (during Verify Cranelift IR) DEBUG cranelift_codegen::timing > timing: Ending Control flow graph: 0ms DEBUG cranelift_codegen::timing > timing: Starting Dominator tree, (during Verify Cranelift IR) DEBUG cranelift_codegen::timing > timing: Ending Dominator tree: 0ms DEBUG cranelift_codegen::timing > timing: Ending Verify Cranelift IR: 0ms DEBUG cranelift_codegen::timing > timing: Starting Remove constant phi-nodes, (during Compilation passes) DEBUG cranelift_codegen::remove_constant_phis > do_remove_constant_phis: done, 1 iters. 0 formals, of which 0 const. DEBUG cranelift_codegen::timing > timing: Ending Remove constant phi-nodes: 0ms DEBUG cranelift_codegen::timing > timing: Starting Verify Cranelift IR, (during Compilation passes) DEBUG cranelift_codegen::timing > timing: Starting Control flow graph, (during Verify Cranelift IR) DEBUG cranelift_codegen::timing > timing: Ending Control flow graph: 0ms DEBUG cranelift_codegen::timing > timing: Starting Dominator tree, (during Verify Cranelift IR) DEBUG cranelift_codegen::timing > timing: Ending Dominator tree: 0ms DEBUG cranelift_codegen::timing > timing: Ending Verify Cranelift IR: 0ms TRACE cranelift_codegen::machinst::abi > ABISig: sig Signature { params: [AbiParam { value_type: types::I64, purpose: Normal, extension: None }], returns: [AbiParam { value_type: types::I16, purpose: Normal, extension: None }, AbiParam { value_type: types::I16, purpose: Normal, extension: None }], call_conv: Fast } => args end = 3 rets end = 2 arg stack = 0 ret stack = 0 stack_ret_arg = false TRACE cranelift_codegen::machinst::abi > ABI: func signature Signature { params: [AbiParam { value_type: types::I64, purpose: Normal, extension: None }], returns: [AbiParam { value_type: types::I16, purpose: Normal, extension: None }, AbiParam { value_type: types::I16, purpose: Normal, extension: None }], call_conv: Fast } TRACE cranelift_codegen::machinst::blockorder > BlockLoweringOrder: function body function %c_lh(i64) -> i16, i16 fast { block0(v0: i64): v1 = load.i16 v0 v2 = load.i16 v0+2 return v1, v2 } TRACE cranelift_codegen::machinst::blockorder > BlockLoweringOrder: BlockLoweringOrder { lowered_order: [ Orig { block: block0, }, ], lowered_succ_indices: [], lowered_succ_ranges: [ ( None, 0..0, ), ], cold_blocks: {}, indirect_branch_targets: {}, } TRACE cranelift_codegen::machinst::lower > bb block0 param v0: regs ValueRegs { parts: [v192, v2097151] } TRACE cranelift_codegen::machinst::lower > bb block0 inst inst0 (Load { opcode: Load, arg: v0, flags: MemFlags { bits: 0 }, offset: Offset32(0) }): result v1 regs ValueRegs { parts: [v193, v2097151] } TRACE cranelift_codegen::machinst::lower > bb block0 inst inst1 (Load { opcode: Load, arg: v0, flags: MemFlags { bits: 0 }, offset: Offset32(2) }): result v2 regs ValueRegs { parts: [v194, v2097151] } TRACE cranelift_codegen::machinst::lower > bb block0 inst inst0 has color 1 TRACE cranelift_codegen::machinst::lower > -> side-effecting; incrementing color for next inst TRACE cranelift_codegen::machinst::lower > bb block0 inst inst1 has color 2 TRACE cranelift_codegen::machinst::lower > -> side-effecting; incrementing color for next inst TRACE cranelift_codegen::machinst::lower > bb block0 inst inst2 has color 3 TRACE cranelift_codegen::machinst::lower > -> side-effecting; incrementing color for next inst TRACE cranelift_codegen::machinst::lower > arg v0 used, old state Unused, new Once TRACE cranelift_codegen::machinst::lower > arg v0 used, old state Once, new Multiple TRACE cranelift_codegen::machinst::lower > -> pushing args for v0 onto stack TRACE cranelift_codegen::machinst::lower > arg v1 used, old state Unused, new Once TRACE cranelift_codegen::machinst::lower > arg v2 used, old state Unused, new Once DEBUG cranelift_codegen::machinst::compile > Number of CLIF instructions to lower: 3 DEBUG cranelift_codegen::machinst::compile [message truncated]
afonso360 added the cranelift:area:regalloc label to Issue #7147.
afonso360 commented on issue #7147:
Updated locally to the latest regalloc2 (7d90ca9) and it still produces the same output.
<details>
<summary> Trace log </summary>
Running `/home/afonso/git/wasmtime/target/debug/clif-util test ./lmao.clif` INFO file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg. INFO file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg. INFO file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg. INFO file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg. INFO file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg. INFO file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg. INFO file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg. INFO file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg. INFO file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg. INFO file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg. INFO file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg. INFO file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg. INFO file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg. INFO file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg. INFO file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg. DEBUG cranelift_codegen::timing > timing: Starting Processing test file, (during <no pass>) INFO cranelift_filetests::runone > --- File: ./lmao.clif DEBUG cranelift_codegen::timing > timing: Starting Parsing textual Cranelift IR, (during Processing test file) INFO file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg. INFO file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg. DEBUG cranelift_codegen::timing > timing: Ending Parsing textual Cranelift IR: 2ms DEBUG cranelift_codegen::timing > timing: Starting Verify Cranelift IR, (during Processing test file) DEBUG cranelift_codegen::timing > timing: Starting Control flow graph, (during Verify Cranelift IR) DEBUG cranelift_codegen::timing > timing: Ending Control flow graph: 0ms DEBUG cranelift_codegen::timing > timing: Starting Dominator tree, (during Verify Cranelift IR) DEBUG cranelift_codegen::timing > timing: Ending Dominator tree: 0ms DEBUG cranelift_codegen::timing > timing: Ending Verify Cranelift IR: 0ms INFO cranelift_filetests::subtest > Test: compile(%c_lh) riscv64 DEBUG cranelift_codegen::timing > timing: Starting Compilation passes, (during Processing test file) DEBUG cranelift_codegen::timing > timing: Starting Verify Cranelift IR, (during Compilation passes) DEBUG cranelift_codegen::timing > timing: Starting Control flow graph, (during Verify Cranelift IR) DEBUG cranelift_codegen::timing > timing: Ending Control flow graph: 0ms DEBUG cranelift_codegen::timing > timing: Starting Dominator tree, (during Verify Cranelift IR) DEBUG cranelift_codegen::timing > timing: Ending Dominator tree: 0ms DEBUG cranelift_codegen::timing > timing: Ending Verify Cranelift IR: 0ms DEBUG cranelift_codegen::context > Number of CLIF instructions to optimize: 3 DEBUG cranelift_codegen::context > Number of CLIF blocks to optimize: 1 TRACE cranelift_codegen::context > Optimizing (opt level None): function %c_lh(i64) -> i16, i16 fast { block0(v0: i64): v1 = load.i16 v0 v2 = load.i16 v0+2 return v1, v2 } DEBUG cranelift_codegen::timing > timing: Starting Control flow graph, (during Compilation passes) DEBUG cranelift_codegen::timing > timing: Ending Control flow graph: 0ms TRACE cranelift_codegen::legalizer > Pre-legalization function: function %c_lh(i64) -> i16, i16 fast { block0(v0: i64): v1 = load.i16 v0 v2 = load.i16 v0+2 return v1, v2 } TRACE cranelift_codegen::legalizer > Post-legalization function: function %c_lh(i64) -> i16, i16 fast { block0(v0: i64): v1 = load.i16 v0 v2 = load.i16 v0+2 return v1, v2 } DEBUG cranelift_codegen::timing > timing: Starting Verify Cranelift IR, (during Compilation passes) DEBUG cranelift_codegen::timing > timing: Starting Control flow graph, (during Verify Cranelift IR) DEBUG cranelift_codegen::timing > timing: Ending Control flow graph: 0ms DEBUG cranelift_codegen::timing > timing: Starting Dominator tree, (during Verify Cranelift IR) DEBUG cranelift_codegen::timing > timing: Ending Dominator tree: 0ms DEBUG cranelift_codegen::timing > timing: Ending Verify Cranelift IR: 0ms DEBUG cranelift_codegen::timing > timing: Starting Dominator tree, (during Compilation passes) DEBUG cranelift_codegen::timing > timing: Ending Dominator tree: 0ms DEBUG cranelift_codegen::timing > timing: Starting Remove unreachable blocks, (during Compilation passes) DEBUG cranelift_codegen::timing > timing: Ending Remove unreachable blocks: 0ms DEBUG cranelift_codegen::timing > timing: Starting Verify Cranelift IR, (during Compilation passes) DEBUG cranelift_codegen::timing > timing: Starting Control flow graph, (during Verify Cranelift IR) DEBUG cranelift_codegen::timing > timing: Ending Control flow graph: 0ms DEBUG cranelift_codegen::timing > timing: Starting Dominator tree, (during Verify Cranelift IR) DEBUG cranelift_codegen::timing > timing: Ending Dominator tree: 0ms DEBUG cranelift_codegen::timing > timing: Ending Verify Cranelift IR: 0ms DEBUG cranelift_codegen::timing > timing: Starting Remove constant phi-nodes, (during Compilation passes) DEBUG cranelift_codegen::remove_constant_phis > do_remove_constant_phis: done, 1 iters. 0 formals, of which 0 const. DEBUG cranelift_codegen::timing > timing: Ending Remove constant phi-nodes: 0ms DEBUG cranelift_codegen::timing > timing: Starting Verify Cranelift IR, (during Compilation passes) DEBUG cranelift_codegen::timing > timing: Starting Control flow graph, (during Verify Cranelift IR) DEBUG cranelift_codegen::timing > timing: Ending Control flow graph: 0ms DEBUG cranelift_codegen::timing > timing: Starting Dominator tree, (during Verify Cranelift IR) DEBUG cranelift_codegen::timing > timing: Ending Dominator tree: 0ms DEBUG cranelift_codegen::timing > timing: Ending Verify Cranelift IR: 0ms TRACE cranelift_codegen::machinst::abi > ABISig: sig Signature { params: [AbiParam { value_type: types::I64, purpose: Normal, extension: None }], returns: [AbiParam { value_type: types::I16, purpose: Normal, extension: None }, AbiParam { value_type: types::I16, purpose: Normal, extension: None }], call_conv: Fast } => args end = 3 rets end = 2 arg stack = 0 ret stack = 0 stack_ret_arg = false TRACE cranelift_codegen::machinst::abi > ABI: func signature Signature { params: [AbiParam { value_type: types::I64, purpose: Normal, extension: None }], returns: [AbiParam { value_type: types::I16, purpose: Normal, extension: None }, AbiParam { value_type: types::I16, purpose: Normal, extension: None }], call_conv: Fast } TRACE cranelift_codegen::machinst::blockorder > BlockLoweringOrder: function body function %c_lh(i64) -> i16, i16 fast { block0(v0: i64): v1 = load.i16 v0 v2 = load.i16 v0+2 return v1, v2 } TRACE cranelift_codegen::machinst::blockorder > BlockLoweringOrder: BlockLoweringOrder { lowered_order: [ Orig { block: block0, }, ], lowered_succ_indices: [], lowered_succ_ranges: [ ( None, 0..0, ), ], cold_blocks: {}, indirect_branch_targets: {}, } TRACE cranelift_codegen::machinst::lower > bb block0 param v0: regs ValueRegs { parts: [v192, v2097151] } TRACE cranelift_codegen::machinst::lower > bb block0 inst inst0 (Load { opcode: Load, arg: v0, flags: MemFlags { bits: 0 }, offset: Offset32(0) }): result v1 regs ValueRegs { parts: [v193, v2097151] } TRACE cranelift_codegen::machinst::lower > bb block0 inst inst1 (Load { opcode: Load, arg: v0, flags: MemFlags { bits: 0 }, offset: Offset32(2) }): result v2 regs ValueRegs { parts: [v194, v2097151] } TRACE cranelift_codegen::machinst::lower > bb block0 inst inst0 has color 1 TRACE cranelift_codegen::machinst::lower > -> side-effecting; incrementing color for next inst TRACE cranelift_codegen::machinst::lower > bb block0 inst inst1 has color 2 TRACE cranelift_codegen::machinst::lower > -> side-effecting; incrementing color for next inst TRACE cranelift_codegen::machinst::lower > bb block0 inst inst2 has color 3 TRACE cranelift_codegen::machinst::lower > -> side-effecting; incrementing color for next inst TRACE cranelift_codegen::machinst::lower > arg v0 used, old state Unused, new Once TRACE cranelift_codegen::machinst::lower > arg v0 used, old state Once, new Multiple TRACE cranelift_codegen::machinst::lower > -> pushing args for v0 onto stack TRACE cranelift_codegen::machinst::lower > arg v1 used, old state Unused, new Once TRACE cranelift_codegen::machinst::lower > arg v2 used, old state Unused, new Once DEBUG cranelift_codegen::machinst::compile > Number of CLIF instructions to lower: 3 DEBUG cranelift_codegen::machinst::compile > Number of CLIF blocks to lower: 1 DEBUG cranelift_codegen::timing > timing: Starting VCode lowering, ( [message truncated]
afonso360 edited a comment on issue #7147:
Tested locally with the latest regalloc2 (7d90ca9) and it still produces the same output.
<details>
<summary> Trace log </summary>
Running `/home/afonso/git/wasmtime/target/debug/clif-util test ./lmao.clif` INFO file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg. INFO file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg. INFO file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg. INFO file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg. INFO file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg. INFO file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg. INFO file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg. INFO file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg. INFO file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg. INFO file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg. INFO file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg. INFO file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg. INFO file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg. INFO file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg. INFO file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg. DEBUG cranelift_codegen::timing > timing: Starting Processing test file, (during <no pass>) INFO cranelift_filetests::runone > --- File: ./lmao.clif DEBUG cranelift_codegen::timing > timing: Starting Parsing textual Cranelift IR, (during Processing test file) INFO file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg. INFO file_per_thread_logger > Set up logging; filename prefix is cranelift.dbg. DEBUG cranelift_codegen::timing > timing: Ending Parsing textual Cranelift IR: 2ms DEBUG cranelift_codegen::timing > timing: Starting Verify Cranelift IR, (during Processing test file) DEBUG cranelift_codegen::timing > timing: Starting Control flow graph, (during Verify Cranelift IR) DEBUG cranelift_codegen::timing > timing: Ending Control flow graph: 0ms DEBUG cranelift_codegen::timing > timing: Starting Dominator tree, (during Verify Cranelift IR) DEBUG cranelift_codegen::timing > timing: Ending Dominator tree: 0ms DEBUG cranelift_codegen::timing > timing: Ending Verify Cranelift IR: 0ms INFO cranelift_filetests::subtest > Test: compile(%c_lh) riscv64 DEBUG cranelift_codegen::timing > timing: Starting Compilation passes, (during Processing test file) DEBUG cranelift_codegen::timing > timing: Starting Verify Cranelift IR, (during Compilation passes) DEBUG cranelift_codegen::timing > timing: Starting Control flow graph, (during Verify Cranelift IR) DEBUG cranelift_codegen::timing > timing: Ending Control flow graph: 0ms DEBUG cranelift_codegen::timing > timing: Starting Dominator tree, (during Verify Cranelift IR) DEBUG cranelift_codegen::timing > timing: Ending Dominator tree: 0ms DEBUG cranelift_codegen::timing > timing: Ending Verify Cranelift IR: 0ms DEBUG cranelift_codegen::context > Number of CLIF instructions to optimize: 3 DEBUG cranelift_codegen::context > Number of CLIF blocks to optimize: 1 TRACE cranelift_codegen::context > Optimizing (opt level None): function %c_lh(i64) -> i16, i16 fast { block0(v0: i64): v1 = load.i16 v0 v2 = load.i16 v0+2 return v1, v2 } DEBUG cranelift_codegen::timing > timing: Starting Control flow graph, (during Compilation passes) DEBUG cranelift_codegen::timing > timing: Ending Control flow graph: 0ms TRACE cranelift_codegen::legalizer > Pre-legalization function: function %c_lh(i64) -> i16, i16 fast { block0(v0: i64): v1 = load.i16 v0 v2 = load.i16 v0+2 return v1, v2 } TRACE cranelift_codegen::legalizer > Post-legalization function: function %c_lh(i64) -> i16, i16 fast { block0(v0: i64): v1 = load.i16 v0 v2 = load.i16 v0+2 return v1, v2 } DEBUG cranelift_codegen::timing > timing: Starting Verify Cranelift IR, (during Compilation passes) DEBUG cranelift_codegen::timing > timing: Starting Control flow graph, (during Verify Cranelift IR) DEBUG cranelift_codegen::timing > timing: Ending Control flow graph: 0ms DEBUG cranelift_codegen::timing > timing: Starting Dominator tree, (during Verify Cranelift IR) DEBUG cranelift_codegen::timing > timing: Ending Dominator tree: 0ms DEBUG cranelift_codegen::timing > timing: Ending Verify Cranelift IR: 0ms DEBUG cranelift_codegen::timing > timing: Starting Dominator tree, (during Compilation passes) DEBUG cranelift_codegen::timing > timing: Ending Dominator tree: 0ms DEBUG cranelift_codegen::timing > timing: Starting Remove unreachable blocks, (during Compilation passes) DEBUG cranelift_codegen::timing > timing: Ending Remove unreachable blocks: 0ms DEBUG cranelift_codegen::timing > timing: Starting Verify Cranelift IR, (during Compilation passes) DEBUG cranelift_codegen::timing > timing: Starting Control flow graph, (during Verify Cranelift IR) DEBUG cranelift_codegen::timing > timing: Ending Control flow graph: 0ms DEBUG cranelift_codegen::timing > timing: Starting Dominator tree, (during Verify Cranelift IR) DEBUG cranelift_codegen::timing > timing: Ending Dominator tree: 0ms DEBUG cranelift_codegen::timing > timing: Ending Verify Cranelift IR: 0ms DEBUG cranelift_codegen::timing > timing: Starting Remove constant phi-nodes, (during Compilation passes) DEBUG cranelift_codegen::remove_constant_phis > do_remove_constant_phis: done, 1 iters. 0 formals, of which 0 const. DEBUG cranelift_codegen::timing > timing: Ending Remove constant phi-nodes: 0ms DEBUG cranelift_codegen::timing > timing: Starting Verify Cranelift IR, (during Compilation passes) DEBUG cranelift_codegen::timing > timing: Starting Control flow graph, (during Verify Cranelift IR) DEBUG cranelift_codegen::timing > timing: Ending Control flow graph: 0ms DEBUG cranelift_codegen::timing > timing: Starting Dominator tree, (during Verify Cranelift IR) DEBUG cranelift_codegen::timing > timing: Ending Dominator tree: 0ms DEBUG cranelift_codegen::timing > timing: Ending Verify Cranelift IR: 0ms TRACE cranelift_codegen::machinst::abi > ABISig: sig Signature { params: [AbiParam { value_type: types::I64, purpose: Normal, extension: None }], returns: [AbiParam { value_type: types::I16, purpose: Normal, extension: None }, AbiParam { value_type: types::I16, purpose: Normal, extension: None }], call_conv: Fast } => args end = 3 rets end = 2 arg stack = 0 ret stack = 0 stack_ret_arg = false TRACE cranelift_codegen::machinst::abi > ABI: func signature Signature { params: [AbiParam { value_type: types::I64, purpose: Normal, extension: None }], returns: [AbiParam { value_type: types::I16, purpose: Normal, extension: None }, AbiParam { value_type: types::I16, purpose: Normal, extension: None }], call_conv: Fast } TRACE cranelift_codegen::machinst::blockorder > BlockLoweringOrder: function body function %c_lh(i64) -> i16, i16 fast { block0(v0: i64): v1 = load.i16 v0 v2 = load.i16 v0+2 return v1, v2 } TRACE cranelift_codegen::machinst::blockorder > BlockLoweringOrder: BlockLoweringOrder { lowered_order: [ Orig { block: block0, }, ], lowered_succ_indices: [], lowered_succ_ranges: [ ( None, 0..0, ), ], cold_blocks: {}, indirect_branch_targets: {}, } TRACE cranelift_codegen::machinst::lower > bb block0 param v0: regs ValueRegs { parts: [v192, v2097151] } TRACE cranelift_codegen::machinst::lower > bb block0 inst inst0 (Load { opcode: Load, arg: v0, flags: MemFlags { bits: 0 }, offset: Offset32(0) }): result v1 regs ValueRegs { parts: [v193, v2097151] } TRACE cranelift_codegen::machinst::lower > bb block0 inst inst1 (Load { opcode: Load, arg: v0, flags: MemFlags { bits: 0 }, offset: Offset32(2) }): result v2 regs ValueRegs { parts: [v194, v2097151] } TRACE cranelift_codegen::machinst::lower > bb block0 inst inst0 has color 1 TRACE cranelift_codegen::machinst::lower > -> side-effecting; incrementing color for next inst TRACE cranelift_codegen::machinst::lower > bb block0 inst inst1 has color 2 TRACE cranelift_codegen::machinst::lower > -> side-effecting; incrementing color for next inst TRACE cranelift_codegen::machinst::lower > bb block0 inst inst2 has color 3 TRACE cranelift_codegen::machinst::lower > -> side-effecting; incrementing color for next inst TRACE cranelift_codegen::machinst::lower > arg v0 used, old state Unused, new Once TRACE cranelift_codegen::machinst::lower > arg v0 used, old state Once, new Multiple TRACE cranelift_codegen::machinst::lower > -> pushing args for v0 onto stack TRACE cranelift_codegen::machinst::lower > arg v1 used, old state Unused, new Once TRACE cranelift_codegen::machinst::lower > arg v2 used, old state Unused, new Once DEBUG cranelift_codegen::machinst::compile > Number of CLIF instructions to lower: 3 DEBUG cranelift_codegen::machinst::compile > Number of CLIF blocks to lower: 1 DEBUG cranelift_codegen::timing > timing: Starting VCode low [message truncated]
elliottt commented on issue #7147:
@cfallin and I looked through this, and it looks like the problem is the current splitting heuristic in RA2: it's splitting at the location of the first conflict, which happens to be that first load, rather than at the point where a fixed constraint exists (the return instruction). Splitting early means that we get an allocation of a3 somewhat arbitrarily to v1, however the other half of the split which contains the second load and the return now also conflicts with the ultimate requirement for v1 to be in a0, thus a second split is introduced after the second load. That second split introduces the allocation of v1 to a4, which is why we see the move from a3 to a4 between the first two loads, and ultimately the move from a4 to a0 to satisfy the abi.
One possible solution here would be to augment the split point heuristic to look at the region after the conflict and move the split point to right before the first fixed use, which would in cases like this ensure that we weren't setting ourselves up for additional splits that would require additional moves.
Last updated: Jan 10 2026 at 02:36 UTC