Stream: git-wasmtime

Topic: wasmtime / issue #5425 Cranelift: s390x: regalloc checker...


view this post on Zulip Wasmtime GitHub notifications bot (Dec 13 2022 at 18:21):

bjorn3 labeled issue #5425:

Steps to Reproduce

I can create a minimal reproducing example later if requested.

Expected Results

It compiles fine.

Actual Results

thread '<unnamed>' panicked at 'register allocation checker: CheckerErrors { errors: [UnknownValueInAllocation { inst: Inst(58), op: Use: v0i fixed(p0i), alloc: p0i }, UnknownValueInAllocation { inst: Inst(58), op: Use: v0i fixed(p0i), alloc: p0i }, IncorrectValuesInAllocation { inst: Inst(60), op: Use: v187i reg, alloc: p2i, actual: {} }] }', /home/runner/.cargo/git/checkouts/wasmtime-41807828cb3a7a7e/63a3951/cranelift/codegen/src/machinst/compile.rs:88:14
stack backtrace:
   0: rust_begin_unwind
             at /rustc/37d7de337903a558dbeb1e82c844fe915ab8ff25/library/std/src/panicking.rs:575:5
   1: core::panicking::panic_fmt
             at /rustc/37d7de337903a558dbeb1e82c844fe915ab8ff25/library/core/src/panicking.rs:64:14
   2: core::result::unwrap_failed
             at /rustc/37d7de337903a558dbeb1e82c844fe915ab8ff25/library/core/src/result.rs:1791:5
   3: cranelift_codegen::machinst::compile::compile
   4: <cranelift_codegen::isa::s390x::S390xBackend as cranelift_codegen::isa::TargetIsa>::compile_function
   5: cranelift_codegen::context::Context::compile_stencil
   6: cranelift_codegen::context::Context::compile
   7: cranelift_codegen::context::Context::compile_and_emit
   8: <cranelift_object::backend::ObjectModule as cranelift_module::module::Module>::define_function
   9: rustc_codegen_cranelift::base::compile_fn
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

Versions and Environment

Cranelift version or commit: https://github.com/bytecodealliance/wasmtime/commit/63a39511b65a2c8ad3d77b580236532dffc98b62 (upcoming 0.91 Cranelift release / 4.0 Wasmtime release)

Operating system: Linux

Architecture: cross-compiling to s390x

Extra Info

cc @uweigand

view this post on Zulip Wasmtime GitHub notifications bot (Dec 13 2022 at 18:21):

bjorn3 labeled issue #5425:

Steps to Reproduce

I can create a minimal reproducing example later if requested.

Expected Results

It compiles fine.

Actual Results

thread '<unnamed>' panicked at 'register allocation checker: CheckerErrors { errors: [UnknownValueInAllocation { inst: Inst(58), op: Use: v0i fixed(p0i), alloc: p0i }, UnknownValueInAllocation { inst: Inst(58), op: Use: v0i fixed(p0i), alloc: p0i }, IncorrectValuesInAllocation { inst: Inst(60), op: Use: v187i reg, alloc: p2i, actual: {} }] }', /home/runner/.cargo/git/checkouts/wasmtime-41807828cb3a7a7e/63a3951/cranelift/codegen/src/machinst/compile.rs:88:14
stack backtrace:
   0: rust_begin_unwind
             at /rustc/37d7de337903a558dbeb1e82c844fe915ab8ff25/library/std/src/panicking.rs:575:5
   1: core::panicking::panic_fmt
             at /rustc/37d7de337903a558dbeb1e82c844fe915ab8ff25/library/core/src/panicking.rs:64:14
   2: core::result::unwrap_failed
             at /rustc/37d7de337903a558dbeb1e82c844fe915ab8ff25/library/core/src/result.rs:1791:5
   3: cranelift_codegen::machinst::compile::compile
   4: <cranelift_codegen::isa::s390x::S390xBackend as cranelift_codegen::isa::TargetIsa>::compile_function
   5: cranelift_codegen::context::Context::compile_stencil
   6: cranelift_codegen::context::Context::compile
   7: cranelift_codegen::context::Context::compile_and_emit
   8: <cranelift_object::backend::ObjectModule as cranelift_module::module::Module>::define_function
   9: rustc_codegen_cranelift::base::compile_fn
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

Versions and Environment

Cranelift version or commit: https://github.com/bytecodealliance/wasmtime/commit/63a39511b65a2c8ad3d77b580236532dffc98b62 (upcoming 0.91 Cranelift release / 4.0 Wasmtime release)

Operating system: Linux

Architecture: cross-compiling to s390x

Extra Info

cc @uweigand

view this post on Zulip Wasmtime GitHub notifications bot (Dec 13 2022 at 18:40):

cfallin commented on issue #5425:

cc @elliottt, was this possibly a result of some of the SSA rework?

view this post on Zulip Wasmtime GitHub notifications bot (Dec 13 2022 at 18:52):

elliottt commented on issue #5425:

It could be, but it's hard to say without seeing the clif that's causing the error.

view this post on Zulip Wasmtime GitHub notifications bot (Dec 13 2022 at 19:13):

bjorn3 commented on issue #5425:

This is one of the failing functions. Not verified locally and needs to be reduced:

<details>

set opt_level=speed_and_size
set tls_model=elf_gd
set libcall_call_conv=isa_default
set probestack_size_log2=12
set probestack_strategy=outline
set regalloc_checker=1
set regalloc_verbose_logs=0
set enable_alias_analysis=1
set use_egraphs=0
set enable_verifier=1
set is_pic=1
set use_colocated_libcalls=0
set avoid_div_traps=0
set enable_float=1
set enable_nan_canonicalization=0
set enable_pinned_reg=0
set use_pinned_reg_as_heap_base=0
set enable_simd=1
set enable_atomics=1
set enable_safepoints=0
set enable_llvm_abi_extensions=1
set unwind_info=1
set preserve_frame_pointers=0
set machine_code_cfg_info=0
set enable_probestack=0
set probestack_func_adjusts_sp=0
set enable_jump_tables=1
set enable_heap_access_spectre_mitigation=1
set enable_table_access_spectre_mitigation=1
set enable_incremental_compilation_cache_checks=0
target s390x has_mie2=0 has_vxrs_ext2=0

function u0:495(i64) system_v {
    ss0 = explicit_slot 16
    ss1 = explicit_slot 16
    ss2 = explicit_slot 16
    ss3 = explicit_slot 16
    ss4 = explicit_slot 16
    ss5 = explicit_slot 16
    ss6 = explicit_slot 16
    ss7 = explicit_slot 16
    ss8 = explicit_slot 16
    ss9 = explicit_slot 16
    ss10 = explicit_slot 16
    ss11 = explicit_slot 16
    sig0 = (i64) -> i32 system_v
    sig1 = (i64, i32, i32, i8, i8) -> i32, i32 system_v
    sig2 = (i64, i32, i64, i32) -> i8 system_v
    sig3 = (i64) -> i32 system_v
    fn0 = colocated u0:496 sig0
    fn1 = colocated u0:117 sig1
    fn2 = u0:727 sig2
    fn3 = colocated u0:496 sig3
    jt0 = jump_table [block6, block4]

                                block0(v0: i64):
                                    stack_store v0, ss0
                                    jump block1

                                block1:
@0000                               v1 = stack_load.i64 ss0
@0000                               v2 = call fn0(v1)
                                    v3 -> v2
@0000                               jump block2

                                block2:
@0002                               brz.i32 v3, block3
@0002                               jump block7

                                block3:
@0003                               v4 = stack_load.i64 ss0
@0003                               stack_store v4, ss2
@0003                               v5 = stack_load.i64 ss2
@0003                               stack_store v5, ss8
@0003                               v6 = stack_load.i64 ss8
@0003                               v7 = iconst.i32 0
@0003                               v8 = iconst.i32 1
@0003                               v9 = iconst.i8 2
@0003                               stack_store v9, ss10  ; v9 = 2
@0003                               v10 = iconst.i8 0
@0003                               stack_store v10, ss11  ; v10 = 0
@0003                               v11 = stack_load.i8 ss10
@0003                               v12 = stack_load.i8 ss11
@0003                               v13, v14 = call fn1(v6, v7, v8, v11, v12)  ; v7 = 0, v8 = 1
@0003                               stack_store v13, ss1
@0003                               stack_store v14, ss1+4
@0003                               jump block16

                                block4:
@0006                               v15 = stack_load.i32 ss1+4
@0008                               jump block8(v15)

                                block5:
@0003                               trap unreachable

                                block6:
@0009                               return

                                block7:
@0008                               jump block8(v3)

                                block8(v16: i32):
@000a                               v17 = icmp_imm eq v16, 2
@000a                               brnz v17, block9
@000a                               jump block10

                                block9:
@000a                               v18 = iconst.i8 0
@000a                               jump block11(v18)  ; v18 = 0

                                block10:
@000b                               v19 = stack_load.i64 ss0
@000b                               stack_store v19, ss5
@000b                               v20 = stack_load.i64 ss5
@000b                               stack_store v20, ss9
@000b                               v21 = stack_load.i64 ss9
@000b                               v22 = iconst.i32 2
@000b                               v23 = atomic_rmw.i32 xchg v21, v22  ; v22 = 2
                                    v37 -> v23
@000b                               jump block17

                                block11(v24: i8):
@000a                               brz v24, block12
@000a                               jump block15

                                block12:
@000e                               v25 = stack_load.i64 ss0
@000f                               v26 = iconst.i32 0x3b9a_ca00
@000f                               stack_store v26, ss7+8  ; v26 = 0x3b9a_ca00
@000d                               v27 = iconst.i32 2
@000d                               v28 = stack_load.i64 ss7
@000d                               v29 = stack_load.i32 ss7+8
@000d                               v30 = call fn2(v25, v27, v28, v29)  ; v27 = 2
@000d                               jump block13

                                block13:
@0012                               v31 = stack_load.i64 ss0
@0012                               v32 = call fn3(v31)
                                    v40 -> v32
@0012                               jump block14

                                block14:
@0014                               jump block8(v40)

                                block15:
@0015                               return

                                block16:
@0003                               v33 = stack_load.i32 ss1
@0003                               v34 = uextend.i64 v33
@0017                               jump block18

                                block18:
@0017                               v35 = icmp_imm.i64 ugt v34, 0xffff_ffff
@0017                               brnz v35, block5
@0017                               jump block19

                                block19:
@0017                               v36 = ireduce.i32 v34
@0017                               br_table v36, block5, jt0

                                block17:
@000b                               v38 = iconst.i32 0
@000b                               v39 = icmp.i32 eq v37, v38  ; v38 = 0
@000a                               jump block11(v39)
}

</details>

view this post on Zulip Wasmtime GitHub notifications bot (Dec 13 2022 at 19:34):

afonso360 commented on issue #5425:

Minimized test case:

test compile
set regalloc_checker=1
target s390x

function %a() system_v {
    sig0 = (i64) -> i32 system_v
    fn0 = colocated u0:496 sig0

block0:
    v1 = iconst.i64 0
    v2 = call fn0(v1)  ; v1 = 0

    v21 = iconst.i64 0
    v22 = iconst.i32 2
    v23 = atomic_rmw.i32 xchg v21, v22  ; v21 = 0, v22 = 2
    trap user0
}

Removing either the call or the atomic_rmw makes the test pass.

view this post on Zulip Wasmtime GitHub notifications bot (Dec 13 2022 at 20:18):

uweigand commented on issue #5425:

Thanks for providing the reduced test case, @afonso360 ! This does look related to the SSA changes, most likely due to the use of a fixed register in the CAS loop. I'll have a look.

view this post on Zulip Wasmtime GitHub notifications bot (Dec 13 2022 at 22:26):

elliottt commented on issue #5425:

I think this is a bug in the RA2 checker: we're handling moves with fixed constraints by turning them unconditionally into fixed_use constraints. This isn't correct in the case of a fixed_nonallocatable constraint, which is why we're seeing an error with a register whose vreg is v0. I'll work on a fix in RA2.

view this post on Zulip Wasmtime GitHub notifications bot (Dec 14 2022 at 21:06):

elliottt closed issue #5425:

Steps to Reproduce

I can create a minimal reproducing example later if requested.

Expected Results

It compiles fine.

Actual Results

thread '<unnamed>' panicked at 'register allocation checker: CheckerErrors { errors: [UnknownValueInAllocation { inst: Inst(58), op: Use: v0i fixed(p0i), alloc: p0i }, UnknownValueInAllocation { inst: Inst(58), op: Use: v0i fixed(p0i), alloc: p0i }, IncorrectValuesInAllocation { inst: Inst(60), op: Use: v187i reg, alloc: p2i, actual: {} }] }', /home/runner/.cargo/git/checkouts/wasmtime-41807828cb3a7a7e/63a3951/cranelift/codegen/src/machinst/compile.rs:88:14
stack backtrace:
   0: rust_begin_unwind
             at /rustc/37d7de337903a558dbeb1e82c844fe915ab8ff25/library/std/src/panicking.rs:575:5
   1: core::panicking::panic_fmt
             at /rustc/37d7de337903a558dbeb1e82c844fe915ab8ff25/library/core/src/panicking.rs:64:14
   2: core::result::unwrap_failed
             at /rustc/37d7de337903a558dbeb1e82c844fe915ab8ff25/library/core/src/result.rs:1791:5
   3: cranelift_codegen::machinst::compile::compile
   4: <cranelift_codegen::isa::s390x::S390xBackend as cranelift_codegen::isa::TargetIsa>::compile_function
   5: cranelift_codegen::context::Context::compile_stencil
   6: cranelift_codegen::context::Context::compile
   7: cranelift_codegen::context::Context::compile_and_emit
   8: <cranelift_object::backend::ObjectModule as cranelift_module::module::Module>::define_function
   9: rustc_codegen_cranelift::base::compile_fn
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

Versions and Environment

Cranelift version or commit: https://github.com/bytecodealliance/wasmtime/commit/63a39511b65a2c8ad3d77b580236532dffc98b62 (upcoming 0.91 Cranelift release / 4.0 Wasmtime release)

Operating system: Linux

Architecture: cross-compiling to s390x

Extra Info

cc @uweigand


Last updated: Jan 24 2025 at 00:11 UTC