Stream: git-wasmtime

Topic: wasmtime / issue #4487 CI for `main` is broken


view this post on Zulip Wasmtime GitHub notifications bot (Jul 20 2022 at 18:59):

alexcrichton opened issue #4487:

With the merge of https://github.com/bytecodealliance/wasmtime/security/advisories/GHSA-7f6x-jwh5-m9r4 and https://github.com/bytecodealliance/wasmtime/security/advisories/GHSA-5fhj-g3p3-pq9g to the main branch (and the release-0.{38,39.0} branches) CI now appears to be failing across the board. Failure logs include:

I think https://github.com/bytecodealliance/wasmtime/pull/4486 will fix the SIGKILL issues.

I have attempted to reproduce the filetest failures locally and cannot (where I think I'm using the same version of QEMU). @uweigand are you perhaps around to help debug the filetest issue?

I have not investigate the s390x all failures yet.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 20 2022 at 19:25):

uweigand commented on issue #4487:

I've had a quick look, and the filetest failures are due to a newly added test (udiv_i16_const). However, this test passes natively for me. Not sure if this is a qemu issues ... I'll have a look.

As to the all tests, those are failed assertions:

multiple uses of vreg with a Stack constraint are not supported

Not sure what exactly this means, but I can reproduce natively.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 20 2022 at 19:30):

alexcrichton commented on issue #4487:

Ok in that case I think I'm realizing that the ISLE transition for s390x may have fixed the filetest issue. Investigating...

view this post on Zulip Wasmtime GitHub notifications bot (Jul 20 2022 at 19:41):

alexcrichton edited issue #4487:

With the merge of https://github.com/bytecodealliance/wasmtime/security/advisories/GHSA-7f6x-jwh5-m9r4 and https://github.com/bytecodealliance/wasmtime/security/advisories/GHSA-5fhj-g3p3-pq9g to the main branch (and the release-0.{38,39.0} branches) CI now appears to be failing across the board. Failure logs include:

I think https://github.com/bytecodealliance/wasmtime/pull/4486 will fix the SIGKILL issues.

I have attempted to reproduce the filetest failures locally and cannot (where I think I'm using the same version of QEMU). @uweigand are you perhaps around to help debug the filetest issue?

I have not investigate the s390x all failures yet.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 20 2022 at 19:42):

alexcrichton commented on issue #4487:

Ok the filetest failures I can confirm were fixed in #4427. The test fails on the release-0.38.0 and release-0.39.0 branches. The tests pass, however, on the #4427 commit (and therefore main).

When the other s390x issue is sorted I think we can backport temporary disabling of s390x running the test on the release-0.{38,39}.0 branches and leave main as-is. Now it's just the all test failures you mentioned.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 20 2022 at 19:44):

alexcrichton edited issue #4487:

With the merge of https://github.com/bytecodealliance/wasmtime/security/advisories/GHSA-7f6x-jwh5-m9r4 and https://github.com/bytecodealliance/wasmtime/security/advisories/GHSA-5fhj-g3p3-pq9g to the main branch (and the release-0.{38,39.0} branches) CI now appears to be failing across the board. Failure logs include:

I think https://github.com/bytecodealliance/wasmtime/pull/4486 will fix the SIGKILL issues.

I have attempted to reproduce the filetest failures locally and cannot (where I think I'm using the same version of QEMU). @uweigand are you perhaps around to help debug the filetest issue?

I have not investigate the s390x all failures yet.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 20 2022 at 19:55):

alexcrichton commented on issue #4487:

Initial reduction of the other issue from one failing wasm test case is:

function u0:0(i64 vmctx, i64) fast {
       gv0 = vmctx
       gv1 = load.i64 notrap aligned readonly gv0+8
       gv2 = load.i64 notrap aligned gv1
       gv3 = vmctx
       sig0 = (i64 vmctx, i32 uext) -> r64 wasmtime_system_v
       sig1 = (i64 vmctx, i32 uext, r64) wasmtime_system_v
       stack_limit = gv2

       block0(v0: i64, v1: i64):
           v3 -> v0
           v8 -> v0
           v12 -> v0
           v17 -> v0
           v2 = null.r64
           v16 -> v2
           v4 = load.i64 notrap aligned readonly v0+48
           v9 -> v4
           v13 -> v4
           v18 -> v4
           v5 = load.i64 notrap aligned readonly v4+128
           v6 = iconst.i32 0
           v11 -> v6
           v15 -> v6
           v20 -> v6
           v7 = call_indirect sig0, v5(v0, v6)
           v10 = load.i64 notrap aligned readonly v4+136
           v14 -> v10
           v19 -> v10
           call_indirect sig1, v10(v0, v6, v7)
           call_indirect sig1, v10(v0, v6, v7)
           call_indirect sig1, v10(v0, v6, v2)
           jump block1

       block1:
           return
}

which yields:

$ cargo run -p cranelift-tools -- compile out.clif --target s390x
    Finished dev [unoptimized + debuginfo] target(s) in 0.08s
     Running `target/debug/clif-util compile out.clif --target s390x`
thread 'main' panicked at 'multiple uses of vreg with a Stack constraint are not supported', /home/alex/.cargo/registry/src/github.com-1ecc6299db9ec823/regalloc2-0.3.0/src/ion/liveranges.rs:1339:57
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

view this post on Zulip Wasmtime GitHub notifications bot (Jul 20 2022 at 20:02):

alexcrichton commented on issue #4487:

Reduced a bit further:

function u0:0(i64) fast {
  sig0 = () -> r64 wasmtime_system_v
  sig1 = (r64) wasmtime_system_v

  block0(v0: i64):
    v1 = call_indirect sig0, v0()
    call_indirect sig1, v0(v1)
    call_indirect sig1, v0(v1)
    return
}

view this post on Zulip Wasmtime GitHub notifications bot (Jul 20 2022 at 20:05):

uweigand commented on issue #4487:

Right, from what I can see it seems the problem is related to the fact that v1 is used twice, but also the value is forced to the stack (maybe because it's a reference value live across a call?).

Not sure what this has to do with s390x, though.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 20 2022 at 20:09):

uweigand commented on issue #4487:

Ok the filetest failures I can confirm were fixed in #4427. The test fails on the release-0.38.0 and release-0.39.0 branches. The tests pass, however, on the #4427 commit (and therefore main).

Ah, that makes sense. As part of #4427 I fixed the following bug found by fuzzing:

a problem with random high bits in sub-word immediates that were not properly zero-extended

This is exactly the problem here (when using the full-word divide instruction to perform a sub-word operation, the inputs must be properly extended or we get wrong results).

The attached minimal backport of just this fix resolves the problem on the release-0.39.0 branch:
diff-s390x-zeroext-fix.txt

view this post on Zulip Wasmtime GitHub notifications bot (Jul 20 2022 at 20:18):

alexcrichton commented on issue #4487:

@cfallin has found the issue and this fixes the panic:

diff --git a/cranelift/codegen/src/machinst/vcode.rs b/cranelift/codegen/src/machinst/vcode.rs
index f9ef6670f..79959dfa3 100644
--- a/cranelift/codegen/src/machinst/vcode.rs
+++ b/cranelift/codegen/src/machinst/vcode.rs
@@ -606,6 +606,8 @@ impl<I: VCodeInst> VCodeBuilder<I> {
         for reg in self.vcode.reftyped_vregs.iter_mut() {
             *reg = Self::resolve_vreg_alias_impl(&self.vcode.vreg_aliases, *reg);
         }
+        self.vcode.reftyped_vregs.sort();
+        self.vcode.reftyped_vregs.dedup();

         self.compute_preds_from_succs();
         self.vcode.debug_value_labels.sort_unstable();

view this post on Zulip Wasmtime GitHub notifications bot (Jul 20 2022 at 20:19):

uweigand commented on issue #4487:

As to the regalloc2 assertion, I see the following code just before the failure:

TRACE - built vcode: VCode {
  Entry block: 0
  v129 := v130
Block 0:
    (original IR block: block0)
    (instruction range: 0 .. 8)
  Inst 0: lgr %v128, %r2
  Inst 1: basr %r14, %v128
  Inst 2: lgr %v130, %r2
  Inst 3: lgr %r2, %v129
  Inst 4: basr %r14, %v128
  Inst 5: lgr %r2, %v129
  Inst 6: basr %r14, %v128
  Inst 7: br %r14
}

which looks very similar to the code I get on x86_64:

TRACE - built vcode: VCode {
  Entry block: 0
Block 0:
    (original IR block: block0)
    (instruction range: 0 .. 8)
  Inst 0: movq    %rdi, %v128
  Inst 1: call    *%v128
  Inst 2: movq    %rax, %v129
  Inst 3: movq    %v129, %rdi
  Inst 4: call    *%v128
  Inst 5: movq    %v129, %rdi
  Inst 6: call    *%v128
  Inst 7: ret
}

The only difference I can see is that on s390x, we have two vregs that are aliased (v129, v130), while on x86_64 we only have one (v129). This is pretty sure because s390x now lowers calls via ISLE, where we get the extra aliased copy.

But that shouldn't change the behavior of regalloc2 otherwise. @cfallin maybe you can help here?

view this post on Zulip Wasmtime GitHub notifications bot (Jul 20 2022 at 20:19):

uweigand commented on issue #4487:

Ah, nevermind ... already fixed.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 20 2022 at 20:28):

alexcrichton commented on issue #4487:

Ok fixes are applied in https://github.com/bytecodealliance/wasmtime/pull/4486 and I will work on backporting this to the release-0.3{8,9}.0 branches for new point releases as well. Thanks for your help @uweigand!

view this post on Zulip Wasmtime GitHub notifications bot (Jul 20 2022 at 20:31):

uweigand commented on issue #4487:

Ok fixes are applied in #4486 and I will work on backporting this to the release-0.3{8,9}.0 branches for new point releases as well. Thanks for your help @uweigand!

Thanks! For the release branches, maybe you can also add the patch from https://github.com/bytecodealliance/wasmtime/issues/4487#issuecomment-1190707048 to fix the s390x filetest regression as well.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 21 2022 at 05:48):

alexcrichton closed issue #4487:

With the merge of https://github.com/bytecodealliance/wasmtime/security/advisories/GHSA-7f6x-jwh5-m9r4 and https://github.com/bytecodealliance/wasmtime/security/advisories/GHSA-5fhj-g3p3-pq9g to the main branch (and the release-0.{38,39.0} branches) CI now appears to be failing across the board. Failure logs include:

I think https://github.com/bytecodealliance/wasmtime/pull/4486 will fix the SIGKILL issues.

I have attempted to reproduce the filetest failures locally and cannot (where I think I'm using the same version of QEMU). @uweigand are you perhaps around to help debug the filetest issue?

I have not investigate the s390x all failures yet.


Last updated: Jan 24 2025 at 00:11 UTC