alexcrichton opened issue #4487:
With the merge of https://github.com/bytecodealliance/wasmtime/security/advisories/GHSA-7f6x-jwh5-m9r4 and https://github.com/bytecodealliance/wasmtime/security/advisories/GHSA-5fhj-g3p3-pq9g to the
main
branch (and therelease-0.{38,39.0}
branches) CI now appears to be failing across the board. Failure logs include:
- [ ] s390x-related
all
test failures- [ ] aarch64-related SIGKILL failures
- [ ] s390x
filetest
failuresI think https://github.com/bytecodealliance/wasmtime/pull/4486 will fix the SIGKILL issues.
I have attempted to reproduce the
filetest
failures locally and cannot (where I think I'm using the same version of QEMU). @uweigand are you perhaps around to help debug thefiletest
issue?I have not investigate the s390x
all
failures yet.
uweigand commented on issue #4487:
I've had a quick look, and the filetest failures are due to a newly added test (udiv_i16_const). However, this test passes natively for me. Not sure if this is a qemu issues ... I'll have a look.
As to the
all
tests, those are failed assertions:multiple uses of vreg with a Stack constraint are not supported
Not sure what exactly this means, but I can reproduce natively.
alexcrichton commented on issue #4487:
Ok in that case I think I'm realizing that the ISLE transition for s390x may have fixed the
filetest
issue. Investigating...
alexcrichton edited issue #4487:
With the merge of https://github.com/bytecodealliance/wasmtime/security/advisories/GHSA-7f6x-jwh5-m9r4 and https://github.com/bytecodealliance/wasmtime/security/advisories/GHSA-5fhj-g3p3-pq9g to the
main
branch (and therelease-0.{38,39.0}
branches) CI now appears to be failing across the board. Failure logs include:
- [ ] s390x-related
all
test failures- [ ] aarch64-related SIGKILL failures
- [x] s390x
filetest
failuresI think https://github.com/bytecodealliance/wasmtime/pull/4486 will fix the SIGKILL issues.
I have attempted to reproduce the
filetest
failures locally and cannot (where I think I'm using the same version of QEMU). @uweigand are you perhaps around to help debug thefiletest
issue?I have not investigate the s390x
all
failures yet.
alexcrichton commented on issue #4487:
Ok the
filetest
failures I can confirm were fixed in #4427. The test fails on therelease-0.38.0
andrelease-0.39.0
branches. The tests pass, however, on the #4427 commit (and thereforemain
).When the other s390x issue is sorted I think we can backport temporary disabling of s390x running the test on the
release-0.{38,39}.0
branches and leavemain
as-is. Now it's just theall
test failures you mentioned.
alexcrichton edited issue #4487:
With the merge of https://github.com/bytecodealliance/wasmtime/security/advisories/GHSA-7f6x-jwh5-m9r4 and https://github.com/bytecodealliance/wasmtime/security/advisories/GHSA-5fhj-g3p3-pq9g to the
main
branch (and therelease-0.{38,39.0}
branches) CI now appears to be failing across the board. Failure logs include:
- [ ] s390x-related
all
test failures- [ ] aarch64-related SIGKILL failures
- [x] s390x
filetest
failures -- only happens onrelease-0.{38,39}.0
, fixed onmain
by #4427I think https://github.com/bytecodealliance/wasmtime/pull/4486 will fix the SIGKILL issues.
I have attempted to reproduce the
filetest
failures locally and cannot (where I think I'm using the same version of QEMU). @uweigand are you perhaps around to help debug thefiletest
issue?I have not investigate the s390x
all
failures yet.
alexcrichton commented on issue #4487:
Initial reduction of the other issue from one failing wasm test case is:
function u0:0(i64 vmctx, i64) fast { gv0 = vmctx gv1 = load.i64 notrap aligned readonly gv0+8 gv2 = load.i64 notrap aligned gv1 gv3 = vmctx sig0 = (i64 vmctx, i32 uext) -> r64 wasmtime_system_v sig1 = (i64 vmctx, i32 uext, r64) wasmtime_system_v stack_limit = gv2 block0(v0: i64, v1: i64): v3 -> v0 v8 -> v0 v12 -> v0 v17 -> v0 v2 = null.r64 v16 -> v2 v4 = load.i64 notrap aligned readonly v0+48 v9 -> v4 v13 -> v4 v18 -> v4 v5 = load.i64 notrap aligned readonly v4+128 v6 = iconst.i32 0 v11 -> v6 v15 -> v6 v20 -> v6 v7 = call_indirect sig0, v5(v0, v6) v10 = load.i64 notrap aligned readonly v4+136 v14 -> v10 v19 -> v10 call_indirect sig1, v10(v0, v6, v7) call_indirect sig1, v10(v0, v6, v7) call_indirect sig1, v10(v0, v6, v2) jump block1 block1: return }
which yields:
$ cargo run -p cranelift-tools -- compile out.clif --target s390x Finished dev [unoptimized + debuginfo] target(s) in 0.08s Running `target/debug/clif-util compile out.clif --target s390x` thread 'main' panicked at 'multiple uses of vreg with a Stack constraint are not supported', /home/alex/.cargo/registry/src/github.com-1ecc6299db9ec823/regalloc2-0.3.0/src/ion/liveranges.rs:1339:57 note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
alexcrichton commented on issue #4487:
Reduced a bit further:
function u0:0(i64) fast { sig0 = () -> r64 wasmtime_system_v sig1 = (r64) wasmtime_system_v block0(v0: i64): v1 = call_indirect sig0, v0() call_indirect sig1, v0(v1) call_indirect sig1, v0(v1) return }
uweigand commented on issue #4487:
Right, from what I can see it seems the problem is related to the fact that
v1
is used twice, but also the value is forced to the stack (maybe because it's a reference value live across a call?).Not sure what this has to do with s390x, though.
uweigand commented on issue #4487:
Ok the
filetest
failures I can confirm were fixed in #4427. The test fails on therelease-0.38.0
andrelease-0.39.0
branches. The tests pass, however, on the #4427 commit (and thereforemain
).Ah, that makes sense. As part of #4427 I fixed the following bug found by fuzzing:
a problem with random high bits in sub-word immediates that were not properly zero-extended
This is exactly the problem here (when using the full-word divide instruction to perform a sub-word operation, the inputs must be properly extended or we get wrong results).
The attached minimal backport of just this fix resolves the problem on the release-0.39.0 branch:
diff-s390x-zeroext-fix.txt
alexcrichton commented on issue #4487:
@cfallin has found the issue and this fixes the panic:
diff --git a/cranelift/codegen/src/machinst/vcode.rs b/cranelift/codegen/src/machinst/vcode.rs index f9ef6670f..79959dfa3 100644 --- a/cranelift/codegen/src/machinst/vcode.rs +++ b/cranelift/codegen/src/machinst/vcode.rs @@ -606,6 +606,8 @@ impl<I: VCodeInst> VCodeBuilder<I> { for reg in self.vcode.reftyped_vregs.iter_mut() { *reg = Self::resolve_vreg_alias_impl(&self.vcode.vreg_aliases, *reg); } + self.vcode.reftyped_vregs.sort(); + self.vcode.reftyped_vregs.dedup(); self.compute_preds_from_succs(); self.vcode.debug_value_labels.sort_unstable();
uweigand commented on issue #4487:
As to the regalloc2 assertion, I see the following code just before the failure:
TRACE - built vcode: VCode { Entry block: 0 v129 := v130 Block 0: (original IR block: block0) (instruction range: 0 .. 8) Inst 0: lgr %v128, %r2 Inst 1: basr %r14, %v128 Inst 2: lgr %v130, %r2 Inst 3: lgr %r2, %v129 Inst 4: basr %r14, %v128 Inst 5: lgr %r2, %v129 Inst 6: basr %r14, %v128 Inst 7: br %r14 }
which looks very similar to the code I get on x86_64:
TRACE - built vcode: VCode { Entry block: 0 Block 0: (original IR block: block0) (instruction range: 0 .. 8) Inst 0: movq %rdi, %v128 Inst 1: call *%v128 Inst 2: movq %rax, %v129 Inst 3: movq %v129, %rdi Inst 4: call *%v128 Inst 5: movq %v129, %rdi Inst 6: call *%v128 Inst 7: ret }
The only difference I can see is that on s390x, we have two vregs that are aliased (v129, v130), while on x86_64 we only have one (v129). This is pretty sure because s390x now lowers calls via ISLE, where we get the extra aliased copy.
But that shouldn't change the behavior of regalloc2 otherwise. @cfallin maybe you can help here?
uweigand commented on issue #4487:
Ah, nevermind ... already fixed.
alexcrichton commented on issue #4487:
Ok fixes are applied in https://github.com/bytecodealliance/wasmtime/pull/4486 and I will work on backporting this to the release-0.3{8,9}.0 branches for new point releases as well. Thanks for your help @uweigand!
uweigand commented on issue #4487:
Ok fixes are applied in #4486 and I will work on backporting this to the release-0.3{8,9}.0 branches for new point releases as well. Thanks for your help @uweigand!
Thanks! For the release branches, maybe you can also add the patch from https://github.com/bytecodealliance/wasmtime/issues/4487#issuecomment-1190707048 to fix the s390x filetest regression as well.
alexcrichton closed issue #4487:
With the merge of https://github.com/bytecodealliance/wasmtime/security/advisories/GHSA-7f6x-jwh5-m9r4 and https://github.com/bytecodealliance/wasmtime/security/advisories/GHSA-5fhj-g3p3-pq9g to the
main
branch (and therelease-0.{38,39.0}
branches) CI now appears to be failing across the board. Failure logs include:
- [ ] s390x-related
all
test failures- [ ] aarch64-related SIGKILL failures
- [x] s390x
filetest
failures -- only happens onrelease-0.{38,39}.0
, fixed onmain
by #4427I think https://github.com/bytecodealliance/wasmtime/pull/4486 will fix the SIGKILL issues.
I have attempted to reproduce the
filetest
failures locally and cannot (where I think I'm using the same version of QEMU). @uweigand are you perhaps around to help debug thefiletest
issue?I have not investigate the s390x
all
failures yet.
Last updated: Nov 22 2024 at 16:03 UTC