Stream: git-wasmtime

Topic: wasmtime / issue #5126 arm64: Segfault when enabling `use...


view this post on Zulip Wasmtime GitHub notifications bot (Oct 25 2022 at 21:33):

alexcrichton opened issue #5126:

Given this input wasm module:

(module
  (type (;0;) (func))
  (func (;0;) (type 0)
    global.get 0
    i32.eqz
    if  ;; label = @1
      unreachable
    end
    global.get 0
    i32.const 1
    i32.sub
    global.set 0
    i32.const -320017172
    v128.load32_splat
    i8x16.bitmask
    v128.load32_zero align=1
    i64x2.extend_high_i32x4_u
    i64x2.extend_high_i32x4_u
    v128.const i32x4 0x9bececec 0x00007171 0x00000000 0x00000000
    drop
    drop
  )
  (memory (;0;) 0 3)
  (global (;0;) (mut i32) i32.const 1000)
  (export "" (func 0))
  (export "1" (memory 0))
  (data (;0;) (i32.const 0) "")
)

I get these results on a local arm64 marchine:

$ cargo run -q testcase0.wat --invoke ''
Error: failed to run main module `testcase0.wat`

Caused by:
    0: failed to invoke ``
    1: wasm trap: out of bounds memory access
       wasm backtrace:
           0:   0x44 - <unknown>!<wasm function 0>

vs

$ cargo run -q testcase0.wat --invoke '' --cranelift-set use_egraphs=true
zsh: segmentation fault  cargo run -q testcase0.wat --invoke '' --cranelift-set use_egraphs=true

The faulting instruction appears to be:

(gdb) disas $pc,$pc+20
Dump of assembler code from 0xfffff7b25038 to 0xfffff7b2504c:
=> 0x0000fffff7b25038:  ld1r    {v1.4s}, [x5]
   0x0000fffff7b2503c:  sshr    v25.16b, v1.16b, #7
   0x0000fffff7b25040:  mov     x4, #0x201                      // #513
   0x0000fffff7b25044:  movk    x4, #0x804, lsl #16
   0x0000fffff7b25048:  movk    x4, #0x2010, lsl #32
End of assembler dump.
(gdb) print/x $x5
$1 = 0xffff5cececec

so I think this may be an alignment issue? I'm not 100% familiar with vector instructions on arm myself.

cc @cfallin

view this post on Zulip Wasmtime GitHub notifications bot (Oct 25 2022 at 21:37):

alexcrichton commented on issue #5126:

Oh actually reading over the code for more than half a second I realize that this is expected to be an out-of-bounds segfault, so the actual issue here may be that the metadata for "this instruction is allowed to trap" may be getting accidentally lost during the egraphs translation

view this post on Zulip Wasmtime GitHub notifications bot (Oct 25 2022 at 21:37):

alexcrichton labeled issue #5126:

Given this input wasm module:

(module
  (type (;0;) (func))
  (func (;0;) (type 0)
    global.get 0
    i32.eqz
    if  ;; label = @1
      unreachable
    end
    global.get 0
    i32.const 1
    i32.sub
    global.set 0
    i32.const -320017172
    v128.load32_splat
    i8x16.bitmask
    v128.load32_zero align=1
    i64x2.extend_high_i32x4_u
    i64x2.extend_high_i32x4_u
    v128.const i32x4 0x9bececec 0x00007171 0x00000000 0x00000000
    drop
    drop
  )
  (memory (;0;) 0 3)
  (global (;0;) (mut i32) i32.const 1000)
  (export "" (func 0))
  (export "1" (memory 0))
  (data (;0;) (i32.const 0) "")
)

I get these results on a local arm64 marchine:

$ cargo run -q testcase0.wat --invoke ''
Error: failed to run main module `testcase0.wat`

Caused by:
    0: failed to invoke ``
    1: wasm trap: out of bounds memory access
       wasm backtrace:
           0:   0x44 - <unknown>!<wasm function 0>

vs

$ cargo run -q testcase0.wat --invoke '' --cranelift-set use_egraphs=true
zsh: segmentation fault  cargo run -q testcase0.wat --invoke '' --cranelift-set use_egraphs=true

The faulting instruction appears to be:

(gdb) disas $pc,$pc+20
Dump of assembler code from 0xfffff7b25038 to 0xfffff7b2504c:
=> 0x0000fffff7b25038:  ld1r    {v1.4s}, [x5]
   0x0000fffff7b2503c:  sshr    v25.16b, v1.16b, #7
   0x0000fffff7b25040:  mov     x4, #0x201                      // #513
   0x0000fffff7b25044:  movk    x4, #0x804, lsl #16
   0x0000fffff7b25048:  movk    x4, #0x2010, lsl #32
End of assembler dump.
(gdb) print/x $x5
$1 = 0xffff5cececec

so I think this may be an alignment issue? I'm not 100% familiar with vector instructions on arm myself.

cc @cfallin

view this post on Zulip Wasmtime GitHub notifications bot (Oct 25 2022 at 21:42):

alexcrichton commented on issue #5126:

Perhaps related, but https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=52721 is an issue on x86_64 which also may be related to egraphs since the fuzz configuration there has egraphs enabled. The problem in the fuzz bug is that it's failing to execute this function where it should be successful but a trap is showing up. The fuzz test case is using a CustomUnaligned memory which means that the base address of linear memory is forcibly unaligned. That may or may not be a valid thing to do at runtime, but it seems like it may be related to this egraphs issue?

view this post on Zulip Wasmtime GitHub notifications bot (Oct 25 2022 at 22:51):

cfallin commented on issue #5126:

Thanks!

I have a meta-thought here actually: we've had some discussions in the last few days about a way to restructure the egraph storage to be more directly integrated into the DataFlowGraph (core CLIF structure) which would alter the specific details leading to some of these recent fuzzbugs (storing types appropriately, storing trap metadata). We're pretty sure we want to do it, we just don't have the time quite yet (perhaps in the next few weeks). Rather than spending time chasing down these fuzzbugs and then refactoring again in a few weeks, I think it might make more sense to just turn off egraph mode in fuzzing for now; to be re-enabled later. I'll go ahead and create a PR for this and we can discuss further there...

view this post on Zulip Wasmtime GitHub notifications bot (Oct 25 2022 at 23:54):

alexcrichton commented on issue #5126:

Handled by https://github.com/bytecodealliance/wasmtime/pull/5128 for now

view this post on Zulip Wasmtime GitHub notifications bot (Oct 25 2022 at 23:54):

alexcrichton closed issue #5126:

Given this input wasm module:

(module
  (type (;0;) (func))
  (func (;0;) (type 0)
    global.get 0
    i32.eqz
    if  ;; label = @1
      unreachable
    end
    global.get 0
    i32.const 1
    i32.sub
    global.set 0
    i32.const -320017172
    v128.load32_splat
    i8x16.bitmask
    v128.load32_zero align=1
    i64x2.extend_high_i32x4_u
    i64x2.extend_high_i32x4_u
    v128.const i32x4 0x9bececec 0x00007171 0x00000000 0x00000000
    drop
    drop
  )
  (memory (;0;) 0 3)
  (global (;0;) (mut i32) i32.const 1000)
  (export "" (func 0))
  (export "1" (memory 0))
  (data (;0;) (i32.const 0) "")
)

I get these results on a local arm64 marchine:

$ cargo run -q testcase0.wat --invoke ''
Error: failed to run main module `testcase0.wat`

Caused by:
    0: failed to invoke ``
    1: wasm trap: out of bounds memory access
       wasm backtrace:
           0:   0x44 - <unknown>!<wasm function 0>

vs

$ cargo run -q testcase0.wat --invoke '' --cranelift-set use_egraphs=true
zsh: segmentation fault  cargo run -q testcase0.wat --invoke '' --cranelift-set use_egraphs=true

The faulting instruction appears to be:

(gdb) disas $pc,$pc+20
Dump of assembler code from 0xfffff7b25038 to 0xfffff7b2504c:
=> 0x0000fffff7b25038:  ld1r    {v1.4s}, [x5]
   0x0000fffff7b2503c:  sshr    v25.16b, v1.16b, #7
   0x0000fffff7b25040:  mov     x4, #0x201                      // #513
   0x0000fffff7b25044:  movk    x4, #0x804, lsl #16
   0x0000fffff7b25048:  movk    x4, #0x2010, lsl #32
End of assembler dump.
(gdb) print/x $x5
$1 = 0xffff5cececec

so I think this may be an alignment issue? I'm not 100% familiar with vector instructions on arm myself.

cc @cfallin


Last updated: Jan 24 2025 at 00:11 UTC