Stream: git-wasmtime

Topic: wasmtime / issue #9980 Divergent behavior between regallo...


view this post on Zulip Wasmtime GitHub notifications bot (Jan 10 2025 at 17:31):

alexcrichton added the fuzz-bug label to Issue #9980.

view this post on Zulip Wasmtime GitHub notifications bot (Jan 10 2025 at 17:31):

alexcrichton opened issue #9980:

Found via oss-fuzz in https://issues.oss-fuzz.com/issues/387110342 I've minimized this to:

(module
  (func (export "")
    call 1
    f64.const 0
    f64.const 0
    f64.ne
    if
    end
    call 0
  )

  (func
    f64.const nan
    f64.const 0
    f64.eq
    if
      loop
      end
    end
  )
)

which I can show different behavior with:

$ wasmtime run --invoke '' -W fuel=$((1<<62)) -C cranelift-regalloc-algorithm=backtracking foo.wat
Error: failed to run main module `foo.wat`

Caused by:
    0: failed to invoke ``
    1: error while executing at wasm backtrace:
           0: <unknown>!<wasm function 1>
           1:   0x1e - <unknown>!<wasm function 0>
           2:   0x36 - <unknown>!<wasm function 0>
...
         16365:   0x36 - <unknown>!<wasm function 0>
         16366:   0x36 - <unknown>!<wasm function 0>
    2: wasm trap: call stack exhausted

That's expected, this infinitely recurses. With single_pass though:

$ wasmtime run --invoke '' -W fuel=$((1<<62)) -C cranelift-regalloc-algorithm=single_pass foo.wat
Error: failed to run main module `foo.wat`

Caused by:
    0: failed to invoke ``
    1: error while executing at wasm backtrace:
           0:   0x1d - <unknown>!<wasm function 0>
           1:   0x36 - <unknown>!<wasm function 0>
    2: wasm trap: all fuel consumed by WebAssembly

view this post on Zulip Wasmtime GitHub notifications bot (Jan 10 2025 at 17:57):

cfallin commented on issue #9980:

For context: Alex and I briefly discussed this offline and concluded this was not a security issue as backtracking (the production algorithm) is performing the correct behavior, and single-pass is off by default and not part of any support tier (or perhaps tier 3 by default).

view this post on Zulip Wasmtime GitHub notifications bot (Jan 10 2025 at 22:58):

cfallin commented on issue #9980:

Additional context: it seems this only reproduces with x86-64; cannot reproduce with aarch64 (also infinitely recurses correctly). Successfully reproduces on macOS/x86-64 (via Rosetta 2), though, in addition to Linux.

view this post on Zulip Wasmtime GitHub notifications bot (Jan 12 2025 at 16:14):

primoly commented on issue #9980:

The strange thing is that it seems to be caused by the NaN. If you replace that NaN with any other value (for example -12.345) single-pass recurses infinitely as well. If you look at the disassembly of both variants, nothing is different except that NaN constant in r15d. It doesn’t even change the address of any instruction. I have no idea why this would make a difference in fuel consumption. Mysterious.

<details><summary>single-pass and fuel with NaN</summary>

Disassembly of function <function[0]>:

00000000    55                                push rbp
00000001    48 89 e5                          mov rbp, rsp
00000004    4c 8b 57 08                       mov r10, qword ptr [rdi + 8]
00000008    4d 8b 12                          mov r10, qword ptr [r10]
0000000b    49 81 c2 90 00 00 00              add r10, 0x90
00000012    49 39 e2                          cmp r10, rsp
00000015    0f 87 4a 01 00 00                 ja 0x165
0000001b    48 81 ec 80 00 00 00              sub rsp, 0x80
00000022    48 89 5c 24 50                    mov qword ptr [rsp + 0x50], rbx
00000027    4c 89 64 24 58                    mov qword ptr [rsp + 0x58], r12
0000002c    4c 89 6c 24 60                    mov qword ptr [rsp + 0x60], r13
00000031    4c 89 74 24 68                    mov qword ptr [rsp + 0x68], r14
00000036    4c 89 7c 24 70                    mov qword ptr [rsp + 0x70], r15
0000003b    48 89 7c 24 18                    mov qword ptr [rsp + 0x18], rdi
00000040    48 8b 44 24 18                    mov rax, qword ptr [rsp + 0x18]
00000045    48 8b 50 08                       mov rdx, qword ptr [rax + 8]
00000049    48 89 14 24                       mov qword ptr [rsp], rdx
0000004d    48 8b 0c 24                       mov rcx, qword ptr [rsp]
00000051    48 8b 71 08                       mov rsi, qword ptr [rcx + 8]
00000055    48 89 74 24 30                    mov qword ptr [rsp + 0x30], rsi
0000005a    4c 8b 44 24 30                    mov r8, qword ptr [rsp + 0x30]
0000005f    4d 8d 48 01                       lea r9, [r8 + 1]
00000063    4c 89 4c 24 40                    mov qword ptr [rsp + 0x40], r9
00000068    4c 8b 54 24 40                    mov r10, qword ptr [rsp + 0x40]
0000006d    48 8b 5c 24 40                    mov rbx, qword ptr [rsp + 0x40]
00000072    4c 85 d3                          test rbx, r10
00000075    48 8b 5c 24 40                    mov rbx, qword ptr [rsp + 0x40]
0000007a    48 89 5c 24 38                    mov qword ptr [rsp + 0x38], rbx
0000007f    0f 8d 0f 00 00 00                 jge 0x94
00000085    48 8b 5c 24 38                    mov rbx, qword ptr [rsp + 0x38]
0000008a    48 89 5c 24 20                    mov qword ptr [rsp + 0x20], rbx
0000008f    e9 33 00 00 00                    jmp 0xc7
00000094    4c 8b 5c 24 30                    mov r11, qword ptr [rsp + 0x30]
00000099    4d 8d 6b 01                       lea r13, [r11 + 1]
0000009d    4c 8b 24 24                       mov r12, qword ptr [rsp]
000000a1    4d 89 6c 24 08                    mov qword ptr [r12 + 8], r13
000000a6    48 8b 7c 24 18                    mov rdi, qword ptr [rsp + 0x18]
000000ab    e8 7e 03 00 00                    call 0x42e
000000b0    4c 8b 34 24                       mov r14, qword ptr [rsp]
000000b4    4d 8b 7e 08                       mov r15, qword ptr [r14 + 8]
000000b8    4c 89 7c 24 28                    mov qword ptr [rsp + 0x28], r15
000000bd    4c 8b 7c 24 28                    mov r15, qword ptr [rsp + 0x28]
000000c2    4c 89 7c 24 20                    mov qword ptr [rsp + 0x20], r15
000000c7    48 8b 44 24 20                    mov rax, qword ptr [rsp + 0x20]
000000cc    48 8d 50 01                       lea rdx, [rax + 1]
000000d0    48 8b 0c 24                       mov rcx, qword ptr [rsp]
000000d4    48 89 51 08                       mov qword ptr [rcx + 8], rdx
000000d8    48 8b 74 24 18                    mov rsi, qword ptr [rsp + 0x18]
000000dd    48 8b 7c 24 18                    mov rdi, qword ptr [rsp + 0x18]
000000e2    e8 99 00 00 00                    call 0x180
000000e7    4c 8b 04 24                       mov r8, qword ptr [rsp]
000000eb    4d 8b 48 08                       mov r9, qword ptr [r8 + 8]
000000ef    4c 89 4c 24 10                    mov qword ptr [rsp + 0x10], r9
000000f4    c4 41 09 57 fe                    vxorpd xmm15, xmm14, xmm14
000000f9    c5 79 2e 3d 6f 00 00 00           vucomisd xmm15, qword ptr [rip + 0x6f]
00000101    0f 8a 00 00 00 00                 jp 0x107
00000107    48 8b 7c 24 18                    mov rdi, qword ptr [rsp + 0x18]
0000010c    4c 8b 54 24 10                    mov r10, qword ptr [rsp + 0x10]
00000111    49 8d 5a 05                       lea rbx, [r10 + 5]
00000115    4c 8b 1c 24                       mov r11, qword ptr [rsp]
00000119    49 89 5b 08                       mov qword ptr [r11 + 8], rbx
0000011d    48 89 fe                          mov rsi, rdi
00000120    e8 db fe ff ff                    call 0
00000125    4c 8b 24 24                       mov r12, qword ptr [rsp]
00000129    4d 8b 6c 24 08                    mov r13, qword ptr [r12 + 8]
0000012e    4c 89 6c 24 08                    mov qword ptr [rsp + 8], r13
00000133    4c 8b 7c 24 08                    mov r15, qword ptr [rsp + 8]
00000138    4c 8b 34 24                       mov r14, qword ptr [rsp]
0000013c    4d 89 7e 08                       mov qword ptr [r14 + 8], r15
00000140    48 8b 5c 24 50                    mov rbx, qword ptr [rsp + 0x50]
00000145    4c 8b 64 24 58                    mov r12, qword ptr [rsp + 0x58]
0000014a    4c 8b 6c 24 60                    mov r13, qword ptr [rsp + 0x60]
0000014f    4c 8b 74 24 68                    mov r14, qword ptr [rsp + 0x68]
00000154    4c 8b 7c 24 70                    mov r15, qword ptr [rsp + 0x70]
00000159    48 81 c4 80 00 00 00              add rsp, 0x80
00000160    48 89 ec                          mov rsp, rbp
00000163    5d                                pop rbp
00000164    c3                                ret
00000165    0f 0b                             ud2
00000167    00 00                             add byte ptr [rax], al
00000169    00 00                             add byte ptr [rax], al
0000016b    00 00                             add byte ptr [rax], al
0000016d    00 00                             add byte ptr [rax], al
0000016f    00 00                             add byte ptr [rax], al
00000171    00 00                             add byte ptr [rax], al
00000173    00 00                             add byte ptr [rax], al
00000175    00 00                             add byte ptr [rax], al
00000177    00 00                             add byte ptr [rax], al
00000179    00 00                             add byte ptr [rax], al
0000017b    00 00                             add byte ptr [rax], al
0000017d    00 00                             add byte ptr [rax], al
0000017f    00                                .byte 0x00

Disassembly of function <function[1]>:

00000180    55                                push rbp
00000181    48 89 e5                          mov rbp, rsp
00000184    4c 8b 57 08                       mov r10, qword ptr [rdi + 8]
00000188    4d 8b 12                          mov r10, qword ptr [r10]
0000018b    49 81 c2 b0 00 00 00              add r10, 0xb0
00000192    49 39 e2                          cmp r10, rsp
00000195    0f 87 ac 01 00 00                 ja 0x347
0000019b    48 81 ec a0 00 00 00              sub rsp, 0xa0
000001a2    48 89 5c 24 70                    mov qword ptr [rsp + 0x70], rbx
000001a7    4c 89 64 24 78                    mov qword ptr [rsp + 0x78], r12
000001ac    4c 89 ac 24 80 00 00 00           mov qword ptr [rsp + 0x80], r13
000001b4    4c 89 b4 24 88 00 00 00           mov qword ptr [rsp + 0x88], r14
000001bc    4c 89 bc 24 90 00 00 00           mov qword ptr [rsp + 0x90], r15
000001c4    48 89 7c 24 28                    mov qword ptr [rsp + 0x28], rdi
000001c9    4c 8b 7c 24 28                    mov r15, qword ptr [rsp + 0x28]
000001ce    49 8b 77 08                       mov rsi, qword ptr [r15 + 8]
000001d2    48 89 34 24                       mov qword ptr [rsp], rsi
000001d6    48 8b 04 24                       mov rax, qword ptr [rsp]
000001da    48 8b 48 08                       mov rcx, qword ptr [rax + 8]
000001de    48 89 4c 24 58                    mov qword ptr [rsp + 0x58], rcx
000001e3    48 8b 54 24 58                    mov rdx, qword ptr [rsp + 0x58]
000001e8    4c 8d 42 01                       lea r8, [rdx + 1]
000001ec    4c 89 44 24 68                    mov qword ptr [rsp + 0x68], r8
000001f1    4c 8b 4c 24 68                    mov r9, qword ptr [rsp + 0x68]
000001f6    4c 8b 54 24 68                    mov r10, qword ptr [rsp + 0x68]
000001fb    4d 85 ca                          test r10, r9
000001fe    4c 8b 54 24 68                    mov r10, qword ptr [rsp + 0x68]
00000203    4c 89 54 24 60                    mov qword ptr [rsp + 0x60], r10
00000208    0f 8d 0f 00 00 00                 jge 0x21d
0000020e    4c 8b 54 24 60                    mov r10, qword ptr [rsp + 0x60]
00000213    4c 89 54 24 20                    mov qword ptr [rsp + 0x20], r10
00000218    e9 32 00 00 00                    jmp 0x24f
0000021d    48 8b 5c 24 58                    mov rbx, qword ptr [rsp + 0x58]
00000222    4c 8d 63 01                       lea r12, [rbx + 1]
00000226    4c 8b 1c 24                       mov r11, qword ptr [rsp]
0000022a    4d 89 63 08                       mov qword ptr [r11 + 8], r12
0000022e    48 8b 7c 24 28                    mov rdi, qword ptr [rsp
[message truncated]

view this post on Zulip Wasmtime GitHub notifications bot (Jan 12 2025 at 16:25):

primoly edited a comment on issue #9980:

The strange thing is that it seems to be caused by the NaN. If you replace that NaN with any other value (for example -12.345) single-pass recurses infinitely as well. If you look at the disassembly of both variants, nothing is different except that NaN constant in r15. It doesn’t even change the address of any instruction. I have no idea why this would make a difference in fuel consumption. Mysterious.

<details><summary>single-pass and fuel with NaN</summary>

Disassembly of function <function[0]>:

00000000    55                                push rbp
00000001    48 89 e5                          mov rbp, rsp
00000004    4c 8b 57 08                       mov r10, qword ptr [rdi + 8]
00000008    4d 8b 12                          mov r10, qword ptr [r10]
0000000b    49 81 c2 90 00 00 00              add r10, 0x90
00000012    49 39 e2                          cmp r10, rsp
00000015    0f 87 4a 01 00 00                 ja 0x165
0000001b    48 81 ec 80 00 00 00              sub rsp, 0x80
00000022    48 89 5c 24 50                    mov qword ptr [rsp + 0x50], rbx
00000027    4c 89 64 24 58                    mov qword ptr [rsp + 0x58], r12
0000002c    4c 89 6c 24 60                    mov qword ptr [rsp + 0x60], r13
00000031    4c 89 74 24 68                    mov qword ptr [rsp + 0x68], r14
00000036    4c 89 7c 24 70                    mov qword ptr [rsp + 0x70], r15
0000003b    48 89 7c 24 18                    mov qword ptr [rsp + 0x18], rdi
00000040    48 8b 44 24 18                    mov rax, qword ptr [rsp + 0x18]
00000045    48 8b 50 08                       mov rdx, qword ptr [rax + 8]
00000049    48 89 14 24                       mov qword ptr [rsp], rdx
0000004d    48 8b 0c 24                       mov rcx, qword ptr [rsp]
00000051    48 8b 71 08                       mov rsi, qword ptr [rcx + 8]
00000055    48 89 74 24 30                    mov qword ptr [rsp + 0x30], rsi
0000005a    4c 8b 44 24 30                    mov r8, qword ptr [rsp + 0x30]
0000005f    4d 8d 48 01                       lea r9, [r8 + 1]
00000063    4c 89 4c 24 40                    mov qword ptr [rsp + 0x40], r9
00000068    4c 8b 54 24 40                    mov r10, qword ptr [rsp + 0x40]
0000006d    48 8b 5c 24 40                    mov rbx, qword ptr [rsp + 0x40]
00000072    4c 85 d3                          test rbx, r10
00000075    48 8b 5c 24 40                    mov rbx, qword ptr [rsp + 0x40]
0000007a    48 89 5c 24 38                    mov qword ptr [rsp + 0x38], rbx
0000007f    0f 8d 0f 00 00 00                 jge 0x94
00000085    48 8b 5c 24 38                    mov rbx, qword ptr [rsp + 0x38]
0000008a    48 89 5c 24 20                    mov qword ptr [rsp + 0x20], rbx
0000008f    e9 33 00 00 00                    jmp 0xc7
00000094    4c 8b 5c 24 30                    mov r11, qword ptr [rsp + 0x30]
00000099    4d 8d 6b 01                       lea r13, [r11 + 1]
0000009d    4c 8b 24 24                       mov r12, qword ptr [rsp]
000000a1    4d 89 6c 24 08                    mov qword ptr [r12 + 8], r13
000000a6    48 8b 7c 24 18                    mov rdi, qword ptr [rsp + 0x18]
000000ab    e8 7e 03 00 00                    call 0x42e
000000b0    4c 8b 34 24                       mov r14, qword ptr [rsp]
000000b4    4d 8b 7e 08                       mov r15, qword ptr [r14 + 8]
000000b8    4c 89 7c 24 28                    mov qword ptr [rsp + 0x28], r15
000000bd    4c 8b 7c 24 28                    mov r15, qword ptr [rsp + 0x28]
000000c2    4c 89 7c 24 20                    mov qword ptr [rsp + 0x20], r15
000000c7    48 8b 44 24 20                    mov rax, qword ptr [rsp + 0x20]
000000cc    48 8d 50 01                       lea rdx, [rax + 1]
000000d0    48 8b 0c 24                       mov rcx, qword ptr [rsp]
000000d4    48 89 51 08                       mov qword ptr [rcx + 8], rdx
000000d8    48 8b 74 24 18                    mov rsi, qword ptr [rsp + 0x18]
000000dd    48 8b 7c 24 18                    mov rdi, qword ptr [rsp + 0x18]
000000e2    e8 99 00 00 00                    call 0x180
000000e7    4c 8b 04 24                       mov r8, qword ptr [rsp]
000000eb    4d 8b 48 08                       mov r9, qword ptr [r8 + 8]
000000ef    4c 89 4c 24 10                    mov qword ptr [rsp + 0x10], r9
000000f4    c4 41 09 57 fe                    vxorpd xmm15, xmm14, xmm14
000000f9    c5 79 2e 3d 6f 00 00 00           vucomisd xmm15, qword ptr [rip + 0x6f]
00000101    0f 8a 00 00 00 00                 jp 0x107
00000107    48 8b 7c 24 18                    mov rdi, qword ptr [rsp + 0x18]
0000010c    4c 8b 54 24 10                    mov r10, qword ptr [rsp + 0x10]
00000111    49 8d 5a 05                       lea rbx, [r10 + 5]
00000115    4c 8b 1c 24                       mov r11, qword ptr [rsp]
00000119    49 89 5b 08                       mov qword ptr [r11 + 8], rbx
0000011d    48 89 fe                          mov rsi, rdi
00000120    e8 db fe ff ff                    call 0
00000125    4c 8b 24 24                       mov r12, qword ptr [rsp]
00000129    4d 8b 6c 24 08                    mov r13, qword ptr [r12 + 8]
0000012e    4c 89 6c 24 08                    mov qword ptr [rsp + 8], r13
00000133    4c 8b 7c 24 08                    mov r15, qword ptr [rsp + 8]
00000138    4c 8b 34 24                       mov r14, qword ptr [rsp]
0000013c    4d 89 7e 08                       mov qword ptr [r14 + 8], r15
00000140    48 8b 5c 24 50                    mov rbx, qword ptr [rsp + 0x50]
00000145    4c 8b 64 24 58                    mov r12, qword ptr [rsp + 0x58]
0000014a    4c 8b 6c 24 60                    mov r13, qword ptr [rsp + 0x60]
0000014f    4c 8b 74 24 68                    mov r14, qword ptr [rsp + 0x68]
00000154    4c 8b 7c 24 70                    mov r15, qword ptr [rsp + 0x70]
00000159    48 81 c4 80 00 00 00              add rsp, 0x80
00000160    48 89 ec                          mov rsp, rbp
00000163    5d                                pop rbp
00000164    c3                                ret
00000165    0f 0b                             ud2
00000167    00 00                             add byte ptr [rax], al
00000169    00 00                             add byte ptr [rax], al
0000016b    00 00                             add byte ptr [rax], al
0000016d    00 00                             add byte ptr [rax], al
0000016f    00 00                             add byte ptr [rax], al
00000171    00 00                             add byte ptr [rax], al
00000173    00 00                             add byte ptr [rax], al
00000175    00 00                             add byte ptr [rax], al
00000177    00 00                             add byte ptr [rax], al
00000179    00 00                             add byte ptr [rax], al
0000017b    00 00                             add byte ptr [rax], al
0000017d    00 00                             add byte ptr [rax], al
0000017f    00                                .byte 0x00

Disassembly of function <function[1]>:

00000180    55                                push rbp
00000181    48 89 e5                          mov rbp, rsp
00000184    4c 8b 57 08                       mov r10, qword ptr [rdi + 8]
00000188    4d 8b 12                          mov r10, qword ptr [r10]
0000018b    49 81 c2 b0 00 00 00              add r10, 0xb0
00000192    49 39 e2                          cmp r10, rsp
00000195    0f 87 b0 01 00 00                 ja 0x34b
0000019b    48 81 ec a0 00 00 00              sub rsp, 0xa0
000001a2    48 89 5c 24 70                    mov qword ptr [rsp + 0x70], rbx
000001a7    4c 89 64 24 78                    mov qword ptr [rsp + 0x78], r12
000001ac    4c 89 ac 24 80 00 00 00           mov qword ptr [rsp + 0x80], r13
000001b4    4c 89 b4 24 88 00 00 00           mov qword ptr [rsp + 0x88], r14
000001bc    4c 89 bc 24 90 00 00 00           mov qword ptr [rsp + 0x90], r15
000001c4    48 89 7c 24 28                    mov qword ptr [rsp + 0x28], rdi
000001c9    4c 8b 7c 24 28                    mov r15, qword ptr [rsp + 0x28]
000001ce    49 8b 77 08                       mov rsi, qword ptr [r15 + 8]
000001d2    48 89 34 24                       mov qword ptr [rsp], rsi
000001d6    48 8b 04 24                       mov rax, qword ptr [rsp]
000001da    48 8b 48 08                       mov rcx, qword ptr [rax + 8]
000001de    48 89 4c 24 58                    mov qword ptr [rsp + 0x58], rcx
000001e3    48 8b 54 24 58                    mov rdx, qword ptr [rsp + 0x58]
000001e8    4c 8d 42 01                       lea r8, [rdx + 1]
000001ec    4c 89 44 24 68                    mov qword ptr [rsp + 0x68], r8
000001f1    4c 8b 4c 24 68                    mov r9, qword ptr [rsp + 0x68]
000001f6    4c 8b 54 24 68                    mov r10, qword ptr [rsp + 0x68]
000001fb    4d 85 ca                          test r10, r9
000001fe    4c 8b 54 24 68                    mov r10, qword ptr [rsp + 0x68]
00000203    4c 89 54 24 60                    mov qword ptr [rsp + 0x60], r10
00000208    0f 8d 0f 00 00 00                 jge 0x21d
0000020e    4c 8b 54 24 60                    mov r10, qword ptr [rsp + 0x60]
00000213    4c 89 54 24 20                    mov qword ptr [rsp + 0x20], r10
00000218    e9 32 00 00 00                    jmp 0x24f
0000021d    48 8b 5c 24 58                    mov rbx, qword ptr [rsp + 0x58]
00000222    4c 8d 63 01                       lea r12, [rbx + 1]
00000226    4c 8b 1c 24                       mov r11, qword ptr [rsp]
0000022a    4d 89 63 08                       mov qword ptr [r11 + 8], r12
0000022e    48 8b 7c 24 28                    mov rdi, qword pt
[message truncated]

view this post on Zulip Wasmtime GitHub notifications bot (Jan 12 2025 at 20:52):

primoly edited a comment on issue #9980:

The strange thing is that it seems to be caused by the NaN. If you replace that NaN with any other value (for example -12.345) single-pass recurses infinitely as well. If you look at the disassembly of both variants, nothing is different except that NaN constant in r15 (at address 0000024f). It doesn’t even change the address of any instruction. I have no idea why this would make a difference in fuel consumption. Mysterious.

<details><summary>single-pass and fuel with NaN</summary>

Disassembly of function <function[0]>:

00000000    55                                push rbp
00000001    48 89 e5                          mov rbp, rsp
00000004    4c 8b 57 08                       mov r10, qword ptr [rdi + 8]
00000008    4d 8b 12                          mov r10, qword ptr [r10]
0000000b    49 81 c2 90 00 00 00              add r10, 0x90
00000012    49 39 e2                          cmp r10, rsp
00000015    0f 87 4a 01 00 00                 ja 0x165
0000001b    48 81 ec 80 00 00 00              sub rsp, 0x80
00000022    48 89 5c 24 50                    mov qword ptr [rsp + 0x50], rbx
00000027    4c 89 64 24 58                    mov qword ptr [rsp + 0x58], r12
0000002c    4c 89 6c 24 60                    mov qword ptr [rsp + 0x60], r13
00000031    4c 89 74 24 68                    mov qword ptr [rsp + 0x68], r14
00000036    4c 89 7c 24 70                    mov qword ptr [rsp + 0x70], r15
0000003b    48 89 7c 24 18                    mov qword ptr [rsp + 0x18], rdi
00000040    48 8b 44 24 18                    mov rax, qword ptr [rsp + 0x18]
00000045    48 8b 50 08                       mov rdx, qword ptr [rax + 8]
00000049    48 89 14 24                       mov qword ptr [rsp], rdx
0000004d    48 8b 0c 24                       mov rcx, qword ptr [rsp]
00000051    48 8b 71 08                       mov rsi, qword ptr [rcx + 8]
00000055    48 89 74 24 30                    mov qword ptr [rsp + 0x30], rsi
0000005a    4c 8b 44 24 30                    mov r8, qword ptr [rsp + 0x30]
0000005f    4d 8d 48 01                       lea r9, [r8 + 1]
00000063    4c 89 4c 24 40                    mov qword ptr [rsp + 0x40], r9
00000068    4c 8b 54 24 40                    mov r10, qword ptr [rsp + 0x40]
0000006d    48 8b 5c 24 40                    mov rbx, qword ptr [rsp + 0x40]
00000072    4c 85 d3                          test rbx, r10
00000075    48 8b 5c 24 40                    mov rbx, qword ptr [rsp + 0x40]
0000007a    48 89 5c 24 38                    mov qword ptr [rsp + 0x38], rbx
0000007f    0f 8d 0f 00 00 00                 jge 0x94
00000085    48 8b 5c 24 38                    mov rbx, qword ptr [rsp + 0x38]
0000008a    48 89 5c 24 20                    mov qword ptr [rsp + 0x20], rbx
0000008f    e9 33 00 00 00                    jmp 0xc7
00000094    4c 8b 5c 24 30                    mov r11, qword ptr [rsp + 0x30]
00000099    4d 8d 6b 01                       lea r13, [r11 + 1]
0000009d    4c 8b 24 24                       mov r12, qword ptr [rsp]
000000a1    4d 89 6c 24 08                    mov qword ptr [r12 + 8], r13
000000a6    48 8b 7c 24 18                    mov rdi, qword ptr [rsp + 0x18]
000000ab    e8 7e 03 00 00                    call 0x42e
000000b0    4c 8b 34 24                       mov r14, qword ptr [rsp]
000000b4    4d 8b 7e 08                       mov r15, qword ptr [r14 + 8]
000000b8    4c 89 7c 24 28                    mov qword ptr [rsp + 0x28], r15
000000bd    4c 8b 7c 24 28                    mov r15, qword ptr [rsp + 0x28]
000000c2    4c 89 7c 24 20                    mov qword ptr [rsp + 0x20], r15
000000c7    48 8b 44 24 20                    mov rax, qword ptr [rsp + 0x20]
000000cc    48 8d 50 01                       lea rdx, [rax + 1]
000000d0    48 8b 0c 24                       mov rcx, qword ptr [rsp]
000000d4    48 89 51 08                       mov qword ptr [rcx + 8], rdx
000000d8    48 8b 74 24 18                    mov rsi, qword ptr [rsp + 0x18]
000000dd    48 8b 7c 24 18                    mov rdi, qword ptr [rsp + 0x18]
000000e2    e8 99 00 00 00                    call 0x180
000000e7    4c 8b 04 24                       mov r8, qword ptr [rsp]
000000eb    4d 8b 48 08                       mov r9, qword ptr [r8 + 8]
000000ef    4c 89 4c 24 10                    mov qword ptr [rsp + 0x10], r9
000000f4    c4 41 09 57 fe                    vxorpd xmm15, xmm14, xmm14
000000f9    c5 79 2e 3d 6f 00 00 00           vucomisd xmm15, qword ptr [rip + 0x6f]
00000101    0f 8a 00 00 00 00                 jp 0x107
00000107    48 8b 7c 24 18                    mov rdi, qword ptr [rsp + 0x18]
0000010c    4c 8b 54 24 10                    mov r10, qword ptr [rsp + 0x10]
00000111    49 8d 5a 05                       lea rbx, [r10 + 5]
00000115    4c 8b 1c 24                       mov r11, qword ptr [rsp]
00000119    49 89 5b 08                       mov qword ptr [r11 + 8], rbx
0000011d    48 89 fe                          mov rsi, rdi
00000120    e8 db fe ff ff                    call 0
00000125    4c 8b 24 24                       mov r12, qword ptr [rsp]
00000129    4d 8b 6c 24 08                    mov r13, qword ptr [r12 + 8]
0000012e    4c 89 6c 24 08                    mov qword ptr [rsp + 8], r13
00000133    4c 8b 7c 24 08                    mov r15, qword ptr [rsp + 8]
00000138    4c 8b 34 24                       mov r14, qword ptr [rsp]
0000013c    4d 89 7e 08                       mov qword ptr [r14 + 8], r15
00000140    48 8b 5c 24 50                    mov rbx, qword ptr [rsp + 0x50]
00000145    4c 8b 64 24 58                    mov r12, qword ptr [rsp + 0x58]
0000014a    4c 8b 6c 24 60                    mov r13, qword ptr [rsp + 0x60]
0000014f    4c 8b 74 24 68                    mov r14, qword ptr [rsp + 0x68]
00000154    4c 8b 7c 24 70                    mov r15, qword ptr [rsp + 0x70]
00000159    48 81 c4 80 00 00 00              add rsp, 0x80
00000160    48 89 ec                          mov rsp, rbp
00000163    5d                                pop rbp
00000164    c3                                ret
00000165    0f 0b                             ud2
00000167    00 00                             add byte ptr [rax], al
00000169    00 00                             add byte ptr [rax], al
0000016b    00 00                             add byte ptr [rax], al
0000016d    00 00                             add byte ptr [rax], al
0000016f    00 00                             add byte ptr [rax], al
00000171    00 00                             add byte ptr [rax], al
00000173    00 00                             add byte ptr [rax], al
00000175    00 00                             add byte ptr [rax], al
00000177    00 00                             add byte ptr [rax], al
00000179    00 00                             add byte ptr [rax], al
0000017b    00 00                             add byte ptr [rax], al
0000017d    00 00                             add byte ptr [rax], al
0000017f    00                                .byte 0x00

Disassembly of function <function[1]>:

00000180    55                                push rbp
00000181    48 89 e5                          mov rbp, rsp
00000184    4c 8b 57 08                       mov r10, qword ptr [rdi + 8]
00000188    4d 8b 12                          mov r10, qword ptr [r10]
0000018b    49 81 c2 b0 00 00 00              add r10, 0xb0
00000192    49 39 e2                          cmp r10, rsp
00000195    0f 87 b0 01 00 00                 ja 0x34b
0000019b    48 81 ec a0 00 00 00              sub rsp, 0xa0
000001a2    48 89 5c 24 70                    mov qword ptr [rsp + 0x70], rbx
000001a7    4c 89 64 24 78                    mov qword ptr [rsp + 0x78], r12
000001ac    4c 89 ac 24 80 00 00 00           mov qword ptr [rsp + 0x80], r13
000001b4    4c 89 b4 24 88 00 00 00           mov qword ptr [rsp + 0x88], r14
000001bc    4c 89 bc 24 90 00 00 00           mov qword ptr [rsp + 0x90], r15
000001c4    48 89 7c 24 28                    mov qword ptr [rsp + 0x28], rdi
000001c9    4c 8b 7c 24 28                    mov r15, qword ptr [rsp + 0x28]
000001ce    49 8b 77 08                       mov rsi, qword ptr [r15 + 8]
000001d2    48 89 34 24                       mov qword ptr [rsp], rsi
000001d6    48 8b 04 24                       mov rax, qword ptr [rsp]
000001da    48 8b 48 08                       mov rcx, qword ptr [rax + 8]
000001de    48 89 4c 24 58                    mov qword ptr [rsp + 0x58], rcx
000001e3    48 8b 54 24 58                    mov rdx, qword ptr [rsp + 0x58]
000001e8    4c 8d 42 01                       lea r8, [rdx + 1]
000001ec    4c 89 44 24 68                    mov qword ptr [rsp + 0x68], r8
000001f1    4c 8b 4c 24 68                    mov r9, qword ptr [rsp + 0x68]
000001f6    4c 8b 54 24 68                    mov r10, qword ptr [rsp + 0x68]
000001fb    4d 85 ca                          test r10, r9
000001fe    4c 8b 54 24 68                    mov r10, qword ptr [rsp + 0x68]
00000203    4c 89 54 24 60                    mov qword ptr [rsp + 0x60], r10
00000208    0f 8d 0f 00 00 00                 jge 0x21d
0000020e    4c 8b 54 24 60                    mov r10, qword ptr [rsp + 0x60]
00000213    4c 89 54 24 20                    mov qword ptr [rsp + 0x20], r10
00000218    e9 32 00 00 00                    jmp 0x24f
0000021d    48 8b 5c 24 58                    mov rbx, qword ptr [rsp + 0x58]
00000222    4c 8d 63 01                       lea r12, [rbx + 1]
00000226    4c 8b 1c 24                       mov r11, qword ptr [rsp]
0000022a    4d 89 63 08                       mov qword ptr [r11 + 8], r12
0000022e    48 8b 7c 24 28
[message truncated]

view this post on Zulip Wasmtime GitHub notifications bot (Jan 21 2025 at 18:40):

alexcrichton commented on issue #9980:

Another test that came up in @Robbepop's differential fuzzing of wasmi vs wasmtime:

(module
  (type (;0;) (func (result i32)))
  (global (;0;) (mut i32) i32.const 0)
  (export "xxx" (func 0))
  (func (;0;) (type 0) (result i32)
    block (result i32) ;; label = @1
      block (result i32) ;; label = @2
        block (result i32) ;; label = @3
          block (result i32) ;; label = @4
            block (result i32) ;; label = @5
              block (result i32) ;; label = @6
                i32.const 1
                i32.const 1
                f32.convert_i32_s
                f64.const -nan:0xffffffff80000 (;=NaN;)
                f32.demote_f64
                f32.ne
                br_if 3 (;@3;)
                drop
                i32.const 1
              end
              global.get 0
              i32.xor
              global.set 0
              i32.const 1
            end
            global.get 0
            i32.xor
            global.set 0
            i32.const 0
          end
          global.get 0
          i32.xor
          global.set 0
          i32.const 0
        end
        i32.const 1
        i32.xor
        global.set 0
        i32.const 0
      end
      global.get 0
      i32.xor
      global.set 0
      i32.const 1
    end
    global.get 0
    i32.xor
  )
)

view this post on Zulip Wasmtime GitHub notifications bot (Jan 21 2025 at 20:23):

alexcrichton commented on issue #9980:

For the test case above I used rr to record both the "good" backtracking algorithm and "bad" single_pass algorithm at runtime.

The f32.ne above compiles to jp + jne and the jp is a taken branch. The backtracking algorithm (the "good" execution) looks like this:

(rr) stepi
0x00007f02fad63022 in ?? ()
2: x/5i $pc
=> 0x7f02fad63022:      jp     0x7f02fad63048
   0x7f02fad63028:      jne    0x7f02fad63048
   0x7f02fad6302e:      mov    0x60(%rdi),%r11d
   0x7f02fad63032:      mov    %r11,%rsi
   0x7f02fad63035:      xor    $0x1,%esi
(rr) stepi
0x00007f02fad63048 in ?? ()
2: x/5i $pc
=> 0x7f02fad63048:      mov    %rax,%r8
   0x7f02fad6304b:      xor    $0x1,%r8d
   0x7f02fad6304f:      mov    %r8d,0x60(%rdi)
   0x7f02fad63053:      mov    %r8d,0x60(%rdi)
   0x7f02fad63057:      mov    %rbp,%rsp
(rr) print/x $rax
$2 = 0x1

Namely the jp branch is taken, and we're about to start the xor business of the wasm code itself originating from 0x1 in the %eax register.

The single_pass algorithm (the "bad" execution) looks like this:

0x00007fc8cc983062 in ?? ()
2: x/5i $pc
=> 0x7fc8cc983062:      jp     0x7fc8cc983078
   0x7fc8cc983068:      mov    0x30(%rsp),%rax
   0x7fc8cc98306d:      mov    %rax,0x28(%rsp)
   0x7fc8cc983072:      je     0x7fc8cc983086
   0x7fc8cc983078:      mov    0x28(%rsp),%rax
(rr) stepi
0x00007fc8cc983078 in ?? ()
2: x/5i $pc
=> 0x7fc8cc983078:      mov    0x28(%rsp),%rax
   0x7fc8cc98307d:      mov    %rax,(%rsp)
   0x7fc8cc983081:      jmp    0x7fc8cc9830cf
   0x7fc8cc983086:      mov    0x8(%rsp),%rsi
   0x7fc8cc98308b:      mov    0x60(%rsi),%edi
(rr) stepi
0x00007fc8cc98307d in ?? ()
2: x/5i $pc
=> 0x7fc8cc98307d:      mov    %rax,(%rsp)
   0x7fc8cc983081:      jmp    0x7fc8cc9830cf
   0x7fc8cc983086:      mov    0x8(%rsp),%rsi
   0x7fc8cc98308b:      mov    0x60(%rsi),%edi
   0x7fc8cc98308e:      mov    %rdi,0x20(%rsp)
(rr) print/x $eax
$4 = 0x86d64000
(rr) stepi
0x00007fc8cc983081 in ?? ()
2: x/5i $pc
=> 0x7fc8cc983081:      jmp    0x7fc8cc9830cf
   0x7fc8cc983086:      mov    0x8(%rsp),%rsi
   0x7fc8cc98308b:      mov    0x60(%rsi),%edi
   0x7fc8cc98308e:      mov    %rdi,0x20(%rsp)
   0x7fc8cc983093:      mov    0x20(%rsp),%rdx
(rr) stepi
0x00007fc8cc9830cf in ?? ()
2: x/5i $pc
=> 0x7fc8cc9830cf:      mov    (%rsp),%rbx
   0x7fc8cc9830d3:      xor    $0x1,%ebx
   0x7fc8cc9830d6:      mov    %rbx,0x10(%rsp)
   0x7fc8cc9830db:      mov    0x8(%rsp),%r12
   0x7fc8cc9830e0:      mov    0x10(%rsp),%r13
(rr)
0x00007fc8cc9830d3 in ?? ()
2: x/5i $pc
=> 0x7fc8cc9830d3:      xor    $0x1,%ebx
   0x7fc8cc9830d6:      mov    %rbx,0x10(%rsp)
   0x7fc8cc9830db:      mov    0x8(%rsp),%r12
   0x7fc8cc9830e0:      mov    0x10(%rsp),%r13
   0x7fc8cc9830e5:      mov    %r13d,0x60(%r12)
(rr) print/x $ebx
$6 = 0x86d64000

It again looks like the jp branch was (correctly) taken but the jump destination moves a stack-based variable through the %rax register (presumably this was elided with backtracking). The value being moved is not the same and when we reach the first xor instruction we're xor-ing into the wrong value.

So something about the phi nodes may be off? @cfallin does this look familiar at all? (or anything I can do to help dig in?)

view this post on Zulip Wasmtime GitHub notifications bot (Jan 21 2025 at 21:59):

alexcrichton commented on issue #9980:

Ok I've done a bit of further digging here using the above module (which notably doesn't need fuel which cleans up IR slightly).

First the two results I get are:

$ wasmtime compile ./foo.wat -C cranelift-regalloc-algorithm=backtracking && wasmtime run --allow-precompiled --invoke xxx foo.cwasm
warning: using `--invoke` with a function that returns values is experimental and may break in the future
1
$ wasmtime compile ./foo.wat -C cranelift-regalloc-algorithm=single_pass && wasmtime run --allow-precompiled --invoke xxx foo.cwasm
warning: using `--invoke` with a function that returns values is experimental and may break in the future
0

aka 1 is correct and 0 is wrong.

Looking at the objdump of the single_pass version I see:

      62:       0f 8a 10 00 00 00       jp     78 <wasm[0]::function[0]+0x78>
      68:       48 8b 44 24 30          mov    0x30(%rsp),%rax
      6d:       48 89 44 24 28          mov    %rax,0x28(%rsp)
      72:       0f 84 0e 00 00 00       je     86 <wasm[0]::function[0]+0x86>
      78:       48 8b 44 24 28          mov    0x28(%rsp),%rax
      7d:       48 89 04 24             mov    %rax,(%rsp)
      81:       e9 49 00 00 00          jmp    cf <wasm[0]::function[0]+0xcf>
...
      cf:       48 8b 1c 24             mov    (%rsp),%rbx
      d3:       83 f3 01                xor    $0x1,%ebx

Using the rr trace from above I know that the jp is being taken which means that the problem lies in 0x28(%rsp). The problem here is that 0x28(%rsp) isn't initialized right. This stack slot is only initialized before the instruction executed at 72, which is not run because jp skips over it. Instructions 68 and 6d are skipped and look like regalloc-inserted instructions.

The VCode for this function I generated with:

$ wasmtime compile ./foo.wat -C cranelift-regalloc-algorithm=single_pass --emit-clif clif
$ RUST_LOG=trace cargo run compile -D ../clif/wasm_func_0.clif --target x86_64 --set regalloc_algorithm=single_pass
...

The precise regalloc results are slightly different because wasmtime-the-CLI is 29.0.0 while clif-util is what's in-tree, but the general shape is the same. Notably the VCode looks like this:

VCode {
  Entry block: 0
Block 0([]):
    (original IR block: block0)
    (successor: Block 1([VReg(vreg = 232, class = Int)]))
    (successor: Block 2([]))
  Inst 0: args %v192=%rdi
...
  Inst 9: jp      label1
  Inst 10: jnz     label1; j label2
Block 1([VReg(vreg = 225, class = Int)]):
    (successor: Block 6([VReg(vreg = 225, class = Int)]))
  Inst 11: jmp     label6

I think the problem here is that VCode is lying to regalloc2 in that we no longer have "extended basic blocks". Notably branch-on-float comparisons (and I think only float comparisons) are represented with two MInst jumps (9 and 10 above). This means that a non-terminating instruction is allowed to leave the block, notably Inst 9: jp label1.

In this case I think regalloc2 is inserting code before Inst 10 which is not being executed on the jp label1, hence the "reading undefined stack slot" problem.

In talking with @cfallin it looks like this problem is scoped to the single_pass algorithm because that can insert code before a basic block terminator while backtracking will never insert code before a terminator. That would explain why (a) we've never seen this with backtracking and (b) why single_pass is fuzzing well with regalloc's checker but we're seeing issues here.

So I think the long-and-short of it is that we need to refactor how float comparisons work, notably the OrCondition of float comparisons because we can't have separate MInst instructions representing two jumps because then we're lying to regalloc2 about being in pure-basic-block-form.

view this post on Zulip Wasmtime GitHub notifications bot (Jan 21 2025 at 22:08):

cfallin commented on issue #9980:

while backtracking will never insert code before a terminator

Slightly more precisely (but we're still safe): it will never insert code before a terminator with multiple targets; this is such a case, because the branch sequence is "one-target branch ; two-target branch" (jmp_if + jmp_cond).

view this post on Zulip Wasmtime GitHub notifications bot (Jan 23 2025 at 05:41):

cfallin commented on issue #9980:

I've confirmed that post-#10086 the above test now shows the same behavior (i.e., "wasm trap: call stack exhausted") under the single-pass allocator as under backtracking. Strictly speaking the underlying bug is still present on s390x until #10087 also merges (so I guess I'll leave this issue open for now) but hopefully we'll see the fuzzbug close as fixed on the next build.

view this post on Zulip Wasmtime GitHub notifications bot (Jan 23 2025 at 17:45):

cfallin closed issue #9980:

Found via oss-fuzz in https://issues.oss-fuzz.com/issues/387110342 I've minimized this to:

(module
  (func (export "")
    call 1
    f64.const 0
    f64.const 0
    f64.ne
    if
    end
    call 0
  )

  (func
    f64.const nan
    f64.const 0
    f64.eq
    if
      loop
      end
    end
  )
)

which I can show different behavior with:

$ wasmtime run --invoke '' -W fuel=$((1<<62)) -C cranelift-regalloc-algorithm=backtracking foo.wat
Error: failed to run main module `foo.wat`

Caused by:
    0: failed to invoke ``
    1: error while executing at wasm backtrace:
           0: <unknown>!<wasm function 1>
           1:   0x1e - <unknown>!<wasm function 0>
           2:   0x36 - <unknown>!<wasm function 0>
...
         16365:   0x36 - <unknown>!<wasm function 0>
         16366:   0x36 - <unknown>!<wasm function 0>
    2: wasm trap: call stack exhausted

That's expected, this infinitely recurses. With single_pass though:

$ wasmtime run --invoke '' -W fuel=$((1<<62)) -C cranelift-regalloc-algorithm=single_pass foo.wat
Error: failed to run main module `foo.wat`

Caused by:
    0: failed to invoke ``
    1: error while executing at wasm backtrace:
           0:   0x1d - <unknown>!<wasm function 0>
           1:   0x36 - <unknown>!<wasm function 0>
    2: wasm trap: all fuel consumed by WebAssembly


Last updated: Jan 24 2025 at 00:11 UTC