Stream: git-wasmtime

Topic: wasmtime / issue #5470 Cranelift: `srem.i8` giving wrong ...


view this post on Zulip Wasmtime GitHub notifications bot (Dec 19 2022 at 16:10):

afonso360 opened issue #5470:

:wave: Hey,

The fuzzer pointed this out when I added srem. This essentially does srem -1, -1, but we get the lhs with a bmask.

Doing a srem with both sides coming from an argument gives the correct result. I think we might be missing some masking on srem. The output of bmask seems correct. (i.e. 255).

This test passes on aarch64, s390x and riscv64.

.clif Test Case

test interpret
test run
target aarch64
target s390x
target riscv64
target x86_64

function %a(i64, i8) -> i8 {
block0(v0: i64, v1: i8):
    v2 = bmask.i8 v0
    v3 = srem v2, v1
    return v3
}
; run: %a(4352, -1) == 0

Steps to Reproduce

Expected Results

The test to pass.

Actual Results

 ERROR cranelift_filetests::concurrent > FAIL: run
FAIL ./test.clif: run

Caused by:
    Failed test: run: %a(4352, -1) == 0, actual: -1
1 tests
Error: 1 failure

Versions and Environment

Cranelift version or commit: main
Operating system: Linux
Architecture: x86_64

Extra Info

<details>
<summary> Disassembly: cargo run -- compile --target x86_64 -D ./test.clif</summary>

Disassembly of 63 bytes:
   0:   55                      push    rbp
   1:   48 89 e5                mov     rbp, rsp
   4:   49 89 fb                mov     r11, rdi
   7:   49 f7 db                neg     r11
   a:   48 89 f8                mov     rax, rdi
   d:   19 f8                   sbb     eax, edi
   f:   31 d2                   xor     edx, edx
  11:   40 80 fe 00             cmp     sil, 0
  15:   0f 85 02 00 00 00       jne     0x1d
  1b:   0f 0b                   ud2
  1d:   40 80 fe ff             cmp     sil, 0xff
  21:   0f 85 0a 00 00 00       jne     0x31
  27:   ba 00 00 00 00          mov     edx, 0
  2c:   e9 05 00 00 00          jmp     0x36
  31:   66 98                   cbw
  33:   40 f6 fe                idiv    sil
  36:   48 c1 e8 08             shr     rax, 8
  3a:   48 89 ec                mov     rsp, rbp
  3d:   5d                      pop     rbp
  3e:   c3                      ret

</details>

view this post on Zulip Wasmtime GitHub notifications bot (Dec 19 2022 at 16:10):

afonso360 labeled issue #5470:

:wave: Hey,

The fuzzer pointed this out when I added srem. This essentially does srem -1, -1, but we get the lhs with a bmask.

Doing a srem with both sides coming from an argument gives the correct result. I think we might be missing some masking on srem. The output of bmask seems correct. (i.e. 255).

This test passes on aarch64, s390x and riscv64.

.clif Test Case

test interpret
test run
target aarch64
target s390x
target riscv64
target x86_64

function %a(i64, i8) -> i8 {
block0(v0: i64, v1: i8):
    v2 = bmask.i8 v0
    v3 = srem v2, v1
    return v3
}
; run: %a(4352, -1) == 0

Steps to Reproduce

Expected Results

The test to pass.

Actual Results

 ERROR cranelift_filetests::concurrent > FAIL: run
FAIL ./test.clif: run

Caused by:
    Failed test: run: %a(4352, -1) == 0, actual: -1
1 tests
Error: 1 failure

Versions and Environment

Cranelift version or commit: main
Operating system: Linux
Architecture: x86_64

Extra Info

<details>
<summary> Disassembly: cargo run -- compile --target x86_64 -D ./test.clif</summary>

Disassembly of 63 bytes:
   0:   55                      push    rbp
   1:   48 89 e5                mov     rbp, rsp
   4:   49 89 fb                mov     r11, rdi
   7:   49 f7 db                neg     r11
   a:   48 89 f8                mov     rax, rdi
   d:   19 f8                   sbb     eax, edi
   f:   31 d2                   xor     edx, edx
  11:   40 80 fe 00             cmp     sil, 0
  15:   0f 85 02 00 00 00       jne     0x1d
  1b:   0f 0b                   ud2
  1d:   40 80 fe ff             cmp     sil, 0xff
  21:   0f 85 0a 00 00 00       jne     0x31
  27:   ba 00 00 00 00          mov     edx, 0
  2c:   e9 05 00 00 00          jmp     0x36
  31:   66 98                   cbw
  33:   40 f6 fe                idiv    sil
  36:   48 c1 e8 08             shr     rax, 8
  3a:   48 89 ec                mov     rsp, rbp
  3d:   5d                      pop     rbp
  3e:   c3                      ret

</details>

view this post on Zulip Wasmtime GitHub notifications bot (Dec 19 2022 at 16:10):

afonso360 labeled issue #5470:

:wave: Hey,

The fuzzer pointed this out when I added srem. This essentially does srem -1, -1, but we get the lhs with a bmask.

Doing a srem with both sides coming from an argument gives the correct result. I think we might be missing some masking on srem. The output of bmask seems correct. (i.e. 255).

This test passes on aarch64, s390x and riscv64.

.clif Test Case

test interpret
test run
target aarch64
target s390x
target riscv64
target x86_64

function %a(i64, i8) -> i8 {
block0(v0: i64, v1: i8):
    v2 = bmask.i8 v0
    v3 = srem v2, v1
    return v3
}
; run: %a(4352, -1) == 0

Steps to Reproduce

Expected Results

The test to pass.

Actual Results

 ERROR cranelift_filetests::concurrent > FAIL: run
FAIL ./test.clif: run

Caused by:
    Failed test: run: %a(4352, -1) == 0, actual: -1
1 tests
Error: 1 failure

Versions and Environment

Cranelift version or commit: main
Operating system: Linux
Architecture: x86_64

Extra Info

<details>
<summary> Disassembly: cargo run -- compile --target x86_64 -D ./test.clif</summary>

Disassembly of 63 bytes:
   0:   55                      push    rbp
   1:   48 89 e5                mov     rbp, rsp
   4:   49 89 fb                mov     r11, rdi
   7:   49 f7 db                neg     r11
   a:   48 89 f8                mov     rax, rdi
   d:   19 f8                   sbb     eax, edi
   f:   31 d2                   xor     edx, edx
  11:   40 80 fe 00             cmp     sil, 0
  15:   0f 85 02 00 00 00       jne     0x1d
  1b:   0f 0b                   ud2
  1d:   40 80 fe ff             cmp     sil, 0xff
  21:   0f 85 0a 00 00 00       jne     0x31
  27:   ba 00 00 00 00          mov     edx, 0
  2c:   e9 05 00 00 00          jmp     0x36
  31:   66 98                   cbw
  33:   40 f6 fe                idiv    sil
  36:   48 c1 e8 08             shr     rax, 8
  3a:   48 89 ec                mov     rsp, rbp
  3d:   5d                      pop     rbp
  3e:   c3                      ret

</details>

view this post on Zulip Wasmtime GitHub notifications bot (Dec 19 2022 at 20:46):

jameysharp commented on issue #5470:

@cfallin has a guess about this bug: Maybe our x86 lowering of srem.i8 is relying on the unspecified high bits in the register. It might need to sign-extend somewhere, or use an 8-bit version of the instruction, if there is one.

view this post on Zulip Wasmtime GitHub notifications bot (Dec 23 2022 at 17:03):

avanhatt commented on issue #5470:

@mpardesh and I were looking to see if we could tackle this from the verification side, but we think the relevant logic is all within external constructors' Rust code (rather than the ISLE itself).

From staring at the disassembly for a bit, though, we are wondering if this line is the problem:

27:   ba 00 00 00 00          mov     edx, 0

Which is being generated by [this line][link] in emit.rs in the specific case that the divisor is -1 and we are doing a remainder:

let inst = Inst::imm(OperandSize::Size64, 0, Writable::from_reg(regs::rdx()));

Should 27 instead write to eax instead of edx in this case?

In particular, next instruction after 27 is an unconditional jump to the following, which uses rax and seems to leave the results of edx unused (granted, it's been a while since I started at x86 disassembly, so we could totally be missing an implicit register use).

  36:   48 c1 e8 08             shr     rax, 8
  3a:   48 89 ec                mov     rsp, rbp
  3d:   5d                      pop     rbp
  3e:   c3                      ret

[link]: https://github.com/bytecodealliance/wasmtime/blob/ff995d910b64ea7937ccfd982dd431b1487a1ec8/cranelift/codegen/src/isa/x64/inst/emit.rs#L549

view this post on Zulip Wasmtime GitHub notifications bot (Dec 30 2022 at 01:10):

mpardesh commented on issue #5470:

While we were staring at idiv, we were wondering if division by 8 bit values is handled correctly. If the divisor is 8 bits long, idiv should not modify rdx at all and should only place results in ax. We dug into the code a bit but not deeply enough to really understand how 8 bit division works.

(this is probably unrelated to the bug, but we thought it might be worth checking anyway)

view this post on Zulip Wasmtime GitHub notifications bot (Jan 06 2023 at 22:18):

jameysharp closed issue #5470:

:wave: Hey,

The fuzzer pointed this out when I added srem. This essentially does srem -1, -1, but we get the lhs with a bmask.

Doing a srem with both sides coming from an argument gives the correct result. I think we might be missing some masking on srem. The output of bmask seems correct. (i.e. 255).

This test passes on aarch64, s390x and riscv64.

.clif Test Case

test interpret
test run
target aarch64
target s390x
target riscv64
target x86_64

function %a(i64, i8) -> i8 {
block0(v0: i64, v1: i8):
    v2 = bmask.i8 v0
    v3 = srem v2, v1
    return v3
}
; run: %a(4352, -1) == 0

Steps to Reproduce

Expected Results

The test to pass.

Actual Results

 ERROR cranelift_filetests::concurrent > FAIL: run
FAIL ./test.clif: run

Caused by:
    Failed test: run: %a(4352, -1) == 0, actual: -1
1 tests
Error: 1 failure

Versions and Environment

Cranelift version or commit: main
Operating system: Linux
Architecture: x86_64

Extra Info

<details>
<summary> Disassembly: cargo run -- compile --target x86_64 -D ./test.clif</summary>

Disassembly of 63 bytes:
   0:   55                      push    rbp
   1:   48 89 e5                mov     rbp, rsp
   4:   49 89 fb                mov     r11, rdi
   7:   49 f7 db                neg     r11
   a:   48 89 f8                mov     rax, rdi
   d:   19 f8                   sbb     eax, edi
   f:   31 d2                   xor     edx, edx
  11:   40 80 fe 00             cmp     sil, 0
  15:   0f 85 02 00 00 00       jne     0x1d
  1b:   0f 0b                   ud2
  1d:   40 80 fe ff             cmp     sil, 0xff
  21:   0f 85 0a 00 00 00       jne     0x31
  27:   ba 00 00 00 00          mov     edx, 0
  2c:   e9 05 00 00 00          jmp     0x36
  31:   66 98                   cbw
  33:   40 f6 fe                idiv    sil
  36:   48 c1 e8 08             shr     rax, 8
  3a:   48 89 ec                mov     rsp, rbp
  3d:   5d                      pop     rbp
  3e:   c3                      ret

</details>


Last updated: Dec 23 2024 at 12:05 UTC