Stream: git-wasmtime

Topic: wasmtime / issue #5147 cranelift: Add Bswap instruction (...


view this post on Zulip Wasmtime GitHub notifications bot (Oct 27 2022 at 19:52):

uweigand commented on issue #5147:

s390x: Bswap not implemented

We do have bswap instructions, and they're already implemented in the backend: for $I32 and $I64, bswap_reg can be used; for $I16, a 32-bit bswap followed by a 16-bit right shift is generally the most efficient solution.

See e.g. the use as part of the bitrev implementation:

(rule (bitrev_bytes $I16 x) (lshr_imm $I32 (bswap_reg $I32 x) 16))
(rule (bitrev_bytes $I32 x) (bswap_reg $I32 x))
(rule (bitrev_bytes $I64 x) (bswap_reg $I64 x))

view this post on Zulip Wasmtime GitHub notifications bot (Oct 27 2022 at 20:36):

afonso360 commented on issue #5147:

This is awesome! Thanks! :tada:

I added this to our fuzzer (Feel free to add that to this PR if you want!) and It's been running for over an hour on aarch64 without any issues, however on x86 it reported an interesting failure.

I've tried to minimize this as much as I could but it sometimes throws weird results.

test interpret
test run
target x86_64

function %a() -> i8, i32, i64 {
block0:
    v5 = iconst.i64 0x9903_5204_d05f_abab
    v6 = bswap v5
    v7 = iconst.i8 0
    v8 = iconst.i32 0
    return v7, v8, v6
}

; run: %a() == [0, 0, -6076657925176032359]

You can run this testcase from the cranelift directory with cargo run -- test ./the-above.clif

Removing the v7 or v8 makes the test pass, which is interesting... but otherwise the test segfaults on my machine, most of the time but not always.

Here is the disassembly, which you can get with cargo run -- compile -D --target x86_64 ./the-above.clif:

Disassembly of 29 bytes:
   0:   55                      push    rbp
   1:   48 89 e5                mov     rbp, rsp
   4:   49 bb ab ab 5f d0 04 52 03 99
                                movabs  r11, 0x99035204d05fabab
   e:   4c 0f cb                bswap   rbx
  11:   31 c0                   xor     eax, eax
  13:   31 d2                   xor     edx, edx
  15:   4c 89 1f                mov     qword ptr [rdi], r11
  18:   48 89 ec                mov     rsp, rbp
  1b:   5d                      pop     rbp
  1c:   c3                      ret

<details>

<summary>Here's a more reliable example that always fails with a wrong result, but never segfaults.</summary>

test interpret
test run
target x86_64

function %a(f32, f64, i32, i32, f64) -> i8, i32, i64 {
block0(v0: f32, v1: f64, v2: i32, v3: i32, v4: f64):
    v5 = iconst.i64 0x9903_5204_d05f_abab
    v6 = bswap v5
    v7 = iconst.i8 0
    v8 = iconst.i32 0
    return v7, v8, v6
}

; run: %a(0.0, 0.0, 0, 0, 0.0) == [0, 0, -6076657925176032359]

Here's the disassembly of the above function

Disassembly of 32 bytes:
   0:   55                      push    rbp
   1:   48 89 e5                mov     rbp, rsp
   4:   49 89 d1                mov     r9, rdx
   7:   49 b8 ab ab 5f d0 04 52 03 99
                                movabs  r8, 0x99035204d05fabab
  11:   4c 0f c8                bswap   rax
  14:   31 c0                   xor     eax, eax
  16:   31 d2                   xor     edx, edx
  18:   4d 89 01                mov     qword ptr [r9], r8
  1b:   48 89 ec                mov     rsp, rbp
  1e:   5d                      pop     rbp
  1f:   c3                      ret

</details>

view this post on Zulip Wasmtime GitHub notifications bot (Oct 27 2022 at 20:50):

11evan commented on issue #5147:

Thanks for the feedback everyone! I'll work on addressing everything, including adding runtests and investigating the fuzzer failure

view this post on Zulip Wasmtime GitHub notifications bot (Oct 28 2022 at 02:20):

11evan commented on issue #5147:

The x86_64 issue the fuzzer found was an incorrect REX encoding - I misunderstood the manual and was using REX.R instead of REX.B when encoding bswap access to r8-r15. Fixed locally and added those cases to the runtests, they pass now.

I'll have an update out sometime in the next couple days once I address the other feedback and add s380x support

view this post on Zulip Wasmtime GitHub notifications bot (Oct 28 2022 at 02:21):

11evan edited a comment on issue #5147:

The x86_64 issue the fuzzer found was an incorrect REX encoding - I misunderstood the manual and was using REX.R instead of REX.B when encoding bswap access to r8-r15. Fixed locally and added those cases to the runtests, they pass now.

I'll have an update out sometime in the next couple days once I address the other feedback and add s390x support

view this post on Zulip Wasmtime GitHub notifications bot (Oct 28 2022 at 23:36):

11evan commented on issue #5147:

PR updated:

view this post on Zulip Wasmtime GitHub notifications bot (Oct 29 2022 at 14:30):

uweigand commented on issue #5147:

s390x parts LGTM, thanks for adding this. (Seeing the discussion above, if consensus is to add I128 support, you can likewise copy the 128-bit byteswap implementation via vector permute from bitrev.)

view this post on Zulip Wasmtime GitHub notifications bot (Oct 31 2022 at 16:37):

11evan commented on issue #5147:

PR Updated - legalized 128-bit bswap, added runtest for interpreter. Didn't actually implement 128-bit swap in any of the backends yet.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 31 2022 at 18:24):

11evan commented on issue #5147:

Looks like the CI run failed at verify-publish with:

Run cd /opt/hostedtoolcache
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
[...]
100   212  100   212    0     0     40      0  0:00:05  0:00:05 --:--:--    54

gzip: stdin: not in gzip format
tar: Child returned status 1
tar: Error is not recoverable: exiting now
Error: Process completed with exit code 2.

It looks like an issue with the tooling rather than an issue with my commit - what should I do here, does it just need a retry?

view this post on Zulip Wasmtime GitHub notifications bot (Oct 31 2022 at 19:30):

jameysharp commented on issue #5147:

Apparently it did just need a retry. Merged now! :tada:


Last updated: Jan 24 2025 at 00:11 UTC