Stream: git-wasmtime

Topic: wasmtime / issue #3248 Cranelift: cls instruction is not ...


view this post on Zulip Wasmtime GitHub notifications bot (Aug 26 2021 at 12:09):

afonso360 opened issue #3248:

Hey, @dheaton-arm reported an interesting implementation detail from the aarch64 backend on the cls instruction. However, further testing shows that this instruction is not correctly implemented in any backend for i16 and i8 types.

.clif Test Case

test run
target aarch64
target x86_64 machinst
target s390x

function %cls_i16(i16) -> i16 {
block0(v0: i16):
    v1 = cls.i16 v0
    return v1
}
; run: %cls_i16(0x0000) == 15
; run: %cls_i16(0xFFFF) == 15
; run: %cls_i16(0x8000) == 1
; run: %cls_i16(0xC000) == 2
; run: %cls_i16(0x2000) == 1
; run: %cls_i16(0x1000) == 2


function %cls_i8(i8) -> i8 {
block0(v0: i8):
    v1 = cls.i8 v0
    return v1
}
; run: %cls_i8(0x00) == 7
; run: %cls_i8(0xFF) == 7
; run: %cls_i8(0x80) == 1
; run: %cls_i8(0xC0) == 2
; run: %cls_i8(0x20) == 1
; run: %cls_i8(0x10) == 2

Steps to Reproduce

clif-util test ./the-above.clif

Expected Results

All three backends should pass the run tests.

Actual Results

x86_64: cls is not implemented, for either type.
aarch64: Returns the wrong results (i.e. run: %cls_i16(0x0000) == 15 returns 31)
s390x: Returns the wrong results (i.e. run: %cls_i16(0x0000) == 15 returns 4096)

Versions and Environment

Cranelift version or commit: main
Operating system: Linux / qemu emulator for aarch64 and s390x
Architecture: x86_64 / aarch64 / s390x

view this post on Zulip Wasmtime GitHub notifications bot (Aug 26 2021 at 12:09):

afonso360 labeled issue #3248:

Hey, @dheaton-arm reported an interesting implementation detail from the aarch64 backend on the cls instruction. However, further testing shows that this instruction is not correctly implemented in any backend for i16 and i8 types.

.clif Test Case

test run
target aarch64
target x86_64 machinst
target s390x

function %cls_i16(i16) -> i16 {
block0(v0: i16):
    v1 = cls.i16 v0
    return v1
}
; run: %cls_i16(0x0000) == 15
; run: %cls_i16(0xFFFF) == 15
; run: %cls_i16(0x8000) == 1
; run: %cls_i16(0xC000) == 2
; run: %cls_i16(0x2000) == 1
; run: %cls_i16(0x1000) == 2


function %cls_i8(i8) -> i8 {
block0(v0: i8):
    v1 = cls.i8 v0
    return v1
}
; run: %cls_i8(0x00) == 7
; run: %cls_i8(0xFF) == 7
; run: %cls_i8(0x80) == 1
; run: %cls_i8(0xC0) == 2
; run: %cls_i8(0x20) == 1
; run: %cls_i8(0x10) == 2

Steps to Reproduce

clif-util test ./the-above.clif

Expected Results

All three backends should pass the run tests.

Actual Results

x86_64: cls is not implemented, for either type.
aarch64: Returns the wrong results (i.e. run: %cls_i16(0x0000) == 15 returns 31)
s390x: Returns the wrong results (i.e. run: %cls_i16(0x0000) == 15 returns 4096)

Versions and Environment

Cranelift version or commit: main
Operating system: Linux / qemu emulator for aarch64 and s390x
Architecture: x86_64 / aarch64 / s390x

view this post on Zulip Wasmtime GitHub notifications bot (Aug 26 2021 at 12:09):

afonso360 labeled issue #3248:

Hey, @dheaton-arm reported an interesting implementation detail from the aarch64 backend on the cls instruction. However, further testing shows that this instruction is not correctly implemented in any backend for i16 and i8 types.

.clif Test Case

test run
target aarch64
target x86_64 machinst
target s390x

function %cls_i16(i16) -> i16 {
block0(v0: i16):
    v1 = cls.i16 v0
    return v1
}
; run: %cls_i16(0x0000) == 15
; run: %cls_i16(0xFFFF) == 15
; run: %cls_i16(0x8000) == 1
; run: %cls_i16(0xC000) == 2
; run: %cls_i16(0x2000) == 1
; run: %cls_i16(0x1000) == 2


function %cls_i8(i8) -> i8 {
block0(v0: i8):
    v1 = cls.i8 v0
    return v1
}
; run: %cls_i8(0x00) == 7
; run: %cls_i8(0xFF) == 7
; run: %cls_i8(0x80) == 1
; run: %cls_i8(0xC0) == 2
; run: %cls_i8(0x20) == 1
; run: %cls_i8(0x10) == 2

Steps to Reproduce

clif-util test ./the-above.clif

Expected Results

All three backends should pass the run tests.

Actual Results

x86_64: cls is not implemented, for either type.
aarch64: Returns the wrong results (i.e. run: %cls_i16(0x0000) == 15 returns 31)
s390x: Returns the wrong results (i.e. run: %cls_i16(0x0000) == 15 returns 4096)

Versions and Environment

Cranelift version or commit: main
Operating system: Linux / qemu emulator for aarch64 and s390x
Architecture: x86_64 / aarch64 / s390x

view this post on Zulip Wasmtime GitHub notifications bot (Aug 26 2021 at 12:17):

afonso360 commented on issue #3248:

In fact, we have issues with this instruction in i32 and i64 types.

Testcase:

function %cls_i64(i64) -> i64 {
block0(v0: i64):
    v1 = cls.i64 v0
    return v1
}
; run: %cls_i64(0) == 63

function %cls_i32(i32) -> i32 {
block0(v0: i32):
    v1 = cls.i32 v0
    return v1
}
; run: %cls_i32(0) == 31

aarch64: Passes these test cases
x86_64: Does not implement these
x390x: Fails with a wrong value (Failed test: run: %cls_i64(0) == 63, actual: 4611686018427387904)

view this post on Zulip Wasmtime GitHub notifications bot (Aug 26 2021 at 13:22):

afonso360 edited issue #3248:

Hey, @dheaton-arm reported an interesting implementation detail from the aarch64 backend on the cls instruction. However, further testing shows that this instruction is not correctly implemented in any backend for i16 and i8 types.

.clif Test Case

test run
target aarch64
target x86_64 machinst
target s390x

function %cls_i16(i16) -> i16 {
block0(v0: i16):
    v1 = cls.i16 v0
    return v1
}
; run: %cls_i16(0x0000) == 15
; run: %cls_i16(0xFFFF) == 15
; run: %cls_i16(0x8000) == 0
; run: %cls_i16(0xC000) == 1
; run: %cls_i16(0x4000) == 0
; run: %cls_i16(0x2000) == 1


function %cls_i8(i8) -> i8 {
block0(v0: i8):
    v1 = cls.i8 v0
    return v1
}
; run: %cls_i8(0x00) == 7
; run: %cls_i8(0xFF) == 7
; run: %cls_i8(0x80) == 0
; run: %cls_i8(0xC0) == 1
; run: %cls_i8(0x40) == 0
; run: %cls_i8(0x20) == 1

Steps to Reproduce

clif-util test ./the-above.clif

Expected Results

All three backends should pass the run tests.

Actual Results

x86_64: cls is not implemented, for either type.
aarch64: Returns the wrong results (i.e. run: %cls_i16(0x0000) == 15 returns 31)
s390x: Returns the wrong results (i.e. run: %cls_i16(0x0000) == 15 returns 4096)

Versions and Environment

Cranelift version or commit: main
Operating system: Linux / qemu emulator for aarch64 and s390x
Architecture: x86_64 / aarch64 / s390x

view this post on Zulip Wasmtime GitHub notifications bot (Sep 01 2021 at 12:44):

sparker-arm commented on issue #3248:

At least for AArch64, we can't use a 1-1 mapping between CLIF and machine instructions when using a type smaller than a naturally supported register width (32 and 64-bits). I think we need to expand this instruction in the case that the target doesn't directly support the type, something like:
cls (x) -> sub (cls (swiden (x)), bitwidth(x))

view this post on Zulip Wasmtime GitHub notifications bot (Sep 10 2021 at 16:39):

akirilov-arm labeled issue #3248:

Hey, @dheaton-arm reported an interesting implementation detail from the aarch64 backend on the cls instruction. However, further testing shows that this instruction is not correctly implemented in any backend for i16 and i8 types.

.clif Test Case

test run
target aarch64
target x86_64 machinst
target s390x

function %cls_i16(i16) -> i16 {
block0(v0: i16):
    v1 = cls.i16 v0
    return v1
}
; run: %cls_i16(0x0000) == 15
; run: %cls_i16(0xFFFF) == 15
; run: %cls_i16(0x8000) == 0
; run: %cls_i16(0xC000) == 1
; run: %cls_i16(0x4000) == 0
; run: %cls_i16(0x2000) == 1


function %cls_i8(i8) -> i8 {
block0(v0: i8):
    v1 = cls.i8 v0
    return v1
}
; run: %cls_i8(0x00) == 7
; run: %cls_i8(0xFF) == 7
; run: %cls_i8(0x80) == 0
; run: %cls_i8(0xC0) == 1
; run: %cls_i8(0x40) == 0
; run: %cls_i8(0x20) == 1

Steps to Reproduce

clif-util test ./the-above.clif

Expected Results

All three backends should pass the run tests.

Actual Results

x86_64: cls is not implemented, for either type.
aarch64: Returns the wrong results (i.e. run: %cls_i16(0x0000) == 15 returns 31)
s390x: Returns the wrong results (i.e. run: %cls_i16(0x0000) == 15 returns 4096)

Versions and Environment

Cranelift version or commit: main
Operating system: Linux / qemu emulator for aarch64 and s390x
Architecture: x86_64 / aarch64 / s390x

view this post on Zulip Wasmtime GitHub notifications bot (Sep 10 2021 at 16:39):

akirilov-arm labeled issue #3248:

Hey, @dheaton-arm reported an interesting implementation detail from the aarch64 backend on the cls instruction. However, further testing shows that this instruction is not correctly implemented in any backend for i16 and i8 types.

.clif Test Case

test run
target aarch64
target x86_64 machinst
target s390x

function %cls_i16(i16) -> i16 {
block0(v0: i16):
    v1 = cls.i16 v0
    return v1
}
; run: %cls_i16(0x0000) == 15
; run: %cls_i16(0xFFFF) == 15
; run: %cls_i16(0x8000) == 0
; run: %cls_i16(0xC000) == 1
; run: %cls_i16(0x4000) == 0
; run: %cls_i16(0x2000) == 1


function %cls_i8(i8) -> i8 {
block0(v0: i8):
    v1 = cls.i8 v0
    return v1
}
; run: %cls_i8(0x00) == 7
; run: %cls_i8(0xFF) == 7
; run: %cls_i8(0x80) == 0
; run: %cls_i8(0xC0) == 1
; run: %cls_i8(0x40) == 0
; run: %cls_i8(0x20) == 1

Steps to Reproduce

clif-util test ./the-above.clif

Expected Results

All three backends should pass the run tests.

Actual Results

x86_64: cls is not implemented, for either type.
aarch64: Returns the wrong results (i.e. run: %cls_i16(0x0000) == 15 returns 31)
s390x: Returns the wrong results (i.e. run: %cls_i16(0x0000) == 15 returns 4096)

Versions and Environment

Cranelift version or commit: main
Operating system: Linux / qemu emulator for aarch64 and s390x
Architecture: x86_64 / aarch64 / s390x

view this post on Zulip Wasmtime GitHub notifications bot (Jun 27 2022 at 22:55):

fitzgen closed issue #3248:

Hey, @dheaton-arm reported an interesting implementation detail from the aarch64 backend on the cls instruction. However, further testing shows that this instruction is not correctly implemented in any backend for i16 and i8 types.

.clif Test Case

test run
target aarch64
target x86_64 machinst
target s390x

function %cls_i16(i16) -> i16 {
block0(v0: i16):
    v1 = cls.i16 v0
    return v1
}
; run: %cls_i16(0x0000) == 15
; run: %cls_i16(0xFFFF) == 15
; run: %cls_i16(0x8000) == 0
; run: %cls_i16(0xC000) == 1
; run: %cls_i16(0x4000) == 0
; run: %cls_i16(0x2000) == 1


function %cls_i8(i8) -> i8 {
block0(v0: i8):
    v1 = cls.i8 v0
    return v1
}
; run: %cls_i8(0x00) == 7
; run: %cls_i8(0xFF) == 7
; run: %cls_i8(0x80) == 0
; run: %cls_i8(0xC0) == 1
; run: %cls_i8(0x40) == 0
; run: %cls_i8(0x20) == 1

Steps to Reproduce

clif-util test ./the-above.clif

Expected Results

All three backends should pass the run tests.

Actual Results

x86_64: cls is not implemented, for either type.
aarch64: Returns the wrong results (i.e. run: %cls_i16(0x0000) == 15 returns 31)
s390x: Returns the wrong results (i.e. run: %cls_i16(0x0000) == 15 returns 4096)

Versions and Environment

Cranelift version or commit: main
Operating system: Linux / qemu emulator for aarch64 and s390x
Architecture: x86_64 / aarch64 / s390x


Last updated: Jan 24 2025 at 00:11 UTC