wasmtime / PR #9214 riscv64/x390: add *_overflow · git-wasmtime

Hi @ghostway0, a few comments on the s390x part:

All the new instruction rules you added seem to provide only a single return, the overflow bit. However, my understanding is that smul_overflow and all the other overflow instructions are defined to have two returns, the low-part result and the overflow bit. I think you'll need to use some form of with_flags to construct the pair of results (like x86 and aarch64 already do).

You're simply re-using the same instructions used for the "normal" operation (add/sub/mul) also for the overflow operation. That is correct for 32-bit and 64-bit operations, but not for 8-bit and 16-bit operations. The reason is that the 390x ISA does not actually have any 8-bit or 16-bit arithmetic instructions, so we simply use the 32-bit version also for 8-bit and 16-bit operations. That provides the correct (low-part) result, but any overflow indication would be incorrect.

There is no unsigned-multiply instruction with overflow indication on our platform. What other compilers do is to use the 32x32->64 or 64x64->128 bit wide multiply instruction, and check whether the high-part of the output is zero.

Wasmtime GitHub notifications bot (Sep 21 2024 at 19:58):

afonso360 created PR review comment:

Ditto here

Wasmtime GitHub notifications bot (Sep 21 2024 at 19:58):

afonso360 created PR review comment:

It might be better to use rv_sltu here instead of a select between one and zero. The RISC-V comparision functions already return a zero or one, and they are a lot shorter than our current implementation of select_xreg

Wasmtime GitHub notifications bot (Sep 21 2024 at 19:58):

afonso360 created PR review comment:

It might be better to use rv_snez here instead of a select between one and zero.

Wasmtime GitHub notifications bot (Sep 21 2024 at 19:58):

afonso360 created PR review comment:

Same here

Wasmtime GitHub notifications bot (Sep 21 2024 at 19:58):

afonso360 submitted PR review:

:wave: Hey,

I don't know if this is ready for review yet, but It's a great start!

A few comments for the RISC-V part. I didn't check the lowerings in a lot of detail, mostly just spotting a few things that could be shorter.

Thanks for working on this!

Wasmtime GitHub notifications bot (Sep 21 2024 at 19:58):

afonso360 created PR review comment:

Ditto here for sltu.

Wasmtime GitHub notifications bot (Sep 21 2024 at 19:58):

afonso360 created PR review comment:

This one doesn't seem to be used anywhere, similarly in the rules below there are a few one unused instructions.

Wasmtime GitHub notifications bot (Sep 21 2024 at 19:58):

afonso360 created PR review comment:

This one doesn't seem to be used anywhere.

Wasmtime GitHub notifications bot (Sep 21 2024 at 19:58):

afonso360 created PR review comment:

This could also be a snez

Wasmtime GitHub notifications bot (Sep 21 2024 at 19:58):

afonso360 created PR review comment:

This could also be replaced with a sltu instruction which is a shorter sequence than a full select.

Wasmtime GitHub notifications bot (Sep 21 2024 at 19:58):

afonso360 created PR review comment:

Instead of doing madd here, we can just multiply x_lo and y_lo and save one instruction.

Wasmtime GitHub notifications bot (Sep 21 2024 at 19:58):

afonso360 created PR review comment:

We could replace this with a snez instruction

Wasmtime GitHub notifications bot (Sep 21 2024 at 19:58):

afonso360 created PR review comment:

Here we could use mul instead of madd and save one instruction.

Wasmtime GitHub notifications bot (Sep 21 2024 at 19:59):

afonso360 edited PR review comment.

Wasmtime GitHub notifications bot (Sep 21 2024 at 19:59):

afonso360 edited PR review comment.

Wasmtime GitHub notifications bot (Sep 22 2024 at 13:15):

afonso360 submitted PR review:

:wave: Hey,

I don't know if this is ready for review yet, but It's a great start!

A few comments for the RISC-V part. I didn't check the lowerings in a lot of detail, mostly just spotting a few things that could be shorter.

Thanks for working on this!

Edit: I also ran the fuzzer (with these changes) and it pointed out this testcase.

<details>

<summary>Fuzzer testcase</summary>

Testcase:
test interpret
test run
target riscv64gc
target x86_64

function %a(i8) -> i8 {
block0(v0: i8):
    v1, v2 = smul_overflow v0, v0
    return v2
}
; run: %a(-15) == 1
Result:
 ERROR cranelift_filetests::concurrent > FAIL: run
FAIL ./test.clif: run

Caused by:
    Failed test: run: %a(-15) == 1, actual: 0
1 tests
Error: 1 failure
</summary>

Last updated: Apr 17 2025 at 05:03 UTC