theotherjimmy opened PR #12523 from theotherjimmy:s390x-arith-overflow to bytecodealliance:main:
Implement the following isle instructions for S390x:
uadd_overflowsadd_overflowusub_overflowssub_overflowumul_overflowsmul_overflow
theotherjimmy requested wasmtime-compiler-reviewers for a review on PR #12523.
theotherjimmy requested alexcrichton for a review on PR #12523.
alexcrichton commented on PR #12523:
r? @uweigand
github-actions[bot] added the label cranelift on PR #12523.
uweigand created PR review comment:
This is again unnecessary.
uweigand created PR review comment:
Since you're doing a 64-bit multiply here, you still need to zero-extend the inputs even here. Or else, you could use a 32x32->64 unsigned multiply.
uweigand submitted PR review.
uweigand created PR review comment:
The inputs are not guaranteed to be zero-extended, so you have to do that here.
uweigand created PR review comment:
This still needs to do the != 0 conversion to boolean on the high part.
uweigand created PR review comment:
This should do a != 0 test to set a boolean.
uweigand created PR review comment:
This is more of a cosmetic issue, but I think I'd prefer to keep these types of helpers in inst.isle - see e.g. what we're doing for the ..._sat_reg helpers. Ideally, we'd have a interface between inst.isle and lower.isle that looks more symmetrical across the different types ...
uweigand created PR review comment:
This seems just wrong. The second return should be a boolean indicating whether overflow happened, not simply the high part of the widened multiplication.
uweigand created PR review comment:
For the 32-bit and 64-bit cases of smul_overflow, why don't you simply use MS(G)RKC, which sets CC indicating overflow - that would work exactly the same as for sadd/ssub.
For the 8-bit and 16-bit cases, you should be able to use a similar shift trick as for sadd/ssub, only you have to make sure to shift just one of the inputs (the other must still be sign-extended).
uweigand created PR review comment:
This is unnecessary - the high bytes are considered don't care.
theotherjimmy submitted PR review.
theotherjimmy created PR review comment:
Multiply integers unsigned with overflow out.
ofis set when the multiplication overflowed.
text a &= x * y \pmod 2^B \\ of &= x * y > 2^BMy read is that the overflow out is the whole upper word size of the multiply.
theotherjimmy submitted PR review.
theotherjimmy created PR review comment:
Multiply integers unsigned with overflow out.
ofis set when the multiplication overflowed.
text a &= x * y \pmod 2^B \\ of &= x * y > 2^BMy read is that the overflow out is the whole upper word size of the multiply.
theotherjimmy deleted PR review comment.
theotherjimmy updated PR #12523.
theotherjimmy updated PR #12523.
uweigand submitted PR review.
uweigand created PR review comment:
This needs to sign-extend from byte, not word.
uweigand created PR review comment:
This is wrong, the CC checks for signed overflow, not unsigned overflow. For
umul_overflow, you'll have to either multiply in a wider mode, or use a widening multiply, and then check the high part is nonzero to detect unsigned overflow.
uweigand created PR review comment:
And this needs to sign-extend from halfword.
uweigand created PR review comment:
This is unfortunately wrong as msr does not set CC. You'll have to use msrkc.
uweigand created PR review comment:
And msgkrc here.
theotherjimmy edited PR review comment.
uweigand commented on PR #12523:
You should enable the run-time filetests on s390x as well, these would probably have caught some of those errors:
filetests/filetests/runtests/overflow.clif
uweigand edited a comment on PR #12523:
You should enable the run-time filetests on s390x as well, these would probably have caught some of those errors:
filetests/filetests/runtests/*overflow*.clif
theotherjimmy submitted PR review.
theotherjimmy created PR review comment:
Ah, oops.
theotherjimmy updated PR #12523.
theotherjimmy updated PR #12523.
theotherjimmy commented on PR #12523:
@uweigand I ran all these tests on my lpar. The current version passes all enabled runtests on s390x, and I enabled the runtests for these instructions.
theotherjimmy commented on PR #12523:
It looks like I need to rebase onto main. I'll do that now then.
theotherjimmy updated PR #12523.
theotherjimmy edited a comment on PR #12523:
It looks like I need to rebase onto main. I'll do that now then. Tests still pass with the rebase onto main.
theotherjimmy edited a comment on PR #12523:
@uweigand I ran all these tests on my lpar. The current version passes all enabled runtests on s390x, and I enabled the runtests for these instructions.
I missed the
cargo testtests. My bad. Please hold off on reviewing while I fix them.
theotherjimmy updated PR #12523.
theotherjimmy commented on PR #12523:
Fixed
cargo testlocally.
theotherjimmy updated PR #12523.
theotherjimmy commented on PR #12523:
Rebased for the regalloc2 version bump. Re-blessed all new tests. ran all runtests on my s390x LPAR, they passed.
alexcrichton requested uweigand for a review on PR #12523.
theotherjimmy updated PR #12523.
theotherjimmy commented on PR #12523:
Rebased for the new types-in-lower-instructions update
uweigand created PR review comment:
So with this change, we can never generate MSR/MSGR at all anymore. It seems confusing to leave the code emitting them in. I think we should either:
- remove Mul32/Mul64 from AluRR; or
- add new opcodes Mul32CC/Mul64CC or so in AluRRR, which can be used whenever we require the CC-setting variant of multiply.
uweigand created PR review comment:
Typo "conditons"
uweigand created PR review comment:
This comment still applies. But this is not an issue that would prevent merging the patch in my opinion, it's just a possible clean-up.
uweigand submitted PR review:
Sorry for the delay in reviewing. This all looks functionally correct to me now, see a few additional cosmetic comments inline.
Also, I think you should also enable the uadd_overflow_128.clif runtest test case.
uweigand created PR review comment:
Typo "itegers". Also, I think this comment and the comment above
type_shift_upcan be combined into a single comment - they're really about the same thing.
uweigand created PR review comment:
Or -final option- just don't use helpers and inline the code, as already done for smul_overflow.
uweigand created PR review comment:
But even if we keep the helper functions in here, please at least group all the helpers in a separate section ahead of the actual lowering implementations.
theotherjimmy created PR review comment:
I'll add the Mul*CC to handle that.
theotherjimmy submitted PR review.
theotherjimmy submitted PR review.
theotherjimmy created PR review comment:
smul_overflow is a special case here, as only the lhs is shifted and the umul_overflow does not use condition code conversion, so it's the only lowering using that specific code pattern. For add/sub, rhs is also shifted, so I think the abstraction might be helpful, So I'll move it to inst.isle
theotherjimmy updated PR #12523.
theotherjimmy commented on PR #12523:
I've handled most of the review locally. I'll be adding the uadd overflow i128 and verifying locally before pushing that up.
uweigand submitted PR review:
This now looks good to me, assuming the uadd-overflow-i128 test case can be added successfully. Thanks!
uweigand created PR review comment:
Missing newline here.
theotherjimmy requested wasmtime-compiler-s390x-reviewers for a review on PR #12523.
theotherjimmy updated PR #12523.
theotherjimmy commented on PR #12523:
uadd_overflow i128 test is passing.
uweigand submitted PR review:
LGTM now, thanks!
uweigand added PR #12523 S390x: implement arith overflow to the merge queue.
uweigand merged PR #12523.
uweigand removed PR #12523 S390x: implement arith overflow from the merge queue.
Last updated: Apr 12 2026 at 23:10 UTC