afonso360 opened PR #2959 from aarch64-i128-bit-ops
to main
:
Hey, some more ops implemented for i128 support.
I think the shifts can be reduced by a few instructions (especially the last csel), but I'm not seeing how right now.
I also didn't implement support for immlogic, changing it to support multiple registers didn't seem like a simple change.
This adds support for:
- bnot
- band
- bor
- bxor
- band_not
- bor_not
- bxor_not
- ishl
- ushr
- sshr
afonso360 submitted PR review.
afonso360 created PR review comment:
@cfallin I pretty much copied the testing code from x64/shift-i128-run.clif. And added a few more test cases where my implementation had issues.
Do you think it would be a good idea to merge run tests so that they could be multi arch? I've been thinking about something along the lines of:
testfiles/runtests/i128-bitops.clif
:test run target x86_64 target aarch64 target s390x function %rest_of_the_tests() {}
This way we could reuse all of these test cases for all arches.
The down side of this is that we may have to be more granular with the testfiles. For example this file right here fails for x86_64 because some of the bit ops fail. So we would have to split this into shift_run and bitops_run or something along those lines.
What do you think?
afonso360 edited PR review comment.
afonso360 updated PR #2959 from aarch64-i128-bit-ops
to main
.
bjorn3 submitted PR review.
bjorn3 created PR review comment:
nit: missing trailing newline
afonso360 updated PR #2959 from aarch64-i128-bit-ops
to main
.
cfallin submitted PR review.
cfallin created PR review comment:
Yes, sharing run-tests makes a lot of sense! In general I actually want to try to shift tests from golden-code (compile tests) to golden-output (run tests) as this makes our suite more robust against cross-cutting backend changes, such as regalloc optimizations; so there are additional benefits, aside from sharing across architectures.
Could you be more specific about the failures you're seeing on x86_64, though? That's somewhat concerning -- unimplemented opcode, incorrect result, or something else?
afonso360 submitted PR review.
afonso360 created PR review comment:
Sure, band_not is failling on this assert.
BorNot / BxorNot / Cls (not on this PR yet) are not implemented at all.
afonso360 edited PR review comment.
afonso360 edited PR review comment.
cfallin submitted PR review.
cfallin created PR review comment:
Ah, OK, for tests like those for
BandNot
where we don't have an implementation on x86_64, we can just omit thetarget
line in the test file, perhaps with a comment saying "not yet implemented on $PLATFORM".
akirilov-arm submitted PR review.
akirilov-arm created PR review comment:
You are right that this instruction sequence can be improved - see what Clang is generating in the same situation here.
A similar improvement is possible for the right shifts.
afonso360 updated PR #2959 from aarch64-i128-bit-ops
to main
.
afonso360 created PR review comment:
Oops, i should have checked that.
I've fixed the code so that it now emits a similar sequence to what clang is doing.
afonso360 submitted PR review.
afonso360 requested akirilov-arm for a review on PR #2959.
cfallin submitted PR review.
afonso360 updated PR #2959 from aarch64-i128-bit-ops
to main
.
afonso360 updated PR #2959 from aarch64-i128-bit-ops
to main
.
afonso360 updated PR #2959 from aarch64-i128-bit-ops
to main
.
afonso360 updated PR #2959 from aarch64-i128-bit-ops
to main
.
cfallin merged PR #2959.
Last updated: Dec 23 2024 at 12:05 UTC