uweigand opened PR #4360 from s390x-vr
to main
:
This defines the full set of 32 128-bit vector registers on s390x.
(Note that the VRs overlap the existing FPRs.) In addition, this
adds support to use all 32 vector registers to implement floating-
point operations, by using vector floating-point instructions with
the 'W' bit set to operate only on the first element.This part of the vector instruction set mostly matches the old FP
instruction set, with two exceptions:
There is no vector version of the COPY SIGN instruction. Instead,
now use a VECTOR SELECT with an appropriate bit mask to implement
the fcopysign operation.There are no vector version of the float <-> int conversion
instructions where source and target differ in bit size. Use
appropriate multiple conversion steps instead. This also requires
use of explicit checking to implement correct overflow handling.
As a side effect, this version now also implements the i8 / i16
variants of all conversions, which had been missing so far.For all operations except those two above, we continue to use the
old FP instruction if applicable (i.e. if all operands happen to
have been allocated to the original FP register set), and use the
vector instruction otherwise.CC @cfallin
<!--
Please ensure that the following steps are all taken care of before submitting
the PR.
[ ] This has been discussed in issue #..., or if not, please tell us why
here.[ ] A short description of what this does, why it is needed; if the
description becomes long, the matter should probably be discussed in an issue
first.[ ] This PR contains test cases, if meaningful.
- [ ] A reviewer from the core maintainer team has been assigned for this PR.
If you don't know who could review this, please indicate so. The list of
suggested reviewers on the right can help you.Please ensure all communication adheres to the code of conduct.
-->
cfallin submitted PR review.
cfallin submitted PR review.
cfallin created PR review comment:
Likewise here with rule priorities.
cfallin created PR review comment:
(I do want to have a mode eventually where we randomize rule ordering just to see how many things this breaks; such a test mode would certainly catch this so we don't have to remember the principle manually!)
cfallin created PR review comment:
We need either rule priorities or an anything-but-these-types extractor here (probably the former is simplest) so that this rule is correct on its own, rather than just as a fallback from the below cases, I think?
uweigand submitted PR review.
uweigand created PR review comment:
I don't see the problem here - the first rule only handles the case where source and destination type are equal; all other rules only accept cases where source and destination type are different. Each of the rules should be correct on its own?
uweigand updated PR #4360 from s390x-vr
to main
.
uweigand updated PR #4360 from s390x-vr
to main
.
cfallin created PR review comment:
Ah! Yes, I misread it, sorry;
ty ty
, notty
. All's well as-is here.
cfallin submitted PR review.
cfallin submitted PR review.
cfallin merged PR #4360.
Last updated: Oct 23 2024 at 20:03 UTC