ArtBlnd edited PR #5036 from x86_64-float-bitops1
to main
:
<!--
Please ensure that the following steps are all taken care of before submitting
the PR.
[ ] This has been discussed in issue #..., or if not, please tell us why
here.[ ] A short description of what this does, why it is needed; if the
description becomes long, the matter should probably be discussed in an issue
first.[ ] This PR contains test cases, if meaningful.
- [ ] A reviewer from the core maintainer team has been assigned for this PR.
If you don't know who could review this, please indicate so. The list of
suggested reviewers on the right can help you.Please ensure all communication adheres to the code of conduct.
-->This patch implements float bitops on x86_64 using SSE instructions.
@afonso360
- [x] Check if better single slot bitops available on x86_64. (which has better latency or throughput?...)
- [ ] Make single slot mask for
f32
,f64
instead ofvector_all_ones
afonso360 submitted PR review.
ArtBlnd has marked PR #5036 as ready for review.
ArtBlnd edited PR #5036 from x86_64-float-bitops1
to main
:
<!--
Please ensure that the following steps are all taken care of before submitting
the PR.
[ ] This has been discussed in issue #..., or if not, please tell us why
here.[ ] A short description of what this does, why it is needed; if the
description becomes long, the matter should probably be discussed in an issue
first.[ ] This PR contains test cases, if meaningful.
- [ ] A reviewer from the core maintainer team has been assigned for this PR.
If you don't know who could review this, please indicate so. The list of
suggested reviewers on the right can help you.Please ensure all communication adheres to the code of conduct.
-->This patch implements float bitops on x86_64 using SSE instructions.
@afonso360
- [x] Check if better single slot bitops available on x86_64. (which has better latency or throughput?...)
- [x] Make single slot mask for
f32
,f64
instead ofvector_all_ones
ArtBlnd updated PR #5036 from x86_64-float-bitops1
to main
.
abrown submitted PR review.
abrown created PR review comment:
Same comment here and below.
abrown created PR review comment:
We should probably use the same
if
syntax here for consistency...
abrown created PR review comment:
I think it might be preferable to use
ANDNPS
here:AND
-ing the register to itself keeps the same bits in place and theNOT
part will do what we want it to do. That way we avoid an extraPCMPEQ*
due to thevector_all_ones
.
abrown submitted PR review.
ArtBlnd updated PR #5036 from x86_64-float-bitops1
to main
.
ArtBlnd submitted PR review.
ArtBlnd created PR review comment:
I am not sure we have to make mask for xor or and after xor. also why than our eq tests does not fail because of side-effects?
ArtBlnd edited PR review comment.
ArtBlnd updated PR #5036 from x86_64-float-bitops1
to main
.
ArtBlnd created PR review comment:
src\isa\x64\lower.isle:275:3: type error: Used non-pure constructor 'ty_scalar_float' in pure expression context src\isa\x64\lower.isle:337:3: type error: Used non-pure constructor 'ty_scalar_float' in pure expression context ``` ``` (decl ty_scalar_float (Type) Type) (extern extractor ty_scalar_float ty_scalar_float)
I don't think this is possible now. It should be separate to other patch.
ArtBlnd submitted PR review.
ArtBlnd updated PR #5036 from x86_64-float-bitops1
to main
.
abrown submitted PR review.
abrown created PR review comment:
Can you re-phrase your question? I'm not sure exactly what you mean...
ArtBlnd created PR review comment:
Oh... I misunderstood. just nevermind.
ArtBlnd submitted PR review.
ArtBlnd edited PR review comment.
ArtBlnd submitted PR review.
ArtBlnd created PR review comment:
I don't think this works. the and
not
will negative the float value itself. here is result.Caused by: Failed test: run: %bnot_f32(0.0) == -NaN:0x3fffff, actual: 0.0
ArtBlnd submitted PR review.
ArtBlnd created PR review comment:
You can check over here https://www.felixcloutier.com/x86/andnps#andnps--128-bit-legacy-sse-version-
abrown submitted PR review.
abrown created PR review comment:
Oh, you're right! The NOT happens before the AND. Sorry for the distraction, let me take one more look at everything else...
abrown submitted PR review.
ArtBlnd updated PR #5036 from x86_64-float-bitops1
to main
.
abrown merged PR #5036.
Last updated: Dec 23 2024 at 12:05 UTC