MaxGraey edited PR #2749 from fix-small-memset
to main
.
bjorn3 submitted PR Review.
bjorn3 created PR Review Comment:
Could you keep using shift and bitor?
MaxGraey submitted PR Review.
MaxGraey created PR Review Comment:
Yes, sure!
MaxGraey updated PR #2749 from fix-small-memset
to main
.
MaxGraey submitted PR Review.
MaxGraey created PR Review Comment:
But keep in mind LLVM can't optimize such series of shifts:
https://godbolt.org/z/nadxYq
MaxGraey updated PR #2749 from fix-small-memset
to main
.
MaxGraey edited PR Review Comment.
MaxGraey updated PR #2749 from fix-small-memset
to main
.
MaxGraey edited PR Review Comment.
MaxGraey edited PR Review Comment.
MaxGraey updated PR #2749 from fix-small-memset
to main
.
bjorn3 submitted PR Review.
bjorn3 created PR Review Comment:
I assumed that shift and bitor would be faster than multiplying, but according to llvm-mca, multiplying is faster for 32bit and 64bit ints.
MaxGraey submitted PR Review.
MaxGraey created PR Review Comment:
I assumed that shift and bitor would be faster than multiplying, but according to llvm-mca, multiplying is faster for 32bit and 64bit ints.
No. 64-bit multiply will be always faster in this case. Even for 32-bit platforms.
btw trunk branck of LLVM already perform this optimization but still suboptimal. See:
https://godbolt.org/z/o3jfWv
MaxGraey edited PR Review Comment.
bjorn3 submitted PR Review.
bjorn3 created PR Review Comment:
Ok. Could you please revert back to multiplying?
MaxGraey updated PR #2749 from fix-small-memset
to main
.
MaxGraey updated PR #2749 from fix-small-memset
to main
.
pchickey requested cfallin for a review on PR #2749.
cfallin submitted PR review.
Last updated: Dec 23 2024 at 12:05 UTC