bjorn3 commented on Issue #2682:
[legion/src/internals/insert.rs:54] component_index = 0 [legion/src/internals/insert.rs:54] 1u128 << dbg!(component_index) = 18446744073709551617
It didn't help unfortunately.
cfallin commented on Issue #2682:
Ah, yes, you were right; I confused
tmp2
fortmp3
(a classic mistake!) and failed to actually use any test vectors with differing lower and upper halves. Should be fixed now (and added a few more test cases). Thanks!
cfallin commented on Issue #2682:
I think this looks fine. (As an aside, why aren't we using SSE2's
PSLLDQ
/PSRLDQ
instructions instead of these long sequences? I haven't looked at much of the i128 code but it would seem that moving upper and lower halves to XMMs and back might still be faster for one of these cases?)Ah, the simple answer is that I don't know SSE well enough to reach for such instructions -- though it looks like they should work much more efficiently than these sequences! I'll go ahead and merge with your +1 for now so that we have correct results; but we can definitely improve this later. Thanks!
Last updated: Nov 22 2024 at 17:03 UTC