jlb6740 opened PR #3084 from implement_ext_mul_x64
to main
:
<!--
Please ensure that the following steps are all taken care of before submitting
the PR.
[ ] This has been discussed in issue #..., or if not, please tell us why
here.[ ] A short description of what this does, why it is needed; if the
description becomes long, the matter should probably be discussed in an issue
first.[ ] This PR contains test cases, if meaningful.
- [ ] A reviewer from the core maintainer team has been assigned for this PR.
If you don't know who could review this, please indicate so. The list of
suggested reviewers on the right can help you.Please ensure all communication adheres to the code of conduct.
-->
jlb6740 updated PR #3084 from implement_ext_mul_x64
to main
.
jlb6740 updated PR #3084 from implement_ext_mul_x64
to main
.
jlb6740 updated PR #3084 from implement_ext_mul_x64
to main
.
jlb6740 has marked PR #3084 as ready for review.
jlb6740 requested abrown for a review on PR #3084.
jlb6740 requested akirilov-arm for a review on PR #3084.
abrown submitted PR review.
abrown created PR review comment:
Initially I thought that this matching was too greedy--that it would match sequences that it shouldn't be and then crash because the types are wrong. I think now that it is OK because
swiden_high
only really allows the following types:I8X16
,I16X8
,I32X4
. I think it might be worthwhile to mention this assumption in some comments at the top so that future readers aren't confused.v0 = swiden.i64
abrown created PR review comment:
Not sure why
swiden_input
is needed: couldn't theInsnInput
s be created inline?
abrown created PR review comment:
I think the order of the lowerings should be reversed: scalar lowerings first, then vector lowerings, then pattern matching lowerings. As it stands in this PR, every
imul.i64
is going to require a bunch of matching calls and type comparisons before it ever gets lowered. I think scalar multiplication is going to be the most common, then plain vector multiplication, so I would think we would want to make those common paths shorter. (Not sure exactly how much this affects compile time but hopefully you see what I mean).
abrown submitted PR review.
jlb6740 created PR review comment:
Sure .. I agree. That was my original thought but I never reverted the initial implementation.
jlb6740 submitted PR review.
jlb6740 edited PR review comment.
jlb6740 updated PR #3084 from implement_ext_mul_x64
to main
.
jlb6740 submitted PR review.
jlb6740 created PR review comment:
Also note, the code must remain at the top as the fall through target since I am using "if let Some(_) .." to check for op sources and there is no if not let Some(_) ..". Still there is just one branch to get to the other lowerings.
jlb6740 requested abrown for a review on PR #3084.
jlb6740 merged PR #3084.
Last updated: Dec 23 2024 at 13:07 UTC