dheaton-arm opened PR #3188 from implement-satarith
to main
:
Implemented the
Extractlane
,UaddSat
, andUsubSat
opcodes for the interpreter,
and added helper functions for working with SIMD vectors (extractlanes
,vectorizelanes
,
andbinary_arith
).Copyright (c) 2021, Arm Limited
<!--
Please ensure that the following steps are all taken care of before submitting
the PR.
[ ] This has been discussed in issue #..., or if not, please tell us why
here.[ ] A short description of what this does, why it is needed; if the
description becomes long, the matter should probably be discussed in an issue
first.[ ] This PR contains test cases, if meaningful.
- [ ] A reviewer from the core maintainer team has been assigned for this PR.
If you don't know who could review this, please indicate so. The list of
suggested reviewers on the right can help you.Please ensure all communication adheres to the code of conduct.
-->
afonso360 submitted PR review.
afonso360 created PR review comment:
Would it be possible to add a
test run
with thex86_64
andaarch64
backend to these tests as well? That way we get some more SIMD coverage on those targets. Thes390x
doesn't support SIMD yet, so no point in adding it.Also could we move the test files to the
runtests
dir since we are testing run targets and the interpreter?
afonso360 submitted PR review.
afonso360 created PR review comment:
I could be wrong, but I think this annotation is only needed on
x86_64
targets.
afonso360 edited PR review comment.
abrown submitted PR review.
abrown created PR review comment:
This might need to be more strict:
assert!(ty.is_vector() && elem_ty.bytes() == 16);
abrown created PR review comment:
What do you think about implementing these instructions as: convert to larger type, use already existing add/sub, convert back to the right type with saturation (maybe with a new saturating
ValueConversionKind
). I can't remember how many instructions end up being saturating ones but if there are more it could reduce the size ofDataValue
.
abrown submitted PR review.
dheaton-arm submitted PR review.
dheaton-arm created PR review comment:
Looking at the list of opcodes, there's a few but given that a lot of them are signed/unsigned variants of each other (which I don't think will need re-implementing each time, as in this case for example), I think there'd be ~5 saturating ops in DataValue. We could do that suggestion though if preferred, which might make implementing opcodes like the
narrow
set easier later (the only question is if[I|U]128
is supposed to be supported; I see a few of the other functions in DataValue seem to ignore it and being the typically largest integer type would require different implementation).
dheaton-arm submitted PR review.
dheaton-arm created PR review comment:
Trying this I'm getting errors that
i32x4
is an unsupported type foruadd_sat
andusub_sat
on thetest run
side -- should that be the case?
afonso360 submitted PR review.
afonso360 created PR review comment:
Would it be easier to do a blanket impl of the
Value
trait like:impl<T: Value, const N: usize> Value for [T; N]
? And have a function that would convert the SIMD values into/from the array?I don't know how many of these functions we are going to need, but it might be worth considering in the future.
Although I think this would require rust 1.51 with
min_const_generics
. Which might be higher than our minimum requirements.
afonso360 submitted PR review.
afonso360 created PR review comment:
for (lhs, rhs) in arg0.iter().zip(arg1.iter()) { // The initial Value::int needs to be on a separate line so the // compiler can determine concrete types. let mut lhs: V = Value::int(*lhs, vector_type.lane_type())?; let mut rhs: V = Value::int(*rhs, vector_type.lane_type())?;
afonso360 edited PR review comment.
afonso360 edited PR review comment.
afonso360 edited PR review comment.
afonso360 edited PR review comment.
afonso360 submitted PR review.
afonso360 created PR review comment:
Hmm, according to the docs, it should work.
However, the reason that it probably doesn't work is that its not included in the WASM SIMD Spec.
What we usually do in these cases is separate the testcases in two files.
8x16
/16x8
in one file with the run targets, andi32x4
/i64x2
(could you add testcases fori64x2
as well?) into another with just the interpreter.I also usually file an issue stating that these are not implemented in the x86_64 and aarch64 backends. We've been finding a lot of code that should compile but doesn't since we started adding these tests instead of relying on the wasm test suite.
afonso360 edited PR review comment.
afonso360 edited PR review comment.
dheaton-arm updated PR #3188 from implement-satarith
to main
.
afonso360 submitted PR review.
dheaton-arm updated PR #3188 from implement-satarith
to main
.
dheaton-arm requested abrown for a review on PR #3188.
cfallin submitted PR review.
cfallin created PR review comment:
I think it might be slightly preferable here to use
SmallVec
, perhapsSmallVec<[i128; 1]>
or similar (and typedef it so we can easily change it) -- while we don't expect the interpreter to be fast, it's still good to avoid an unnecessary heap allocation per execution of an instruction.
abrown submitted PR review.
abrown created PR review comment:
Yeah, it sounds like the current approach will be fine.
abrown submitted PR review.
abrown created PR review comment:
Thanks for adding these instructions!
dheaton-arm submitted PR review.
dheaton-arm created PR review comment:
Is this specifically for
extractlanes
or would this also apply to the input tovectorizelanes
? Just want to check in case you wanted basically all theVec
s in this section of code changing, whilst I'm thinking that parameters aren't an issue (which I may be wrong about) andVec
parameters seem more general / flexible (accepting bothSmallVec
andVec
).
dheaton-arm updated PR #3188 from implement-satarith
to main
.
cfallin created PR review comment:
I guess the advice applies generally. Specifically, this:
Vec
parameters seem more general / flexible (accepting bothSmallVec
andVec
)is not quite right;
Vec
is a specific type that requires its contents to be heap-allocated. The more idiomatic Rust type for "any array of values in memory" would be a slice, i.e.&[128]
, which one can get with&vec[..]
or&smallvec[..]
; but that only works for inbound args because it has no ownership. Returning aSmallVec
is a technique we use elsewhere pretty successfully so I think it's' reasonable here too :-)
cfallin submitted PR review.
cfallin edited PR review comment.
dheaton-arm submitted PR review.
dheaton-arm created PR review comment:
Ah, I was basing that statement off testing where using a
Vec
parameter but passing aSmallVec
was accepted by the compiler and built successfully, but the inverse (using aSmallVec
parameter but passing aVec
) was blocked by the compiler. I wasn't sure about if that made theSmallVec
behave like aVec
with regards to heap allocations, however, hence my question.Returning a SmallVec is a technique we use elsewhere pretty successfully
I understood that, and changed it to behave like that in 0089e15; I was more wondering about the input parameters to
vectorizelanes
. I'm gathering from your response though that the preference is for those to be a slice?
dheaton-arm updated PR #3188 from implement-satarith
to main
.
dheaton-arm edited PR review comment.
dheaton-arm requested cfallin for a review on PR #3188.
cfallin submitted PR review.
dheaton-arm updated PR #3188 from implement-satarith
to main
.
abrown merged PR #3188.
Last updated: Dec 23 2024 at 12:05 UTC