jianjunz opened PR #8964 from jianjunz:f16-i64
to bytecodealliance:main
:
Some devices may not support FP32.
jianjunz updated PR #8964.
jianjunz updated PR #8964.
jianjunz updated PR #8964.
jianjunz updated PR #8964.
jianjunz has marked PR #8964 as ready for review.
jianjunz requested wasmtime-core-reviewers for a review on PR #8964.
jianjunz requested alexcrichton for a review on PR #8964.
jianjunz requested wasmtime-default-reviewers for a review on PR #8964.
alexcrichton requested abrown for a review on PR #8964.
abrown submitted PR review.
abrown created PR review comment:
When is this needed?
abrown created PR review comment:
Are we sure
cast
is right here? Poking around inwindows-rs
, I see that it has aCanInto
implementation forTensorFeatureDescriptor
here — doesn't that mean we should be usinginto()
instead?
abrown created PR review comment:
let inspectable =
abrown created PR review comment:
You probably only need
implement
for this PR, right?
abrown created PR review comment:
Is this correct? ONNX is going to represent this type with 4 bytes instead of 2 bytes? If so, maybe we should add a comment describing why this is the case and linking to some documentation; otherwise, it's a bit surprising.
abrown created PR review comment:
I don't think this makes sense. The
cargo test
above runs both the unit tests within the library as well as the integration tests in thetests
directory.
abrown created PR review comment:
Not sure if it's any faster, but this would avoid the size calculation:
let data = view.into_iter().map(f32::to_le_bytes).collect();
We might want to note somewhere that the LE ordering is here to match WebAssembly's ordering, not the platform we're running on (in which case always doing LE could be incorrect).
abrown created PR review comment:
We can do this once at the top of the function.
abrown created PR review comment:
When doing this
unsafe
"cast" to a slice, we also need to make sure that theu8
pointer totensor.data
is aligned in such a way that thef32
s indata
will also be aligned to their 4-byte alignment. Here's an example of how to do the check fromopenvino-rs
. And the docs forstd::slice::from_raw_parts
have all the details. I think it would be fine here to fail if things aren't aligned correctly (instead of trying to shift everything!), at least for the time being.
abrown created PR review comment:
_ => unimplemented!("the winml backend only supports tensors, found: {}", kind),
abrown created PR review comment:
tensor.data.as_ptr().cast<i64>,
These days clippy tells me this is the preferred way to do this.
abrown created PR review comment:
This kind of comment can be moved from here to the
unimplemented!()
at the end ofto_tensor
so that both users and code readers can take advantage of this information.
abrown created PR review comment:
abrown created PR review comment:
TensorType::Fp32 => unsafe {
Isn't
TensorType
already imported?
abrown created PR review comment:
It took me a second to understand that what we're trying to do is "roundtrip" a tensor through
to_inspectable
and back throughto_tensor
. To make this more clear, can you look at factoring out some of the duplication here?
jianjunz updated PR #8964.
jianjunz submitted PR review.
jianjunz created PR review comment:
You're right. The
cargo test
above also runs unit tests. https://github.com/bytecodealliance/wasmtime/actions/runs/9969920948/job/27547820669#step:8:409
jianjunz submitted PR review.
jianjunz created PR review comment:
Removed. They're not needed for this one.
jianjunz submitted PR review.
jianjunz created PR review comment:
output_tensor
's type is unknown at the top of the function.
jianjunz created PR review comment:
WinML may report an error
self.binding.Bind("input", tensor of shape [10]); self.binding.Bind("input", tensor of shape [11]); <-- error
But it works
self.binding.Bind("input", tensor of shape [10]); self.binding.Clear(); self.binding.Bind("input", tensor of shape [11]);
jianjunz created PR review comment:
WinML may use 2 bytes internally because it's fp16, but f16 is not officially supported by Rust stable, so we use f32 here.
jianjunz submitted PR review.
jianjunz created PR review comment:
Removed
crate::wit::types::
.
jianjunz created PR review comment:
Done.
jianjunz created PR review comment:
Removed.
jianjunz created PR review comment:
Done.
jianjunz created PR review comment:
view
's type isIVectorView
, not a rust collection, so the short version doesn't work here.
jianjunz submitted PR review.
jianjunz created PR review comment:
My understanding is
cast
here is for COM interfaces, similar as QueryInterface, whileinto
is for converting derived class into base class?
jianjunz updated PR #8964.
jianjunz submitted PR review.
jianjunz created PR review comment:
Replaced with
cast
.
jianjunz updated PR #8964.
jianjunz submitted PR review.
jianjunz created PR review comment:
Thanks for the info. Added assertion when getting length.
jianjunz submitted PR review.
jianjunz created PR review comment:
Implemented
PartialEq
forTensor
, so we don't need a lot ofassert_eq!
here.
abrown submitted PR review.
abrown created PR review comment:
Ok, so this needs to be fixed then in this PR?
abrown created PR review comment:
No, I don't think that is not correct. I checked out the code and
let data = view.into_iter().flat_map(f32::to_le_bytes).collect();
(changingmap
toflat_map
) seems to compile ok.
abrown created PR review comment:
We should probably be using
shape
here to verify thattensor.dimensions
matches.
abrown created PR review comment:
Not sure that answers the question...
abrown created PR review comment:
If you'd like to do this in a more standard way, I just learned of
align_to
which can be used to do the same thing.
jianjunz updated PR #8964.
jianjunz submitted PR review.
jianjunz created PR review comment:
I mean
cast
here queries if the COM object can be convert into a derived class, whileinto
is used to convert into a base class. That's my understanding.
jianjunz created PR review comment:
No, not in this one. It cannot be easily fixed by adding a
self.binding.Clear()
here because a model may have multiple input features. In this case, application callsset_input
more than once.
jianjunz created PR review comment:
Thanks for pointing out the duplication here. I think we don't need to verify shape here, WinML will check it later, so
shape
is removed.
jianjunz submitted PR review.
jianjunz created PR review comment:
flat_map
works. Thanks.
jianjunz updated PR #8964.
abrown submitted PR review.
abrown created PR review comment:
I think I understand what you're saying about
Clear
: it erases _all_ the bindings, even for other model inputs. We can't have that. But what happens in this conceivable sequence?
- Wasm guest calls
set_input
on input N with shape A- Wasm guest again calls
set_input
on input N with shape BThis is valid, though silly. The user should not have to face an error from wasi-nn in this case, right? But, if WinML is going to raise an error, then should we protect this some other way, e.g., by checking that the tensor shape is what the model expects (either A or B)?
abrown created PR review comment:
How about
let itensor = inspectable.cast<ITensor>()?; let dimensions = dimensions_as_u32(&itensor.Shape()?)?;
or something like that?
abrown created PR review comment:
Ok, please add a comment explaining that and link to the docs I mentioned.
abrown created PR review comment:
I'm not familiar enough with COM so I'll take your word for it that this is the right approach.
abrown submitted PR review.
abrown created PR review comment:
Probably should just
#[derive(PartialEq)]
since it's equivalent to this I think.
jianjunz updated PR #8964.
jianjunz submitted PR review.
jianjunz created PR review comment:
That's a variable input, accepts both A and B (like a string with different length). I feel like this is a bug of WinML, or the calling flow is incorrect.
Clear
fixes the issue but I'm not sure if that's the only solution, so I'm not addingClear
at this time.
jianjunz created PR review comment:
It works, but
GetAsVectorView
is a method ofTensorFloat16Bit
. Then we'll need to castitensor
again to specific tensor type. The change above makes code clean but will it be a performance issue to cast twice?
jianjunz updated PR #8964.
jianjunz submitted PR review.
jianjunz created PR review comment:
It works. Thanks.
jianjunz updated PR #8964.
abrown updated PR #8964.
abrown submitted PR review.
abrown merged PR #8964.
Last updated: Jan 24 2025 at 00:11 UTC