devigned opened PR #7691 from devigned:wasi-nn-ort
to bytecodealliance:main
:
<!--
Please make sure you include the following information:
If this work has been discussed elsewhere, please include a link to that
conversation. If it was discussed in an issue, just mention "issue #...".Explain why this change is needed. If the details are in an issue already,
this can be brief.Our development process is documented in the Wasmtime book:
https://docs.wasmtime.dev/contributing-development-process.htmlPlease ensure all communication follows the code of conduct:
https://github.com/bytecodealliance/wasmtime/blob/main/CODE_OF_CONDUCT.md
-->This change adds an ONNXruntime backend for WASI-NN. Since there is only one backend implemented, this will help to move the standardization of this proposal forward.
Also, the example usage of the ONNXruntime backend shows how to build a component using WASI-NN and ONNXruntime.
devigned requested alexcrichton for a review on PR #7691.
devigned requested wasmtime-core-reviewers for a review on PR #7691.
devigned requested wasmtime-default-reviewers for a review on PR #7691.
devigned commented on PR #7691:
@abrown please take a look when you have a minute. Thank you!!
devigned updated PR #7691.
alexcrichton requested abrown for a review on PR #7691.
@devigned, I'll take a closer look next week. I will say I've been been changing the ground under your feet here (sorry :grinning_face_with_smiling_eyes:) with https://github.com/bytecodealliance/wasmtime/pull/7679; that PR will mean we have to move some of your tests over to
test-programs
and write Rust tests instead of bash scripts. An overall improvement, just inconvenient timing.As for the CI failures here: I think we can add a line to
skip-tree
indeny.toml
to ignore the version mismatch of thehalf
dependency: { name = "half", depth = 1 }. As for the
cargo vet` failures, I think we'll have to audit those, maybe even as a separate PR... anything you can do to reduce the number of dependencies coming in will make that easier (drop features?).
abrown submitted PR review:
Thanks for all the work to make this happen! This is going to be a great addition to Wasmtime. Let me know on Zulip if you need help working through the test refactoring, CI issues, etc.
abrown submitted PR review:
Thanks for all the work to make this happen! This is going to be a great addition to Wasmtime. Let me know on Zulip if you need help working through the test refactoring, CI issues, etc.
abrown created PR review comment:
I don't think we'll need this script anymore now that #7679 is merged.
abrown created PR review comment:
Maybe something like the following would be less verbose?
_ => { unimplemented!("{:?} not supported by ONNX", input.tensor_type); }
abrown created PR review comment:
Good to see the auto-generated bindings work!
abrown created PR review comment:
I opened https://github.com/abrown/install-openvino-action/issues/29 to track this.
abrown created PR review comment:
I don't know how easy it will be to add these dependencies to
test-programs
. If it is difficult for some reason, you could always dump the decoded bytes to a file like I did with the OpenVINO test fixtures. If you do keep this, though, we probably only need thejpeg
feature.
abrown created PR review comment:
You could probably
self.0.lock()
once at the top of this function?
abrown created PR review comment:
Maybe we should explain somewhere why we are forced to copy the tensor twice — at input and output.
mtobin-tdab commented on PR #7691:
This is great, I have been testing different ONNX models and encountered a data type mismatch error for certain models, specifically at the wasi-nn compute function in bindings.rs. The error indicates a failure to parse ctx: GraphExecutionContext as i32:
1: error while executing at wasm backtrace: 0: 0x245b42a3 - wit-component:shim!indirect-wasi:nn/inference-compute 1: 0x698ad - octaioxide::bindings::wasi::nn::inference::compute::h1bf34b4c8ec1bf77 at ....................bindings.rs:11137:15 2: Failed while accessing backend 3: Data type mismatch: was Int64, tried to convert to Float32
As far as I can tell the ctx value is 0 when I pass it to compute, I've tried to cast it to f32, but the compiler expects a u32.
Very possible I'm missing something obvious as I am fairly new to both WASM and Rust. But thought I would share.
For some extra context, both models which have produced this error have been classification models (specifcially an xgboost and sklearn's SGDClassifier model) - Are there model types in ONNX format which we would never expect this to be compatible with?
devigned updated PR #7691.
devigned submitted PR review.
devigned created PR review comment:
I fixed this and added a test program.
devigned submitted PR review.
devigned created PR review comment:
added to test-programs and reduced to jpeg.
devigned submitted PR review.
devigned created PR review comment:
I updated this to reflect the latest changes in
cargo component
and preview 2.
devigned updated PR #7691.
devigned updated PR #7691.
devigned updated PR #7691.
devigned updated PR #7691.
devigned submitted PR review.
devigned created PR review comment:
I was verbose to be super clear about what is not implemented. I've since changed it to your suggestion.
devigned commented on PR #7691:
This is great, I have been testing different ONNX models and encountered a data type mismatch error for certain models, specifically at the wasi-nn compute function in bindings.rs. The error indicates a failure to parse ctx: GraphExecutionContext as i32:
1: error while executing at wasm backtrace: 0: 0x245b42a3 - wit-component:shim!indirect-wasi:nn/inference-compute 1: 0x698ad - octaioxide::bindings::wasi::nn::inference::compute::h1bf34b4c8ec1bf77 at ....................bindings.rs:11137:15 2: Failed while accessing backend 3: Data type mismatch: was Int64, tried to convert to Float32
As far as I can tell the ctx value is 0 when I pass it to compute, I've tried to cast it to f32, but the compiler expects a u32.
Very possible I'm missing something obvious as I am fairly new to both WASM and Rust. But thought I would share.
For some extra context, both models which have produced this error have been classification models (specifcially an xgboost and sklearn's SGDClassifier model) - Are there model types in ONNX format which we would never expect this to be compatible with?
Is there any chance you could share code that reproduces this behavior, or create a branch with a failing test reproducing this behavior?
devigned commented on PR #7691:
@sunfishcode, I believe I need someone that is a trusted contributor to Wasmtime to approve the cargo vet dependencies added in this PR (per https://docs.wasmtime.dev/contributing-coding-guidelines.html#dependencies-of-wasmtime). Would you please consider allow listing the failing vet dependencies?
I'm not sure if the
cargo deny
result forittapi-sys
is spurious or if the license is going to be a problem. What would you advise?
devigned edited a comment on PR #7691:
@sunfishcode, I believe I need someone that is a trusted contributor to Wasmtime to approve the cargo vet dependencies added in this PR (per https://docs.wasmtime.dev/contributing-coding-guidelines.html#dependencies-of-wasmtime). Would you please consider allow listing the failing vet dependencies?
I'm not sure if the
cargo deny
result forittapi-sys
is spurious or if the license is going to be a problem. What would you advise?Also, if this is not how this stuff should work, please let me know :).
devigned updated PR #7691.
devigned edited a comment on PR #7691:
@sunfishcode, I believe I need someone that is a trusted contributor to Wasmtime to approve the
cargo vet
dependencies added in this PR (per https://docs.wasmtime.dev/contributing-coding-guidelines.html#dependencies-of-wasmtime). Would you please consider allow listing the failing vet dependencies?I'm not sure if the
cargo deny
result forittapi-sys
is spurious or if the license is going to be a problem. What would you advise?Also, if this is not how this stuff should work, please let me know :).
devigned updated PR #7691.
devigned updated PR #7691.
devigned updated PR #7691.
devigned updated PR #7691.
devigned updated PR #7691.
devigned updated PR #7691.
devigned updated PR #7691.
devigned updated PR #7691.
devigned updated PR #7691.
devigned updated PR #7691.
devigned updated PR #7691.
@devigned. I see a bunch of dependencies for the failing vet check:
bytemuck:1.14.2 missing ["safe-to-deploy"] bytes:1.5.0 missing ["safe-to-deploy"] castaway:0.2.2 missing ["safe-to-deploy"] color_quant:1.1.0 missing ["safe-to-deploy"] compact_str:0.7.1 missing ["safe-to-deploy"] crunchy:0.2.2 missing ["safe-to-deploy"] fdeflate:0.3.4 missing ["safe-to-deploy"] filetime:0.2.16 missing ["safe-to-deploy"] flate2:1.0.28 missing ["safe-to-deploy"] half:2.3.1 missing ["safe-to-deploy"] image:0.24.8 missing ["safe-to-deploy"] matrixmultiply:0.3.8 missing ["safe-to-deploy"] ndarray:0.15.6 missing ["safe-to-deploy"] num-complex:0.4.4 missing ["safe-to-deploy"] num-integer:0.1.45 missing ["safe-to-deploy"] ort:2.0.0-rc.0 missing ["safe-to-deploy"] ort-sys:2.0.0-rc.0 missing ["safe-to-deploy"] png:0.17.11 missing ["safe-to-deploy"] rawpointer:0.2.1 missing ["safe-to-deploy"] rustversion:1.0.14 missing ["safe-to-deploy"] simd-adler32:0.3.7 missing ["safe-to-deploy"] static_assertions:1.1.0 missing ["safe-to-deploy"] tar:0.4.40 missing ["safe-to-deploy"] ureq:2.9.1 missing ["safe-to-deploy"] xattr:1.1.1 missing ["safe-to-deploy"]
I would guess that at least some of these are related to testing? If that is the case, one thing that I suggested in #7807 is to create a test fixture file with all the tensor bytes in the right format. This could remove some of these dependencies, reducing the amount of crates that we will have to audit.
devigned updated PR #7691.
devigned updated PR #7691.
devigned updated PR #7691.
devigned commented on PR #7691:
@abrown, I've trimmed down to about as slim as I think I can get the dependencies. I've removed any dependency on
half
andndarray
which cut the list of dependencies by over half.12 unvetted dependencies: bytes:1.5.0 missing ["safe-to-deploy"] castaway:0.2.2 missing ["safe-to-deploy"] compact_str:0.7.1 missing ["safe-to-deploy"] filetime:0.2.16 missing ["safe-to-deploy"] flate2:1.0.28 missing ["safe-to-deploy"] ort:2.0.0-rc.0 missing ["safe-to-deploy"] ort-sys:2.0.0-rc.0 missing ["safe-to-deploy"] rustversion:1.0.14 missing ["safe-to-deploy"] static_assertions:1.1.0 missing ["safe-to-deploy"] tar:0.4.40 missing ["safe-to-deploy"] ureq:2.9.1 missing ["safe-to-deploy"] xattr:1.2.0 missing ["safe-to-deploy"]
WDYT?
@abrown, I've trimmed down to about as slim as I think I can get the dependencies. I've removed any dependency on
half
andndarray
which cut the list of dependencies by over half.WDYT?
Thanks! Let's talk about this in the Wasmtime meeting tomorrow.
abrown updated PR #7691.
I audited the dependencies this PR would bring in and the only crate I didn't audit was
compact_str
: I do not feel comfortable putting a stamp of approval on this crate without some further discussion. The crate is well-designed and has many kinds of tests, including proptests, but the fundamentalunsafe
issue is that the crate manually implements what would otherwise be anenum
: if a string is less than 24 bytes, it lives on the stack; otherwise, it lives on the heap. This causes all kinds ofunsafe
pointer copies for its special integer representation,unsafe
ways to pass in unchecked UTF-8 strings, severalunsafe
implementations ofSync
,Send
,LifetimeFree
, etc. I would note that all of these unsafe instances are carefully documented with "SAFETY:" comments which seemed reasonable, but the large amount of unsafety and how inherent it is to the crate gave me pause. Perhaps someone else can more confidently vouch for this?
alexcrichton commented on PR #7691:
Given your review and the fact that I'm seeing extensive fuzzing and miri testing in the repo I think it's safe to put all that into the vet entry. Vetting isn't so much guaranteeing that the crate is correct but moreso that it doesn't intentionally do bad things. You've done an initial review and the author is clearly covering all the bases they can, so I think it's reasonable to say it's well vetted.
devigned commented on PR #7691:
@abrown would you like me to add the vet audit entry for
compact_str
? Is that all we need to get this moving forward?
@devigned: yeah, go for it!
mingqiusun submitted PR review.
devigned updated PR #7691.
devigned updated PR #7691.
devigned updated PR #7691.
devigned updated PR #7691.
abrown submitted PR review.
abrown submitted PR review.
abrown created PR review comment:
I don't think we need to add this top-level feature because we have workspaces. We should be able to reach down and control sub-crate features by doing:
cargo build --features wasmtime-wasi-nn/onnx
, e.g.
abrown created PR review comment:
Not sure this should be making a comeback — why is it back?
abrown created PR review comment:
Yeah, I think all these additions of
target_os = "windows"
are necessary; thanks for catching that.
abrown created PR review comment:
Might want to tweak these comments: "MobileNet" -> "SqueezeNet". BTW, are the SqueezeNet expectations here the same?
abrown created PR review comment:
No
"macos"
?
abrown created PR review comment:
Let's follow the precedent like with the other backends and just retrieve any models with a test helper function, e.g.:
abrown created PR review comment:
--features wasmtime-wasi-nn/onnx \
See my previous comment about workspace features, plus I don't think we really want to enable the
wasmtime-wasi-nn/winml
feature unconditionally since (1) it is Windows-specific and (2) @jianjunz found that the WinML APIs aren't even available in the GitHub CI images.
abrown created PR review comment:
We can refactor all this stuff later, but for now it probably makes sense to retrieve any ONNX artifacts separately, e.g., with a
check_onnx_artifacts_are_available
function.
abrown created PR review comment:
Friendly reminder that this Git submodule change is probably from a missing
git submodule update --init
at some point.
abrown created PR review comment:
Let's just leave this unimplemented; we can follow up with better error handling in a separate PR.
abrown created PR review comment:
In case some Git mischief happened here:
image
looks to be pulling in all the features again, not justjpeg
.
devigned updated PR #7691.
devigned submitted PR review.
devigned created PR review comment:
No, I don't think we needed to skip 1 there.
devigned created PR review comment:
Yep. Updated. Thank you.
devigned submitted PR review.
devigned updated PR #7691.
devigned updated PR #7691.
devigned updated PR #7691.
devigned updated PR #7691.
devigned updated PR #7691.
devigned submitted PR review.
devigned created PR review comment:
@abrown this won't work with multiple features enabled. For example, if onnx and openvino are enabled, then each backend will try to be run for a given model. Unless the backends support the same model input, the caller is going to have a bad time.
Rather than running for every backend in a test, I have each test pass in their backend to run.
devigned created PR review comment:
I'd like to find a better way of doing this rather than chunking from u8 to the type and size of the model input.
devigned submitted PR review.
abrown submitted PR review.
abrown created PR review comment:
Yeah, that makes more sense! :+1:
abrown submitted PR review.
devigned updated PR #7691.
devigned submitted PR review.
devigned created PR review comment:
@abrown I didn't realize that I needed to mark this as
publish = false
. Should be good now.
devigned updated PR #7691.
devigned updated PR #7691.
squillace commented on PR #7691:
outstanding. Thanks for your work, @devigned and to the reviewers!
squillace commented on PR #7691:
go figure:
thread 'main' panicked at /home/runner/.cargo/registry/src/index.crates.io-6f17d22bba15001f/ort-sys-2.0.0-rc.0/build.rs:308:22:
downloaded binaries not available for target riscv64gc-unknown-linux-gnu
you may have to compile ONNX Runtime from source
stack backtrace:
0: rust_begin_unwind
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panicking.rs:645:5
1: core::panicking::panic_fmt
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/panicking.rs:72:14
2: build_script_build::prepare_libort_dir
3: build_script_build::real_main
4: build_script_build::main
5: core::ops::function::FnOnce::call_once
note: Some details are omitted, run withRUST_BACKTRACE=full
for a verbose backtrace.
warning: build failed, waiting for other jobs to finish...
Error: Process completed with exit code 101.
downloaded binaries not available for target riscv64gc-unknown-linux-gnu
That's fine; this just needs a few more target checks.
devigned commented on PR #7691:
downloaded binaries not available for target riscv64gc-unknown-linux-gnu
That's fine; this just needs a few more target checks.
I'm on it.
devigned updated PR #7691.
devigned updated PR #7691.
devigned updated PR #7691.
squillace commented on PR #7691:
let's go baby
devigned updated PR #7691.
squillace commented on PR #7691:
GACK. MingGW!!! crazy amazing
squillace commented on PR #7691:
you having fun yet, @devigned ?
alexcrichton commented on PR #7691:
Since this isn't intended to work everywhere just yet, it might be best to add a new builder on CI specifically dedicated to testing ONNX rather than trying to get it to work across all the platforms.
If that works for you I'd recommend copying this to start, updating all the bits there (e.g. run on ubuntu, not on windows), and then be sure to add an entry down here to ensure we gate on it in CI.
Also, what I might also recommend, if you add
prtest:full
as a string in a commit message it'll run full CI on this PR before going to the merge queue, and that may help avoid the bouncing back and forth between the merge queue and back here.
squillace commented on PR #7691:
drinking booze all night now over here, best soap opera I've watched in years
devigned commented on PR #7691:
@alexcrichton, in https://github.com/bytecodealliance/wasmtime/pull/7691/commits/22d14d6e768d8bbc346b7ade3a99bed7b8571092 I'm reducing the set of triplets in which ONNX will run. I do not plan on pursuing riscv or s390 precompiled onnxruntime bins at this point. Are you good with the matrix-test approach or would you still prefer I break it out into another builder?
devigned updated PR #7691.
devigned edited a comment on PR #7691:
@alexcrichton, in https://github.com/bytecodealliance/wasmtime/pull/7691/commits/7f7fca55bf40fcde8083345c9bbe4b70f85b33ff I'm reducing the set of triplets in which ONNX will run. I do not plan on pursuing riscv or s390 precompiled onnxruntime bins at this point. Are you good with the matrix-test approach or would you still prefer I break it out into another builder?
devigned updated PR #7691.
devigned edited a comment on PR #7691:
@alexcrichton, in https://github.com/bytecodealliance/wasmtime/pull/7691/commits/2b7a104684f0c45e5969f5ef25b858f8069d69bd I'm reducing the set of triplets in which ONNX will run. I do not plan on pursuing riscv or s390 precompiled onnxruntime bins at this point. Are you good with the matrix-test approach or would you still prefer I break it out into another builder?
alexcrichton commented on PR #7691:
Nah if that works for you seems fine, just trying to save some CI headache.
devigned updated PR #7691.
devigned updated PR #7691.
devigned updated PR #7691.
devigned updated PR #7691.
devigned updated PR #7691.
abrown merged PR #7691.
Last updated: Jan 24 2025 at 00:11 UTC