zhen9910 opened PR #12044 from zhen9910:zkong/update-ort-and-onnx-gpu to bytecodealliance:main:
As discussed in https://github.com/bytecodealliance/wasmtime/issues/8547, the existing wasi-nn onnx backend only uses the default CPU execution provider. This PR updated the onnxruntime crate
ortto 2.0.0-rc.10 with improved cuda support and added onnx-cuda based wasi-nn GPU execution target inwasmtime-wasi-nnonnx backend.
zhen9910 requested wasmtime-wasi-reviewers for a review on PR #12044.
zhen9910 requested fitzgen for a review on PR #12044.
zhen9910 requested wasmtime-default-reviewers for a review on PR #12044.
zhen9910 updated PR #12044.
devigned commented on PR #12044:
Looks like the bump to
ortcausedcargo vetto be angry, which is to be expected.+1 to the addition of GPU support for the ONNX backend \o/
@abrown, you might be interested in this one.
fitzgen requested abrown for a review on PR #12044.
zhen9910 commented on PR #12044:
@abrown could you please review the PR and add the cargo vet entries?
alexcrichton commented on PR #12044:
@abrown is in the process of handing off wasi-nn maintenance/work to @jlb6740 and @rahulchaphalkar so as a heads up this might take a moment @zhen9910 to allocate time to work on it.
zhen9910 commented on PR #12044:
@abrown is in the process of handing off wasi-nn maintenance/work to @jlb6740 and @rahulchaphalkar so as a heads up this might take a moment @zhen9910 to allocate time to work on it.
I see, thanks for update
zhen9910 updated PR #12044.
zhen9910 edited PR #12044:
As discussed in https://github.com/bytecodealliance/wasmtime/issues/8547, the existing wasi-nn onnx backend only uses the default CPU execution provider. This PR added onnx-cuda based wasi-nn GPU execution target in
wasmtime-wasi-nnonnx backend.
zhen9910 commented on PR #12044:
@alexcrichton could you take a look this PR, which is rebased against main with cargo vet and ort changes
alexcrichton commented on PR #12044:
I unfortunately am not equipped myself to review/maintain wasi-nn. Historically that was Andrew, who's now handed off to @jlb6740 and @rahulchaphalkar, but they (I suspect) have other priorities to balance too. @zhen9910 if you'd like to reach out to them directly I'm sure they'd be happy to help work on a path forward
rahulchaphalkar commented on PR #12044:
Thanks @zhen9910 for the contribution and also waiting for someone to take a look at it, and @alexcrichton for summarizing some of the changes happening here. I'm reviewing this.
rahulchaphalkar submitted PR review:
Looks good, a couple of comments/question about the included Example.
rahulchaphalkar created PR review comment:
This should've read
wasm32-wasip2instead ofwasm32-wasip1in the original Readme, as this fails with below error:Error: failed to run main module `./crates/wasi-nn/examples/classification-component-onnx/target/wasm32-wasip1/debug/classification-component-onnx.wasm` Caused by: 0: failed to instantiate "./crates/wasi-nn/examples/classification-component-onnx/target/wasm32-wasip1/debug/classification-component-onnx.wasm" 1: unknown import: `wasi:nn/tensor@0.2.0-rc-2024-10-28::[resource-drop]tensor` has not been definedSo thiis needs to be
p2for cpu and gpu both. (Build withcargo build --target wasm32-wasip2)
rahulchaphalkar created PR review comment:
What is the expected behavior of this example when run on a system w/o GPU/Cuda? I ran on my system without an Nvidia GPU, and it seemed to run without complaining, or without explicitly falling back to CPU or failing.
./target/debug/wasmtime run \ -Snn \ --dir ./crates/wasi-nn/examples/classification-component-onnx/fixture/::fixture \ ./crates/wasi-nn/examples/classification-component-onnx/target/wasm32-wasip2/debug/classification-component-onnx.wasm \ gpu Read ONNX model, size in bytes: 4956208 Using GPU (CUDA) execution target from argument Loaded graph into wasi-nn with ExecutionTarget::Gpu target Created wasi-nn execution context. Read ONNX Labels, # of labels: 1000 Executed graph inference Retrieved output data with length: 4000 Index: n02099601 golden retriever - Probability: 0.9948673 Index: n02088094 Afghan hound, Afghan - Probability: 0.002528982 Index: n02102318 cocker spaniel, English cocker spaniel, cocker - Probability: 0.001098644
rahulchaphalkar created PR review comment:
NIT: debug messages for CPU and GPU can be of similar form, e.g.
Using CPU/Nvidia GPU/CUDA execution provideror similar, or more verbose if you want.
zhen9910 commented on PR #12044:
Thanks @zhen9910 for the contribution and also waiting for someone to take a look at it, and @alexcrichton for summarizing some of the changes happening here. I'm reviewing this.
thanks @rahulchaphalkar @alexcrichton for the update! I will take a look and address the comments.
zhen9910 submitted PR review.
zhen9910 created PR review comment:
updated
zhen9910 submitted PR review.
zhen9910 created PR review comment:
The original Readme used
cargo component build, which will build wasm component forwasm32-wasip1by default. so, the readme will be updated withcargo component build --target wasm32-wasip2forwasm32-wasip2
zhen9910 submitted PR review.
zhen9910 created PR review comment:
From ort, if the GPU execution provider is requested but the device does not have a GPU or the necessary CUDA drivers are missing, ONNX Runtime will silently fall back to the CPU execution provider. The application will continue to run, but inference will happen on the CPU. When
ortlog is enabled, we can see a warning like:No execution providers from session options registered successfully; may fall back to CPU.But this
ortlog is not propagated towasi_nn, which cause confusing. I then add a featureort-tracingto enable ort logging for wasi_nn, and this can be used by the users to verify this fallback behavior, please see if this is fine.
zhen9910 updated PR #12044.
zhen9910 edited PR review comment.
zhen9910 edited PR review comment.
zhen9910 updated PR #12044.
zhen9910 requested rahulchaphalkar for a review on PR #12044.
rahulchaphalkar submitted PR review:
Thanks for addressing the feedback, this looks good.
rahulchaphalkar commented on PR #12044:
@alexcrichton I've finished the review. Do you want to take a look and merge this? Also not sure if this needs a
prtest:fullor let the full ci run later.
alexcrichton submitted PR review:
Thanks @rahulchaphalkar!
alexcrichton commented on PR #12044:
And thanks @zhen9910 of course too!
alexcrichton added PR #12044 Support Nvidia-Cuda execution provider for wasi-nn onnx backend to the merge queue.
alexcrichton removed PR #12044 Support Nvidia-Cuda execution provider for wasi-nn onnx backend from the merge queue.
alexcrichton merged PR #12044.
Last updated: Feb 24 2026 at 05:28 UTC