wasmtime / PR #12044 Support Nvidia-Cuda execution provid... · git-wasmtime

Stream: git-wasmtime

Topic: wasmtime / PR #12044 Support Nvidia-Cuda execution provid...

Wasmtime GitHub notifications bot (Nov 18 2025 at 20:04):

zhen9910 opened PR #12044 from zhen9910:zkong/update-ort-and-onnx-gpu to bytecodealliance:main:

As discussed in https://github.com/bytecodealliance/wasmtime/issues/8547, the existing wasi-nn onnx backend only uses the default CPU execution provider. This PR updated the onnxruntime crate ort to 2.0.0-rc.10 with improved cuda support and added onnx-cuda based wasi-nn GPU execution target in wasmtime-wasi-nn onnx backend.

Wasmtime GitHub notifications bot (Nov 18 2025 at 20:04):

zhen9910 requested wasmtime-wasi-reviewers for a review on PR #12044.

Wasmtime GitHub notifications bot (Nov 18 2025 at 20:04):

zhen9910 requested fitzgen for a review on PR #12044.

Wasmtime GitHub notifications bot (Nov 18 2025 at 20:04):

zhen9910 requested wasmtime-default-reviewers for a review on PR #12044.

Wasmtime GitHub notifications bot (Nov 19 2025 at 00:03):

zhen9910 updated PR #12044.

Wasmtime GitHub notifications bot (Nov 19 2025 at 12:31):

devigned commented on PR #12044:

Looks like the bump to ort caused cargo vet to be angry, which is to be expected.

+1 to the addition of GPU support for the ONNX backend \o/

@abrown, you might be interested in this one.

Wasmtime GitHub notifications bot (Nov 19 2025 at 20:51):

fitzgen requested abrown for a review on PR #12044.

Wasmtime GitHub notifications bot (Dec 10 2025 at 20:13):

zhen9910 commented on PR #12044:

@abrown could you please review the PR and add the cargo vet entries?

Wasmtime GitHub notifications bot (Dec 11 2025 at 12:13):

alexcrichton commented on PR #12044:

@abrown is in the process of handing off wasi-nn maintenance/work to @jlb6740 and @rahulchaphalkar so as a heads up this might take a moment @zhen9910 to allocate time to work on it.

Wasmtime GitHub notifications bot (Dec 11 2025 at 18:54):

zhen9910 commented on PR #12044:

@abrown is in the process of handing off wasi-nn maintenance/work to @jlb6740 and @rahulchaphalkar so as a heads up this might take a moment @zhen9910 to allocate time to work on it.

I see, thanks for update

Wasmtime GitHub notifications bot (Jan 12 2026 at 20:16):

zhen9910 updated PR #12044.

Wasmtime GitHub notifications bot (Jan 12 2026 at 20:22):

zhen9910 edited PR #12044:

As discussed in https://github.com/bytecodealliance/wasmtime/issues/8547, the existing wasi-nn onnx backend only uses the default CPU execution provider. This PR added onnx-cuda based wasi-nn GPU execution target in wasmtime-wasi-nn onnx backend.

Wasmtime GitHub notifications bot (Jan 13 2026 at 01:51):

zhen9910 commented on PR #12044:

@alexcrichton could you take a look this PR, which is rebased against main with cargo vet and ort changes

Wasmtime GitHub notifications bot (Jan 13 2026 at 19:55):

alexcrichton commented on PR #12044:

I unfortunately am not equipped myself to review/maintain wasi-nn. Historically that was Andrew, who's now handed off to @jlb6740 and @rahulchaphalkar, but they (I suspect) have other priorities to balance too. @zhen9910 if you'd like to reach out to them directly I'm sure they'd be happy to help work on a path forward

Wasmtime GitHub notifications bot (Jan 13 2026 at 21:48):

rahulchaphalkar commented on PR #12044:

Thanks @zhen9910 for the contribution and also waiting for someone to take a look at it, and @alexcrichton for summarizing some of the changes happening here. I'm reviewing this.

Wasmtime GitHub notifications bot (Jan 13 2026 at 22:01):

rahulchaphalkar submitted PR review:

Looks good, a couple of comments/question about the included Example.

Wasmtime GitHub notifications bot (Jan 13 2026 at 22:01):

rahulchaphalkar created PR review comment:

This should've read wasm32-wasip2 instead of wasm32-wasip1 in the original Readme, as this fails with below error:

Error: failed to run main module `./crates/wasi-nn/examples/classification-component-onnx/target/wasm32-wasip1/debug/classification-component-onnx.wasm`

Caused by:
    0: failed to instantiate "./crates/wasi-nn/examples/classification-component-onnx/target/wasm32-wasip1/debug/classification-component-onnx.wasm"
    1: unknown import: `wasi:nn/tensor@0.2.0-rc-2024-10-28::[resource-drop]tensor` has not been defined

So thiis needs to be p2 for cpu and gpu both. (Build with cargo build --target wasm32-wasip2)

Wasmtime GitHub notifications bot (Jan 13 2026 at 22:01):

rahulchaphalkar created PR review comment:

What is the expected behavior of this example when run on a system w/o GPU/Cuda? I ran on my system without an Nvidia GPU, and it seemed to run without complaining, or without explicitly falling back to CPU or failing.

./target/debug/wasmtime run \
    -Snn \
    --dir ./crates/wasi-nn/examples/classification-component-onnx/fixture/::fixture \
    ./crates/wasi-nn/examples/classification-component-onnx/target/wasm32-wasip2/debug/classification-component-onnx.wasm \
    gpu
Read ONNX model, size in bytes: 4956208
Using GPU (CUDA) execution target from argument
Loaded graph into wasi-nn with ExecutionTarget::Gpu target
Created wasi-nn execution context.
Read ONNX Labels, # of labels: 1000
Executed graph inference
Retrieved output data with length: 4000
Index: n02099601 golden retriever - Probability: 0.9948673
Index: n02088094 Afghan hound, Afghan - Probability: 0.002528982
Index: n02102318 cocker spaniel, English cocker spaniel, cocker - Probability: 0.001098644

Wasmtime GitHub notifications bot (Jan 13 2026 at 22:01):

rahulchaphalkar created PR review comment:

NIT: debug messages for CPU and GPU can be of similar form, e.g. Using CPU/Nvidia GPU/CUDA execution provider or similar, or more verbose if you want.

Wasmtime GitHub notifications bot (Jan 14 2026 at 18:10):

zhen9910 commented on PR #12044:

Thanks @zhen9910 for the contribution and also waiting for someone to take a look at it, and @alexcrichton for summarizing some of the changes happening here. I'm reviewing this.

thanks @rahulchaphalkar @alexcrichton for the update! I will take a look and address the comments.

Wasmtime GitHub notifications bot (Jan 15 2026 at 07:03):

zhen9910 submitted PR review.

Wasmtime GitHub notifications bot (Jan 15 2026 at 07:03):

zhen9910 created PR review comment:

updated

Wasmtime GitHub notifications bot (Jan 15 2026 at 07:06):

zhen9910 submitted PR review.

Wasmtime GitHub notifications bot (Jan 15 2026 at 07:06):

zhen9910 created PR review comment:

The original Readme used cargo component build, which will build wasm component for wasm32-wasip1 by default. so, the readme will be updated with cargo component build --target wasm32-wasip2 for wasm32-wasip2

Wasmtime GitHub notifications bot (Jan 15 2026 at 07:19):

zhen9910 submitted PR review.

Wasmtime GitHub notifications bot (Jan 15 2026 at 07:19):

zhen9910 created PR review comment:

From ort, if the GPU execution provider is requested but the device does not have a GPU or the necessary CUDA drivers are missing, ONNX Runtime will silently fall back to the CPU execution provider. The application will continue to run, but inference will happen on the CPU. When ort log is enabled, we can see a warning like: No execution providers from session options registered successfully; may fall back to CPU.

But this ort log is not propagated to wasi_nn, which cause confusing. I then add a feature ort-tracing to enable ort logging for wasi_nn, and this can be used by the users to verify this fallback behavior, please see if this is fine.

Wasmtime GitHub notifications bot (Jan 15 2026 at 07:21):

zhen9910 updated PR #12044.

Wasmtime GitHub notifications bot (Jan 15 2026 at 07:27):

zhen9910 edited PR review comment.

Wasmtime GitHub notifications bot (Jan 15 2026 at 07:32):

zhen9910 edited PR review comment.

Wasmtime GitHub notifications bot (Jan 15 2026 at 07:36):

zhen9910 updated PR #12044.

Wasmtime GitHub notifications bot (Jan 16 2026 at 16:58):

zhen9910 requested rahulchaphalkar for a review on PR #12044.

Wasmtime GitHub notifications bot (Jan 16 2026 at 19:51):

rahulchaphalkar submitted PR review:

Thanks for addressing the feedback, this looks good.

Wasmtime GitHub notifications bot (Jan 16 2026 at 19:53):

rahulchaphalkar commented on PR #12044:

@alexcrichton I've finished the review. Do you want to take a look and merge this? Also not sure if this needs a prtest:full or let the full ci run later.

Wasmtime GitHub notifications bot (Jan 20 2026 at 00:21):

alexcrichton submitted PR review:

Thanks @rahulchaphalkar!

Wasmtime GitHub notifications bot (Jan 20 2026 at 00:21):

alexcrichton commented on PR #12044:

And thanks @zhen9910 of course too!

Wasmtime GitHub notifications bot (Jan 20 2026 at 00:21):

alexcrichton added PR #12044 Support Nvidia-Cuda execution provider for wasi-nn onnx backend to the merge queue.

Wasmtime GitHub notifications bot (Jan 20 2026 at 00:45):

alexcrichton removed PR #12044 Support Nvidia-Cuda execution provider for wasi-nn onnx backend from the merge queue.

Wasmtime GitHub notifications bot (Jan 20 2026 at 00:45):

alexcrichton merged PR #12044.

Last updated: Jun 01 2026 at 09:49 UTC