rahulchaphalkar opened PR #9234 from rahulchaphalkar:pytorch-dyn-link-pr to bytecodealliance:main:
This change adds a PyTorch backend for wasi-nn.
tch crate is used for Libtorch bindings. I have added an image classification example to demonstrate its usage, which uses a torchscript model.
This backend is currently gated behind a wasi-nn feature flag--features pytorchas due to dynamic linking, a Libtorch v2.4.0 installation on the system (specified byLIBTORCH=/path/to/libtorch) is needed for building.
rahulchaphalkar requested alexcrichton for a review on PR #9234.
rahulchaphalkar requested wasmtime-core-reviewers for a review on PR #9234.
rahulchaphalkar requested wasmtime-default-reviewers for a review on PR #9234.
abrown assigned abrown to PR #9234.
alexcrichton requested abrown for a review on PR #9234.
abrown submitted PR review:
This is a good start. The main thing to fix is the handling of the input and output tensors.
abrown created PR review comment:
nit: merge these
abrown created PR review comment:
This function needs to handle the passed index: a model can have multiple inputs and the job this function needs to do is map the incoming tensor to the right one.
abrown created PR review comment:
This seems incorrect:
load_from_diris going to pass the path to a directory and this code will try to use it as a file.
abrown created PR review comment:
Storing the output tensor in the same location as the input tensor means that
set_inputfollowed immediately byget_outputwould return the input tensor... probably not what you want here. It looks likeforward_tsonly returns a single tensor so perhaps just create anoutputfield for that and anotherinput: Vec<Tensor>for the inputs.
abrown created PR review comment:
unimplemented!("the pytorch backend does not yet support GPU execution targets")
abrown created PR review comment:
unimplemented!("the pytorch backend does not yet support TPU execution targets")
abrown created PR review comment:
...since PyTorch backend does indeed support GPU execution.
abrown created PR review comment:
This is incorrect: we need to retrieve the data regardless of the type, so we need to first figure out how many bytes each
Kindis before constructing the receiving buffer, like:let data = vec![0u8; size_of(ty) * numel]; self.1.copy_data_u8(&mut data, numel);
The
cargo vetsituation is a bit much:cargo vet diff zstd-safe 5.0.1+zstd.1.5.2 5.0.2+zstd.1.5.2 gyscos zstd 2 files changed, 4 insertions(+), 4 deletions(-) cargo vet diff zstd 0.11.1+zstd.1.5.2 0.11.2+zstd.1.5.2 gyscos zip 3 files changed, 5 insertions(+), 5 deletions(-) cargo vet diff num-complex 0.4.2 0.4.6 cuviper ndarray 6 files changed, 188 insertions(+), 48 deletions(-) NOTE: this project trusts Josh Stone (cuviper) - consider cargo vet trust num-complex or cargo vet trust --all cuviper cargo vet inspect constant_time_eq 0.1.5 cesarb zip 311 lines cargo vet diff sha1 0.10.5 0.10.6 newpavlov zip 7 files changed, 302 insertions(+), 20 deletions(-) cargo vet inspect rawpointer 0.2.1 bluss ndarray and matrixmultiply 559 lines cargo vet diff zip 0.6.4 0.6.6 Plecra tch and torch-sys 14 files changed, 604 insertions(+), 109 deletions(-) cargo vet inspect inout 0.1.3 newpavlov cipher 1112 lines NOTE: cargo vet import zcash would eliminate this cargo vet inspect pbkdf2 0.9.0 tarcieri zip 1120 lines cargo vet inspect bzip2 0.4.4 alexcrichton zip 2094 lines NOTE: this project trusts Alex Crichton (alexcrichton) - consider cargo vet trust bzip2 or cargo vet trust --all alexcrichton cargo vet inspect safetensors 0.3.3 Narsil tch 2200 lines cargo vet inspect cipher 0.4.4 newpavlov aes 2635 lines NOTE: cargo vet import zcash would reduce this to a [130](https://github.com/bytecodealliance/wasmtime/actions/runs/10836457564/job/30070281197?pr=9234#step:6:131)0-line diff cargo vet inspect password-hash 0.3.2 tarcieri pbkdf2 3139 lines cargo vet inspect base64ct 1.6.0 tarcieri password-hash 3381 lines cargo vet diff half 1.8.2 2.4.1 starkat99 tch 19 files changed, 2546 insertions(+), 958 deletions(-) cargo vet inspect time 0.1.44 jhpratt zip 3915 lines cargo vet inspect aes 0.7.5 tarcieri zip 6822 lines cargo vet inspect matrixmultiply 0.3.8 bluss ndarray 7934 lines cargo vet inspect ndarray 0.15.6 jturner314 tch 41996 lines cargo vet inspect torch-sys 0.17.0 LaurentMazare tch 52119 lines cargo vet inspect bzip2-sys 0.1.11+1.0.8 alexcrichton bzip2 264[133](https://github.com/bytecodealliance/wasmtime/actions/runs/10836457564/job/30070281197?pr=9234#step:6:134) lines NOTE: this project trusts Alex Crichton (alexcrichton) - consider cargo vet trust bzip2-sys or cargo vet trust --all alexcrichton cargo vet inspect tch 0.17.0 LaurentMazare wasmtime-wasi-nn 2287297 lines
rahulchaphalkar updated PR #9234 (assigned to abrown).
rahulchaphalkar submitted PR review.
rahulchaphalkar created PR review comment:
Changed to
path.join("model.pt")
rahulchaphalkar submitted PR review.
rahulchaphalkar created PR review comment:
Done
rahulchaphalkar submitted PR review.
rahulchaphalkar created PR review comment:
Good point. I added
inputsas vector of tensors. Also changed structs in general to use named fields to make code more readable.
rahulchaphalkar submitted PR review.
rahulchaphalkar created PR review comment:
Done, added
kind_to_size()for conversions.
rahulchaphalkar submitted PR review.
rahulchaphalkar created PR review comment:
Done
rahulchaphalkar submitted PR review.
rahulchaphalkar created PR review comment:
This is one of the differences for this backend. The module's
forwardmethod should handle multiple inputs appropriately if it does support multiple inputs. The vector of input tensors being passed toforwardshould be sufficient, no index or name is needed.
rahulchaphalkar commented on PR #9234:
This is a good start. The main thing to fix is the handling of the input and output tensors.
Thanks for the review, Andrew. I've marked smaller Nits as resolved, and I've addressed other comments as well, but kept them 'unresolved' as of now until you take a look.
abrown submitted PR review.
abrown created PR review comment:
This is almost there: we're still ignoring the index, though. What might need to happen is that we set up inputs as a
Vec<Option<TchTensor>>filled withNoneand then set the right item based on the index. We are able to retrieve the number of inputs, right?
rahulchaphalkar submitted PR review.
rahulchaphalkar created PR review comment:
The index is being ignored because jit-compiled pytorch models expect tensors in order, similar to what is achieved by vector of tensors in
set_input(). Do we need to use index here to keep things consistent with wasi-nn? I don't think I saw a direct way to retrieve the number of inputs to a model, I can look into that further. However, someone using pytorch would probably not intend on using index.
This was my previous comment on ignoring the index:This is one of the differences for this backend. The module's forward method should handle multiple inputs appropriately if it does support multiple inputs. The vector of input tensors being passed to forward should be sufficient, no index or name is needed.
rahulchaphalkar submitted PR review.
rahulchaphalkar created PR review comment:
Did you mean use the index as the position/index for the vector of tensors? That sounds good. I'll see if there's a way to determine the max number of inputs to the given model.
rahulchaphalkar updated PR #9234 (assigned to abrown).
rahulchaphalkar submitted PR review.
rahulchaphalkar created PR review comment:
Alright, so here's what I've changed -
set_input()now uses theidpassed to it. If it isu32, it assigns the tensor to the vectorinputs[id], assigningNoneif there are any empty spots below the givenid. This will give the user flexibility to set input tensors non sequentially.compute()will check the vector for anyNonevalues, and give an error if present, before callingforward_ts().- There is no reliable way at this time to get max inputs for a model. However, assigning more inputs than available returns an error message detailing the max inputs the model expects at run time, so this is helpful for the user. The expectation, similar to other backends, is for the user to be aware of input size/shape etc.
- If
idis a string, I've currently permitted only single input to be set, effectively ignoring the index. If more inputs are assigned, or ifu32andStringindexes are used together, we give out an error.Let me know if this looks fine. Thanks.
rahulchaphalkar updated PR #9234 (assigned to abrown).
abrown submitted PR review.
abrown created PR review comment:
Yeah, looking at 727a4aa, I think that makes more sense. I was kind of hoping that
CModule::named_parameterswould return the list of inputs by name, but, if that's not the case, then let's just return an error forId::Nameside of things. In any case, we probably don't needid_type: if we get anId::Indexwe know were to put it in the vector and if we get anId::Namethen we either (a) look up its index innamed_parametersor (b) if that's not possible, return an error.
abrown edited PR review comment.
rahulchaphalkar submitted PR review.
rahulchaphalkar created PR review comment:
I had added the Id:Name case because the wit test cases (
nn_wit_image_classification_pytorchin this case) need a Name instead of Index to pass - nn.rs and wasi-nn.wit. Although it looks likeset-inputmight be going away anyway.
rahulchaphalkar updated PR #9234 (assigned to abrown).
abrown submitted PR review.
abrown created PR review comment:
Here's the identical 44.7MB file checked in again.
abrown created PR review comment:
Can we find a smaller model or download this instead? Not all Wasmtime users probably want to download this file... twice.
abrown created PR review comment:
Can you re-add the new line?
abrown edited PR review comment.
abrown updated PR #9234 (assigned to abrown).
abrown updated PR #9234 (assigned to abrown).
abrown updated PR #9234 (assigned to abrown).
rahulchaphalkar updated PR #9234 (assigned to abrown).
rahulchaphalkar submitted PR review.
rahulchaphalkar created PR review comment:
Done
rahulchaphalkar submitted PR review.
rahulchaphalkar created PR review comment:
I've made the following changes -
Created a repository to generate jit-compiled libtorch models, and download models from Release for tests and examples.
https://github.com/rahulchaphalkar/libtorch-models
https://github.com/rahulchaphalkar/libtorch-models/releases/tag/v0.1Switched from Resnet18(45MB) to Squeezenet1.1(4.5MB) as the file to download from above repo.
- This
libtorch-modelsrepo can issue new releases if we update the pytorch/libtorch version in this backend from current 2.4.0 to ensure model compatibility.
rahulchaphalkar created PR review comment:
Downloading from external repo, same as tests
rahulchaphalkar submitted PR review.
rahulchaphalkar updated PR #9234 (assigned to abrown).
I think 51930fa and da1f467 have unintentionally updated some crates. Can you remove those commits?
rahulchaphalkar updated PR #9234 (assigned to abrown).
rahulchaphalkar commented on PR #9234:
I rolled back 2 commits, but there's an issue with the lock file. I had previously attempted to fix this by deleting my lockfile, rebasing off of latest main, and then doing a
cargo buildto generate any of my lock file changes on top of that.
rahulchaphalkar updated PR #9234 (assigned to abrown).
rahulchaphalkar commented on PR #9234:
The failing tests fail due to
error: failed retrieving file 'mingw-w64-x86_64-headers-git-12.0.0.r329.g8f7b5ce36-1-any.pkg.tar.zst.sig' from mirror.clarkson.edu : Operation too slow. Less than 1 bytes/sec transferred the last 10 seconds mingw-w64-x86_64-gcc-14.2.0-1-any downloading... error: failed retrieving file 'mingw-w64-x86_64-mpfr-4.2.1-2-any.pkg.tar.zst' from mirror.msys2.org : Operation too slow. Less than 1 bytes/sec transferred the last 10 secondsI'm hoping a rerun of CI would help.
rahulchaphalkar commented on PR #9234:
@abrown can you take a look
abrown submitted PR review.
abrown merged PR #9234.
Last updated: Dec 13 2025 at 19:03 UTC