rahulchaphalkar opened PR #9234 from rahulchaphalkar:pytorch-dyn-link-pr
to bytecodealliance:main
:
This change adds a PyTorch backend for wasi-nn.
tch crate is used for Libtorch bindings. I have added an image classification example to demonstrate its usage, which uses a torchscript model.
This backend is currently gated behind a wasi-nn feature flag--features pytorch
as due to dynamic linking, a Libtorch v2.4.0 installation on the system (specified byLIBTORCH=/path/to/libtorch
) is needed for building.
rahulchaphalkar requested alexcrichton for a review on PR #9234.
rahulchaphalkar requested wasmtime-core-reviewers for a review on PR #9234.
rahulchaphalkar requested wasmtime-default-reviewers for a review on PR #9234.
abrown assigned abrown to PR #9234.
alexcrichton requested abrown for a review on PR #9234.
abrown submitted PR review:
This is a good start. The main thing to fix is the handling of the input and output tensors.
abrown created PR review comment:
nit: merge these
abrown created PR review comment:
This function needs to handle the passed index: a model can have multiple inputs and the job this function needs to do is map the incoming tensor to the right one.
abrown created PR review comment:
This seems incorrect:
load_from_dir
is going to pass the path to a directory and this code will try to use it as a file.
abrown created PR review comment:
Storing the output tensor in the same location as the input tensor means that
set_input
followed immediately byget_output
would return the input tensor... probably not what you want here. It looks likeforward_ts
only returns a single tensor so perhaps just create anoutput
field for that and anotherinput: Vec<Tensor>
for the inputs.
abrown created PR review comment:
unimplemented!("the pytorch backend does not yet support GPU execution targets")
abrown created PR review comment:
unimplemented!("the pytorch backend does not yet support TPU execution targets")
abrown created PR review comment:
...since PyTorch backend does indeed support GPU execution.
abrown created PR review comment:
This is incorrect: we need to retrieve the data regardless of the type, so we need to first figure out how many bytes each
Kind
is before constructing the receiving buffer, like:let data = vec![0u8; size_of(ty) * numel]; self.1.copy_data_u8(&mut data, numel);
The
cargo vet
situation is a bit much:cargo vet diff zstd-safe 5.0.1+zstd.1.5.2 5.0.2+zstd.1.5.2 gyscos zstd 2 files changed, 4 insertions(+), 4 deletions(-) cargo vet diff zstd 0.11.1+zstd.1.5.2 0.11.2+zstd.1.5.2 gyscos zip 3 files changed, 5 insertions(+), 5 deletions(-) cargo vet diff num-complex 0.4.2 0.4.6 cuviper ndarray 6 files changed, 188 insertions(+), 48 deletions(-) NOTE: this project trusts Josh Stone (cuviper) - consider cargo vet trust num-complex or cargo vet trust --all cuviper cargo vet inspect constant_time_eq 0.1.5 cesarb zip 311 lines cargo vet diff sha1 0.10.5 0.10.6 newpavlov zip 7 files changed, 302 insertions(+), 20 deletions(-) cargo vet inspect rawpointer 0.2.1 bluss ndarray and matrixmultiply 559 lines cargo vet diff zip 0.6.4 0.6.6 Plecra tch and torch-sys 14 files changed, 604 insertions(+), 109 deletions(-) cargo vet inspect inout 0.1.3 newpavlov cipher 1112 lines NOTE: cargo vet import zcash would eliminate this cargo vet inspect pbkdf2 0.9.0 tarcieri zip 1120 lines cargo vet inspect bzip2 0.4.4 alexcrichton zip 2094 lines NOTE: this project trusts Alex Crichton (alexcrichton) - consider cargo vet trust bzip2 or cargo vet trust --all alexcrichton cargo vet inspect safetensors 0.3.3 Narsil tch 2200 lines cargo vet inspect cipher 0.4.4 newpavlov aes 2635 lines NOTE: cargo vet import zcash would reduce this to a [130](https://github.com/bytecodealliance/wasmtime/actions/runs/10836457564/job/30070281197?pr=9234#step:6:131)0-line diff cargo vet inspect password-hash 0.3.2 tarcieri pbkdf2 3139 lines cargo vet inspect base64ct 1.6.0 tarcieri password-hash 3381 lines cargo vet diff half 1.8.2 2.4.1 starkat99 tch 19 files changed, 2546 insertions(+), 958 deletions(-) cargo vet inspect time 0.1.44 jhpratt zip 3915 lines cargo vet inspect aes 0.7.5 tarcieri zip 6822 lines cargo vet inspect matrixmultiply 0.3.8 bluss ndarray 7934 lines cargo vet inspect ndarray 0.15.6 jturner314 tch 41996 lines cargo vet inspect torch-sys 0.17.0 LaurentMazare tch 52119 lines cargo vet inspect bzip2-sys 0.1.11+1.0.8 alexcrichton bzip2 264[133](https://github.com/bytecodealliance/wasmtime/actions/runs/10836457564/job/30070281197?pr=9234#step:6:134) lines NOTE: this project trusts Alex Crichton (alexcrichton) - consider cargo vet trust bzip2-sys or cargo vet trust --all alexcrichton cargo vet inspect tch 0.17.0 LaurentMazare wasmtime-wasi-nn 2287297 lines
rahulchaphalkar updated PR #9234 (assigned to abrown).
rahulchaphalkar submitted PR review.
rahulchaphalkar created PR review comment:
Changed to
path.join("model.pt")
rahulchaphalkar submitted PR review.
rahulchaphalkar created PR review comment:
Done
rahulchaphalkar submitted PR review.
rahulchaphalkar created PR review comment:
Good point. I added
inputs
as vector of tensors. Also changed structs in general to use named fields to make code more readable.
rahulchaphalkar submitted PR review.
rahulchaphalkar created PR review comment:
Done, added
kind_to_size()
for conversions.
rahulchaphalkar submitted PR review.
rahulchaphalkar created PR review comment:
Done
rahulchaphalkar submitted PR review.
rahulchaphalkar created PR review comment:
This is one of the differences for this backend. The module's
forward
method should handle multiple inputs appropriately if it does support multiple inputs. The vector of input tensors being passed toforward
should be sufficient, no index or name is needed.
rahulchaphalkar commented on PR #9234:
This is a good start. The main thing to fix is the handling of the input and output tensors.
Thanks for the review, Andrew. I've marked smaller Nits as resolved, and I've addressed other comments as well, but kept them 'unresolved' as of now until you take a look.
abrown submitted PR review.
abrown created PR review comment:
This is almost there: we're still ignoring the index, though. What might need to happen is that we set up inputs as a
Vec<Option<TchTensor>>
filled withNone
and then set the right item based on the index. We are able to retrieve the number of inputs, right?
rahulchaphalkar submitted PR review.
rahulchaphalkar created PR review comment:
The index is being ignored because jit-compiled pytorch models expect tensors in order, similar to what is achieved by vector of tensors in
set_input()
. Do we need to use index here to keep things consistent with wasi-nn? I don't think I saw a direct way to retrieve the number of inputs to a model, I can look into that further. However, someone using pytorch would probably not intend on using index.
This was my previous comment on ignoring the index:This is one of the differences for this backend. The module's forward method should handle multiple inputs appropriately if it does support multiple inputs. The vector of input tensors being passed to forward should be sufficient, no index or name is needed.
rahulchaphalkar submitted PR review.
rahulchaphalkar created PR review comment:
Did you mean use the index as the position/index for the vector of tensors? That sounds good. I'll see if there's a way to determine the max number of inputs to the given model.
rahulchaphalkar updated PR #9234 (assigned to abrown).
rahulchaphalkar submitted PR review.
rahulchaphalkar created PR review comment:
Alright, so here's what I've changed -
set_input()
now uses theid
passed to it. If it isu32
, it assigns the tensor to the vectorinputs[id]
, assigningNone
if there are any empty spots below the givenid
. This will give the user flexibility to set input tensors non sequentially.compute()
will check the vector for anyNone
values, and give an error if present, before callingforward_ts()
.- There is no reliable way at this time to get max inputs for a model. However, assigning more inputs than available returns an error message detailing the max inputs the model expects at run time, so this is helpful for the user. The expectation, similar to other backends, is for the user to be aware of input size/shape etc.
- If
id
is a string, I've currently permitted only single input to be set, effectively ignoring the index. If more inputs are assigned, or ifu32
andString
indexes are used together, we give out an error.Let me know if this looks fine. Thanks.
rahulchaphalkar updated PR #9234 (assigned to abrown).
abrown submitted PR review.
abrown created PR review comment:
Yeah, looking at 727a4aa, I think that makes more sense. I was kind of hoping that
CModule::named_parameters
would return the list of inputs by name, but, if that's not the case, then let's just return an error forId::Name
side of things. In any case, we probably don't needid_type
: if we get anId::Index
we know were to put it in the vector and if we get anId::Name
then we either (a) look up its index innamed_parameters
or (b) if that's not possible, return an error.
abrown edited PR review comment.
rahulchaphalkar submitted PR review.
rahulchaphalkar created PR review comment:
I had added the Id:Name case because the wit test cases (
nn_wit_image_classification_pytorch
in this case) need a Name instead of Index to pass - nn.rs and wasi-nn.wit. Although it looks likeset-input
might be going away anyway.
rahulchaphalkar updated PR #9234 (assigned to abrown).
abrown submitted PR review.
abrown created PR review comment:
Here's the identical 44.7MB file checked in again.
abrown created PR review comment:
Can we find a smaller model or download this instead? Not all Wasmtime users probably want to download this file... twice.
abrown created PR review comment:
Can you re-add the new line?
abrown edited PR review comment.
abrown updated PR #9234 (assigned to abrown).
abrown updated PR #9234 (assigned to abrown).
abrown updated PR #9234 (assigned to abrown).
rahulchaphalkar updated PR #9234 (assigned to abrown).
rahulchaphalkar submitted PR review.
rahulchaphalkar created PR review comment:
Done
rahulchaphalkar submitted PR review.
rahulchaphalkar created PR review comment:
I've made the following changes -
Created a repository to generate jit-compiled libtorch models, and download models from Release for tests and examples.
https://github.com/rahulchaphalkar/libtorch-models
https://github.com/rahulchaphalkar/libtorch-models/releases/tag/v0.1Switched from Resnet18(45MB) to Squeezenet1.1(4.5MB) as the file to download from above repo.
- This
libtorch-models
repo can issue new releases if we update the pytorch/libtorch version in this backend from current 2.4.0 to ensure model compatibility.
rahulchaphalkar created PR review comment:
Downloading from external repo, same as tests
rahulchaphalkar submitted PR review.
rahulchaphalkar updated PR #9234 (assigned to abrown).
I think 51930fa and da1f467 have unintentionally updated some crates. Can you remove those commits?
rahulchaphalkar updated PR #9234 (assigned to abrown).
rahulchaphalkar commented on PR #9234:
I rolled back 2 commits, but there's an issue with the lock file. I had previously attempted to fix this by deleting my lockfile, rebasing off of latest main, and then doing a
cargo build
to generate any of my lock file changes on top of that.
rahulchaphalkar updated PR #9234 (assigned to abrown).
rahulchaphalkar commented on PR #9234:
The failing tests fail due to
error: failed retrieving file 'mingw-w64-x86_64-headers-git-12.0.0.r329.g8f7b5ce36-1-any.pkg.tar.zst.sig' from mirror.clarkson.edu : Operation too slow. Less than 1 bytes/sec transferred the last 10 seconds mingw-w64-x86_64-gcc-14.2.0-1-any downloading... error: failed retrieving file 'mingw-w64-x86_64-mpfr-4.2.1-2-any.pkg.tar.zst' from mirror.msys2.org : Operation too slow. Less than 1 bytes/sec transferred the last 10 seconds
I'm hoping a rerun of CI would help.
rahulchaphalkar commented on PR #9234:
@abrown can you take a look
abrown submitted PR review.
abrown merged PR #9234.
Last updated: Jan 24 2025 at 00:11 UTC