Stream: wasi-nn

Topic: Named models in the Wasmtime CLI


view this post on Zulip Andrew Brown (Aug 09 2023 at 16:58):

I intend to merge a change to wasi-nn to add "named models," which adds a way to refer to ML models by name instead of having to pass in all of the bytes (see @Matthew Tamayo-Rios's PR here: https://github.com/WebAssembly/wasi-nn/pull/38). I'm looking for input on whether this functionality should be bubbled up to the Wasmtime CLI flags.

This change has been scoped down to only adding support for getting a handle to a named model for inference and updating the WIT bindings to the latest spec.

view this post on Zulip Andrew Brown (Aug 09 2023 at 17:02):

The way I'm thinking about it, it would look and feel approximately like preopened directories. Maybe the user passes in --preloaded-models=<format>:<directory> and the wasmtimewasi-nn crate figures out the necessary bits to load those models and make them available with the directory name. Maybe we add a :<name> suffix on there so users have more control of the name.

view this post on Zulip Andrew Brown (Aug 09 2023 at 17:04):

My rationale for doing this in the CLI is to allow experimentation with this new API. This would also be available as a programmatic API on the the wasi-nn host object (e.g., during construction) for embedders to use.

view this post on Zulip Andrew Brown (Aug 09 2023 at 17:05):

Let me know any feedback before I go off and implement a bunch of stuff! cc: @Alex Crichton, @Dan Gohman, @fitzgen (he/him)

view this post on Zulip Alex Crichton (Aug 09 2023 at 17:20):

I dunno much about wasi-nn so I can't really comment on whether this seems right or not, but as for having a CLI option for wasi-nn that seems reasonable since we have a bunch for wasi-common

view this post on Zulip Angel M (Aug 22 2023 at 07:28):

Hello @Andrew Brown !

This would be amazing. I started integrating WASI-NN to Wasm Workers Server and I see named models as a great addition to the project. Could we help on this?

Introduce the new WASI-NN bindings to run Machine Learning (ML) inference in workers. It includes a new configuration parameter (features.wasi_nn) that allows you to set the allowed ML backends for...

view this post on Zulip Andrew Brown (Aug 22 2023 at 16:32):

cool, I don't think I'd ever looked to closely at that project; sort of neat. I think the next step to get named models working is to get a review on https://github.com/bytecodealliance/wasmtime/pull/6854.

This implements named models in Wasmtime; see the commit messages for more details.

view this post on Zulip Andrew Brown (Aug 22 2023 at 16:33):

I had been tagging @Pat Hickey with that sequence of PRs but he's probably busy with other stuff; @Alex Crichton, @Dan Gohman... do either of you want to take a look?

view this post on Zulip Andrew Brown (Aug 22 2023 at 16:38):

@Angel M, beyond that, I think the feature should be relatively "done"; I have some additional PRs to make with some cleanups and a testing overhaul for wasi-nn and @Matthew Tamayo-Rios has an additional backend to add in https://github.com/bytecodealliance/wasmtime/pull/6867. If you're interested in helping out, though, there are many little things that could be "made nicer" that you will see as you use wasi-nn--PRs for any of those would be appreciated. Also, if you're interested in adding new backends that would be helpful!

This implements a kserve backend allowing forwarding of wasi-nn calls over http to servers implementing the kserve protocol (documented here https://github.com/kserve/kserve/blob/master/docs/predic...

view this post on Zulip Angel M (Aug 23 2023 at 06:37):

Andrew Brown said:

cool, I don't think I'd ever looked to closely at that project; sort of neat. I think the next step to get named models working is to get a review on https://github.com/bytecodealliance/wasmtime/pull/6854.

I see the PR is already merged! :clap:

view this post on Zulip Angel M (Aug 23 2023 at 06:43):

Andrew Brown said:

Angel M, beyond that, I think the feature should be relatively "done"; I have some additional PRs to make with some cleanups and a testing overhaul for wasi-nn and Matthew Tamayo-Rios has an additional backend to add in https://github.com/bytecodealliance/wasmtime/pull/6867. If you're interested in helping out, though, there are many little things that could be "made nicer" that you will see as you use wasi-nn--PRs for any of those would be appreciated. Also, if you're interested in adding new backends that would be helpful!

Amazing! First thing, I will give it a try and test the named-models feature. This will allow wws to configure a set of predefined models per worker / function. Regarding the wasi-nn-PRs, is there any specific tag for those issues? Those little improvements seem to be a great way to get involved in the project :)

Definitely adding new backends is something we have in mind. Tensorflowlite or Pytorch could be great additions. I know there are some security concerns related to Tensorflow. Not sure about your take on that one.

view this post on Zulip Angel M (Aug 23 2023 at 06:46):

@Andrew Brown another topic that I have in mind are LLMs and dynamic input / outputs. I'm still very new to AI / ML, so maybe you already figure out how to work with LLMs and WASI-NN. However, I couldn't find any example about those. Let me open a separate conversation about this so we can close this topic.

view this post on Zulip Angel M (Aug 23 2023 at 06:46):

And thank you for all the context and responses!

view this post on Zulip Andrew Brown (Aug 23 2023 at 17:19):

Well, I haven't yet created issues for TODO work; I'm still in the middle of things so it's hard to see what should be fixed immediately and what I should postpone for later. I'll let you know once I have a bit more clarity on that. (e.g., see refactorings like https://github.com/bytecodealliance/wasmtime/pull/6893)

One improvement that came from discussions with @geekbeast is that BackendKind, the enum used for differentiating between ML implementation, is no longer necessary. Instead, we can use the generate...

view this post on Zulip Andrew Brown (Aug 23 2023 at 17:22):

Since TF accepts operators that can read/write files, do network I/O, etc., it seems like it would just open Wasmtime up to attacks. Some way of mitigating that would need to get figured out before moving https://github.com/bytecodealliance/wasmtime/pull/3977 forward. I haven't looked too closely at PyTorch yet.

Users will now be able to use either OpenVino or Tensorflow for their backend.

view this post on Zulip Andrew Brown (Aug 23 2023 at 17:23):

re: LLMs, maybe @Matthew Tamayo-Rios can comment further. He is planning to demo something at WasmCon and may have more comments about that.

view this post on Zulip Andrew Brown (Aug 23 2023 at 17:51):

@Angel M, I would say that any issues you find by using wasi-nn are extremely valuable. It is quite difficult to exhaustively test a thing like "ML" so and eventually we need to have a better testing strategy.

view this post on Zulip Angel M (Aug 23 2023 at 18:35):

Thank you @Andrew Brown for the background on the different tasks and refactors. I will take a look at the different open issues to start getting familiar with the codebase.

view this post on Zulip Angel M (Aug 23 2023 at 18:38):

Regarding WasmCon, that's amazing! Do you plan to attend @Andrew Brown ? I will give two talks, so I'll be there for sure hehe. Would be great to chat with you both :big_smile:

view this post on Zulip Andrew Brown (Aug 23 2023 at 22:50):

Yeah, I'll be there. It will be good to meet in person!

view this post on Zulip Angel M (Aug 25 2023 at 14:29):

Amazing! We plan to work more and more on Wasm + AI, so it would be great to chat with you and learn more :)

view this post on Zulip Matthew Tamayo-Rios (Aug 28 2023 at 07:01):

I will also be there and will be showing off a prompt based stable diffusion demo running through WASI-NN on top of fastly's compute@edge infrastructure.

view this post on Zulip Matthew Tamayo-Rios (Aug 28 2023 at 07:01):

One thing to keep in mind as we consider additional backends is that we need to figure out a better testing story for said backends since most backends require loading the relevant library (openvino, libtorch, etc) in order to run.

view this post on Zulip Matthew Tamayo-Rios (Aug 28 2023 at 07:04):

The other big issue for TensorFlow is that the SavedModel format expects to be able to read from a directory on disk. There's an older h5 format that is more limited in functionality that can be read from bytes, but it one more challenge to using tensorflow models in a dynamic environment. There's some work arounds, but I'm not sure they work outside of python (i.e you can implement a virtual drive)

view this post on Zulip Matthew Tamayo-Rios (Aug 28 2023 at 07:05):

PyTorch also has a lot of outstanding security issues at the moment for untrusted models, because they use the pickle storage format.

view this post on Zulip Matthew Tamayo-Rios (Aug 28 2023 at 07:05):

They are looking at switching to SafeTensors but have not done so yet.


Last updated: Oct 23 2024 at 20:03 UTC