I intend to merge a change to wasi-nn to add "named models," which adds a way to refer to ML models by name instead of having to pass in all of the bytes (see @Matthew Tamayo-Rios's PR here: https://github.com/WebAssembly/wasi-nn/pull/38). I'm looking for input on whether this functionality should be bubbled up to the Wasmtime CLI flags.
The way I'm thinking about it, it would look and feel approximately like preopened directories. Maybe the user passes in --preloaded-models=<format>:<directory>
and the wasmtimewasi-nn crate figures out the necessary bits to load those models and make them available with the directory name. Maybe we add a :<name>
suffix on there so users have more control of the name.
My rationale for doing this in the CLI is to allow experimentation with this new API. This would also be available as a programmatic API on the the wasi-nn host object (e.g., during construction) for embedders to use.
Let me know any feedback before I go off and implement a bunch of stuff! cc: @Alex Crichton, @Dan Gohman, @fitzgen (he/him)
I dunno much about wasi-nn so I can't really comment on whether this seems right or not, but as for having a CLI option for wasi-nn that seems reasonable since we have a bunch for wasi-common
Hello @Andrew Brown !
This would be amazing. I started integrating WASI-NN to Wasm Workers Server and I see named models as a great addition to the project. Could we help on this?
cool, I don't think I'd ever looked to closely at that project; sort of neat. I think the next step to get named models working is to get a review on https://github.com/bytecodealliance/wasmtime/pull/6854.
I had been tagging @Pat Hickey with that sequence of PRs but he's probably busy with other stuff; @Alex Crichton, @Dan Gohman... do either of you want to take a look?
@Angel M, beyond that, I think the feature should be relatively "done"; I have some additional PRs to make with some cleanups and a testing overhaul for wasi-nn and @Matthew Tamayo-Rios has an additional backend to add in https://github.com/bytecodealliance/wasmtime/pull/6867. If you're interested in helping out, though, there are many little things that could be "made nicer" that you will see as you use wasi-nn--PRs for any of those would be appreciated. Also, if you're interested in adding new backends that would be helpful!
Andrew Brown said:
cool, I don't think I'd ever looked to closely at that project; sort of neat. I think the next step to get named models working is to get a review on https://github.com/bytecodealliance/wasmtime/pull/6854.
I see the PR is already merged! :clap:
Andrew Brown said:
Angel M, beyond that, I think the feature should be relatively "done"; I have some additional PRs to make with some cleanups and a testing overhaul for wasi-nn and Matthew Tamayo-Rios has an additional backend to add in https://github.com/bytecodealliance/wasmtime/pull/6867. If you're interested in helping out, though, there are many little things that could be "made nicer" that you will see as you use wasi-nn--PRs for any of those would be appreciated. Also, if you're interested in adding new backends that would be helpful!
Amazing! First thing, I will give it a try and test the named-models feature. This will allow wws
to configure a set of predefined models per worker / function. Regarding the wasi-nn-PRs
, is there any specific tag for those issues? Those little improvements seem to be a great way to get involved in the project :)
Definitely adding new backends is something we have in mind. Tensorflowlite or Pytorch could be great additions. I know there are some security concerns related to Tensorflow. Not sure about your take on that one.
@Andrew Brown another topic that I have in mind are LLMs and dynamic input / outputs. I'm still very new to AI / ML, so maybe you already figure out how to work with LLMs and WASI-NN. However, I couldn't find any example about those. Let me open a separate conversation about this so we can close this topic.
And thank you for all the context and responses!
Well, I haven't yet created issues for TODO work; I'm still in the middle of things so it's hard to see what should be fixed immediately and what I should postpone for later. I'll let you know once I have a bit more clarity on that. (e.g., see refactorings like https://github.com/bytecodealliance/wasmtime/pull/6893)
Since TF accepts operators that can read/write files, do network I/O, etc., it seems like it would just open Wasmtime up to attacks. Some way of mitigating that would need to get figured out before moving https://github.com/bytecodealliance/wasmtime/pull/3977 forward. I haven't looked too closely at PyTorch yet.
re: LLMs, maybe @Matthew Tamayo-Rios can comment further. He is planning to demo something at WasmCon and may have more comments about that.
@Angel M, I would say that any issues you find by using wasi-nn are extremely valuable. It is quite difficult to exhaustively test a thing like "ML" so and eventually we need to have a better testing strategy.
Thank you @Andrew Brown for the background on the different tasks and refactors. I will take a look at the different open issues to start getting familiar with the codebase.
Regarding WasmCon, that's amazing! Do you plan to attend @Andrew Brown ? I will give two talks, so I'll be there for sure hehe. Would be great to chat with you both :big_smile:
Yeah, I'll be there. It will be good to meet in person!
Amazing! We plan to work more and more on Wasm + AI, so it would be great to chat with you and learn more :)
I will also be there and will be showing off a prompt based stable diffusion demo running through WASI-NN on top of fastly's compute@edge infrastructure.
One thing to keep in mind as we consider additional backends is that we need to figure out a better testing story for said backends since most backends require loading the relevant library (openvino, libtorch, etc) in order to run.
The other big issue for TensorFlow is that the SavedModel format expects to be able to read from a directory on disk. There's an older h5 format that is more limited in functionality that can be read from bytes, but it one more challenge to using tensorflow models in a dynamic environment. There's some work arounds, but I'm not sure they work outside of python (i.e you can implement a virtual drive)
PyTorch also has a lot of outstanding security issues at the moment for untrusted models, because they use the pickle storage format.
They are looking at switching to SafeTensors but have not done so yet.
Last updated: Nov 22 2024 at 16:03 UTC