Hi just joined the Zulip now, I’m currently a postgrad working in ML, with a decent rust component (mostly work with candle in rust, and torch or Fortran out of rust). Am somewhat unsure what the aim of wasi-nn is, even after reading the repo and chat: is it an abstraction layer (so existing ML frameworks would target wasi-nn instead of specific hardware, and the resulting would be portable) or a full nn framework with support for deserialising models, running inference etc wholly within wasi-nn?
"yes". I'll let @Andrew Brown respond more fully, but from my pov the intent was to provide an abstraction layer and the result happened to be shaped at the time like what people wanted to use. However, as LLMs exploded, there were clearly areas of the implementation to evolve. That's my take. wasi:nn is intended to provide an abstract api for people to use models portably and with a much higher degree of security.
there is always a tension in engineering between abstractions that prevent optimizations and optimizations that prevent easy reuse, so that balance is the the discussion, typically.....
your thoughts and questions are really useful, so keep asking!
@Kirp, yes, it is an abstraction layer but not at the level you suggest (between an ML framework and the hardware); rather, it's an abstraction layer between WebAssembly applications and ML frameworks. Why? For a WebAssembly application to do ML inference, it would either (a) have to compile an ML framework to WebAssembly (like Tract does) and link to it somehow, or (b) make system calls via some interface like WASI. One problem with (a) is that the WebAssembly language is not the most efficient way to execute ML operations and it is limited to CPUs. wasi-nn takes approach (b), but instead of taking on the rather difficult role of abstracting the hardware, it abstracts the ML framework itself.
If you're interested in discussing further, you may want to attend some of the ML working group meetings (see here) which are where we discuss changes to wasi-nn, e.g.
Sorry to communicate through the form of memes, but this has a diagram showing my idea of what wasi-nn is doing already in it to clarify: is wasi-nn a layer above the frameworks like keras etc here targeting them in this case?1F536886-C705-4F75-91B4-012B99BE606B.jpg
So not like I initially thought where it takes the ml compiler role, but instead targets the different frameworks?
Heh, the meme is fine. I'm no Keras expert but I would say that your comparison makes some sense. At a conceptual level, wasi-nn and Keras both target different frameworks "underneath" them. The difference is that Keras probably has some actual implementation functionality to it; wasi-nn is essentially just an API, a standardized way to access the underlying ML functionality from within WebAssembly. In this sense wasi-nn is no different than other WASI APIs.
In general then, in order to target wasi-nn, i guess the plan is that consumers architect their models and frameworks directly targeting wasi-nn as opposed to their existing models (in say torch/keras/candle etc) being transpiled/compiled into wasi-nn
(sorry for repeating myself just still a bit unclear about whether the api aim for wasi-nn is low level (as in a bit above cuda/rock but beneath say torch) or high level like current libraries: I presume the latter?)
To get a good idea of where wasi-nn fits, take a look at its definition. That may help you place it better. Users that want to use ML in WebAssembly via wasi-nn will provide their models as bytes to the wasi-nn interface, program their WebAssembly application using the wasi-nn interface, and rely on a WebAssembly runtime to pass on their model as well as input and output tensors to the underlying framework.
Thanks for your patience, makes more sense now looking at the code for the graph!
Last updated: Dec 23 2024 at 13:07 UTC