Stream: wasi-nn

Topic: aim of wasi-nn


view this post on Zulip Kirp (Apr 23 2024 at 22:25):

Hi just joined the Zulip now, I’m currently a postgrad working in ML, with a decent rust component (mostly work with candle in rust, and torch or Fortran out of rust). Am somewhat unsure what the aim of wasi-nn is, even after reading the repo and chat: is it an abstraction layer (so existing ML frameworks would target wasi-nn instead of specific hardware, and the resulting would be portable) or a full nn framework with support for deserialising models, running inference etc wholly within wasi-nn?

view this post on Zulip Ralph (Apr 24 2024 at 07:14):

"yes". I'll let @Andrew Brown respond more fully, but from my pov the intent was to provide an abstraction layer and the result happened to be shaped at the time like what people wanted to use. However, as LLMs exploded, there were clearly areas of the implementation to evolve. That's my take. wasi:nn is intended to provide an abstract api for people to use models portably and with a much higher degree of security.

view this post on Zulip Ralph (Apr 24 2024 at 07:15):

there is always a tension in engineering between abstractions that prevent optimizations and optimizations that prevent easy reuse, so that balance is the the discussion, typically.....

view this post on Zulip Ralph (Apr 24 2024 at 07:15):

your thoughts and questions are really useful, so keep asking!

view this post on Zulip Andrew Brown (Apr 24 2024 at 19:51):

@Kirp, yes, it is an abstraction layer but not at the level you suggest (between an ML framework and the hardware); rather, it's an abstraction layer between WebAssembly applications and ML frameworks. Why? For a WebAssembly application to do ML inference, it would either (a) have to compile an ML framework to WebAssembly (like Tract does) and link to it somehow, or (b) make system calls via some interface like WASI. One problem with (a) is that the WebAssembly language is not the most efficient way to execute ML operations and it is limited to CPUs. wasi-nn takes approach (b), but instead of taking on the rather difficult role of abstracting the hardware, it abstracts the ML framework itself.

Tiny, no-nonsense, self-contained, Tensorflow and ONNX inference - sonos/tract

view this post on Zulip Andrew Brown (Apr 24 2024 at 19:53):

If you're interested in discussing further, you may want to attend some of the ML working group meetings (see here) which are where we discuss changes to wasi-nn, e.g.

Contribute to bytecodealliance/meetings development by creating an account on GitHub.

view this post on Zulip Kirp (Apr 24 2024 at 22:06):

Sorry to communicate through the form of memes, but this has a diagram showing my idea of what wasi-nn is doing already in it to clarify: is wasi-nn a layer above the frameworks like keras etc here targeting them in this case?1F536886-C705-4F75-91B4-012B99BE606B.jpg

view this post on Zulip Kirp (Apr 24 2024 at 22:07):

So not like I initially thought where it takes the ml compiler role, but instead targets the different frameworks?

view this post on Zulip Andrew Brown (Apr 25 2024 at 00:26):

Heh, the meme is fine. I'm no Keras expert but I would say that your comparison makes some sense. At a conceptual level, wasi-nn and Keras both target different frameworks "underneath" them. The difference is that Keras probably has some actual implementation functionality to it; wasi-nn is essentially just an API, a standardized way to access the underlying ML functionality from within WebAssembly. In this sense wasi-nn is no different than other WASI APIs.

view this post on Zulip Kirp (Apr 25 2024 at 01:40):

In general then, in order to target wasi-nn, i guess the plan is that consumers architect their models and frameworks directly targeting wasi-nn as opposed to their existing models (in say torch/keras/candle etc) being transpiled/compiled into wasi-nn

(sorry for repeating myself just still a bit unclear about whether the api aim for wasi-nn is low level (as in a bit above cuda/rock but beneath say torch) or high level like current libraries: I presume the latter?)

view this post on Zulip Andrew Brown (Apr 25 2024 at 16:55):

To get a good idea of where wasi-nn fits, take a look at its definition. That may help you place it better. Users that want to use ML in WebAssembly via wasi-nn will provide their models as bytes to the wasi-nn interface, program their WebAssembly application using the wasi-nn interface, and rely on a WebAssembly runtime to pass on their model as well as input and output tensors to the underlying framework.

Neural Network proposal for WASI. Contribute to WebAssembly/wasi-nn development by creating an account on GitHub.

view this post on Zulip Kirp (Apr 25 2024 at 17:35):

Thanks for your patience, makes more sense now looking at the code for the graph!

view this post on Zulip Adrian Ibanez (Mar 20 2025 at 19:06):

@Andrew Brown May I ask for some clarifications regarding wasi-nn? To integrate wasi-nn on eg. a iOS platform I would require the wasmtime runtime, something like sonos/tract abstracting some ML backend and then the actual wasm binary. Correct?

Sounds like a lot of complexity and layers / interfaces to me. What are the typical scenarios that wasi-nn is targeting? Where would it work best? And why would / could it make sense to use such a setup instead of a single component that abstracts all layers away with one single api.

Fully aware that these are gross simplifications. I'm asking from a standpoint where building statically linked libraries for eg. iOS platform comes with a lot of bumps in the road.
Was hoping that wasm could reduce some of those issues and avoid the fact that the whole application "runtime" has to be rebuilt whenever something changes.

view this post on Zulip Andrew Brown (Mar 20 2025 at 19:31):

Sure, let me clarify some things. But, first, let me define the term "framework": to me, an ML framework is the code responsible for executing an ML model (think MobileNet), which if you squint is essentially a program in a higher-level ML language (e.g., ONNX operators, PyTorch operators, etc.). And sometimes I also call a framework a "backend," especially in the context of wasi-nn. So when you're talking about sonos/tract, we should note that it is an additional layer "on top of" the framework itself, but not the actual framework (I'm no expert on tract but I think it compiles ONNX and TF models, right?).

So I think there are two broad approaches to ML in WebAssembly at the moment:

So, are there too many layers? You'll have to be the judge of that, but hopefully my explanation gives you some context for how the layers are split across the system in both approaches.

view this post on Zulip Andrew Brown (Mar 20 2025 at 19:33):

I was just updating the documentation for a simple MobileNet example using wasi-nn in Wasmtime: https://github.com/bytecodealliance/wasmtime/blob/2b9da02f/crates/wasi-nn/examples/classification-example/README.md. It may be helpful to walk through that to understand the pieces better? Let me know if that helps!

A lightweight WebAssembly runtime that is fast, secure, and standards-compliant - bytecodealliance/wasmtime

view this post on Zulip Adrian Ibanez (Mar 20 2025 at 20:34):

@Andrew Brown thanks for the clarifications, that helps understanding the scope better. I wasn't aware of all the crates included in the repo. great source of information, thanks.
More than happy to walk through the sample. I'm really only starting to understand all the different apis and their "evolution" phases


Last updated: Jan 09 2026 at 13:15 UTC