Stream: general

Topic: WASI clang++ protobuf stubbing for TensorFlow


view this post on Zulip Colin D Murphy (Nov 13 2025 at 19:58):

When ONNX Runtime loads ORT format models in WASM/WASI, protobuf's RepeatedPtrField calls constructors like onnx::TensorProto::TensorProto(google::protobuf::Arena*, bool) via function pointers (see onnxruntime/core/graph/graph.cc around the LoadFromOrtFormat method that processes initializers). wasmtime validates these function pointer signatures against the C++ ABI and rejects them as mismatched, even with stub implementations, because the function pointer calling convention doesn't match wasmtime's expected C++ member function signature.
References:
• ONNX Runtime: onnxruntime/core/graph/graph.cc - Graph::LoadFromOrtFormat() method that processes initializers using protobuf's RepeatedPtrField
• Protobuf: google/protobuf/repeated_ptr_field.h - TypeHandler::NewFromPrototype() which calls constructors via function pointers

I'm able to compile to wasip1 and run without issue, however, I can't get it to work with either the adapter or compiling directly to wasip2

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator - microsoft/onnxruntime

view this post on Zulip Chris Fallin (Nov 13 2025 at 20:06):

I'm trying to parse this problem and having some trouble -- could you fill in some details?

Basically I'm very confused about what you're actually doing -- sorry!

view this post on Zulip Chris Fallin (Nov 13 2025 at 20:07):

(If I guess that you're running C++ in the guest, and hitting some sort of signature mismatch, that would be a toolchain bug in your C++-to-Wasm compiler; but it's not 100% clear that's what you mean)

view this post on Zulip Colin D Murphy (Nov 13 2025 at 20:36):

Sorry. I wish I understood better. I'm able to build and compose the component I built with ONNX Runtime, but it fails when I try to run it.
The WASM module traps with wasm trap: wasm unreachable instruction executed when ONNX Runtime's Graph::LoadFromOrtFormat() processes initializers. The backtrace shows the trap occurs at address 0x29340 when attempting to call what should be onnx::TensorProto::TensorProto(google::protobuf::Arena*, bool).
What's happening:

1. Guest code: ONNX Runtime (onnxruntime/core/graph/graph.cc lines 6067-6097) uses protobuf's RepeatedPtrField to add TensorProto objects. Despite my patch to use UnsafeArenaAddAllocated, protobuf still invokes a constructor via function pointer.
2. Function pointer call: Protobuf calls the constructor through a function pointer stored in its type metadata.
3. wasmtime validation: When wasmtime executes the indirect call, it validates the function signature against the function table entry. This validation fails.
4. Trap: wasmtime traps with "unreachable" (either because validation inserted unreachable, or the called function immediately executes unreachable).

view this post on Zulip Chris Fallin (Nov 13 2025 at 20:39):

OK. It sounds like, then, that this is either an ONNX issue or C++ compiler issue. "Wasmtime traps when calling a function pointer of the wrong signature" is spec-compliant behavior and the only thing we can do. If I were you I would work with the ONNX project to debug this further

view this post on Zulip Colin D Murphy (Nov 13 2025 at 20:43):

The problem is that I have to stub out the protobufs. They are trying to access system information that doesn't exist in the runtime. It's just frustrating that I didn't have this trouble with preview 1. It just worked.

view this post on Zulip Chris Fallin (Nov 13 2025 at 20:45):

I don't know what to tell you, sorry -- it's surprising that changing the target would affect something in the guest like that, but from an engine perspective, at a signature mismatch there's nothing we can do; the standard says we must trap; it's very clearly a bug somewhere in the guest

view this post on Zulip Colin D Murphy (Nov 13 2025 at 20:46):

It doesn't match because I'm trying to stub out a mangled function name.

view this post on Zulip Chris Fallin (Nov 13 2025 at 20:47):

I'm still missing some context here. Why are you trying to stub something out? Or if you have to, why not stub it out with the correct signature?

view this post on Zulip Colin D Murphy (Nov 13 2025 at 20:48):

We'd need something like this, but for WASI
protobuf-emscripten

Google's Protocol Buffers for emscripten. Contribute to invokr/protobuf-emscripten development by creating an account on GitHub.

view this post on Zulip Colin D Murphy (Nov 13 2025 at 20:49):

Or this: https://github.com/dsyer/protobuf-wasm

Contribute to dsyer/protobuf-wasm development by creating an account on GitHub.

view this post on Zulip Chris Fallin (Nov 13 2025 at 20:51):

I'm still missing something, sorry. So you're building stubs equivalent to the above (it looks like the first one at least is just replacing atomics). Why not write those stubs with the correct signature?

view this post on Zulip Colin D Murphy (Nov 13 2025 at 20:51):

I'm sorry. I'm going to ask perplexity why stubbing protobufs is so hard. I'm not a C++ person. I hope this helps

Why stubbing protobuf signatures is hard:

1. Function pointer calls vs direct calls: Protobuf stores constructor function pointers in type metadata (TypeHandler::NewFromPrototype). When protobuf calls via these pointers, wasmtime validates the indirect call against the function table entry. The stub signature must match exactly what wasmtime expects for that table entry, not just the C++ signature.
2. C++ calling convention differences: For a C++ member function like TensorProto::TensorProto(Arena*, bool), the this pointer is implicit. When called via function pointer, this becomes explicit. The stub must match the function-pointer calling convention, which may differ from the member function signature.
3. Type system mismatch: We've tried stubbing with void* and int32_t/bool, but wasmtime may be validating against the actual C++ types (onnx::TensorProto, google::protobuf::Arena, bool). The error message doesn't specify what wasmtime expects, so we're guessing.
4. Multiple constructor variants: C++ has C1 (complete object) and C2 (base object) constructors. Both may be needed, and their signatures can differ subtly.
5. No diagnostic information: The error signature_mismatch:onnx::TensorProto::TensorProto(google::protobuf::Arena*, bool) doesn't show:
• What signature wasmtime expected
• What signature was provided
• Which parameter type mismatched

6. Function table vs symbol: The function exists as a symbol (0x26f74), but the trap occurs at a different address (0x29340), suggesting the function table entry has a different signature than the actual function.

The core issue: wasmtime validates indirect calls strictly, and we don't have visibility into what signature it expects for the function table entry. We're guessing based on the C++ signature, but the function-pointer calling convention may require a different signature.

view this post on Zulip Chris Fallin (Nov 13 2025 at 20:52):

I'm going to ask perplexity

Asking an AI agent for a load of slop that doesn't answer my question is a gross disrespect of maintainers' time. Sorry, I'm not able to help further.

view this post on Zulip Colin D Murphy (Nov 13 2025 at 20:54):

I don't understand what to tell you. If doesn't make sense then please point out where. This is obviously a difficult problem that I've spent the past week trying to work through

view this post on Zulip Colin D Murphy (Nov 13 2025 at 20:55):

We don't have this problem with emscripten or preview1. So obviously something needs to work as we move towards preview 3.

view this post on Zulip Chris Fallin (Nov 13 2025 at 20:56):

Copy/pasting a half-page of text at me from your agent is not answering the question. The question was: "why are you not able to write the stubs with the correct signature?"

view this post on Zulip Colin D Murphy (Nov 13 2025 at 20:56):

That's exactly the question it answered.

view this post on Zulip Colin D Murphy (Nov 13 2025 at 20:57):

It wasn't AI slop. It's a known difficulty with protobufs that requires separate projects.

view this post on Zulip Chris Fallin (Nov 13 2025 at 20:57):

I'm unable to help further, sorry! Best of luck.

view this post on Zulip Chris Fallin (Nov 13 2025 at 23:11):

I have a little time now so I'm going to demonstrate why this AI agent's answer makes no sense:

When protobuf calls via these pointers, wasmtime validates the indirect call against the function table entry. The stub signature must match exactly what wasmtime expects for that table entry, not just the C++ signature.

The agent is:

Basically: this is a long essay that boils down to "the types are wrong and that's an error". My question remains: why not use the correct types?

view this post on Zulip Colin D Murphy (Nov 14 2025 at 00:04):

The issue lies with the WASI SDK. I agree.

view this post on Zulip Colin D Murphy (Nov 14 2025 at 00:07):

I'm glad we are in violent agreement. What do we need to get into the WASI SDK so that ONNX Runtime will work with components, specifically protobufs?

view this post on Zulip Alex Crichton (Nov 14 2025 at 00:18):

@Colin D Murphy the other point I believe Chris is making is that it's not quite reasonable to present a very large "black box" of ONNX/protobufs/etc and say "something's broken, how do I fix it?". Those projects are quite large and the possible modes of failure are even larger. The bug here could range anywhere from UB in ONNX to UB in protobufs to bugs in either to a bug in clang to a bug in wasi-sdk to a bug in wasmtime to an architecture-specific bug, etc. There's really no way to diagnose an issue like this when it's so large.

At the same time I at least personally don't think it's reasonable to expect others to reduce the issue for you. For example I know virtually nothing of ONNX/protobufs nor where to even start in terms of reduction. I know a lot about wasi-sdk, wasi-libc, and wasmtime, but "there's a trap here" is not sufficient knowledge to point to something and say "it's your fault figure it out".

I'd recommend trying to reduce this failure on your end. The less source code in play the better. For example if you could give us a modest C file with no dependencies and say "this passes on native but not on wasm" that is light-years easier to debug.

view this post on Zulip Alex Crichton (Nov 14 2025 at 00:19):

Also, I can reiterate, when it comes to debugging I would very strongly recommend against using LLMs. LLMs are not suitable when you're not already an expert in the problem space. When you're debugging something it's probably because you're not sure why it's going wrong, which by definition means you're not an expert in the problem space. LLMs are likely to add more confusion to the situation and they should not be used as justification for "the problem lies here because this LLM said so"

view this post on Zulip Colin D Murphy (Nov 14 2025 at 00:21):

Thanks for the feedback @Alex Crichton . Is it fair to ask the question: what is the WASI story for protobufs for preview 2? How do we get this to work for preview3?

view this post on Zulip Alex Crichton (Nov 14 2025 at 00:22):

I can also say that I understand that reduction of a problem like this is an arduous and unforgiving task. In the past I'm sometimes successful at this, sometimes not. It unfortunately doesn't change the ground truth though in that it's up to you to figure out how to reproduce it, and more often than not if you can't figure it out then it's likely to go unsolved.

view this post on Zulip Alex Crichton (Nov 14 2025 at 00:24):

Well, for protobufs, I can't really answer that. I know nothing of protobufs beyond "they put stuff in bytes and take them out later" and I also understand that I'm grossly misrepresenting what they can do. Beyond that the WASI story isn't lined up for particular projects individually, so no, there's no specific story for protobufs.

More generally it's expected that everything "just works" with WASI and its versions. If you need sockets you'll need WASIp2-and-up, but otherwise it should be compatible. If it's not then that's in the realm of "someone needs to figure out why", and for now that's probably going to be you.

view this post on Zulip Colin D Murphy (Nov 14 2025 at 00:27):

Also, in my defense, I pointed to a very specific piece of code in a very popular library that used (from what I understand) another commonly used library. I think we should be able to ask questions about how such things should be supported, especially whrn they were supported in preview 1. It was only after several attempts to explain the problem that I resorted to asking an LLM why protobufs are so hard to stub.

view this post on Zulip Chris Fallin (Nov 14 2025 at 00:36):

I don't think that it's reasonable to conclude automatically that this is an issue in wasi-sdk and ask "what do we need to include in WASI-SDK". The whole idea of having a C/C++ compiler is that one can take a large C/C++ codebase and compile it with minimal effort. In theory we should never have to do a "protobuf-specific thing" in wasi-sdk.

You linked to several projects above that build protobufs for wasm environments that stub out atomics. Atomics are indeed problematic in Wasm MVP; with the right flags you might be able to use the wasm atomics instructions though. I'm curious why you're trying to stub out other things (you were pointing to constructors above). And then when you do, you use incorrect signatures and that leads to (as expected) a trap.

Basically: I think there are a lot of flawed assumptions here. The first is that protobufs should need specific support in the platform. That's not really how low-level bytecode platforms work; protobufs should work like any other code inside the guest. The next is that p1/p2 somehow change this story. To be blunt, there is a very high burden of proof to convince someone that a change in IO interfaces should cause a signature mismatch in unrelated C++ code.

The last flawed assumption is -- to re-emphasize a point that Alex makes above -- that we are "general tech support" for any programming problem that happens to be on top of the platform. I am happy to put time in when others are working in good faith with us, but you're throwing a giant ball of conmingled problems at us here, insinuating that it's our fault and we need to fix it, and not able to explain details of what you've done or why. Per your point above that you're "not a C++ person", I suspect you need a C++ person here to look at your actual code to see what is wrong. That's beyond the scope of the level of free help I can offer, at least.

view this post on Zulip Chris Fallin (Nov 14 2025 at 00:50):

And to put a final point on it -- you say

The issue lies with the WASI SDK. I agree.

No, I strongly don't agree. If I had to guess, you wrote some code somewhere with an incorrect signature in a modified version of protobufs, and that's causing the issue. You seem to say as much -- you aren't running pure upstream, you're modifying things. "My hacked-up version of some libraries has an error" is a problem only you can solve, sorry

view this post on Zulip Colin D Murphy (Nov 14 2025 at 02:21):

I most definitely had it working with preview1. You can ask anyone who was at Wasm Con in Atlanta where I demoed it on stage as part of my talk.
I'm not trying to cast any blame here, just trying to figure out how to get this working. I'll investigate and see what I need to do. I'm going to start with the emscripten implementation. I'd be perfectly willing to admit I'm wrong here. I don't really care about that.
cc @Mendy Berger @Bailey Hayes

view this post on Zulip Christof Petig (Nov 15 2025 at 08:47):

I typically use wasm-objdump -x to investigate function signature mismatches at the byte code level. Look for the function which works and the one it complains about. Solved a similar problem for me in minutes, but I am used to debugging c++ at the assembly level for decades, so it will probably send you on a journey to understand name mangling and wat syntax.

view this post on Zulip Colin D Murphy (Nov 17 2025 at 14:16):

Thanks @Christof Petig . Threads should reduce the need for many of the stubs for protobufs. Once we have that working, we'll see what remains. I think there will still need be some sort of special handling for arenas. I wouldn't be pursuing this if I didn't see a lot of potential for wasmtime.

view this post on Zulip Bailey Hayes (Nov 19 2025 at 03:30):

Hey I'd like to take a stab at this one. It's easy to get lost in the sauce when diving deep on a problem with many, many variables. Using the experimental C++ bindgen target definitely seems to be part of the challenge here. I started there and a few fixes and coffees later I got a working http server component: https://github.com/ricochet/sample-wasi-http-cpp

The bindings there are using this branch on wit-bindgen: https://github.com/bytecodealliance/wit-bindgen/pull/1425

Turning to just protobuf+abseil, I definitely ran into rough edges. Abseil already has checks for __wasi__, but it seems incomplete and out of date. I resorted to stubs to compile for p1. I had to add additional stubs for abseil arenas/allocs (as you mentioned) specifically for p2. I found that a little confusing and plan on looking into it more. Abseil does have a mode that avoids sys calls and removed the need for many of the stubs (ABSL_FORCE_WAITER_MODE=4).I hope to contribute the appropriate wasi stubs back to abseil so folks won't have to do this themselves. I have both p1 and p2 working here: https://github.com/ricochet/wasi-protobuf

Sample WASI HTTP Server implemented with C++ and `wit-bindgen cpp` - ricochet/sample-wasi-http-cpp
Depends on #1424 This adds resource support for wit-bindgen cpp + clippy cleanups across the generator. I think the remaining missing feature is covering multiversion. The rest all appear to be asy...
Example of building protobuf (including Abseil) for WebAssembly (wasip1 and wasip2) - ricochet/wasi-protobuf

Last updated: Dec 06 2025 at 05:03 UTC