Perhaps this is obvious in hindsight, but it took me a while to debug in practice:
I've been working on WASIp2 support for the .NET runtime recently, which is currently based on the WASI 0.2.1 WIT files. In order to provide an ergonomic async
/await
experience without creating an API compatibility hazard, we've added a couple of internal
functions to the runtime: RegisterWasiPollableHandle
(for registering a wasi:io/poll#pollable
to be polled when all tasks are blocked) and PollWasiEventLoopUntilResolved
(for running the top-level event loop, i.e. running any tasks and polling any pollable
s). Although they are internal
, they can be called via the UnsafeAccessor
attribute, which allows e.g. wit-bindgen
-generated code to call them even though they're not part of the public API. One consequence of that design is that application code must pass pollable
handles as raw integers, which can lead to confusion if the app passes a 0.2.0 pollable
handle instead of a 0.2.1 pollable
handle; the host (e.g. Wasmtime) treats those as unrelated types and will trap the guest with an unknown handle index
error if the former is passed where the latter is expected.
The issue was easy enough to resolve: update the application code to use 0.2.1 so it matches what the .NET runtime is using. However, I'm concerned that others may run into this both for .NET and other language toolchains. It's not going to be obvious to users that unknown handle index
traps may mean multiple wasi:io
versions are in use, and that the guest confused them somehow, which might, in turn, be due to a mismatch between e.g. the wit-bindgen
-generated code they're using (possibly indirectly, via an SDK) and the version their toolchain and/or runtime is using. And this will become increasingly likely if we release new minor versions every two months or so.
In hindsight, I wonder if we should have left wasi:io
at 0.2.0 indefinitely given its "special" status as part of the connective tissue of WASIp2. The problem will presumably go away with WASIp3, since most or all of wasi:io
will be pushed down into the component model ABI, but we need to deal with it in the meantime.
Wild idea: consider making wasi:io
resources of any 0.2.x version interchangeable with other 0.2.x versions.
Less wild idea: add special support to wasmtime
and/or wasmtime-wasi
to recognize when the guest has passed a resource handle of the wrong version and provide a detailed diagnostic, e.g. "expected handle of type wasi:io@0.2.1/poll#pollable
; got wasi:io@0.2.0/poll#pollable
".
For .NET specifically, the core issue is passing handles as integers and thereby losing critical type information, so we'll have to think about how to do that better. One idea I had was to split RegisterWasiPollableHandle
into RegisterWasiPollableHandle_v0_2_0
, RegisterWasiPollableHandle_v0_2_1
, and so on as new versions are added. That would help provide early diagnostics in cases where application code is incompatible with the .NET runtime (i.e. catch it in the guest before we get to the host, at which point the only option is to trap). However, a given build of the .NET runtime could only support at most one of those functions since there's no way for the runtime to poll a mix of pollables of different versions concurrently. Hence my wild idea above.
Other thoughts?
thanks for documenting this issue. @Alex Crichton can we teach the resolver to resolve these two pollables to the same resource? I don't think this should even be wasi-specific, I'd hope this would work for any interface/resource thats depended on at a newer version than it was introduced at
i suspect this works for interfaces and functions today but maybe it doesnt for resources?
I think this is not dotnet specific problem, but I'm glad we discovered it soon enough.
When we generate C# proxies with wit-bindgen, they could be compiled into
Each of those dotnet (inner) components (of single dotnet WASI outer component) could be generated at different time, by different version of wit-bindgen, from different version of the same WIT world.
Even compiled by different version of C# compiler.
C#/dotnet is strongly typed language even at runtime. Class of the same namespace and same name in different assembly is not the same type.
We will be forced to use proxies generated from coherent set of WIT versions.
If we want to pass WASI resources between different dotnet (inner) components.
We can make that little bit simpler if we generate C# code just once and produce and publish binary dotnet assemblies as Nuget packages, that would be common to all apps.
That still makes the application development difficult but not impossible. One dotnet app would have to use coherent versions of (transitive) dotnet dependencies.
The runtime library itself could use different (private) coherent set, given that we don't need to expose WASI resources directly on runtime library C# APIs.
Right now Pollable
is breaching that boundary between application code and dotnet runtime code.
It's because we need to block on all pollable
s, regardless if they are runtime owned or application owned.
It needs to be dotnet runtime function, because of the way how dotnet processes jobs and continuations.
I still need to learn more about WASIp3 promises, to see what happens when we have their C# proxies generated multiple times.
For the dotnet context: The reason why RegisterWasiPollableHandle
and passing numeric handle is just temporary way ho to solve those strong-typing problems without too much dance about public API contract of the dotnet runtime.
When/if we expose such public APIs eventually, and it would have to be strongly typed and the types would be owned by the dotnet runtime, not by the wit-bindgen or external Nuget.
And it would become backward compatibility burden from that point on, so it's better to postpone that using those trick Joel described.
@Alex Crichton @Luke Wagner and I chatted about this a bit today. Alex suggested that we can update wit-component
(and also tools that use it, e.g. wasm-component-ld
) to combine imports that span multiple minor versions. For example, if the input module imports both wasi:io@0.2.0
and wasi:io@0.2.1
, the output component will only import wasi:io@0.2.1
, and that import will satisfy the module-level imports for both versions. The upshot is that the host will only see one version of that import, in which case the issue above goes away.
We use the same strategy for WAC, BTW. For example: when composing a component that imports wasi:io@0.2.0
with one that imports wasi:io@0.2.1
, the output component can just import wasi:io@0.2.1
and use that to satisfy both inner components' imports.
to expand a bit more on what Joel said:
can we teach the resolver to resolve these two pollables to the same resource?
This is basically what we're going to try to do with the subtelty that this is already happening in wasmtime itself but we're going to be changing the component building process to do this and produce a different shape component.
I don't think this should even be wasi-specific
Indeed! The thinking is that this'll be at the generic "any WIT interface level" to unify and "pick the biggest" in a semver track as part of wit-component when wit-component sees duplicates.
I think this is not dotnet specific problem
Very much agree with this! Was going to be a problem for all other languages and runtimes too.
I still need to learn more about WASIp3 promises, to see what happens when we have their C# proxies generated multiple times.
At least for me WASIp3 is far enough away that I think it's tough to say exactly how it will shape up. The default is probably that it'll be a "break the world event", but I also suspect that by the time we get there that won't be suitable so we'll probably need to figure out alternatives to problems like this
I've opened https://github.com/bytecodealliance/wasm-tools/issues/1774 for this and I'll try to get that done this week
We use the same strategy for WAC, BTW
Meant to say "We can use..." here.
I've done some work on this this week and I've hit a snag. This is a big enough snag that I don't believe this strategy is going to work out, and I think we may need to readjust other parts of WASI/versioning/etc to accommodate this possibly.
A basic recap of the problem: within a single component or application we don't want to require that everything is in sync all the time. For example the adapter may use one version of WASI, wasi-libc another, and custom bindings yet another. The concrete case Joel ran into was wasi-libc and the adapter use 0.2.0 and the custom http bindings use 0.2.1. This is a situation we want to work.
The reason this doesn't work today is because of how components work. The best way to phrase this is that eventually the application is going to want to block. The application, however, is blocking over a set of 0.2.0 and 0.2.1 pollables (e.g. think sockets from wasi-libc and http bits from the custom bindings). There's not actually a function to block over 0.2.0 and 0.2.1 pollables simultaneously, that just doesn't exist.
The shape of this problem means that the host has little recourse to fix this. The host cannot detect what the guest component is trying to do without violating the semantics one way or another of the component model. This is where the unknown handle index
error Joel mentioned came from. That happened because an 0.2.0 pollable index was passed to a function that wants 0.2.1, and Wasmtime raised a trap (according to component model semantics) in this situation.
This led to the original idea of solving this. Change wasm-component-ld
to unify the imports here where only 0.2.1 pollables are imported into the application. The idea though was that this solution would be implemented at the WIT level (where wit-component
merges worlds together) rather than being WASI-specific. This brings us to the snag that I have now encountered.
The basic idea of the solution was that there's an operation in wit_parser::Resolve
where worlds are merged together. This happens when you take the world of the adapter, the world of wasi-libc, and the world of the custom bindings, and merge them all together to produce a final world which is what ends up being the interface of the component (e.g. all of what the component could import). This merging operation was where I was hoping to insert logic to say "ok let's just import 0.2.1, not 0.2.0"
In this function though we are now faced with a situation of let's say we're merging worlds A and B. World A has pollable 0.2.0, but it also has wasi:filesystem/types@0.2.0
. World B has pollable 0.2.1 and also has wasi:http/types@0.2.1
. Note that wasi:{filesystem,http}/types
both depend on pollables. GIven this situation it's not actually possible to delete the 0.2.0 pollable import. That breks the import that wasi:filesystem/types@0.2.0
has. There's no way to upgrade wasi:filesystem/types@0.2.0
. Put another way this situation gives rise to a problem where it's not possible to create a world which is derivative of the actual original WITs.
Various possible solutions to this:
wasi:filesystem/types@0.2.0
to wasi:io/poll@0.2.1
. This is similar to the previous point to me where we're inventing WIT that doesn't actually exist. I don't feel that this is the right operation because it's working at the wrong abstraction level.wasm-component-ld
. Basically it would always have the latest copy of WITs for WASI (or maybe all versions? I don't know). That would mean that there's a way to upgrade wasi:filesystem/types@0.2.0
to 0.2.1 or whatever version is desired, even if the component doesn't refer to it.Personally I'm sort of out of ideas. I don't know how best to solve this. Until this is solved though I think we should put the brakes on WASI releases and not release 0.2.2 until we have a strategy for what to do.
As a small aside, I've realized that this is also a problem with BuildTargets.md
, although a bit worse since the core module doesn't even know it's happening. For example the core module might import a poll function for 0.2.0 and 0.2.1 but the actual import string is the same to wasm-ld
will deduplicate them into a single import (I think). That then has no way of actually getting routed to the correct wasi:io/poll
function which would mean it'd be impossible to componentize the component.
This in turn sort of gives rise to me thinking that it's not right that functions which all have the same definition are available under multiple versions. That seems to be causing more headaches than it's solving so it may be the crux of what needs solving? Unsure.
Half-baked idea: truncate import name versions to the "semver compatible part" i.e. 0.2.0
-> 0.2
, 1.2.3
-> 1
and stick the full version in a names
-like custom section.
Yeah that was sort of the idea of BuildTargets.md
where the truncated name is the actual wasm import name, but bridging the truncation with the semantics of the component model is the problem because the two are mismatched at that point and it's what I'm having trouble resolving
IIUC, I think bullet 2 is a principled solution to the problem (or at least I can't yet think of a problem with it). In particular: when wasi:filesystem/types@0.2.0
depends on wasi:io/poll@0.2.0
via use
, semantically this means that the instance
-type defined by wasi:filesystem/types@0.2.0
is parameterized by some abstract (unknown) implementation of the wasi:io/poll.pollable@0.2.0
semantic contract. Because semver, we know that wasi:io/poll.pollable@0.2.1
must be a valid implementation of the semantic contract of wasi:io/poll.pollable@0.2.0
(in the same way that justifies the host in supplying 0.2.1
to a component that imports 0.2.0
), so it seems totally valid, in the final resolved C-M component
type produced by the merge operation, to have the C-M alias
inside the instance
type of the imported wasi:filesystem/types@0.2.0
resolve to the pollable
resource type exported by the imported wasi:io/poll@0.2.1
instance
type.
As further supporting details: in the "standalone" representation of an interface
, the instance
type that contains the fields of the interface
is wrapped with a component
type containing one type import
for each use
d resource type. When we resolve multiple interfaces together, these wrapping component
types are removed because we are logically performing a substitution, replacing the import
in the wrapping component
type with the substitution argument determined by the resolution algorithm, and in this case, it so happens that be a resource type from a 0.2.1
import. This is really just a special case of the with
clause we've discussed (here and here), which allows not just tweaking version numbers, but replacing a foo
with a bar
entirely).
I'm not sure how to square this with all the other tooling we have though. For example right now if you import a resource in an interface you're not importing just any resource with a particular signature but the exact resource from that exact interface at that exact version. The interface can then be satisfied with whatever, but from a WIT-to-component-model-and-back perspective I don't understand how the version would be optional in a sense.
What I'm worried about is that if you were to infer the WIT from such a component where substitution occurred you'd see that wasi:filesystem/types would import from wasi:io/poll@0.2.1, which to me feels weird in that it's creating something that no one ever wrote down anywhere. I'm worried that it'll have knock-on effects that we can't predict at this time and causes even more trouble down the road
I think it's ok for the operation of "resolving WIT together into a final C-M component
type" to be a lossy operation that essentially loses what you wrote in the use
. (This was already true since dead-code elimination is allowed to delete unused imports.) If you want to see the original WIT that produce some (import "ns:foo/bar" (instance ...))
, you just have to go fetch the WIT package ns:foo
and to find the full bar
type.
Just to sketch out what I'm imagining the resolved C-M type to be to see if we're imagining the same thing:
(component $Resolved
(import "wasi:io/poll@0.2.1" (instance $poll
(export "pollable" (type $P (sub resource)))
;; ...
))
(alias export $poll "pollable" (type $pollable))
(import "wasi:filesystem/types@0.2.0" (instance
(alias outer $Resolved $pollable (type $pollable'))
(export "some-operation" (func ... (result (borrow $pollable'))))
;; ...
))
)
So here, you get to see the original imported versions of wasi:io/poll
and wasi:filesystem/types
, and it's only the aliases that cross the chasm. And indeed the fact that use wasi:io/poll@0.2.0.{pollable}
was written was lost, but I think that's unavoidably lost by the abovementioned substitution (and for more reasons, in general, than just this one relating to versioning, once we have with
).
(I have to head out in a bit, but happy to discuss more next week!)
oh that's a good point about dead code elimitation already causing WITs to diverge
ok I'll work on getting this route implemented
Also, on a somewhat orthogonal note, if others are interested in this thread I opened up https://github.com/WebAssembly/component-model/issues/395 after some discussions last week to assist in debugging the original issue in this thread by tweaking the component model ABI semantics to enable better error messages from the host
I tried this with the latest wasi-sdk packaged from https://github.com/WebAssembly/wasi-sdk/actions/runs/11129514597 with wasmtime 25. And I am still getting
1: error while executing at wasm backtrace:
0: 0xe28ffb - wit-component:shim!indirect-wasi:io/poll@0.2.1-poll
...
2: unknown handle index 1
Part of the code is using wasi-lic which still uses 0.2.0. Other parts are generated from wit-bindgen with 0.2.1.
cat ~/wasi-sdk-nightly/VERSION
24.9gcec5cf4f6cf3
wasi-libc: 1b19fc65ad84
llvm: a4bf6cd7cfb1
llvm-version: 19.1.0
config: f992bcc08219
Would you please post the .wasm file that's giving you that error backtrace?
sorry for the delay,
dotnet.wasm
interestingly, When I try to build a single binary I get:
Caused by:
0: failed to validate component output
1: duplicate export name `31` already defined (at offset 0x8c3e9f)
Which seems to be related if I understand properly
In that component I see (processed-by "wit-component" "0.215.0")
for @producers
, so I think there's a tooling/version mismatch in production there?
(it should be 217 for the fix)
er, 218 is the fix, sorry
@James Sturtevant what does wasm-component-ld --version
say?
And have you double-checked you're using WASI-SDK 25? And don't have WASI_SDK_PATH
pointing to an old location by accident?
I am seeing the build output use my path that has the build from the latest wasi-sdk branch:
Running clang: -target wasm32-unknown-wasip2 --sysroot /home/jstur/wasi-sdk-nightly/share/wasi-sysroot -std=gnu11 -DMONO_GENERATING_OFFSETS -isystem /home/jstur/wasi-sdk-nightly/share/wasi-sysroot/include
cat /home/jstur/wasi-sdk-nightly/VERSION
24.9gcec5cf4f6cf3
wasi-libc: 1b19fc65ad84
llvm: a4bf6cd7cfb1
llvm-version: 19.1.0
config: f992bcc08219
how about /home/jstur/wasi-sdk-nightly/share/wasi-sysroot/bin/wasm-component-ld --version
?
there is no bin
folder there
oh maybe /home/jstur/wasi-sdk-nightly/bin/wasm-component-ld
then?
ls bin/wasm-component-ld --version
ls (GNU coreutils) 9.4
Copyright (C) 2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <https://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Written by Richard M. Stallman and David MacKenzie.
doh...
bin/wasm-component-ld --version
wasm-component-ld 0.5.9
ok that's the expected version, so that's not executing for some reason
maybe try deleting the previous wasi-sdk version? just to make sure it's not accidentally getting used? (or moving it to a different location temporarily)
so I cleared the build output folder and now, I get
Running 'python3 /home/jstur/projects/runtime/src/mono/mono/tools/offsets-tool/offsets-tool.py --abi=wasm32-unknown-wasip2 --netcore --targetdir="/home/jstur/projects/runtime/artifacts/obj/mono/wasi.wasm.Debug" --monodir="/home/jstur/projects/runtime/src/mono" --nativedir="/home/jstur/projects/runtime/src/native" --outfile="/home/jstur/projects/runtime/artifacts/obj/mono/wasi.wasm.Debug/cross/offsets-wasm32-unknown-none.h" --libclang="/home/jstur/projects/runtime/artifacts/obj/mono/wasi.wasm.Debug/llvm//x64/lib/libclang.so" --sysroot="/home/jstur/wasi-sdk-nightly/share/wasi-sysroot" --wasi-sdk="/home/jstur/wasi-sdk-nightly"'
Running clang: -target wasm32-unknown-wasip2 --sysroot /home/jstur/wasi-sdk-nightly/share/wasi-sysroot -std=gnu11 -DMONO_GENERATING_OFFSETS -isystem /home/jstur/wasi-sdk-nightly/share/wasi-sysroot/include -isystem /home/jstur/wasi-sdk-nightly/lib/clang/18/include -isystem /home/jstur/projects/runtime/src/mono/wasi/mono-include -I /home/jstur/projects/runtime/src/mono -I /home/jstur/projects/runtime/src/mono/mono -I /home/jstur/projects/runtime/src/mono/mono/eglib -I /home/jstur/projects/runtime/src/native -I /home/jstur/projects/runtime/src/native/public -I /home/jstur/projects/runtime/artifacts/obj/mono/wasi.wasm.Debug -I /home/jstur/projects/runtime/artifacts/obj/mono/wasi.wasm.Debug/mono/eglib -DTARGET_WASI -DTARGET_WASM -D_WASI_EMULATED_PROCESS_CLOCKS -D_WASI_EMULATED_SIGNAL -D_WASI_EMULATED_MMAN -DHAVE_SGEN_GC -DHAVE_MOVING_COLLECTOR /home/jstur/projects/runtime/src/mono/mono/metadata/metadata-cross-helpers.c
/home/jstur/wasi-sdk-nightly/share/wasi-sysroot/include/wasm32-wasip2/__struct_iovec.h:5:10: fatal error: 'stddef.h' file not found
If I build it with 24 It builds, then it builds when I switch to nightly. Something looks off in the way its putting it all togther with the caching. I'll dig deeper here...
is this using the system clang
perhaps? As opposed to the wasi-sdk clang?
had a few meetings. I am past the missing file header: one of the build scripts had hard coded the clang includes to point to a path to 18. -isystem /home/jstur/wasi-sdk-nightly/lib/clang/18/include
, the new wasi-sdk is v19.
now it gives me:
-- Build files have been written to: /home/jstur/projects/runtime/artifacts/bin/native/net10.0-wasi-Debug-wasm
[ 14%] Linking C executable dotnet.wasm
EXEC : error : failed to encode component [/home/jstur/projects/runtime/src/mono/wasi/wasi.proj]
Caused by:
0: failed to validate component output
1: duplicate export name `57` already defined (at offset 0xdf82f5)
verifing, it is actually executing the corect versions...
I found log files that seem to indicate that it is using the version expected:
link line: [ "/home/jstur/wasi-sdk-nightly/bin/wasm-component-ld" -m wasm32 --wasm-ld-path /home/jstur/wasi-sdk-nightly/bin/wasm-ld -L/home/jstur/wasi-sdk-nightly/share/wasi-sysroot/lib/wasm32-wasip2 /home/jstur/wasi-sdk-nightly/share/wasi-sysroot/lib/wasm32-wasip2/crt1-command.o CMakeFiles/cmTC_7ad0d.dir/CMakeCCompilerABI.c.obj -lc /home/jstur/wasi-sdk-nightly/lib/clang/19/lib/wasip2/libclang_rt.builtins-wasm32.a -o cmTC_7ad0d]
arg [/home/jstur/wasi-sdk-nightly/bin/wasm-component-ld] ==> ignore
arg [-m] ==> ignore
arg [wasm32] ==> ignore
arg [--wasm-ld-path] ==> ignore
arg [/home/jstur/wasi-sdk-nightly/bin/wasm-ld] ==> ignore
arg [-L/home/jstur/wasi-sdk-nightly/share/wasi-sysroot/lib/wasm32-wasip2] ==> dir [/home/jstur/wasi-sdk-nightly/share/wasi-sysroot/lib/wasm32-wasip2]
arg [/home/jstur/wasi-sdk-nightly/share/wasi-sysroot/lib/wasm32-wasip2/crt1-command.o] ==> obj [/home/jstur/wasi-sdk-nightly/share/wasi-sysroot/lib/wasm32-wasip2/crt1-command.o]
arg [CMakeFiles/cmTC_7ad0d.dir/CMakeCCompilerABI.c.obj] ==> ignore
arg [-lc] ==> lib [c]
arg [/home/jstur/wasi-sdk-nightly/lib/clang/19/lib/wasip2/libclang_rt.builtins-wasm32.a] ==> lib [/home/jstur/wasi-sdk-nightly/lib/clang/19/lib/wasip2/libclang_rt.builtins-wasm32.a]
arg [-o] ==> ignore
arg [cmTC_7ad0d] ==> ignore
/home/jstur/wasi-sdk-nightly/bin/wasm-component-ld --version
wasm-component-ld 0.5.9
Interesting! That looks like a bug in wit-component I think. Can you capture the input core wasm module and arguments to the command here?
Oh you know just thinking about this I think I know where the issue is
If you could file an issue on wasm-tools about this I'll debug further tomorrow, I think I know how to reproduce
(also thanks for testing this is great to find before a full wasi sdk release!)
I am not sure I know how to capture the input core module
https://github.com/bytecodealliance/wasm-tools/issues/1850
Last updated: Nov 22 2024 at 17:03 UTC