Late last year we decided that we'd like to start running a standup style update and to use the Ship WASIp3 project board (see topic for background) to coordinate work.
Let's kick things off with an async style standup for this week!
Folks that have active work streams, will you post a quick update?
cc @Victor Adossi @Roman Volosatovs @Joel Dice
Bailey Hayes: WASI clocks and random 0.3.0-draft WIT definitions have been reviewed and merged. We adjusted the API for wasi-clocks now that fn's are async, see: https://github.com/WebAssembly/wasi-clocks/blob/main/wit-0.3.0-draft/monotonic-clock.wit#L34-L44
We decided to rev the since
versions as 0.3.0 will make breaking changes to the 0.2.x series. Expect all 0.3.0-draft WIT's to align to this. Once @Roman Volosatovs PR's for filesystem and sockets merge, wasi-cli is the next target. I believe we're very close to having a draft for WASIP2 interfaces (http draft already exists but will be updated once deps land).
Once the above is complete, adding support for WASIP3 interfaces in wasmtime and JCO can begin
If you're interested in jumping in, I see a couple items that could begin now:
non-blocking
attribute issue on board. Before starting this one, check-in with @Luke Wagner When picking up a new issue on the board, set yourself as the assignee.
Async/future/stream/error-context support has been merged into wasm-tools
and wit-bindgen
and is available in their latest releases. Still iterating on the Wasmtime PR, knocking out todo items one by one. Hoping to move that one to "ready-for-review" by the end of the month. Meanwhile, it is feature-complete (modulo the error-context
work Victor is doing, which I'm reviewing now), so feel free to clone the branch, kick the tires, and start adding WASIp3 support to wasmtime-wasi
and wasmtime-wasi-http
.
Hey all, here's what I'm working on right now:
wit-bindgen-rt
and a draft to Joel's wasmtime branch (immense thanks to @Joel Dice who walked me through this)asyncify
faster (unlikely to pan out)CC @Calvin Prewitt who is doing the actual bulk of the work of jco host async imports
wasi-sockets
and wasi-filesystem
WIT update PRs are ready for review
will work on removing wasi:io/error.error
dep from wasi:http
and after that start WASI implementation in Wasmtime
Thanks. I've got a few remarks on the wasi-sockets part. Hope to get it written down soon
What’s the status of implementing P3 streams and futures in wasmtime::component? That’s a prerequisite to a wasi impl, correct?
@Pat Hickey see my update above; that's feature-complete and has decent test coverage. Working to get that PR in a reviewable state (hopefully in the next week or two), but meanwhile you can use my PR branch.
Thanks. I will take a look. I’ve been factoring out wasi-io from wasmtime-wasi so that at least that much can be built nostd, idk how that will intersect with these new streams and futures but hopefully they can plug together
Just a heads-up, I've started working on stream
types with no <T>
, added this ticket to the board: https://github.com/orgs/bytecodealliance/projects/16?pane=issue&itemId=94157967
i.e. implementing the spec change from https://github.com/WebAssembly/component-model/pull/440 in wasm-tools and wit-bindgen
Opened a draft PR to wasi:cli
for 0.3.0 https://github.com/WebAssembly/wasi-cli/pull/52
a few smaller-scoped wasi:http
updates to 0.3.0 https://github.com/WebAssembly/wasi-http/pull/143 https://github.com/WebAssembly/wasi-http/pull/142
wasi:random
fix https://github.com/WebAssembly/wasi-random/pull/52
Added a draft PR for stream
of unit: https://github.com/bytecodealliance/wasm-tools/pull/1978
started the wasmtime WASI 0.3 impl https://github.com/bytecodealliance/wasmtime/pull/10061
and now with actual async: https://github.com/bytecodealliance/wasmtime/pull/10063/commits/171a532910790379ebbfd85ddbb62b6f60a1fbff
component build fails with:
thread 'main' panicked at crates/test-programs/artifacts/build.rs:176:10:
module can be translated to a component: failed to validate component output
Caused by:
canonical option `async` requires the component model async feature (at offset 0x82da1a)
Will try basing on async
branch
btw, I've noticed that wit-bindgen-rt
dependency, which appears to be used by async
is missing out-of-the-box.
I believe deps are normally reexported by wit-bindgen
, but in this case it appears that macro attempts to import from ::wit-bindgen-rt
cc @Joel Dice idk, if that's a known/fixed issue or do you want me to take a look?
filed https://github.com/bytecodealliance/wit-bindgen/issues/1135, working on a fix
this seems to have fixed the issue: https://github.com/bytecodealliance/wit-bindgen/pull/1136
FYI, I'm starting to work on wit-bindgen C support for async/streams/futures, which will unblock wasip3 support in wasi-libc
and the toolchains which depend on it.
I've rebased my Wasmtime WASI changes on latest async
from @Joel Dice in https://github.com/rvolosatovs/wasmtime/tree/async-wasi
wasi:clocks@0.3.0
test fails for now, but looks like we're pretty close to getting this working end-to-end:
on https://github.com/rvolosatovs/wasmtime/commit/8ea812e04a77ab7a396bd74876d8c8ef6f0b017b
---- async_::preview3_sleep stdout ----
preopen: TempDir { path: "/var/folders/bq/thy1_b7x29l7s2wqw39r62yw0000gn/T/wasi_components_preview3_sleep.component_cp7fbR" }
thread 'async_::preview3_sleep' panicked at /Users/rvolosatovs/src/github.com/bytecodealliance/wasmtime/crates/wasi/src/runtime.rs:108:15:
Cannot start a runtime from within a runtime. This happens because a function (like `block_on`) attempted to block the current thread while the thread is being used to drive asynchronous tasks.
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
---- sync::preview3_sleep stdout ----
preopen: TempDir { path: "/var/folders/bq/thy1_b7x29l7s2wqw39r62yw0000gn/T/wasi_components_preview3_sleep.component_bQriUp" }
thread 'sync::preview3_sleep' panicked at /Users/rvolosatovs/src/github.com/bytecodealliance/wasmtime/crates/wasmtime/src/runtime/component/concurrent.rs:534:54:
called `Option::unwrap()` on a `None` value
failures:
async_::preview3_sleep
sync::preview3_sleep
I'll start digging :)
I got wasi:clocks@0.3.0
test passing with async runtime in https://github.com/rvolosatovs/wasmtime/commit/c3406b2cb02a6013add16867a8c6debad8bbef39
thanks @Joel Dice for the help!
I noticed the same "current task missing" panic when attempting to run the same test with async_support(false)
, i.e. using sync host bindings. It makes sense, since in that case none of the async machinery seem to be used
What is the current thinking around that - are we planning to support wasip3 on hosts without async_support
at all?
@Alex Crichton and I discussed this a while ago. We don't think WASIp3 or the component model async ABI make sense to support in Wasmtime with async_support(false)
, and we're actually considering deprecating Wasmtime's component model sync APIs (i.e. making async_support(true)
the default and, eventually, only option for running components).
That sounds good to me and simplifies this effort - I won't be adding the sync
bindings to wasip3 interfaces at all then :)
thanks Joel, that sounds right - we'll have to see about deprecating the sync stuff for p2 we already provide (hacky as it is, it does actually meet some users needs) but I agree we shouldn't worry about sync interfaces to p3 or async CM
Do I understand correctly that the plan is that the CM host must support async, but that all components can still be fully sync (if one wants that)? If wasmtime will always use async, it may be useful to support compiling with a very cheap async runtime so that e.g. embedded usecases don’t need to pull in a costly featureful runtime
Yes, you should basically just need tasks (which can be implemented with a library e.g. https://docs.rs/futures/latest/futures/task/index.html), and whatever your platform needs for IO to make a custom executor
I'm working on exactly that for P2 right now, the need is not limited to P3
the existing P2 wasmtime-wasi always runs inside tokio, with the sync bindings it just hides that tokio in the background.
@Pat Hickey @Joel Dice @Alex Crichton @Victor Adossi I've sent you all invites to a 30 min call tomorrow, please let me know if that time slot does not work for you.
If it does, then first item on the agenda would be discussing time slots and making this a regular meeting :)
(I used emails from Wasmtime git log mostly)
just realized I scheduled the meeting for next week :thinking: , just rescheduled for tomorrow, also moved it to a bit later to accommodate for US West Coast
works for me yeah :+1:
joel/I won't be around next monday but don't let my absence block y'all the monday after that from meeting
I've created https://github.com/bytecodealliance/wasip3-prototyping and we should all have admin access there
I avoided a literal fork b/c i don't think github lets you fork a repo within the same org
just pushed up wasmtime's current main
to the main
there as well
Small follow-up to the discussion from yesterday, just for the sake of completeness - without any changes to the wasmtime bindgen, we should be able to use dynamic dispatch to avoid the complex state machines in implementations. I haven't tested it, but I don't see why returning a Box<dyn FnOnce(..)>
wouldn't work.
It's essentially the same thing we've been doing for a long time with async_trait
, but it does seem as a bit of a setback, since we'd not be able to "just" use Rust async. It also seems non-obvious for embedders, especially those not completely familiar with dynamic dispatch/async (e.g. explicit Box<dyn ..>
bounds would likely be required to avoid compiler errors)
Just throwing it out there as a potential solution, I believe having "true" Rust async trait support would be a lot nicer/better
Can you clarify how returning a Box<dyn ...>
would help? I'm not seeing what part of the problem it solves.
FWIW, I just sketched out a potential API based on a thread-local variable which is only set to a valid value when Wasmtime is polling futures created by functions registered using LinkerInstance::func_wrap_concurrent
: https://gist.github.com/dicej/21519f9cf2e4d57a3316ea0b2167d281
diff --git a/crates/wasi/src/p3/sockets/host/types/tcp.rs b/crates/wasi/src/p3/sockets/host/types/tcp.rs
index e2a45eb6c..c5f3cbb7f 100644
--- a/crates/wasi/src/p3/sockets/host/types/tcp.rs
+++ b/crates/wasi/src/p3/sockets/host/types/tcp.rs
@@ -45,64 +45,65 @@ where
let ctx = store.data().sockets();
let allowed = ctx.allowed_network_uses.tcp;
let socket_addr_check = ctx.socket_addr_check.clone();
- let sock = store
+ let tcp_socket = match store
.data_mut()
.table()
.get_mut(&mut socket)
.context("failed to get socket resource from table")
- .map(|socket| {
- let tcp_state = mem::replace(&mut socket.tcp_state, TcpState::BindStarted);
- if let TcpState::Default(sock) = tcp_state {
- Some((sock, socket.family))
- } else {
- socket.tcp_state = tcp_state;
- None
- }
+ {
+ Ok(sock) => sock,
+ Err(err) => {
+ return Box::pin(async move {
+ for_any(Box::new(for_any(move |_| Err(err)))) as Box<dyn FnOnce(_) -> _>
+ }) as Pin<Box<dyn Future<Output = _>>>
+ }
+ };
+ let tcp_state = mem::replace(&mut tcp_socket.tcp_state, TcpState::BindStarted);
+ let sock = if let TcpState::Default(sock) = tcp_state {
+ sock
+ } else {
+ tcp_socket.tcp_state = tcp_state;
+ return Box::pin(async {
+ for_any(
+ Box::new(move |_| Ok(Err(ErrorCode::InvalidState))) as Box<dyn FnOnce(_) -> _>
+ )
});
+ };
let local_address = SocketAddr::from(local_address);
- async move {
- let res = match sock {
- Ok(sock)
- if !allowed
- || !socket_addr_check(local_address, SocketAddrUse::TcpBind).await =>
- {
- if let Some((sock, ..)) = sock {
- Ok(Ok((sock, Err(ErrorCode::AccessDenied))))
- } else {
- Ok(Err(ErrorCode::AccessDenied))
- }
- }
- Ok(Some((sock, family))) => {
- let res = bind(&sock, local_address, family);
- Ok(Ok((sock, res)))
- }
- Ok(None) => Ok(Err(ErrorCode::InvalidState)),
- Err(err) => Err(err),
- };
- for_any(move |mut store: StoreContextMut<'_, Self::TcpSocketData>| {
- let sock = res?;
- let socket = store
+ Box::pin(async move {
+ if !allowed || !socket_addr_check(local_address, SocketAddrUse::TcpBind).await {
+ return Box::new(move |mut store: StoreContextMut<'_, Self::TcpSocketData>| {
+ let tcp_socket = store
+ .data_mut()
+ .table()
+ .get_mut(&mut socket)
+ .context("failed to get socket resource from table")?;
+ tcp_socket.tcp_state = TcpState::Default(sock);
+ Ok(Err(ErrorCode::AccessDenied))
+ }) as Box<dyn FnOnce(_) -> _>;
+ }
+ Box::new(move |mut store: StoreContextMut<'_, Self::TcpSocketData>| {
+ let tcp_socket = store
.data_mut()
.table()
.get_mut(&mut socket)
.context("failed to get socket resource from table")?;
- let (sock, res) = match sock {
- Ok(sock) => sock,
- Err(err) => return Ok(Err(err)),
- };
ensure!(
- matches!(socket.tcp_state, TcpState::BindStarted),
+ matches!(tcp_socket.tcp_state, TcpState::BindStarted),
"corrupted socket state"
);
- if let Err(err) = res {
- socket.tcp_state = TcpState::Default(sock);
- Ok(Err(err))
- } else {
- socket.tcp_state = TcpState::Bound(sock);
- Ok(Ok(()))
+ match bind(&sock, local_address, tcp_socket.family) {
+ Ok(()) => {
+ tcp_socket.tcp_state = TcpState::Bound(sock);
+ Ok(Ok(()))
+ }
+ Err(err) => {
+ tcp_socket.tcp_state = TcpState::Default(sock);
+ Ok(Err(err))
+ }
}
})
- }
+ })
}
fn connect(
something roughly like this. I did not play the lifetime bingo to solve the "one type is more general than the other" and "implementation of FnOnce
is not general enough", but that's a potential workaround (which, it seems, would require even more helpers akin to for_any
). I'm not sure if it's more ergonomic or clear.
Your sketch looks good to me, however, and I think it makes a lot of sense.
FYI, I just pushed my async
branch to the main
branch of the wasip3-prototyping repo. I won't rebase it anymore after that; we can merge from upstream main as necessary, which should be less disruptive than rebasing.
I missed wherever this came up on github - whats the reasoning behind dropping the -draft
suffix on wasi 0.3 wit packages versions?
Looks like I've discovered, what appears to be, a bindgen bug https://github.com/bytecodealliance/wasip3-prototyping/issues/2, but not entirely sure yet.
ok got it. so we can restore the -draft once that bug is fixed?
I think so, however is there a particular reason we'd want the -draft
suffix?
Since the interfaces are in a separate (draft) directory within proposal repositories anyway, the fact that those interfaces are "draft" should be clear and not introducing the suffix seem to minimize the churn
It's a precedent we set with wasip2. We don't want anyone to accidentally think wasi 0.3.0 is ready. We're also expecting to make a series of updates as these are unlikely to be the final versions of the interfaces. -draft
makes that easier.
yep. we have meaningful semver there, so lets use it.
it also means we can publish the draft versions to an OCI registry fearlessly.
Some fast-feedback would be great on this issue in wasi-http:
https://github.com/WebAssembly/wasi-http/issues/153
It looks like a signature (body.finish
) needs to change given that internally it produces an error-context
. There are many options for how to fix it, but it would be great to get that fix in upstream on the 0.3.0-draft
A quick summary of the status meeting earlier today, please let me know if I missed something:
wasi-http/types.body#finish
: reads from future, which is fallible, but there is currently no way to propagate the error-context
from the method. We need an issue discussion in wasi-http
about how to handle that@Roman Volosatovs this fixes the borrow issue: https://github.com/bytecodealliance/wasip3-prototyping/pull/4
Awesome, thanks @Joel Dice ! Just tested it and can confirm that it fixed the issue I've encountered :)
Here's the wasmtime-wit-bindgen ergonomics refactor: https://github.com/bytecodealliance/wasip3-prototyping/pull/6
Looks great! Here's bind
and connect
with new API https://github.com/bytecodealliance/wasip3-prototyping/blob/7dbc8a9fd6a2348e772a32382c3e29c88cad397e/crates/wasi/src/p3/sockets/host/types/tcp.rs#L67-L153
See also https://github.com/bytecodealliance/wasip3-prototyping/pull/6#pullrequestreview-2612364221
first, quick, incomplete attempt at TCP send looks roughly like that https://github.com/bytecodealliance/wasip3-prototyping/blob/e84bce870bf7bafed014b0a536445e1e0deb22fa/crates/wasi/src/p3/sockets/host/types/tcp.rs#L263-L366
once I can get listen
working, I'll be able to run some tests
(this also lacks shutdown on EOF, but just testing out the API for now)
@Roman Volosatovs this should unblock listen
: https://github.com/bytecodealliance/wasip3-prototyping/pull/8
Thanks @Joel Dice !
Working on listen
I'm getting a sefault with this API and I've also ran into the failed to get promise
due to cannot remove owned resource while borrowed
again, looks like there's indeed a bug, which is likely similar to what happened with borrows.
I've also added a little commit to allow spawn
on store directly (from within with
) (https://github.com/bytecodealliance/wasip3-prototyping/pull/1/commits/e4e4e5531aee37c0a99ea5e2d9364c30c001752d), however even after reverting that I'm still getting the same result.
What's the most efficient way to proceed here?
Should I file issues with reproduction steps?
Would it help to get on a call and pair?
I've pushed my commit trace to https://github.com/bytecodealliance/wasip3-prototyping/pull/1
Filing issues would be ideal. They don't need detailed descriptions -- just brief descriptions of each issue with steps to repro.
BTW, I've been wondering how listen: func() -> result<stream<tcp-socket>, error-code>
is supposed to work given that the tcp-socket
borrow only lasts until listen
returns, i.e. the callee no longer has access to it after returning, so it can't use it in the spawned task. Curious if @Luke Wagner has guidance on this.
filed the issue at https://github.com/bytecodealliance/wasip3-prototyping/issues/9
BTW, I've been wondering how
listen: func() -> result<stream<tcp-socket>, error-code>
is supposed to work given that thetcp-socket
borrow only lasts untillisten
returns, i.e. the callee no longer has access to it after returning, so it can't use it in the spawned task.
Roman and I discussed this in today's Wasmtime meeting and he clarified that the host implementation of listen
does not attempt to use the borrowed resource as such after the function returns -- instead it clones the backing implementation and uses that, which makes sense. And I guess a component-implemented version could do the same thing by making the exported tcp-socket
resource implementation contain a cloneable and interior-ly mutable state which may be cloned for post-return use.
Filed an issue for the second error I've encountered: https://github.com/bytecodealliance/wasip3-prototyping/issues/10
@Roman Volosatovs The segfault you found is the same as the one I discovered last week in the context of epoch interruption (although it turns out it can happen even with epoch interruption disabled) and am in the process of addressing. Meanwhile, this tiny PR makes it panic instead of segfault.
Awesome, thanks for looking into it. Can confirm that it does not segfault anymore! :)
@Roman Volosatovs This fixes issue #10: https://github.com/bytecodealliance/wasip3-prototyping/pull/12. I'll continue working on #9, but might not finish it before the end of my day, in which case I should have a PR up tomorrow.
Do we want to get CI running on the prototyping repo? I must have missed the how/when/why of when it was disabled, but having some tests run at PR time is likely better than none.
It's not a showstopper as I think a forking workflow would allow all the changes anyone wanted, but just bringing it up here for discussion.
Yes, that would be great.
@Roman Volosatovs This PR addresses #9. BTW, I merged my other outstanding PRs (despite not yet being approved) because juggling 4 different branches was irritating me. I'd also like to merge yours as soon as it's somewhat stable (#[ignore
ing tests as necessary) so we don't need to jump back and forth so much.
With that PR, the tests that were failing due to #9 no longer panic, but they hang indefinitely, consuming no CPU. Not sure if that's a bug in the tests or in Wasmtime; I haven't had a chance to study them yet.
I general, I think we can be pretty aggressive about merging PRs for now and avoid long-lived branches. We're going to split it all up again for upstreaming anyway; for now let's keep things simple. Enabling CI as Victor suggested will help avoid stepping on each others' toes.
Sounds good, I've rebased https://github.com/bytecodealliance/wasip3-prototyping/pull/1 on latest main
and marked it ready for review.
Unfortunately, I was not able to enable all TCP tests yet, as it seems I'm hitting a deadlock with stream I/O, the host never receives the data that the guest sends. I'll file an issue later today
Yeah, that deadlock sounds like the hang I saw last week. I can look into that today. If you can minimize the test case that reproduces it at all, that would definitely help.
I tracked it down to these two:
guest: https://github.com/bytecodealliance/wasip3-prototyping/blob/33263c939f09ec0c68d200b7b8048441812657d4/crates/test-programs/src/bin/sockets_0_3_tcp_bind.rs#L71
host: https://github.com/bytecodealliance/wasip3-prototyping/blob/33263c939f09ec0c68d200b7b8048441812657d4/crates/wasi/src/p3/sockets/host/types/tcp.rs#L435
Neither of these calls ever resolve, as if the futures are not polled
I'm using the Promise::into_future
in the host, which seems to be the right thing to do according to the docs, I have not tried it with Promise::get
Yeah, Promise::into_future
is the only one that makes sense in this context. I'll take a look.
Based on println
debugging the guest is not even reaching the line you linked to. Since the guest has lowered thetcp-socket.send
import synchronously, and the implementation is trying to read the entire stream<u8>
before returning, there's no way it will ever return; the guest is blocked on that call and will never write anything to the stream.
I think you'll want to lower the tcp-socket.send
import asynchronously and await
it concurrently with a future that feeds the write end of the stream (e.g. using join or FuturesUnordered). Happy to pair on this when you're available.
Right, yeah, that makes sense, thanks!
I'll push a fix later tonight if I'll get a chance, tomorrow otherwise
had a few minutes to try it out https://github.com/bytecodealliance/wasip3-prototyping/pull/17, will need to figure out a way to properly adapt the wasip2 test expectations regarding the behavior, but looks like it works. tcp_streams
still hangs, but that's likely due to some write handle not being dropped. Will continue tomorrow.
Thanks for taking a look, looks like we'll be able to have the complete TCP test suite working tomorrow
Interesting, looks like I found another issue - the BackgroundTask
is not awoken by the executor after returning Poll::Pending
.
I think it would be beneficial to pair later today!
if this is the waker used to poll these tasks, it would explain what I'm seeing: https://github.com/bytecodealliance/wasip3-prototyping/blob/f5fefc6f4e63b311850421c75d5cdb2e96784a7c/crates/wasmtime/src/runtime/component/concurrent.rs#L1459-L1469
https://github.com/bytecodealliance/wasip3-prototyping/blob/f5fefc6f4e63b311850421c75d5cdb2e96784a7c/crates/wasmtime/src/runtime/component/concurrent.rs#L1561
specifically, this future never gets polled again after returning Poll::Pending
, even though oneshot::Receiver
poll should have registered the wakup event once the oneshot::Sender
is dropped (I have verified that it indeed gets dropped) https://docs.rs/tokio/latest/tokio/sync/oneshot/struct.Receiver.html
@Roman Volosatovs dummy_waker
is only used in first_poll
and poll_and_block
, and those are only used for guest->host calls in host.rs
, meaning dummy_waker
should never be used for spawn
ed tasks. Instead, spawn
ed tasks are pushed directly to ConcurentState::futures
, and that is always polled with the Context
received from the embedder's executor (i.e. Tokio in this case).
My guess is that we're not polling ConcurrentState::futures
between when spawn
is called and when we suspend the fiber and return control to the executor, i.e. the spawn
ed future is not being polled at all. Or have you already verified that it's being polled at least once?
Happy to pair today when you're available. Meanwhile, I'll take a closer look.
It gets polled once for sure. I'll be available in a few minutes, we could jump on a call if you're free
https://meet.google.com/sta-snii-ffj
Ok, after looking at the code, I think you're right that the issue is related to the guest synchronously dropping the listening tcp socket, which in the host drops the oneshot::Sender
, but the guest code keeps running after the drop call returns, before the host has a chance to poll ConcurrentState::futures
again.
So we get the trap because the guest tries to bind again before the old socket has really stopped listening.
I'm wondering if there's a way we can make HostTcpSocket::drop
finish everything it needs to do (including waiting for the spawned future to complete) before returning.
It's like the spawned future needs to send a confirmation back to HostTcpSocket::drop
indicating it's done.
For my understanding then, are you saying the observed behavior is a race condition effectively?
Would the future eventually get polled on it's own or it does require something to trigger the poll?
Kind of, except that it's 100% deterministic.
yes, I think the future would be polled if the guest were to yield between the drop and the bind
Want to jump back on the call to chat for a bit?
or if (like I was saying), the host drop implementation were to wait for confirmation before returning
yup
https://meet.google.com/sta-snii-ffj
actually, following-up to our discussion just now, I might try taking the task out from the drop directly, by wrapping the ListenTask
itself
It does seem like a generally useful pattern, but I'm not actually sure if the abort handle should be responsibility of the runtime if we can figure that out by wrapping the task directly
Makes sense; want to try that before I start messing with the runtime code?
sounds good, let me try a few things!
diff --git a/crates/wasi/src/p3/sockets/host/types/tcp.rs b/crates/wasi/src/p3/sockets/host/types/tcp.rs
index d2b43129b..db000c471 100644
--- a/crates/wasi/src/p3/sockets/host/types/tcp.rs
+++ b/crates/wasi/src/p3/sockets/host/types/tcp.rs
@@ -4,8 +4,8 @@ use core::net::SocketAddr;
use core::pin::{pin, Pin};
use core::task::Poll;
-use std::net::Shutdown;
use std::sync::Arc;
+use std::{net::Shutdown, os::fd::AsRawFd as _};
use anyhow::{ensure, Context as _};
use io_lifetimes::AsSocketlike as _;
@@ -769,7 +769,8 @@ where
.delete(rep)
.context("failed to delete socket resource from table")?;
match sock.tcp_state {
- TcpState::Listening { abort, .. } => {
+ TcpState::Listening { abort, listener } => {
+ unsafe { libc::close(listener.as_raw_fd()) };
eprintln!("DROP LISTENER");
_ = abort.send(());
Ok(())
just as a very quick PoC, since I did not really find a nice way to close the listener yet, this seems to have fixed the issue.
Of course, I got
fatal runtime error: IO Safety violation: owned file descriptor already closed
but I guess that's expected :D
I tried also wrapping the listener in an Option<std::sync::Mutex<TcpListener>>
, but that got pretty awkward pretty quick although it potentially can work as well
Yeah, I figured we'd need to use a Mutex
somewhere given that the futures need to be Send + Sync
ah, actually, we could also use tokio::spawn
directly and communicate with the BackgroundTask
via a channel
that way the drop
can just use tokio task machinery directly
yeah, I could see that working
hmm, it mostly works, hitting some non-determinism, want to take a look?
sure, same google meet link?
yes, will be there in a sec
you people are rocking
took me 3 different channels of 3 different types to get there, but got tcp_bind
test consistently working https://github.com/bytecodealliance/wasip3-prototyping/pull/17 @Joel Dice
Yikes, that drop
implementation is complicated. This should make it a lot simpler:
type Spawned =
Arc<Mutex<Option<Pin<Box<dyn Future<Output = Result<()>> + Send + Sync + 'static>>>>>;
/// Handle to a spawned task which may be used to abort it.
pub struct SpawnHandle(Spawned);
impl SpawnHandle {
/// Abort the task.
pub fn abort(&self) {
*self.0.lock().unwrap() = None;
}
/// Return an [`AbortOnDropHandle`] which, when dropped, will abort the
/// task.
pub fn abort_handle(&self) -> AbortOnDropHandle {
AbortOnDropHandle(self.0.clone())
}
}
/// Handle to a spawned task which will abort the task when dropped.
pub struct AbortOnDropHandle(Spawned);
impl Drop for AbortOnDropHandle {
fn drop(&mut self) {
*self.0.lock().unwrap() = None;
}
}
I should have a PR up by the end of the day.
Actually, the SpawnHandle
/AbortOnDropHandle
stuff is a bit trickier than I realized. I need to not just clear the Option
but also wake the waker (i.e. the FuturesUnordered
waker, in this case) so it knows to poll the wrapper future one last time and remove it from the set. Otherwise the wrapper future will be stuck in the FuturesUnordered
forever.
It's doable, just need to spend some more time on it.
Got all TCP tests working https://github.com/bytecodealliance/wasip3-prototyping/pull/20
you people POUNDING!
@Joel Dice there's a merge conflict in https://github.com/bytecodealliance/wasip3-prototyping/pull/18 currently, however https://github.com/bytecodealliance/wasip3-prototyping/pull/20 can be merged cleanly and it increases test coverage.
How about we merge https://github.com/bytecodealliance/wasip3-prototyping/pull/20 now and I rebase https://github.com/bytecodealliance/wasip3-prototyping/pull/18 on the resulting main
?
pushed a rebased version to https://github.com/bytecodealliance/wasip3-prototyping/tree/dicej/another-bindgen-refactor-rebase
Yup, sounds good; I just merged #20
Awesome, should I push to your branch then?
Sure, thanks!
pushed. just a heads-up, I'm noticing some flakiness in tcp_connect
test, let's see if switching to the new spawn functionality fixes that. Otherwise, we can ignore it for now
FYI, in the past when I've seen non-deterministic test behavior, I've been able to make it more consistent by adding/removing/adjusting sleep
calls in host functions (e.g. artificially delaying tcp socket operations), so if we still see issues even with the new spawn feature, that might help with debugging.
I cancelled the CI run, if I'll manage to fix it, I'll push a fix within next 1.5h, otherwise I'll push an ignore for the flaky ones
@Joel Dice the fix for the race condition we discovered was fairly trivial https://github.com/bytecodealliance/wasip3-prototyping/pull/23
Note, that I cannot actually use the abort handle machinery out-of-the-box just yet due to the need of explicitly closing the stream write handle. Let's chat about it later today at the status update!
Hey @Pat Hickey wanted to get your opinion on https://github.com/WebAssembly/wasi-http/pull/154
After talking with Joel at the P3 sync, we want to have body.finish
not async (since it doesn't have to be, it's simply ferrying along the trailers), but right now creating a new future requires access to the store which currently is only doable from async
functions.
This somewhat inconvenient, but brought up the idea that the lack of trailers and trailers that are present but empty are different -- which means option<future<trailers>>
might be a reasonable thing to put on the table. Right now, to make future<trailers>
work, if there are no trailers, we create a new future to essentially and make sure to write nothing.
I think you were against option<future<trailers>>
, am I remembering correctly?
To elaborate on what Victor said above: there's currently a limitation in wasmtime-wit-bindgen
's Component Model Async support such that sync functions can't create streams or futures, but that's a temporary limitation we plan to address soon, so no need to design around it.
thanks for the explanation, I'm fine with what you came up with
Thanks! Just to make sure I fully understand -- are you going for creating the future here (no option
, i.e. returning future<trailers>
no matter what) or returning an option<future<trailers>>
?
If the former, a review on https://github.com/WebAssembly/wasi-http/pull/154 would be appreciated! If the latter -- I'll get to making that change ASAP
following the discussion from yesterday, I've replaced the Tokio task/channel magic by AbortOnDropHandle
machinery in https://github.com/bytecodealliance/wasip3-prototyping/pull/32, which does not rely on multi-threading anymore.
I think we need to talk a bit more about send
and connect
.
cc @Alex Crichton @Joel Dice
UDP implementation and tests: https://github.com/bytecodealliance/wasip3-prototyping/pull/33 - with this, we have latest wasip3 wasi:sockets
draft fully implemented and tested using adapted wasip2 test suite. Will be starting with wasi:filesystem
tomorrow.
I got sidetracked yesterday fixing the soundness issue Alex raised in wasmtime-wit-bindgen
's concurrent_imports
option, as well as restoring support for &mut T
forwarding impl
s so we can generate add_to_linker
functions again. The good news is I got it working (should have a PR up today). The bad news is I didn't work on anything else. These are on my TODO list, though:
{Stream|Future}{Writer|Reader}
on drop@Roman Volosatovs when you switched to AbortOnDropHandle
, how did you handle closing the StreamWriter
? Or is that still a TODO blocked on the "auto-closing {Stream|Future}{Writer|Reader}
on drop" item I mentioned above?
I actually did not make any changes in that regard - there are two tasks, both spawned via the store. The tasks communicate over an mpsc.
The producer side deals with all the OS-level stuff, I/O etc. , e.g. it owns the file descriptor (e.g. TCP stream) and it does not use the Accessor
at all.
The consumer side owns the StreamWriter and is responsible for closing it once mpsc is closed and/or an unrecoverable error is encountered.
The abort handle of the producer task is stored as part of the resource state and so it's dropped once resource is dropped, triggering, e.g. close of a TCP stream FD.
The consumer task will eventually(likely on next poll) find out that the producer is closed and close the StreamWriter
Ah, that makes sense -- thanks for the explanation.
@Alex Crichton @Victor Adossi I've been thinking some more about how to support creating futures and streams in sync host functions, and the easiest thing for the time being is to just tell wasmtime-wit-bindgen
to generate an async function if it needs to create a future or stream (even if it doesn't need to await
) -- which is exactly what Victor's PR already does. The guest can call that synchronously and everything will work as it should.
Alex and I have ideas about making both the async and sync bindings more flexible and ergonomic in the future, but for the time being I think it's fine to tell the binding generator to generate an async function any time the implementation needs to await
or create futures or streams.
Hello :wave: this is probably a silly question but could someone explain what's the reason for changing the cli/stdio
interface from:
getStdout()->output-stream
and getStderr()->output-stream
tosetStdout(stream)
and setStderr(stream)
?In the component model, the stream
type always represents the readable end of a stream; there's no equivalent to output-stream
for the writable end; which is to say that the component which creates a stream
using stream.new
can only give away the readable end and keep the writable end for itself. So the only way to hook a stream
up to stdout or stderr is to create that stream
using stream.new
, hold on to the writable end, and pass the readable end to e.g. setStdout
.
The setStdout
/setStderr
API does raise an interesting question, though: If I compose two components and they both call setStdout
, will one overwrite the other? I.e. is only one of the components going to be able to write to stdout?
I imagine the host could maintain some kind of magic per-component state so that they don't overwrite each other, but I'm not sure that would be virtualizable by a guest component.
@Joel Dice @Alex Crichton following our earlier discussion: https://github.com/WebAssembly/wasi-sockets/pull/119
I'll also address the listen/receive stream capturing the lifetime of the socket resource later. (on that note, lazy question: does Rust guest wit-bindgen
actually support this correctly, or does a returned stream capture the lifetime of the resource borrow the method was called on, in other words, are returned streams/futures 'static
?)
I just opened an issue about the set-stdout
/set-stderr
question I raised above: https://github.com/WebAssembly/wasi-cli/issues/64
Joel there's changes in wasm-tools now to update various intrinsics and line up with the spec, do you want to coordinate about landing this within wasmtime? I ask because jsturtevant is working on a bugfix at wasm-tools#2076 which will unblock wasi-tls landing (this is completely unrelated to async), but I bring this up as that'll require updating wasm-tools.
The literal update is pretty modest but I suspect the actual changes to trampolines/tests to be a bit more invasive. In particular core wasm signatures are changing and things like waitable sets now need to be both created and destroyed around usages too
Getting that upstream sounds fine to me and won't disrupt the WASIp3 work until we try to merge it into the wasip3-prototyping
repo, but we don't need to do that right away. So no need to delay the former, and I'm happy to help if anything comes up.
ok sounds good, I'll forge ahead when the update is necessary
although same question, but this time from wit-bindgen -- should I be careful about updating there or "update at will"?
I'd say update at will; we can delay updating the dep in the wasip3-prototyping
repo as well.
@Joel Dice https://github.com/bytecodealliance/wasip3-prototyping/pull/40
test scenario::round_trip::async_round_trip_stackless_sync_import has been running for over 60 seconds
looks like CI caught it as well
filed an issue https://github.com/bytecodealliance/wasip3-prototyping/issues/41
question on CI: there's a copy of main.yml
but trimmed down -- is this because when I originally made the repo I disabled main.yml
(named "CI") and it was thought we needed to copy that to get something running?
I'm discovering in https://github.com/bytecodealliance/wasip3-prototyping/pull/43 that there's some sort of mixture or something going on where CI isn't actually running right now
Probably. In any case we're using wasip3-prototyping.yml (which we copied and modified) for PRs
or rather some jobs aren't running
is the goal to basically match the wasmtime repo?
if so I can set that up
(merge queues and all)
kind of, except we're skipping cargo vet
for now
just vet? anything else?
That would be awesome -- I set up the copy there to mostly do the same thing as main but hopefully run slightly faster without tests that we wouldn't impact
and I think we added another test job for async specifically
the async tests aren't run otherwise?
maybe not? I haven't looked closely :shrug:
Yep, and another job for async specifically, mostly for visibility, the original goal was to only run those but figured there could be downstream effects
ok I'll work to try to figure out what's going on and see if changes are necessary
the async tests are under misc
I'm not sure if that's covered under test by default but entirely could be!
Yup happy with any changes there -- having merge queues and any tests that are necessary re-enabled would be nice
To be clear, the extra job Victor added runs cargo test -p component-async-tests
, but there are other async-related tests elsewhere (e.g. bindgen and WA(S)T tests)
I would expect cargo test --all
to run the component-async-tests
, but maybe it wasn't for some reason?
I don't even know if CI runs cargo test --all
Ah it does by default, run-tests.sh
uses --workspace
, wasn't necessary for me to use -p component-async-tests
by default, but did like having it broken out like fiber. Certainly happy with having it all get back to a state more similar to upstream
ok in the main merge I'm deleting wasip3-prototyping.yml, I disabled cargo vet
for anything outside of bytecodealliance/wasmtime (I'll land that upstream) and I'll work through the remaining failures
I'll have a follow-up which splits out the component-async-tests package to its own CI bucket which should still run on PRs by default.
I'm also enabling the merge queue. For now it's set as "merge & commit" as the strategy, but I'm only going to leave that for this one PR. The merge strategy is defined at the repo level, not the PR level, so my plan is to change that to "squash & merge" like wasmtime has after this merge lands
Squash & merge would complicate the eventual upstream contribution - I've set the strategy to "rebase" to allow for cherry-picking commits later
If others prefer squashes - that's fine, but I'd like to be able to keep merging my WASI PRs with rebase
oh sure I can leave that turned on yeah
it'll mostly just be annoying that if a merge from upstream back to wasip3 happens we'll have to update merge queue settings around that
Giving it a bit more thought, I don't think I actually mind squash merges at this point, my upcoming PRs should be rather small/well-scoped, so rebase or squash should not really matter any more, the "incremental" commits I cared about are already in main
ok set to squash + merge and the next PR should go through smoothly in theory
Last updated: Feb 27 2025 at 23:03 UTC