Stream: wasmtime

Topic: tokio always reports readiness


view this post on Zulip Dave Bakker (badeend) (Feb 04 2024 at 20:53):

I'm investigating a bug in the wasi implementation, but can't figure it out.

If input-stream::blocking-read succeeds, the returned buffer must contain at least one byte. I noticed that this is not the case for TCP sockets.

I tracked it down to tokio::net::TcpStream::readable().await. That immediately reports readiness. And yet after that, the call to read call returns WouldBlock.
My initial instinct was that this would be a one-off spurious wakeup kind of deal. So I wrapped it in a loop, expecting the second await for readiness to correctly suspend.
Unfortunately, it ends up in a spin loop. Interestingly, it only happens in the async_ testsuite, not the sync tests.

A small repro here: https://github.com/badeend/wasmtime/commit/e788501c3ba8283670cfa5071d99a7565e711d32

sync::preview2_tcp_sample_application (Correct)

Outputs just once:

ready?
ready!

async_::preview2_tcp_sample_application (Incorrect)

Output (truncated):

ready?
ready!
WouldBlock
ready?
ready!
WouldBlock
ready?
ready!
WouldBlock
ready?
ready!
WouldBlock
ready?
ready!
WouldBlock
(repeated many more times...)

I see that tokio's readiness states have been a problem before in wasmtime. But I don't know if this is related, especially as in this case it's the async variant that's not working instead of the sync variant.

Any pointers?

A fast and secure runtime for WebAssembly. Contribute to badeend/wasmtime development by creating an account on GitHub.

view this post on Zulip Alex Crichton (Feb 05 2024 at 14:58):

Weird! You already found the problem line that I would have otherwise pointed out. The general shape of that problem was that we more-or-less weren't pumping the event loop or yielding properly, and once we returned to the event loop the I/O object got to realize it wasn't ready (or was, I forget which)

You also applied what I would have already suggested which is to throw a loop around the object. The documentation for try_read_buf has basically what should be happening at the wasm level as well.

Are you on Windows perhaps? If not, what system are you on?

view this post on Zulip Dave Bakker (badeend) (Feb 05 2024 at 15:02):

Linux

view this post on Zulip Dave Bakker (badeend) (Feb 05 2024 at 15:02):

(or WSL technically)

view this post on Zulip Dave Bakker (badeend) (Feb 05 2024 at 15:21):

I already tried sprinkling yield_now around in various places, but without effect

view this post on Zulip Alex Crichton (Feb 05 2024 at 15:42):

Can you run strace and find a loop of syscalls?

view this post on Zulip Alex Crichton (Feb 05 2024 at 15:42):

I can also test out when I land later too

view this post on Zulip Alex Crichton (Feb 05 2024 at 17:40):

I checked out the commit you referenced above (e788501c3ba8283670cfa5071d99a7565e711d32) and ran cargo test -p wasmtime-wasi sample_application and was unfortunately unable to reproduce:

test async_::preview2_udp_sample_application ... ok
test sync::preview2_udp_sample_application ... ok
test sync::preview2_tcp_sample_application ... ok
test async_::preview2_tcp_sample_application ... ok

I tested a second linux system and also didn't get a failure there. Should I be doing something else to reproduce though?

view this post on Zulip Dave Bakker (badeend) (Feb 05 2024 at 18:02):

Correct, the test doesn;t fail, because of the busy loop

view this post on Zulip Dave Bakker (badeend) (Feb 05 2024 at 18:03):

If you run with nocapture you see all the log outputs:

RUST_BACKTRACE=full WASMTIME_BACKTRACE_DETAILS=1 cargo test async_::preview2_tcp_sample_application --features wasmtime-wasi/preview2 --package wasmtime-wasi -- --nocapture

view this post on Zulip Dave Bakker (badeend) (Feb 05 2024 at 18:05):

Or revert the change in io.rs, then you'll see it fail.

view this post on Zulip Alex Crichton (Feb 05 2024 at 18:07):

Hm using that command I see one instance of ready? ready! but no more

view this post on Zulip Dave Bakker (badeend) (Feb 05 2024 at 18:10):

:thinking:

view this post on Zulip Alex Crichton (Feb 05 2024 at 18:11):

I'm also on a plane connected over ssh to the timings are probably much different than if it were local

view this post on Zulip Dave Bakker (badeend) (Feb 05 2024 at 18:35):

Hmm, when I run it on Windows natively, it works fine as well..

view this post on Zulip Dave Bakker (badeend) (Feb 05 2024 at 18:54):

May I ask what version of the Linux kernel you're running?

view this post on Zulip Alex Crichton (Feb 06 2024 at 15:42):

I tested Linux arm2-ci.infra.bytecodealliance.org 5.15.0-89-generic #99-Ubuntu SMP Mon Oct 30 23:43:36 UTC 2023 aarch64 aarch64 aarch64 GNU/Linux and Linux fhaweignbi 6.7.1-060701-generic #202401201133 SMP PREEMPT_DYNAMIC Sat Jan 20 11:43:06 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

view this post on Zulip Alex Crichton (Feb 06 2024 at 15:43):

so 5.15 and 6.7.1

view this post on Zulip Dave Bakker (badeend) (Feb 07 2024 at 12:45):

Alright thanks. I saw this stackoverflow post that describes almost the exact same issue, but that was closed without a satisfying answer.

Anyway, I ran the repro on a fresh ubuntu 22.04 install and the problem still persists there. Next, I ran a strace like you suggested:

[pid 35964] socket(AF_INET, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, IPPROTO_IP) = 8
[pid 35964] epoll_ctl(5, EPOLL_CTL_ADD, 8, {events=EPOLLIN|EPOLLOUT|EPOLLRDHUP|EPOLLET, data={u32=0, u64=0}} <unfinished ...>
[pid 35965] <... epoll_wait resumed>[{events=EPOLLOUT|EPOLLHUP, data={u32=0, u64=0}}], 1024, 948) = 1
[pid 35964] <... epoll_ctl resumed>)    = 0
[pid 35965] epoll_wait(3,  <unfinished ...>
[pid 35964] connect(8, {sa_family=AF_INET, sin_port=htons(80), sin_addr=inet_addr("93.184.216.34")}, 16) = -1 EINPROGRESS (Operation now in progress)
[pid 35964] poll([{fd=8, events=POLLOUT}], 1, 0) = 0 (Timeout)
[pid 35964] poll([{fd=8, events=POLLOUT}], 1, 0) = 0 (Timeout)
[pid 35964] poll([{fd=8, events=POLLOUT}], 1, 0) = 0 (Timeout)
[pid 35964] poll([{fd=8, events=POLLOUT}], 1, 0) = 0 (Timeout)
[pid 35964] poll([{fd=8, events=POLLOUT}], 1, 0) = 0 (Timeout)
[pid 35964] poll([{fd=8, events=POLLOUT}], 1, 0) = 0 (Timeout)
[pid 35964] poll([{fd=8, events=POLLOUT}], 1, 0) = 0 (Timeout)
[pid 35964] poll([{fd=8, events=POLLOUT}], 1, 0) = 0 (Timeout)
[pid 35964] poll([{fd=8, events=POLLOUT}], 1, 0) = 0 (Timeout)
[pid 35964] poll([{fd=8, events=POLLOUT}], 1, 0) = 0 (Timeout)
[pid 35964] poll([{fd=8, events=POLLOUT}], 1, 0) = 0 (Timeout)
[pid 35964] poll([{fd=8, events=POLLOUT}], 1, 0) = 0 (Timeout)
[pid 35964] poll([{fd=8, events=POLLOUT}], 1, 0) = 0 (Timeout)
[pid 35964] poll([{fd=8, events=POLLOUT}], 1, 0) = 0 (Timeout)

(many more times...)

[pid 35964] poll([{fd=8, events=POLLOUT}], 1, 0) = 0 (Timeout)
[pid 35964] poll([{fd=8, events=POLLOUT}], 1, 0) = 0 (Timeout)
[pid 35964] poll([{fd=8, events=POLLOUT}], 1, 0) = 0 (Timeout)
[pid 35964] poll([{fd=8, events=POLLOUT}], 1, 0) = 0 (Timeout)
[pid 35964] poll([{fd=8, events=POLLOUT}], 1, 0) = 0 (Timeout)
[pid 35964] poll([{fd=8, events=POLLOUT}], 1, 0) = 0 (Timeout)
[pid 35964] poll([{fd=8, events=POLLOUT}], 1, 0) = 0 (Timeout)
[pid 35964] poll([{fd=8, events=POLLOUT}], 1, 0) = 0 (Timeout)
[pid 35964] poll([{fd=8, events=POLLOUT}], 1, 0) = 0 (Timeout)
[pid 35964] poll([{fd=8, events=POLLOUT}], 1, 0) = 0 (Timeout)
[pid 35964] poll([{fd=8, events=POLLOUT}], 1, 0) = 0 (Timeout)
[pid 35964] poll([{fd=8, events=POLLOUT}], 1, 0) = 0 (Timeout)
[pid 35964] poll([{fd=8, events=POLLOUT}], 1, 0) = 0 (Timeout)
[pid 35964] poll([{fd=8, events=POLLOUT}], 1, 0) = 0 (Timeout)
[pid 35964] poll([{fd=8, events=POLLOUT}], 1, 0) = 1 ([{fd=8, revents=POLLOUT}])
[pid 35965] <... epoll_wait resumed>[{events=EPOLLOUT, data={u32=0, u64=0}}], 1024, 989) = 1
[pid 35964] getsockopt(8, SOL_SOCKET, SO_ERROR, [0], [4]) = 0
[pid 35965] epoll_wait(3,  <unfinished ...>
[pid 35964] sendto(8, "GET / HTTP/1.1\r\nHost: example.co"..., 37, MSG_NOSIGNAL, NULL, 0) = 37
[pid 35964] write(1, "ready?\n", 7)     = 7
[pid 35964] write(1, "ready!\n", 7)     = 7
[pid 35964] recvfrom(8, 0x7f73f4382b00, 10000, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
[pid 35964] write(1, "WouldBlock\n", 11) = 11
[pid 35964] write(1, "ready?\n", 7)     = 7
[pid 35964] write(1, "ready!\n", 7)     = 7
[pid 35964] recvfrom(8, 0x7f73f4382b00, 10000, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
[pid 35964] write(1, "WouldBlock\n", 11) = 11
[pid 35964] write(1, "ready?\n", 7)     = 7
[pid 35964] write(1, "ready!\n", 7)     = 7
[pid 35964] recvfrom(8, 0x7f73f4382b00, 10000, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
[pid 35964] write(1, "WouldBlock\n", 11) = 11
[pid 35964] write(1, "ready?\n", 7)     = 7
[pid 35964] write(1, "ready!\n", 7)     = 7
[pid 35964] recvfrom(8, 0x7f73f4382b00, 10000, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
[pid 35964] write(1, "WouldBlock\n", 11) = 11
[pid 35964] write(1, "ready?\n", 7)     = 7
[pid 35964] write(1, "ready!\n", 7)     = 7
[pid 35964] recvfrom(8, 0x7f73f4382b00, 10000, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
[pid 35964] write(1, "WouldBlock\n", 11) = 11
[pid 35964] write(1, "ready?\n", 7)     = 7
[pid 35964] write(1, "ready!\n", 7)     = 7

(many more times...)

[pid 35964] recvfrom(8, 0x7f73f4382b00, 10000, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
[pid 35964] write(1, "WouldBlock\n", 11) = 11
[pid 35964] write(1, "ready?\n", 7)     = 7
[pid 35964] write(1, "ready!\n", 7)     = 7
[pid 35964] recvfrom(8, 0x7f73f4382b00, 10000, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
[pid 35964] write(1, "WouldBlock\n", 11) = 11
[pid 35964] write(1, "ready?\n", 7)     = 7
[pid 35964] write(1, "ready!\n", 7)     = 7
[pid 35964] recvfrom(8, 0x7f73f4382b00, 10000, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
[pid 35964] write(1, "WouldBlock\n", 11) = 11
[pid 35964] write(1, "ready?\n", 7)     = 7
[pid 35964] write(1, "ready!\n", 7)     = 7
[pid 35964] recvfrom(8, 0x7f73f4382b00, 10000, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
[pid 35964] write(1, "WouldBlock\n", 11) = 11
[pid 35964] write(1, "ready?\n", 7)     = 7
[pid 35964] write(1, "ready!\n", 7)     = 7
[pid 35964] recvfrom(8, 0x7f73f4382b00, 10000, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
[pid 35964] write(1, "WouldBlock\n", 11) = 11
[pid 35964] write(1, "ready?\n", 7)     = 7
[pid 35964] write(1, "ready!\n", 7 <unfinished ...>
[pid 35965] <... epoll_wait resumed>[{events=EPOLLIN|EPOLLOUT, data={u32=0, u64=0}}], 1024, -1) = 1
[pid 35964] <... write resumed>)        = 7
[pid 35965] epoll_wait(3,  <unfinished ...>
[pid 35964] recvfrom(8,  <unfinished ...>
[pid 35965] <... epoll_wait resumed>[{events=EPOLLIN|EPOLLOUT, data={u32=0, u64=0}}], 1024, -1) = 1
[pid 35964] <... recvfrom resumed>"HTTP/1.1 200 OK\r\nAge: 33142\r\nCac"..., 10000, 0, NULL, NULL) = 1590
[pid 35965] epoll_wait(3,  <unfinished ...>
[pid 35964] mprotect(0x7f7244120000, 65536, PROT_READ|PROT_WRITE) = 0
[pid 35964] epoll_ctl(5, EPOLL_CTL_DEL, 8, NULL) = 0
[pid 35964] close(8)                    = 0

First of all, the same problem apparently also happens while the socket is connecting.
Secondly, I'm not familar with the inner workings of tokio, but I would expect there to be a poll/epoll_wait call after the initial recvfrom returns EAGAIN.

I am developing Rust Tokio library for ISO-TP. CAN protocol, which lets you send larger messages. The program is aimed towards linux only. For this, I am using Tokio structure AsyncFd. When the wri...

view this post on Zulip Alex Crichton (Feb 07 2024 at 14:04):

Looking into this so the looping behavior on connect makes sense since we're not properly hooking into tokio, we need to do something along the lines of io.do_io(|| ...) or similar. Either that or we need to avoid using rustix here and need to use the tokio primitives, but I haven't looked too much into this.

view this post on Zulip Alex Crichton (Feb 07 2024 at 14:06):

I'm still not sure why after a read we don't hit epoll

view this post on Zulip Dave Bakker (badeend) (Feb 07 2024 at 14:47):

Even with these changes, connecting still performs a busy loop

…wrote TcpSocket::subscribe to only poll the event we're interested in

view this post on Zulip Dave Bakker (badeend) (Feb 07 2024 at 15:17):

I attached the debugger and logged tokio's internal ready state of the socket over the program's lifetime

view this post on Zulip Dave Bakker (badeend) (Feb 07 2024 at 15:19):

prettier names for the states:

0:  Ready::EMPTY
1:  Ready::READABLE
2:  Ready::WRITABLE
3:  Ready::READABLE | Ready::WRITABLE
4:  Ready::READ_CLOSED
10: Ready::WRITABLE | Ready::WRITE_CLOSED
14: Ready::WRITABLE | Ready::READ_CLOSED | Ready::WRITE_CLOSED
31: Ready::ALL

Results for the sync version:

wasi tcp-socket::create_tcp_socket
wasi tcp-socket::start_connect
tokio::io::Registration::try_io: ev.ready == 0
wasi tcp-socket::finish_connect
crate::preview2::tcp::TcpSocket: ready?
tokio::io::ScheduledIo::wake was called with: ready == 2
crate::preview2::tcp::TcpSocket: ready!
wasi tcp-socket::finish_connect
wasi output-stream::write
tokio::io::Registration::try_io: ev.ready == 2
crate::preview2::tcp::TcpReadStream: ready?
tokio::io::ScheduledIo::wake was called with: ready == 3
crate::preview2::tcp::TcpReadStream: ready!
wasi input-stream::read
tokio::io::Registration::try_io: ev.ready == 1
tokio::io::ScheduledIo::wake was called with: ready == 2
wasi tcp-socket::drop

Seems all ok to me.

view this post on Zulip Dave Bakker (badeend) (Feb 07 2024 at 15:20):

But the async_ version:

wasi tcp-socket::create_tcp_socket
tokio::io::ScheduledIo::wake was called with: ready == 14
wasi tcp-socket::start_connect
tokio::io::Registration::try_io: ev.ready == 10
wasi tcp-socket::finish_connect
crate::preview2::tcp::TcpSocket: ready?
crate::preview2::tcp::TcpSocket: ready!
wasi tcp-socket::finish_connect
crate::preview2::tcp::TcpSocket: ready?
crate::preview2::tcp::TcpSocket: ready!
wasi tcp-socket::finish_connect
crate::preview2::tcp::TcpSocket: ready?
crate::preview2::tcp::TcpSocket: ready!
wasi tcp-socket::finish_connect
crate::preview2::tcp::TcpSocket: ready?
crate::preview2::tcp::TcpSocket: ready!
wasi tcp-socket::finish_connect
crate::preview2::tcp::TcpSocket: ready?
crate::preview2::tcp::TcpSocket: ready!
wasi tcp-socket::finish_connect
crate::preview2::tcp::TcpSocket: ready?
crate::preview2::tcp::TcpSocket: ready!
wasi tcp-socket::finish_connect
crate::preview2::tcp::TcpSocket: ready?
crate::preview2::tcp::TcpSocket: ready!
wasi tcp-socket::finish_connect
crate::preview2::tcp::TcpSocket: ready?
crate::preview2::tcp::TcpSocket: ready!
wasi tcp-socket::finish_connect
crate::preview2::tcp::TcpSocket: ready?
crate::preview2::tcp::TcpSocket: ready!
tokio::io::ScheduledIo::wake was called with: ready == 2
wasi tcp-socket::finish_connect
wasi output-stream::write
tokio::io::Registration::try_io: ev.ready == 10
crate::preview2::tcp::TcpReadStream: ready?
crate::preview2::tcp::TcpReadStream: ready!
wasi input-stream::read
tokio::io::Registration::try_io: ev.ready == 4
tokio's TcpStream::try_read_buf returned WouldBlock
crate::preview2::tcp::TcpReadStream: ready?
crate::preview2::tcp::TcpReadStream: ready!
wasi input-stream::read
tokio::io::Registration::try_io: ev.ready == 4
tokio's TcpStream::try_read_buf returned WouldBlock
crate::preview2::tcp::TcpReadStream: ready?
crate::preview2::tcp::TcpReadStream: ready!
wasi input-stream::read
tokio::io::Registration::try_io: ev.ready == 4
tokio's TcpStream::try_read_buf returned WouldBlock
crate::preview2::tcp::TcpReadStream: ready?
crate::preview2::tcp::TcpReadStream: ready!
wasi input-stream::read
tokio::io::Registration::try_io: ev.ready == 4
tokio's TcpStream::try_read_buf returned WouldBlock
crate::preview2::tcp::TcpReadStream: ready?
crate::preview2::tcp::TcpReadStream: ready!
wasi input-stream::read
tokio::io::Registration::try_io: ev.ready == 4
tokio's TcpStream::try_read_buf returned WouldBlock
crate::preview2::tcp::TcpReadStream: ready?
crate::preview2::tcp::TcpReadStream: ready!
wasi input-stream::read
tokio::io::Registration::try_io: ev.ready == 4
tokio::io::ScheduledIo::wake was called with: ready == 3
tokio::io::ScheduledIo::wake was called with: ready == 2
wasi tcp-socket::drop
tokio::io::ScheduledIo::wake was called with: ready == 31

The socket immediately jumps to Ready::WRITABLE | Ready::READ_CLOSED | Ready::WRITE_CLOSED after creation.
:face_with_raised_eyebrow:

view this post on Zulip Alex Crichton (Feb 07 2024 at 15:46):

weird! From the syscalls above that's probably:

[pid 35964] epoll_ctl(5, EPOLL_CTL_ADD, 8, {events=EPOLLIN|EPOLLOUT|EPOLLRDHUP|EPOLLET, data={u32=0, u64=0}} <unfinished ...>
[pid 35965] <... epoll_wait resumed>[{events=EPOLLOUT|EPOLLHUP, data={u32=0, u64=0}}], 1024, 948) = 1

at the very beginning

view this post on Zulip Alex Crichton (Feb 07 2024 at 15:46):

if it thinks it's hung up that makes sense why it keeps returning immediately

view this post on Zulip Alex Crichton (Feb 07 2024 at 15:46):

why it things it's hung up though is a mystery

view this post on Zulip Alex Crichton (Feb 07 2024 at 15:48):

I wonder if this is like WSL-specific behavior where we have to avoid adding the socket to the epoll set until after the connect is issued? Or something like that?

view this post on Zulip Dave Bakker (badeend) (Feb 07 2024 at 16:12):

I don't think this is WSL specific.

  1. https://stackoverflow.com/questions/14765203/why-am-i-getting-the-epollhup-event-on-a-brand-new-socket.
  2. https://issues.apache.org/jira/browse/MESOS-9877 "In Linux, calling epoll() on a TCP socket before calling connect() will return an EPOLLHUP event on that socket."
  3. https://groups.google.com/g/comp.os.linux.development.apps/c/ifXtb2dsF3g "EPOLLHUP means the
    other end of a communication was closed. In this particular case, it
    is returned for an unconnected stream socket."

I suspect this has to do with the fact that we forcefully construct a tokio::net::TcpStream before socket has been connected and that tokio was just never designed to handle this.

I have some code that when run on a virtual machine is misbehaving for some reason. The order of initialization is: s_listen = socket(...) bind(s_listen, ...) epoll_ctl(epfd, EPOLL_CTL_ADD, s_lis...

view this post on Zulip Alex Crichton (Feb 08 2024 at 02:51):

This might be possible to fix by sticking to tokio types rather than using rustix perhaps? We might be able to leverage rustix for anything tokio doesn't implement but basic things like connection and such seems like it should be doable with tokio


Last updated: Oct 23 2024 at 20:03 UTC