Stream: git-wasmtime

Topic: wasmtime / PR #11658 fix(p3-http): flaky `content-length`...


view this post on Zulip Wasmtime GitHub notifications bot (Sep 09 2025 at 17:01):

rvolosatovs opened PR #11658 from rvolosatovs:fix/flaky-content-length-test to bytecodealliance:main:

Closes #11656

view this post on Zulip Wasmtime GitHub notifications bot (Sep 09 2025 at 17:03):

rvolosatovs updated PR #11658.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 09 2025 at 17:48):

rvolosatovs updated PR #11658.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 09 2025 at 17:52):

rvolosatovs updated PR #11658.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 09 2025 at 17:59):

rvolosatovs updated PR #11658.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 09 2025 at 18:10):

rvolosatovs updated PR #11658.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 09 2025 at 18:11):

rvolosatovs updated PR #11658.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 09 2025 at 18:36):

rvolosatovs updated PR #11658.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 10 2025 at 09:33):

rvolosatovs updated PR #11658.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 10 2025 at 09:45):

rvolosatovs updated PR #11658.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 10 2025 at 10:23):

rvolosatovs updated PR #11658.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 10 2025 at 10:24):

rvolosatovs edited PR #11658:

Closes #11656

The reason for the flakiness was an issue in implementation of the test HTTP server logic - short write of the 2nd connection would sometimes cause an error and the 3rd connection would fail to get established, since the server would stop accepting connections.
The reason for this is that there is a race condition for the "short write" case in which the client might not even have started sending the request body to the server yet when the error is caught by the consumer and consequently the I/O driver task is dropped. The cases where the body was not started to be transmitted yet would be treated as success by Hyper and so the 3rd connection would be accepted, however in the rare cases where the request body has already started being streamed, connection handling would fail server-side due to the short write, aborting the accept loop and causing "connection refused" error in the guest for the 3rd connection never triggering the content-length check for the 3rd case and therefore causing a panic on the transmit.expect_err, since from wasi:http perspective transmission future did not encounter errors as it has never started. To address the last part, I've also pushed https://github.com/bytecodealliance/wasmtime/pull/11658/commits/db3cbac35da23760997cf4ba44964bf7a357af26 to make sure content-length check happens early and even if the GuestBody is already dropped by the time guest is trying to write

view this post on Zulip Wasmtime GitHub notifications bot (Sep 10 2025 at 10:26):

rvolosatovs edited PR #11658:

Closes #11656

We paired with @alexcrichton yesterday debugging this (thank you, @alexcrichton!) and here's the summary of what we've found out:
The reason for the flakiness was an issue in implementation of the test HTTP server logic - short write of the 2nd connection would sometimes cause an error and the 3rd connection would fail to get established, since the server would stop accepting connections.
The reason for this is that there is a race condition for the "short write" case in which the client might not even have started sending the request body to the server yet when the error is caught by the consumer and consequently the I/O driver task is dropped. The cases where the body was not started to be transmitted yet would be treated as success by Hyper and so the 3rd connection would be accepted, however in the rare cases where the request body has already started being streamed, connection handling would fail server-side due to the short write, aborting the accept loop and causing "connection refused" error in the guest for the 3rd connection never triggering the content-length check for the 3rd case and therefore causing a panic on the transmit.expect_err, since from wasi:http perspective transmission future did not encounter errors as it has never even began. To address the last part, I've also pushed https://github.com/bytecodealliance/wasmtime/pull/11658/commits/db3cbac35da23760997cf4ba44964bf7a357af26 to make sure content-length check happens early and even if the GuestBody is already dropped by the time guest is trying to write

view this post on Zulip Wasmtime GitHub notifications bot (Sep 10 2025 at 10:29):

rvolosatovs has marked PR #11658 as ready for review.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 10 2025 at 10:29):

rvolosatovs requested wasmtime-wasi-reviewers for a review on PR #11658.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 10 2025 at 10:29):

rvolosatovs requested fitzgen for a review on PR #11658.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 10 2025 at 10:29):

rvolosatovs requested wasmtime-core-reviewers for a review on PR #11658.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 10 2025 at 10:29):

rvolosatovs requested alexcrichton for a review on PR #11658.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 10 2025 at 10:29):

rvolosatovs has enabled auto merge for PR #11658.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 10 2025 at 11:10):

rvolosatovs updated PR #11658.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 10 2025 at 11:11):

rvolosatovs updated PR #11658.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 10 2025 at 13:22):

rvolosatovs updated PR #11658.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 10 2025 at 13:24):

rvolosatovs commented on PR #11658:

I've been working on some refactoring and addition of content-length validation for requests carrying bodies originating from the host as well as some refactoring, since it's all related to content-length, I went ahead and just pushed it to this PR to simplify review/merge process

view this post on Zulip Wasmtime GitHub notifications bot (Sep 10 2025 at 13:25):

rvolosatovs edited PR #11658:

Closes #11656

We paired with @alexcrichton yesterday debugging this (thank you, @alexcrichton!) and here's the summary of what we've found out:
The reason for the flakiness was an issue in implementation of the test HTTP server logic - short write of the 2nd connection would sometimes cause an error and the 3rd connection would fail to get established, since the server would stop accepting connections.
The reason for this is that there is a race condition for the "short write" case in which the client might not even have started sending the request body to the server yet when the error is caught by the consumer and consequently the I/O driver task is dropped. The cases where the body was not started to be transmitted yet would be treated as success by Hyper and so the 3rd connection would be accepted, however in the rare cases where the request body has already started being streamed, connection handling would fail server-side due to the short write, aborting the accept loop and causing "connection refused" error in the guest for the 3rd connection never triggering the content-length check for the 3rd case and therefore causing a panic on the transmit.expect_err, since from wasi:http perspective transmission future did not encounter errors as it has never even began. To address the last part, I've also pushed https://github.com/bytecodealliance/wasmtime/pull/11658/commits/db3cbac35da23760997cf4ba44964bf7a357af26 to make sure content-length check happens early and even if the GuestBody is already dropped by the time guest is trying to write

In this PR I've also:

view this post on Zulip Wasmtime GitHub notifications bot (Sep 10 2025 at 13:34):

rvolosatovs updated PR #11658.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 10 2025 at 13:43):

rvolosatovs updated PR #11658.


Last updated: Dec 06 2025 at 06:05 UTC