wasmtime / issue #6811 Wasi Preview 2 Async Race Condition · git-wasmtime

Moving streams to async has introduced a race condition. The AsyncWriteStream often gets dropped before the receiver inside the spawned task has time to receive a message.

This makes sense as there's nothing to tie the spawned task's completion to the Store that contains the AsyncWriteStream. It's very easy to drop the Store before any handlers are allowed to run to completition.

Test Case

use wasmtime::{
    component::{Component, Linker},
    Config, Engine, Store,
};
use wasmtime_wasi::preview2::{Table, WasiCtx, WasiCtxBuilder, WasiView};

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    let mut config = Config::new();
    config.wasm_component_model(true).async_support(true);
    let engine = Engine::new(&config)?;
    let component = Component::from_file(&engine, "example.wasm")?;
    let mut table = Table::new();
    let ctx = WasiCtxBuilder::new().inherit_stdio().build(&mut table)?;
    let mut store = Store::new(&engine, Data { table, ctx });
    let mut linker = Linker::new(&engine);
    wasmtime_wasi::preview2::command::add_to_linker(&mut linker)?;
    let instance = linker.instantiate_async(&mut store, &component).await?;
    let run = instance.get_typed_func::<(), ()>(&mut store, "hello")?;
    run.call_async(&mut store, ()).await?;
    Ok(())
}

struct Data {
    table: Table,
    ctx: WasiCtx,
}

impl WasiView for Data {
    fn table(&self) -> &wasmtime_wasi::preview2::Table {
        &self.table
    }

    fn table_mut(&mut self) -> &mut wasmtime_wasi::preview2::Table {
        &mut self.table
    }

    fn ctx(&self) -> &wasmtime_wasi::preview2::WasiCtx {
        &self.ctx
    }

    fn ctx_mut(&mut self) -> &mut wasmtime_wasi::preview2::WasiCtx {
        &mut self.ctx
    }
}

The example.wasm file is just a simple wasm component that has one export hello which just uses Rust's println! to print "Hello, World!" to stdout.

This technically works, but only by coincidence. If we see when the "Hello, World" gets flushed to stdout, it happens after the spawned task that is handling IO is aborted in the Drop handler.

Steps to Reproduce

You can reproduce those by running the above example with a local checkout of wasmtime. Simply add a logging statement to the drop handler of AsyncWriteStream and you'll see that the drop (and thus the cancellation of the spawned task) happens before the output to stdout.

I've found this to be reliably reproducible in a user visible way if the wasm binary prints more than 1 line of text to stdout. The first line usually gets successfully flushed, but the second does not. Adding something to slow down the execution of the runtime can eliminates the bug (e.g., adding tokio::time::sleep(std::time::Duration::from_millis(1)).await; after the function call, gets rid of the race condition).

I first caught this bug first when upgrading Spin to the latest version of wasmtime. The exported function would complete and the containing Store would be dropped before the underlying AsyncWriteStream's spawned task had a chance to receive the bytes to be written to the inner AsyncWrite type. Only every ~10th invocation would see the spawned task actually process the bytes before the AsyncWriteStream's drop handler canceled the task.

Expected Results

I would expect that there would be some mechanism for ensuring that ongoing async writes complete before the Store is dropped.

Stream: git-wasmtime

Topic: wasmtime / issue #6811 Wasi Preview 2 Async Race Condition

Wasmtime GitHub notifications bot (Aug 07 2023 at 11:02):

Test Case

Steps to Reproduce

Expected Results

Versions and Environment

Wasmtime GitHub notifications bot (Aug 07 2023 at 11:02):

Wasmtime GitHub notifications bot (Aug 07 2023 at 15:32):

Wasmtime GitHub notifications bot (Aug 07 2023 at 17:53):

Wasmtime GitHub notifications bot (Aug 08 2023 at 08:59):

Wasmtime GitHub notifications bot (Aug 08 2023 at 09:00):

Wasmtime GitHub notifications bot (Aug 08 2023 at 15:42):

Wasmtime GitHub notifications bot (Sep 12 2023 at 18:14):

Test Case

Steps to Reproduce

Expected Results

Versions and Environment