wasmtime / issue #7243 wasi-sockets: Introduce UDP streams · git-wasmtime

Stream: git-wasmtime

Topic: wasmtime / issue #7243 wasi-sockets: Introduce UDP streams

Wasmtime GitHub notifications bot (Oct 16 2023 at 16:28):

Thanks for this! I've got a few comments on some API design decisions as well as their implications on the implementation:

Primarily I think it would be best to avoid the usage of Mutex<T> in the implementation. I feel like it's best to reflect the concurrency of the underlying OS which is that mutexes aren't required and concurrent calls to various bits and pieces are ok. I realize that mutexes aren't held during blocking I/O operations as a part of this PR, but that effectively means that the lock is overhead and I'd prefer to do away with it. I tested this out locally and the main issue that I ran into was that the inbound/outbound streams need access to the remote_address() of the state of the socket. This is problematic because, theoretically, a start_connect can race with a send or receive call, meaning that a Mutex is basically required to arbitrate that race (or something atomic like that). Could this perhaps be altered at the API level where an inbound/outbound stream is returned from finish-connect in addition to finish-bind? That way the streams from bind are never connected and the streams from connect are always connected. That may have weird implications if all the streams are used, though, so I'm not 100% sure.

Bikeshed-wise the name inbound-* and outbound-* are inconsistent with wasi:io/streams usage of input-* and output-*. Is it perhaps worth it to use the same terminology?

Bikeshed-wise the oubound-datagram-stream type is similar to output-stream but has a different mechanism for receiving data. The output-stream type for example has a check-write method which returns how much can be written and only then can write be called. Additionally write traps if too much is given to it. The main rationale for this design decision is that it avoids the need for callees to receive a possibly unbounded amount of data forcibly within their module (since calling send will copy over all the data into the second module if there is one). This may best be solved with a check-write lookalike for outbound-datagram-stream as well perhaps?

Wasmtime GitHub notifications bot (Oct 16 2023 at 17:55):

badeend commented on issue #7243:

I think it would be best to avoid the usage of Mutex<T> in the implementation.

I agree, yet I couldn't figure out how to make it work.
My initial version had bind and connect both return streams. (Also I allowed connect to be called multiple times, but I didn't want to pollute this PR too much).
In that setup, calling connect has to invalidate the previously returned streams somehow. At which point I ended up at Mutex again. Allowing the previous streams to remain somehow usable might be easier to implement, but is arguably a worse API design.

Bikeshed-wise the name inbound-* and outbound-* are inconsistent with wasi:io/streams usage of input-* and output-*. Is it perhaps worth it to use the same terminology?

I (tried to) use the wasi-http naming scheme, as they feel more natural to the domain at hand (networking) as opposed to input&output.
Sidenote, I just noticed that wasi-http actually uses slightly different terminology (incoming&outgoing) :no_mouth:

For the stream types, I'm fine with either naming scheme. They're temporary anyway.
For the datagram types themselves I do prefer to keep calling them incoming&outgoing-datagram rather than input&output-datagram.

The output-stream type for example has a check-write method which returns how much can be written and only then can write be called

Good point. I'll look into it.

The main rationale for this design decision is that it avoids the need for callees to receive a possibly unbounded amount of data forcibly within their module

Even with a check-write method on outbound-datagram-stream that can still happen, right. I assume check-write will return the amount of datagrams that can be sent. But each datagram can still contain an unbounded number of bytes in the data field.

Wasmtime GitHub notifications bot (Oct 16 2023 at 18:05):

alexcrichton commented on issue #7243:

One way to get invalidation working without Mutex is to have a generational counter stored in each incoming/outgoing stream which is compared to the "current generation", an atomic counter, stored within the Arc. That way invalidation would be "increment the generation" and that may work out?

For naming ok sounds like we're already a bit inconsistent, so I think it's ok to keep the names here :+1:

For check-write you're correct that you can always still pass in more data, but implementations can trap when this happens vs being expected to work correctly. I also think that for datagrams the number returned from check-write should probably be "size of all datagrams summed up" since you're right that the size of an individual datagram can change. Unless we can place a hard cap on the size of the datagram though in which case implementations could trap if any individual datagram is too large.

Wasmtime GitHub notifications bot (Oct 16 2023 at 18:07):

badeend commented on issue #7243:

Potentially answering my own question:

calling connect has to invalidate the previously returned streams somehow

Rather than performing validation in the *-stream code, maybe we can require the consumer to have dropped all prior child streams before calling connect (again) and trap otherwise. That way it should be impossible to have multiple competing streams active at the same time.

Wasmtime GitHub notifications bot (Oct 16 2023 at 18:09):

alexcrichton commented on issue #7243:

Ah I think we raced there a bit, but I like your idea more of requiring the streams are dropped than my idea of a generation counter

Wasmtime GitHub notifications bot (Oct 23 2023 at 20:34):

badeend commented on issue #7243:

This is ready for review again.

I have:

Updated the API to disallow multiple pairs of active streams.

Updated the API to mostly match wasi-io's input-stream & output-stream.

Removed the mutex

Renamed the resources to align with wasi-http.

Wasmtime GitHub notifications bot (Oct 24 2023 at 20:00):

badeend commented on issue #7243:

Alright, looks like the MacOS issue is fixed now

Last updated: Apr 17 2025 at 22:03 UTC