Yesterday we discussed with @Joel Dice, @Till Schneidereit and others that TCP+TLS handshake typically have significant network latency.
That latency is not great fit for hosting model where we want to create application/component instance per incoming request/event.
In dotnet application servers this is solved by connection pooling.. But that is not compatible with instance per request hosting.
So I wonder if we could have something like network session pooling as WASI interface. It could be implemented by the host or by another component with longer lifecycle.
We are designing TLS stream as a transformer. Without the TCP connection. But I wonder if a wrapper of that together with TCP connection is the right direction? Let's call it "TLS session pooling".
Problems I can see are in security and state management of such session.
The session typically would be authenticated to particular DB user, using password or using private key on TLS layer.
The generic session cache would no way knowing how to make the application level (SQL) handshake, like login, selection of the database schema etc.
So the creation of new session would have to be done inside of the application component ?
Also the application should not release the session back to the pool, unless it's in some base state. For example, no open DB transaction.
Alternatively I can see alternate design, where the Microsoft.Data.SqlClient would long lived WASI component living side by side next to the short lived request handler WASI component.
That would make the whole affair very specific to SQL server and dotnet. We could have bespoke WIT for that.
The benefit would be that the existing code is solving those security/state problems already.
I would call that "SQL server connection pooling component for dotnet".
In any case, it seems to me we could not make it work transparently without WASI specific changes in Microsoft.Data.SqlClient, right ?
I can see that other long lived protocols may have similar problems, if they are implemented in terms of wasi:sockets, rather than by the WASI host. Web Sockets and HTTP/3 come to mind.
So I think it would be good to establish at least some common best practice guidance.
this is of course a great and difficult conversation. I'd chatted a bit with Till periodically about similar things
what follows is my take, with which everyone should feel free to disagree.
I see several variables at play:
in very short-lived functions, CDN functions for example, you really do not WANT threading because scheduling/orch of external work is not really what the function is for. You want execution and cleanup. In these cases:
and for those kinds of functions, you're very likely to ship one component with everythign in it, as there are only a few functions you're implementing/using.
for me, these don't cause too much trouble
what does is the real world in which to achieve high throughput requires the things we do not yet have:threads/streams and so on. it also requires components that hold state (caches or pools) for shorter-lived items that use them -- here, the connection pooling is a great example. So I immediately think of layers of components most of which do basic work we already do in native "servers" or "clients that handle lots of things on behalf of various functions"
wasi:sockets would be the bottom layer, in this view, and then wasi:tcp and wasi:http. Somewhere at the appropriate layer, wasi:tls would be involved -- it handles that portion of the network handshake.
so this kind of layering might require additions to wasi:http/tcp and so on to do the tls dancing by calling out to a wasi:tls implementation. in each case, we lean into the security boundary of a component to protect against memory attacks from outside the component (keeping always in mind the lack fo readonly memory).
one quick note before replying to other parts: wasi:http
is very intentionally not specified in terms of wasi:sockets
, so that layering picture doesn't reflect how things are actually set up. This is pretty key, because it means that wasi:http
isn't restricted to functionality that can be expressed in terms of wasi:sockets
, nor does being able to implement wasi:http
need to imply also being able to implement wasi:sockets
(see browsers as an example of the latter)
in the usual case, these would all be shipped as a "stack" of components that do the right thing, used by a client that invokes the highest abstraction it's necessary to use.
I also don't think that we need to change any of this to address the issues Pavel raised
Pavel, Joel, and I had a chance to talk about this during a meeting yesterday, and my following thoughts are substantially based on that conversation:
I agree with Pavel that establishing a new TCP+TLS connection in terms of wasi:sockets
each and every time will be prohibitively costly and inefficient. I don't think that'll change in any meaningful way with WASIp3+, nor do intra-component security considerations change the picture all that much—but I'd like to understand your argument about that better, @Ralph
Till, that's a great point -- and one I love. I'm using a layering metaphor merely to ensure that we take into account the feature of the component boundary for memory and the higher level abstraction that most people should use that means -- like wasi:http -- that you can't just reach down and grab wasi:sockets from guest code.
I also don't think threads really are involved all that much in this. In dotnet specifically, the connection pool is implemented in terms of threads, but that wouldn't have had to be the case. And to me the instance lifetime issues are the much more substantial concern
Till Schneidereit said:
Pavel, Joel, and I had a chance to talk about this during a meeting yesterday, and my following thoughts are substantially based on that conversation:
I agree with Pavel that establishing a new TCP+TLS connection in terms of
wasi:sockets
each and every time will be prohibitively costly and inefficient. I don't think that'll change in any meaningful way with WASIp3+, nor do intra-component security considerations change the picture all that much—but I'd like to understand your argument about that better, Ralph
Take the abstraction and figure out the path that relieves it. connection pooling is a cache to enable shorter-lived things to NOT do connection creation. and this happens at multiple layers with varying kinds of data. Caches are great things.
Ah, that makes sense, and I strongly agree!
(will continue in a bit, have to switch trains)
so I'm thinking out loud about how you would establish a coherent http/tls story that doesn't just open calls up to everyone. Maybe we need to! But I'd like to think that whatever needs to establish secure connections AND pool them might be their own components that are typically configured together. Yes, the user might code to wasi:sql (for one example) and oracle:sql (for another) but we wouldn't be building either with full access to all the calls involved.
I happen to love the component memory boundary as a feature, and I look for places to lean in.
but when we're building the innards of the core protocols, it's possible we can't do it easily -- yet. And this is where the threading/streams comes in. Once you have threads, you can have async scale processing that takes advantage of cores. That means that prohibitively costly and inefficient
will become less so. Once you have streams, you can have network filters that can actually approach native speeds (which can't happen with copying that fast).
a real web server does several layers of caching different things and each one is managed using thread systems. They max out the OS functionality to the very best of their ability. There is no way we could hope to approach that in components until we have similar OS-like capabilities. Maybe even then we don't get close enough! But that's the point I'm trying to make about the difference between p2 and p3+.
I'm thinking that read-only "caching" of the sessions would be ideal from layering and security perspective. We know how to do that for HTTP. Maybe the host uses keep-alive, but individual HTTP requests are well isolated and long-lived aspect is no business of the application code.
This is not the case with SQL session, you use SET NOCOUNT
in your session and now the session is "dirty".
F5's unitd absolutely screams using wasi:http and it's because it handles all the networking.
I'll be very interested to hear Till's ideas here when he gets on the next train. But wrt This is not the case with SQL session, you use
SET NOCOUNT in your session and now the session is "dirty".
, how would you model that abstractly now?
it sounds like a sql conn is potentially dirty and potentially clean.....
what are the consequences of a dirtyconn, here? That the conn is long lived and has state floating around but is reused anyway?
And probably the implementation of the pooling in the Microsoft.Data.SqlClient
is able to deal with that already. Modeling it abstractly .. we can't trust the application code to say "i made it dirty" with confidence.
SHOULD the app code be able to do that? currently, the answer is yes?
and also, can it change the choice dynamically?
In the case of MSSQL specifically: when the shortlived client exits, the host can sp_reset_connection
to "clean up" the session and get it ready for the next client session, right?
does <3 mean yes?
:-)
I means I didn't know about it. And that I love it's there. Is is "clean enough" ? IDK
guess who gets to make that decision? :-P
question: does Microsoft.Data.SqlClient only connect to mssql? or can u use it against other dbs?
MSSQL only
there is dedicated https://github.com/npgsql/npgsql for example
But, AFAIK, every major database has its own equivalent of sp_reset_connection
So that's a research point, because something like that will really help this conversation about sql
Correct
does HTTP/3 have such thing ?
http/3 is wild, imho
that one requires thought. I still think the focus on lifetimes of things doing dependent caching for layers above is the thing that pops the design free
you can handle varying lifetimes in the same component, but without internal threading that's going to bog down
but kept separate, you have more possibilities and are likely leaning into the component memory boundary feature
that's my take -- you all are much more intelligent than me here
I don't see what threads have to with this, though. The real issue to me seems to be: how to compose components with varying instance lifetimes.
E.g. in this case there should be a SQL "driver"/"service" instance with a longer lifetime than any of its short-lived consumers. _Without_ having to special case everything as special wasmtime/host behavior
What I'm saying is that currently if you wanna do pooling, your pool is going to want to scale out and that's done using async/threads.
if you wanted a wasi:connectionpooling impl, you're either going to have everythign be a component inside or youu're going to use async/threads inside because that's something you already know how to do
Right, in the current world, you'd need to set up one long-lived component instance that handles _all_ requests.
because you can't create pollable in guest, right ?
now, @Dave Bakker (badeend) you're right in your focus on "compose components with varying lifetimes". Right now, using wasi:http, we don't use threads to go fast! in fact, threads get in the way. The question becomes more important once "people" want to build a connectionpooling component. They can do it using subcomponents as shorterlived items that the outercomponents manages. it's essentially a very small "serverless" approach to avoid threads.
well, I had written a long thing, which didn't go through because WIFI on trains, and now Zulip reloaded and (properly: correctly) decided that all of that was too poorly worded to retain
Till Schneidereit said:
well, I had written a long thing, which didn't go through because WIFI on trains, and now Zulip reloaded and (properly: correctly) decided that all of that was too poorly worded to retain
NEVER, TILL, NEVER!!!!!
in centralized services, you're always going to be handling the really large scale, long lived stuff outside the guest function
you don't need a "host component" for that, even if you could do it.
too kind, too kind
because you can't create pollable in guest, right ?
It Depends™. There exists a conversation somewhere on this Zulip with much more background on that. But the TLDR is somewehre on the spectrum between: "No" and "Yes, but it will be a lot of work"
but I'm thinking again of the hardware gateway that won't be updated for years and for which someone might want the entire webserver in a component.
anyway, I propose we set threading aside, because I think we can fully assume that we want to have a way to do pooling without requiring (very) long-lived instances
Till Schneidereit said:
anyway, I propose we set threading aside, because I think we can fully assume that we want to have a way to do pooling without requiring (very) long-lived instances
agreed, officially set that aside, back to sql conn pooling/dirty/clean
While the "dirty/clean" aspect is an important prerequisite for connection sharing to work, its not really of importance to the WASI/WIT/Components discussion. Either the underlying protocol supports it (HTTP, SQL, ..) and can be implemented by an implementation-specific "driver". Or: the protocol doesn't support it, in which case there's also no need to think about it any further here :P
Great job! our work is done. :-)
Nice, let's take the rest of the day off :palm_tree:
already drinking
exactly! (And I now have proof that Zulip was correct in eating my homework: that's much more concise than what I had)
What I'm imagining as a minimum client connection pooling API for wasi:tls
would be roughly this:
interface client-connection-pool {
put(connection: client-connection, identities: option<list<borrow<private-identity>>>);
get(identities: option<list<borrow<private-identity>>>) -> option<result<client-connection>>;
}
The idea being that for connections that make use of client certificates, you must prove that you'd be able to create a new connection with the same certificate, otherwise you shouldn't get to reuse an existing one.
I guess it'd make sense to add a few things such as optional TTL setting and such
Zulip has weird opinions on how to highlight .wit
@Dave Bakker (badeend) do you see any reason why we wouldn't be able to implement this kind of pool?
oh also, I think it'd make sense to have the same kind of pool for non-TLS socket connections
Hmm. I'd have to think about it more.
My initial reaction is that TCP/TLS sockets is the wrong abstraction level (too low) to provide a pool for. As Resetting a connection requires higher-level knowledge on how to do that. (e.g. the sp_reset_connection
example from above)
my thinking is that, outside of a hypothetical wasi:mssql
, it should be up to the component to ensure that the connection is ready for reuse. Yes, that does mean that there's a risk of improperly reusing a connection, but that seems pretty fundamental to me (again, outside of higher-level interfaces)
i think that's the proper responsibility of the component, absolutely.
Maybe, that can only work if the components are cooperating and can 100% trust each other.
I would imagine that the most common scenario is for this pool to be implemented by the host
Till Schneidereit said:
I would imagine that the most common scenario is for this pool to be implemented by the host
most commonly yes, but in the future a long lived client component could want to pool a large number of calls as well. But first things first.
would imagine that the most common scenario is for this pool to be implemented by the host
Ok. My comment was targeted at:
it should be up to the component to ensure that the connection is ready for reuse.
true, yes. But in that case it seems like you're fundamentally trusting the pooling component to give you back a connection with the same state that you'd have put it in, and conversely the pooling component fundamentally trusts its clients to properly clean up connections before putting them into the pool
yes, this must be the case. the pooling component is an inner component of the ultimate used connection manager. Does that make any sense?
ah, that gets to another thing Pavel and I talked about yesterday: ideally the pooling mechanism would be client-isolated. I.e., you'd not share a pool with other client components, so you get to rely on the exact set of properties you ensure for pooled connections
and certainly one would never ever ever share a pool across tenants
like, boooooooooooo
Makes sense
I think for component composition scenarios that'd largely Just Happen, but we'd absolutely want to specify this as part of the semantics
(then again, we don't even have the spec mechanisms for composing components with differing lifetimes, so who knows whether it'd still Just Happen once we have those)
that's interesting: I was unaware we hadn't fleshed out what happens with different lifetimes. hmmmm. Or are you just saying we don't have the language to describe that?
it seems like you're fundamentally trusting the pooling component to give you back a connection with the same state that you'd have put it in, and conversely the pooling component fundamentally trusts its clients to properly clean up connections before putting them into the pool
Right. And that's exactly why I'm doubting this solution path. Ideally, a component shouldn't have to worry about pooling at all and would be able to just say "give me a SQL[1] connection, I'll drop it when I'm done". And let the pooling component figure out how to reset & resuse the connection.
[1] (mentally replace "SQL" with your favorite protocol in your head)
yes, but that's the ultimate guest code's position!
what we're discussing here is the underlying imple components that actually do that work, right?
Ralph the PM writing code to call a db should just say, "give me a connection, I'll drop it when I'm done"
but something underneath that interface has to do the work of managing a pool, and underneath that actually implement the pool
it could be the same component, of course.
one big wasi:msssql
for exzample
I realized when reading we are possibly dealing with MSDTC
because of "transaction context"
That document about connection pooling I linked above is good read. It mentions
oh, transactions, fun
@Dave Bakker (badeend) absolutely. But that seems to fundamentally require specific interfaces such as wasi:mssql
, no?
The only really alternative way to set up pooling without abstracting all the connection handling completely would involve an interface with setup and teardown/cleanup hooks, where you'd say "give me a connection, and if you need to set it up, call this function, and if you need to reset it, call this one". But I don't think that'd address your concerns at all, because you'd still have to trust that the reset is done correctly
I mean, just fundamentally something has to do the setup/reset/teardown. And I think we should provide an interface that lets that "something" be the client component. Which then allows us to implement things like a pooling wasi:mssql
in user space
one important aspect here is that we've learned the hard way that even if we wanted to (and could) provide all the high-level interfaces, it'd not be enough: we'd not just have to provide all these interfaces, we'd also have to convince the world to change All The Code to make use of these interfaces instead of the implementations they already have in terms of a lower-level thing
Till Schneidereit said:
one important aspect here is that we've learned the hard way that even if we wanted to (and could) provide all the high-level interfaces, it'd not be enough: we'd not just have to provide all these interfaces, we'd also have to convince the world to change All The Code to make use of these interfaces instead of the implementations they already have in terms of a lower-level thing
this is the largest problem we face.
well said
that's not to say the high-level interfaces aren't a good thing: where possible and where people are asking for them, we should provide them. But we shouldn't force them on people
the higher level and different interfaces will become popular if they hit the sweet spot for users. that is the only path they have. it may well take time for a lot of them.
(yes, I'm one of the people who had to learn this the hard way. See also: wasi:grpc
requiring substantially more work in the spec, host implementations, and all language ecosystems than extending wasi:http
to support gRPC)
it will be better having wasi:grpc! But man, that will take time for adoption.
yes to both!
at a minimum, the best higher level abstractions will take 3-5 years before they hit the sweet spot. It seems like a long time, but it really isn't.
but, I guess the optimistic look at this is that we need time to make those all happen anyway
????????
I think we got it right by-and-large with wasi:http
, and I still believe in the fundamental approach to WASI (and WIT more generally) API design of "as high-level as feasible, but no higher". My thinking on the "feasible" bit has evolved a bit
yes. I have a feeling that "new" interfaces will have more adoption than "why did you screw up tcp?"
my hope is also that as we see things like component-based middleware, db connectors, etc, all of this will matter less and less, because we'll have much tighter pinch-points
it's going to happen
@Dave Bakker (badeend) how do you feel about TLS connection pooling after all this discussion?
I think that single-use/throw-away pre-opened anonymous TLS sessions would reduce necessary latency on of the application code. And be generic and secure. It would not bring the scalability. But maybe that good enough for MVP ?
absolutely it would
imho, that is.
so you're thinking of something that'd reduce this example to something like this?
// TCP setup:
let(tls_input, tls_output) = wasi_tls::connect("example.com")?.await?;
// Usage:
tls_output.blocking_write_and_flush("GET / HTTP/1.1\r\nHost: example.com\r\n\r\n");
let http_response = tls_input.blocking_read();
println!(http_response);
Could we encourage pool-per-client by defining a standard resource but not a standard interface? Consumers would need to define how the pool is exposed but it would take deliberate effort to share between components.
edit: actually I'm not sure bindgen produces the same type for a resource in different interfaces, so maybe not all that useful
(i.e., let the import handle the DNS lookup, socket connection, and TLS handshake, so that that can all happen concurrently and preemptively)
Now I'm thinking how to make that transparent to existing C# Socket & SslStream APIs
yeah, I think we're in the same space of difficult-to-adopt abstractions
@Lann Martin you'd ultimately still import an interface that would provide a function for acquiring the pool resource handle though, right? so it'd still be the most straightforward thing to always return the same handle
the host could just hand out unbound handle/resource when wasi:sockets:connect and if that is followed by call to TLS transform, it could take it from different pool.
The problem is that it is disastrous to share a TLS connection pool with an untrusted component
but I just remember that there were sketches somewhere about shared and non-shared instance imports. We'd want a non-shared instance import here, I think
indeed, yes
but much in the same way as sharing an outgoing-handler
, right?
an "optimistic pre-fetching pool" (or whatever you want to call the pre-warmed connection approach) definitely seems like the best bang-for-buck
i.e., to the degree we have this issue for connection pools, we also have it for anything that can establish outgoing connections
but much in the same way as sharing an
outgoing-handler
, right?
Not really. HTTP is pretty rigorously stateless
Lann Martin said:
The problem is that it is disastrous to share a TLS connection pool with an untrusted component
Those would be throw away, after the end of life of that resource. The host would consider it dirty and actually close the real connection. Is that not enough ?
what I mean is that a specific outgoing-handler
provides specific capabilities, at least as long as the exporter applies some kind of restrictions to where requests can be sent to
@Pavel Šavara Yeah sorry; we should call the "pre-warmed connections" idea something other than "pooling". I like that idea.
a naive userspace implementation of outgoing-handler
in a persistent instance would share its allowlist with all importers
same as a naive userspace implementation of a connection pool would share the pool with all importers
I guess I'm just thinking of the obvious "allow all" case for both http and a tls pool. For HTTP, a reused connection has pretty well understood state(lessness), assuming the implementation doesn't allow returning e.g. websockets to the pool. A TLS socket pool has too much flexibility here; e.g. a malicious component could set up an HTTP proxy on a socket and then put it in the pool masquerading as a "normal" HTTPS connection to the proxy host.
I agree that that is a very bad scenario. It seems to me like it's ultimately one example of the more fundamental issue that the moment you allow yourself to be imported by multiple components you'd better ensure that you retain the right level of isolation between the state you provide them with
What are the actual real-world use cases we're thinking of here, other than HTTP & SQL?
other databases, e.g. Redis, Mongo, etc.
MQTT
I guess any protocols on top of TCP or UDP?
where not all of them are actual real-world use cases, but there are enough that I think it makes sense for us to treat them as unbounded
malicious component
There's a continuum of "how much I trust the other component" from "I don't trust it at all" (in which case I probably shouldn't be using it) to "I wrote it myself and trust it completely, and I have other reasons besides security to make it a separate component" (e.g. different lifetimes, different languages, etc.).
@Lann Martin the more I think about it, the more I really don't think connection pools are special. Fundamentally, a very reasonable approximation is Thou Shalt Not Mix Capabilities.
I do think this poses very interesting problems for composition between long- and short-lived things, which I think we've only gotten away with ignoring so far because existing systems manage capabilities in the host pretty exclusively
as in, I don't even know if we'd have a mechanism by which a component that'd be imported by two other components would be able to tell apart which of those a call originated in
This is what unforgeable resources are for right? :shrug:
yeah, I just realize that's not the right question
better question: how do you ensure that a component importing you should get access to the same resource a previous instance of the same component definition did?
Say I have a single, long-lived, component Pool
, imported by an arbitrary series of instances of both A
and B
. How can I tell calls from instances of A
apart from those of instances of B
, so I can establish isolated caches for each of the component definitions?
one important aspect here is that we've learned the hard way that even if we wanted to (and could) provide all the high-level interfaces, it'd not be enough: we'd not just have to provide all these interfaces, we'd also have to convince the world to change All The Code to make use of these interfaces instead of the implementations they already have in terms of a lower-level thing
Everything discussed so far isn't POSIX (obviously), so "All The Code" will need to change anyway in some form or another. Right?
One possible answer could be "you don't, because that's not a setup you get to have". Instead, there could be one long-lived instance Pool-A
imported by all instances of A
, and another Pool-B
imported by all instances of B
Everything discussed so far isn't POSIX (obviously), so "All The Code" will need to change anyway in some form or another. Right?
That was my intuition as well, but I don't think it holds, no: there's a huge difference between having to change an ecosystem's HTTP abstraction(s) and let everything on top work without modification, and having to change all the things on top individually
The only idea that comes to mind that would be compatible with existing code would be per-instance session management
as in, the thing we have now for sockets, and will have for TLS with Dave's proposal?
I agree that the required changes should be limited to the standard-library(-like) libraries, and should not impact each and every application
I guess it would have to be host magic that was aware of export call sites
or ~equivalently slice up composed components to give them wrapped copies of shared imports
yeah, I remember Luke sketching out something along the latter lines
cybernetic component implants :smile:
Would it make sense to add a "hint flag" to the component model that tells the host "instances of this (sub)component should be kept alive and reused if possible", i.e. the app will work fine even if the hint is ignored, but it will work better if the instances are reused? I believe we discussed this with @Luke Wagner and others already, but I don't recall if we discussed the scenario where e.g. a component has three subcomponents, only one of which has the hint flag attached to it, meaning the other two are not expected to be reused (and in fact shouldn't be reused), but they may use the "long lived" one to cache state.
with dynamic instantiation, you could imagine a component exporting an API that gives a fresh instance for an interface instance export, but provides that exported instance with imports that are shared internally. Then each of those short-lived instances could hold a session key
Are you guys still discussing session "pool" ? Meaning that the session would not be throw-away ? And Joel means "hint that I trust the pool" ? I think I got lost.
My comment was regarding the general problem of caching for otherwise short-lived instances. Could be data caching, connection caching, or whatever.
You don't even need dynamic instantiation per-se, just preprocessing to split shared imports plus a convention for how the host maps those split imports to their components
I guess I'm not convinced that a pool is inherently more dangerous. And OTOH, prewarmed connections would require a completely different approach to establishing connections from what's proposed right now—one which would be harder to integrated into content toolchains
Some prewarming strategies wouldn't require new interfaces; as a simple example you could immediately prewarm a connection upon opening a cold connection, optimistically assuming that it is likely to be used soon
And there could be configuration that for each IP the pre-warmer should keep 10 open un-used connections ready.
given that Dave's proposal has multiple discrete steps, are you suggesting the host (or more generally, the exporter) would effectively record the sequence of these steps and then rerun them optimistically because they're likely to be repeated in exactly the same way?
@Till Schneidereit Earlier you gave a client-connection-pool
example. Does this need to be specialized for any kind of transport (TCP and/or TLS, ..) or can it even be a generic duplex stream pool, like:
get-preopened-pool: func(name: string) -> option<io-pool>;
interface io-pool {
open() -> tuple<input-stream, output-stream>;
close(input-stream, output-stream);
}
ooh, that's a great question! I can't immediately see any reason why you'd not be able to generalize it.
The one thing you'd lose is having to provide a proof that you'd be able to recreate the same configuration
but maybe we could recreate even that
Till Schneidereit said:
given that Dave's proposal has multiple discrete steps, are you suggesting the host (or more generally, the exporter) would effectively record the sequence of these steps and then rerun them optimistically because they're likely to be repeated in exactly the same way?
I assumed that wasi:TCP & wasi:TLS are fused implementation. And that they know how to do that handshake, there are probably no application specific steps to replay, or are there ?
You could use secret tokens as pool keys :shrug:
Lann Martin said:
You could use secret tokens as pool keys :shrug:
I previously said "anonymous", meaning PK are out of scope.
yeah, and in fact establishing a TLS connection could return a key to use
Well "token" is maybe the wrong term. You could for example derive the pool key from a TLS private key
does fingerprint of that PK become part of the cache-key then ?
oh, and you'd be able to derive the same key/hash/token by an operation taking the same imputs
what Lann said :smile:
In case this isn't obvious: this requires very careful design; don't just hash the PK
I assumed that wasi:TCP & wasi:TLS are fused implementation. And that they know how to do that handshake, there are probably no application specific steps to replay, or are there ?
That's at least not how things are specified right now, and it gets us back to higher-level abstractions being harder to integrate into existing toolchains
hmm actually this is a good question for TLS in particular; do we actually need socket pools or do we just need session resumption?
TLS 1.3 has "0-RTT" session resumption, which just requires a secret resumption ticket iirc
You'd still eat the normal TCP startup but that's much faster than TLS
if we wanted to support a proof system for a general-purpose cache, we could do something like this:
With these things, you'd be able to have a chain such as
And to retrieve a connection from the pool, you'd do this:
How would the new instance ask for connection from pool? What would give it the token ?
actually, ignore all the "and token" parts of the first list: those aren't needed
yeah, exactly
one sec, sketching out example code
for backward compatibility: trade the token to the host for a reserved ip/port that can be passed to existing client code
I'm a bit overwhelmed by all the discussion above. Could someone explain why we need "tokens/private-keys/proof-system/etc.." ?
It is essentially about "authenticating" that your instance has permission/capability to get a particular connection from the pool
the concern I'm trying to address is that you shouldn't be able to reuse a connection if you'd not be able to create a new one with the same properties. For example if you lost access to a client certificate, you shouldn't be able to reuse a connection that used that certificate
Hmm. In my example (https://bytecodealliance.zulipchat.com/#narrow/stream/219900-wasi/topic/Database.20connection.20pooling.20or.20pre-warmed.20connections/near/469684262) there is no such thing as "the" pool. There is a set of preopened pools that you have access to. I.e. if a pool is preopened for your component instance, then you access to those streams. Similar to files.
you shouldn't be able to reuse a connection if you'd not be able to create a new one with the same properties. For example if you lost access to a client certificate, you shouldn't be able to reuse a connection that used that certificate
Continuing with my example; that could be the responsibility of the child component who is putting in the streams into the pool. By putting them in, they're also giving access to use the streams. Capability-style.
It then becomes a game of logically divvying up separate kind of streams (with different permissions / authority) into separate pools.
Yeah, "private imports" are a reasonable approach. I think the tooling just doesn't support them yet.
I don't think that addresses the issue I'm trying to address
the sequence of concern is
a
establishes a connection using a client certificate, and stores it in the poolb
retrieves connection from pool, despite not having access to the certificate anymorei.e., being able to put a connection into the pool isn't sufficient to ensure that it's also okay to later retrieve it from the pool
could we put client certificates out of scope ?
here's the simple client example, modified with what I have in mind:
// TCP setup:
fn from_pool() -> result<(io::incoming_stream, io::outgoing_stream)> {
let ip_token = wasi_sockets::get_address_resolution_token("example.com")?;
let connection_token = wasi_sockets::get_connection_token(ip_token, 443)?;
let tls_token = wasi_tls::get_client_connection_token(connection_token)?;
wasi::io_pool::get(tls_token)
}
let (tls_input, tls_output)) = match from_pool() {
Ok(connection) => connection,
_ => {
// TCP setup:
let ip = wasi_sockets::resolve_addresses("example.com").await?[0];
let tcp_client = wasi_sockets::TcpSocket::new();
let (tcp_input, tcp_output) = tcp_client.connect(ip, 443).await;
// TLS setup:
let (tls_input, tls_output) = wasi_tls::ClientConnection::new(tcp_input, tcp_output)
.connect("example.com")?
.finish().await?;
}
}
// Usage:
tls_output.blocking_write_and_flush("GET / HTTP/1.1\r\nHost: example.com\r\n\r\n");
let http_response = tls_input.blocking_read();
println!(http_response);
// Reset and prepare for reuse
// [Do whatever is needed to reset the connection]
wasi::io_pool::put(tls_input, tls_output);
the same issue applies to allowlists for outgoing connections, and to the ability to do DNS resolution
access to the client certificate is revoked
While an active connection is using it?
no, this would happen in-between instantiations of a
and b
I shouldn't have used "revoked": I mean only the access to the certificate, not the certificate itself
Yeah, but the physical connection is still alive in the meantime. For the native TLS implementation, it would still be considered "in use"
right. All I'm trying to ensure is that if you're losing access to the capability required to create a new connection, you should also lose the ability to reuse an existing connection established with those capabilities
if we want to do a TLS specific pool, that's easy to tie to the certificate. But I really like your idea of a more general IO pool, and I think this setup would enable it
that's SQL client connection code, which ideally would not need any modifications, if we were able to hide those "caching tokens" inside dotnet base class library. I don't know if that's possible.
Chatting with Joel and Lann a bit more about this, my impression is that, while, in the abstract, I see the value of enabling TLS connections to be pooled by the host (just like HTTP already allows), given the pure semantics of TLS (with no baked-in knowledge that we're, e.g., talking to a database with a "reset" command), it doesn't seem like something we can safely do in general without putting too much trust in the guests. Thus, I like the idea of using a long-lived instance that handles multiple requests while maintaining its own guest-implemented pool of long-lived TLS connections.
The issue Joel mentioned above for the "reuse hint" is C-M/#307 and my impression is that the reuse hint proposed in that issue is a perfect match for this use case and a good short-term "Step 1".
As for a later "Step 2", it does seem like, as Joel suggested, we can recover the per-request isolation using the "runtime instantiation" feature (which I think should be the last significant feature to add after Preview 3 before a 1.0-rc). With runtime instantiation, a long-lived root component could create 1 long-lived connection-pooling child instance, and then export a "handle request" function that internally uses runtime-instantiation to create a fresh request-handler child instance per request, with these dynamic children all importing the same connection-pooling instance. What's nice is that this would all be under producer toolchain control, which I think is important because I expect there are many fine-grained policy choices to tweak how this works that we wouldn't want to bake once-and-for-all into the spec or host implementation.
Last updated: Dec 23 2024 at 12:05 UTC