Hey all, thanks for all the amazing progress on the ecosystem lately -- it's been moving at :rocket: pace.
I've got a quite uninteresting question about wit-bindgen
and in particular wit-parser
's Resolve
.
Is it possible to distinguish between interfaces from top level world imports and their deps?
Here's an example -- if I have the following world:
world kv {
import wasi:keyvalue/readwrite
}
The long and short of it is that I'd like to be able to tell that which specific interface (in this case readwrite
) was the "top level" import of the world.
When I use wit-parser
, I get a hierarchy that looks like this (excuse the crude listing):
====> world? kv (pkg: Some(Id { idx: 3 }))
====> interface: Some("poll") => PKG: Some(Id { idx: 0 })
====> interface: Some("streams") => PKG: Some(Id { idx: 1 })
====> interface: Some("wasi-cloud-error") => PKG: Some(Id { idx: 2 })
====> interface: Some("types") => PKG: Some(Id { idx: 2 })
====> interface: Some("readwrite") => PKG: Some(Id { idx: 2 })
types
and wasi-cloud-error
get pulled in because of the use
in wasi:keyvalue/readwrite
, but they show up (as far as I can tell) just as if they were also import
ed.
AFAIK this is how the resolution of a use
is supposed to work (there's no bug, per say) -- but I was wondering if there's something I'm missing that could tell me the difference between something that was explicitly directly import
ed, and something brought in via a use
.
The "ideal" answer would be to be able to tell that readwrite
was the interface actually import
ed at the world level, and the rest came in as a result of resolution.
How I can think to do it now is to go through the imports at that layer and try to piece together which types come from which interfaces
Here's the in-code wit-parser
struct for the Interface
for readwrite
in the example:
Interface {
name: Some("readwrite"),
docs: Docs {
contents: Some("A keyvalue interface that provides simple read and write operations.\n")
},
types: {
"bucket": Id { idx: 28 },
"error": Id { idx: 29 },
"incoming-value": Id { idx: 30 },
"key": Id { idx: 31 },
"outgoing-value": Id { idx: 32 }
},
functions: {
"get": Function {
docs: Docs { contents: Some("Get the value associated with the key in the bucket. It returns a incoming-value\nthat can be consumed to get the value.\n\nIf the key does not exist in the bucket, it returns an error.\n") },
name: "get",
kind: Freestanding,
params: [("bucket", Id(Id { idx: 28 })), ("key", Id(Id { idx: 31 }))],
results: Anon(Id(Id { idx: 33 }))
},
"set": Function { ... }
"delete": Function { ... }
"exists": Function { ... }
}
}
From what I can tell there's no way to know that "bucket"
inside readwrite
came in via the use
below:
interface readwrite {
use types.{bucket, error, incoming-value, key, outgoing-value}
get: func(bucket: bucket, key: key) -> result<incoming-value, error>
....
}
I believe the TypeDef
for the type should have a owner
field (https://docs.rs/wit-parser/latest/wit_parser/struct.TypeDef.html#structfield.owner)
so you'd look up the type in the types arena on the Resolve
Thanks for the hint @Peter Huene , will poke around and see what I can find!
Ahh so that's not quite what I want -- I'm actually more concerned with the interfaces, not the types basically I'd like to know that kv
imported readwrite
.
It's unclear to me at least if/how knowing that Bucket
came from types
would help me figure that out...
It's like World
should have a imports
and a resolved
member -- attempting to resolve the readwrite
import would then lead to WorldItem
s being added to resolved
(not imports
directly), so in the end types
would never end up in imports
.
I'm not sure if what I'm asking for is desirable for any other use case but mine though.
I assume the same thing would happen with exports
-- if an export
included some use
, the pulled in interfaces would end up under World#exports
(rather than some resolved
member if it existed)
Currently there's not actually a way to do what you want to do, only listing written-down imports/exports. The motivation behind this is sort of twofold. One is that there's no way to express this in the component model, e.g. in the binary format it's just required that everything is there and there's no concept of what was written down. Two is that code generators and processors of WIT in theory shouldn't distinguish whether it was written down or not since all the transitive imports are regardless required.
Not to say this couldn't be at least partially implemented. It'd be easy enough to add a flag that's true when parsing and false when added via resolve. That wouldn't be preserved through the binary format though.
Could you expand a bit on what your intended use case for this is?
Currently there's not actually a way to do what you want to do, only listing written-down imports/exports. The motivation behind this is sort of twofold. One is that there's no way to express this in the component model, e.g. in the binary format it's just required that everything is there and there's no concept of what was written down. Two is that code generators and processors of WIT in theory shouldn't distinguish whether it was written down or not since all the transitive imports are regardless required.
This makes sense, thanks for explaining and confirming -- I get the decision made this way/the motivation.
Not to say this couldn't be at least partially implemented. It'd be easy enough to add a flag that's true when parsing and false when added via resolve. That wouldn't be preserved through the binary format though.
Yeah this would be great, being able to know when something was resolved in would be nice... but I also maybe don't think it's worth the trouble.
Could you expand a bit on what your intended use case for this is?
What I'm working on is some code that turns each top level interface into an asynchronous remote interaction ( a receive or send of a message) -- so taking something like:
get: func(bucket: bucket, key: key) -> result<incoming-value, error>
And turning it into
#[derive(Serialize, Deserialize)]
struct GetInvocation {
bucket: Bucket,
key: Key,
}
#[async_trait]
trait RemoteReadWrite {
async get(...., invocation: GetInvocation) -> ... { ... }
}
The idea is to go from top level world specification automatically to an object that is trained to perform the contract remotely.
To avoid getting too into the weeds, this is functionality that's meant to enhance the way wasmCloud does providers (we take messages that come in off of a NATS lattice).
The good news is that this is working at this point (it works with simple contracts), but when I pulled in wasi:keyvalue
(which is a bit more complex than one might make), the automation started doing the above for things that don't make sense (like functions that manipulate incoming-value
, etc)
So my current solution which is a bit janky is to have the user specify which interfaces they want this sort of... extra codegen to happen for (because in the general case, there's no way to know exactly which interfaces should be exposed, for sure, even if I was able to pin down the "top level" ones).
If you're curious, what my franken-bindgen invocation ends up looking like is something like this:
wasmcloud_provider_wit_bindgen::generate!(
// Impl Struct
KeyvalueProvider,
// WIT Namespace & package
"wasi:keyvalue",
// Interfaces that should be exposed on the lattice
[
"wasi:keyvalue/readwrite",
"wasi:keyvalue/atomic"
],
"test-bindgen-kv-memory"
);
ok thanks for the extra context!
For this though I think that the solution you've got already might be the way to go
looking a bit further into the future, with registry integration many interfaces are in theory going to be distributed as WIT-encoded-as-wasm which loses the "top level" concept since that's not representable in the binary format
so in that sense while it would be possible to implement what I mentioned above today once you transitioned to registry-based tooling it would no longer work
Thank you for taking the time to look at it @Alex Crichton, hopefully the description wasn't too hard to follow.
looking a bit further into the future, with registry integration many interfaces are in theory going to be distributed as WIT-encoded-as-wasm which loses the "top level" concept since that's not representable in the binary format
Yeah the binary format not representing it is certainly gives me pause for the other solution idea...
so in that sense while it would be possible to implement what I mentioned above today once you transitioned to registry-based tooling it would no longer work
Yeah this makes sense -- I got the code compiling yesterday (diff timezone) so I've proved out the specification method now, it looks like the way to go that should have less breakage in the future.
Thanks so much for the help
Victor Adossi has marked this topic as resolved.
Last updated: Nov 22 2024 at 16:03 UTC