Stream: wit-bindgen

Topic: ✔ Distinguishing between "top level" imports and their deps


view this post on Zulip Victor Adossi (Sep 19 2023 at 01:07):

Hey all, thanks for all the amazing progress on the ecosystem lately -- it's been moving at :rocket: pace.

I've got a quite uninteresting question about wit-bindgen and in particular wit-parser's Resolve.

Is it possible to distinguish between interfaces from top level world imports and their deps?

Here's an example -- if I have the following world:

world kv {
    import wasi:keyvalue/readwrite

}

The long and short of it is that I'd like to be able to tell that which specific interface (in this case readwrite) was the "top level" import of the world.

When I use wit-parser, I get a hierarchy that looks like this (excuse the crude listing):

====> world? kv (pkg: Some(Id { idx: 3 }))
====> interface: Some("poll") => PKG: Some(Id { idx: 0 })
====> interface: Some("streams") => PKG: Some(Id { idx: 1 })
====> interface: Some("wasi-cloud-error") => PKG: Some(Id { idx: 2 })
====> interface: Some("types") => PKG: Some(Id { idx: 2 })
====> interface: Some("readwrite") => PKG: Some(Id { idx: 2 })

types and wasi-cloud-error get pulled in because of the use in wasi:keyvalue/readwrite, but they show up (as far as I can tell) just as if they were also imported.

AFAIK this is how the resolution of a useis supposed to work (there's no bug, per say) -- but I was wondering if there's something I'm missing that could tell me the difference between something that was explicitly directly imported, and something brought in via a use.

The "ideal" answer would be to be able to tell that readwrite was the interface actually imported at the world level, and the rest came in as a result of resolution.

How I can think to do it now is to go through the imports at that layer and try to piece together which types come from which interfaces

view this post on Zulip Victor Adossi (Sep 19 2023 at 01:13):

Here's the in-code wit-parser struct for the Interface for readwrite in the example:

Interface {
  name: Some("readwrite"),
  docs: Docs {
    contents: Some("A keyvalue interface that provides simple read and write operations.\n")
  },
  types: {
    "bucket": Id { idx: 28 },
    "error": Id { idx: 29 },
    "incoming-value": Id { idx: 30 },
    "key": Id { idx: 31 },
    "outgoing-value": Id { idx: 32 }
  },
  functions: {
    "get": Function {
      docs: Docs { contents: Some("Get the value associated with the key in the bucket. It returns a incoming-value\nthat can be consumed to get the value.\n\nIf the key does not exist in the bucket, it returns an error.\n") },
      name: "get",
      kind: Freestanding,
      params: [("bucket", Id(Id { idx: 28 })), ("key", Id(Id { idx: 31 }))],
      results: Anon(Id(Id { idx: 33 }))
    },
    "set": Function { ... }
    "delete": Function { ... }
    "exists": Function { ... }
  }
}

From what I can tell there's no way to know that "bucket" inside readwrite came in via the use below:

interface readwrite {
    use types.{bucket, error, incoming-value, key, outgoing-value}

    get: func(bucket: bucket, key: key) -> result<incoming-value, error>

        ....
}

view this post on Zulip Peter Huene (Sep 19 2023 at 05:01):

I believe the TypeDef for the type should have a owner field (https://docs.rs/wit-parser/latest/wit_parser/struct.TypeDef.html#structfield.owner)

view this post on Zulip Peter Huene (Sep 19 2023 at 05:01):

so you'd look up the type in the types arena on the Resolve

view this post on Zulip Victor Adossi (Sep 19 2023 at 13:33):

Thanks for the hint @Peter Huene , will poke around and see what I can find!

view this post on Zulip Victor Adossi (Sep 19 2023 at 13:54):

Ahh so that's not quite what I want -- I'm actually more concerned with the interfaces, not the types basically I'd like to know that kv imported readwrite.

It's unclear to me at least if/how knowing that Bucket came from types would help me figure that out...

It's like World should have a imports and a resolved member -- attempting to resolve the readwrite import would then lead to WorldItems being added to resolved (not imports directly), so in the end types would never end up in imports.

I'm not sure if what I'm asking for is desirable for any other use case but mine though.

view this post on Zulip Victor Adossi (Sep 19 2023 at 13:56):

I assume the same thing would happen with exports -- if an export included some use, the pulled in interfaces would end up under World#exports (rather than some resolved member if it existed)

view this post on Zulip Alex Crichton (Sep 19 2023 at 14:30):

Currently there's not actually a way to do what you want to do, only listing written-down imports/exports. The motivation behind this is sort of twofold. One is that there's no way to express this in the component model, e.g. in the binary format it's just required that everything is there and there's no concept of what was written down. Two is that code generators and processors of WIT in theory shouldn't distinguish whether it was written down or not since all the transitive imports are regardless required.

Not to say this couldn't be at least partially implemented. It'd be easy enough to add a flag that's true when parsing and false when added via resolve. That wouldn't be preserved through the binary format though.

Could you expand a bit on what your intended use case for this is?

view this post on Zulip Victor Adossi (Sep 19 2023 at 15:04):

Currently there's not actually a way to do what you want to do, only listing written-down imports/exports. The motivation behind this is sort of twofold. One is that there's no way to express this in the component model, e.g. in the binary format it's just required that everything is there and there's no concept of what was written down. Two is that code generators and processors of WIT in theory shouldn't distinguish whether it was written down or not since all the transitive imports are regardless required.

This makes sense, thanks for explaining and confirming -- I get the decision made this way/the motivation.

Not to say this couldn't be at least partially implemented. It'd be easy enough to add a flag that's true when parsing and false when added via resolve. That wouldn't be preserved through the binary format though.

Yeah this would be great, being able to know when something was resolved in would be nice... but I also maybe don't think it's worth the trouble.

Could you expand a bit on what your intended use case for this is?

What I'm working on is some code that turns each top level interface into an asynchronous remote interaction ( a receive or send of a message) -- so taking something like:

    get: func(bucket: bucket, key: key) -> result<incoming-value, error>

And turning it into

#[derive(Serialize, Deserialize)]
struct GetInvocation {
  bucket: Bucket,
  key: Key,
}

#[async_trait]
trait RemoteReadWrite {
  async get(...., invocation: GetInvocation) -> ... { ... }
}

The idea is to go from top level world specification automatically to an object that is trained to perform the contract remotely.

To avoid getting too into the weeds, this is functionality that's meant to enhance the way wasmCloud does providers (we take messages that come in off of a NATS lattice).

The good news is that this is working at this point (it works with simple contracts), but when I pulled in wasi:keyvalue (which is a bit more complex than one might make), the automation started doing the above for things that don't make sense (like functions that manipulate incoming-value, etc)

view this post on Zulip Victor Adossi (Sep 19 2023 at 15:07):

So my current solution which is a bit janky is to have the user specify which interfaces they want this sort of... extra codegen to happen for (because in the general case, there's no way to know exactly which interfaces should be exposed, for sure, even if I was able to pin down the "top level" ones).

If you're curious, what my franken-bindgen invocation ends up looking like is something like this:

wasmcloud_provider_wit_bindgen::generate!(
    // Impl Struct
    KeyvalueProvider,
    // WIT Namespace & package
    "wasi:keyvalue",
    // Interfaces that should be exposed on the lattice
    [
     "wasi:keyvalue/readwrite",
     "wasi:keyvalue/atomic"
    ],
    "test-bindgen-kv-memory"
);

view this post on Zulip Alex Crichton (Sep 19 2023 at 17:18):

ok thanks for the extra context!

For this though I think that the solution you've got already might be the way to go

view this post on Zulip Alex Crichton (Sep 19 2023 at 17:18):

looking a bit further into the future, with registry integration many interfaces are in theory going to be distributed as WIT-encoded-as-wasm which loses the "top level" concept since that's not representable in the binary format

view this post on Zulip Alex Crichton (Sep 19 2023 at 17:19):

so in that sense while it would be possible to implement what I mentioned above today once you transitioned to registry-based tooling it would no longer work

view this post on Zulip Victor Adossi (Sep 20 2023 at 00:44):

Thank you for taking the time to look at it @Alex Crichton, hopefully the description wasn't too hard to follow.

looking a bit further into the future, with registry integration many interfaces are in theory going to be distributed as WIT-encoded-as-wasm which loses the "top level" concept since that's not representable in the binary format

Yeah the binary format not representing it is certainly gives me pause for the other solution idea...

so in that sense while it would be possible to implement what I mentioned above today once you transitioned to registry-based tooling it would no longer work

Yeah this makes sense -- I got the code compiling yesterday (diff timezone) so I've proved out the specification method now, it looks like the way to go that should have less breakage in the future.

Thanks so much for the help

view this post on Zulip Notification Bot (Sep 20 2023 at 00:44):

Victor Adossi has marked this topic as resolved.


Last updated: Dec 23 2024 at 12:05 UTC