Stream: git-wasmtime

Topic: wasmtime / issue #4442 support dynamic function calls in ...


view this post on Zulip Wasmtime GitHub notifications bot (Jul 13 2022 at 17:19):

dicej commented on issue #4442:

@alexcrichton @jameysharp This addresses #4310. I plan to push one more commit to add "sad path" tests, but this should be complete otherwise.

Please pay particular attention to the unsafe block in Func::call -- I copied it from TypedFunc::call_raw and modified it, but I want to make sure I did that correctly.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 13 2022 at 17:36):

github-actions[bot] commented on issue #4442:

Subscribe to Label Action

cc @peterhuene

<details>
This issue or pull request has been labeled: "wasmtime:api"

Thus the following users have been cc'd because of the following labels:

To subscribe or unsubscribe from this label, edit the <code>.github/subscribe-to-label.json</code> configuration file.

Learn more.
</details>

view this post on Zulip Wasmtime GitHub notifications bot (Jul 13 2022 at 19:50):

dicej commented on issue #4442:

I could use some advice about how to address the CI issue. I've added a new wasmtime-component-util crate with code that I wanted to share between wasmtime and wasmtime-component-macro. The publish script doesn't like that, though:

Caused by:
  no matching package named `wasmtime-component-util` found
  location searched: registry `crates-io`
  required by package `wasmtime-component-macro v0.40.0 (/home/runner/work/wasmtime/wasmtime/target/package/wasmtime-component-macro-0.40.0)`
thread 'main' panicked at 'failed to verify "crates/component-macro/Cargo.toml"', scripts/publish.rs:462:9

Is there a better way to reuse that code, or else some way to bootstrap the dependency and make publish happy?

view this post on Zulip Wasmtime GitHub notifications bot (Jul 13 2022 at 20:02):

jameysharp commented on issue #4442:

I didn't know this, but there's a comment in publish.rs that says "note that this list must be topologically sorted by dependencies". Does it work if you swap the order of wasmtime-component-util and wasmtime-component-macro?

view this post on Zulip Wasmtime GitHub notifications bot (Jul 13 2022 at 20:28):

dicej commented on issue #4442:

I didn't know this, but there's a comment in publish.rs that says "note that this list must be topologically sorted by dependencies". Does it work if you swap the order of wasmtime-component-util and wasmtime-component-macro?

Oh good call. I _did_ know that, but forgot about it. Will try that now.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 14 2022 at 15:08):

dicej commented on issue #4442:

Thanks @alexcrichton and @jameysharp for your thoughtful and detailed feedback.

Alex: I agree with what you said about representing types and wanting to be able to get a type from a Val. For my first pass at this, I didn't have a Type enum -- I just used InterfaceType everywhere. That worked pretty well until the borrow checker made its verdict. I couldn't figure out how to pass a &ComponentTypes to Val::lower since it also takes a &mut StoreContextMut<T> since the former was borrowed from the latter. Now that I reflect on it, though -- I see I don't need to pass both of those things since I can always get the former from the latter "just in time" by reborrowing when needed. I'll give that another shot and see what happens. I'll also see what I can do about deduplicating Func::call and TypedFunc::call.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 14 2022 at 15:21):

alexcrichton commented on issue #4442:

Hm actually I do think you'll want to pass around &ComponentTypes if you can. One trick I used with instances is that I took it out of the store temporarily, placing it back into the store when I was done with it during the function call.

I think this will want to be an argument because if an end user creates a Val they'll need to create it with a type for cases like a Record, so the end-user may create a foreign ComponentTypes. Hm maybe at least, I'm not sure. They'll always want to use types already registered within the component so perhaps not... Well in any case as long as it works!

view this post on Zulip Wasmtime GitHub notifications bot (Jul 14 2022 at 15:41):

dicej commented on issue #4442:

@alexcrichton Would you please point me to the code where you do that trick of pulling the ComponentTypes out of the store temporarily?

Also, here's a quick sketch of what I'm thinking. Please let me know what you think.

pub enum RecordIndex {
    Unit,
    Record(TypeRecordIndex),
    Tuple(TypeTupleIndex)
}

pub enum VariantIndex {
    Variant(TypeVariantIndex),
    Enum(TypeEnumIndex),
    Union(TypeUnionIndex),
    Option(TypeInterfaceIndex),
    Expected(TypeExpectedIndex)
}

pub enum Val {
    Bool(bool),
    S8(i8),
    U8(u8),
    S16(i16),
    U16(u16),
    S32(i32),
    U32(u32),
    S64(i64),
    U64(u64),
    Float32(u32),
    Float64(u64),
    Char(char),
    String(Rc<str>),
    List {
        element_type_index: TypeInterfaceIndex,
        values: Rc<[Val]>
    },
    Record {
        type_index: RecordIndex,
        fields: Rc<[Val]>
    },
    Variant {
        type_index: VariantIndex,
        discriminant: u32,
        value: Rc<Val>,
    },
    Flags {
        type_index: TypeFlagsIndex,
        count: u32,
        value: Rc<[u32]>,
    },
}

impl Val {
    pub fn ty() -> InterfaceType {
        ...
    }

    ...
}

view this post on Zulip Wasmtime GitHub notifications bot (Jul 14 2022 at 15:44):

dicej commented on issue #4442:

Regarding the code sketch above: we'd need to add methods to ComponentTypes to create composite Vals, I guess.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 14 2022 at 16:12):

alexcrichton commented on issue #4442:

That seems like a good start to me. I'd probably go ahead and use Box<T> everywhere instead of Rc<T> for Send/Sync reasons. We're already nowhere near "this is almost optimal perf" so I don't think there's much benefit to making it cheaply clone-able.

Otherwise though the type information will be a bit tricky since I would prefer to not simply reexport everything within wasmtime_environ::component::types. I think the wasmtime crate will need to have its own parallel type hierarchy with methods for us to audit, document, and maintain as part of an embedder API.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 14 2022 at 18:22):

dicej commented on issue #4442:

@alexcrichton Can you elaborate on what you think the parallel type hierarchy in the wasmtime crate would look like? I'm having trouble imagining how to build that in such a way that it reuses ComponentTypes behind the scenes. I also wonder whether it would represent just despecialized types or the full range including specializations like unions, options, etc.

Here's my latest sketch:

pub struct TypeIndex(TypeInterfaceIndex);
pub struct RecordTypeIndex(TypeRecordIndex);
pub struct TupleTypeIndex(TypeTupleIndex);
pub struct VariantTypeIndex(TypeVariantIndex);
pub struct EnumTypeIndex(TypeEnumIndex);
...

pub enum AnyRecordTypeIndex {
    Unit,
    Record(RecordTypeIndex),
    Tuple(TupleTypeIndex)
}

pub enum AnyVariantTypeIndex {
    Variant(VariantTypeIndex),
    Enum(EnumTypeIndex),
    Union(UnionTypeIndex),
    Option(TypeIndex),
    Expected(ExpectedTypeIndex)
}

pub enum Type {
    Bool,
    S8,
    U8,
    S16,
    U16,
    S32,
    U32,
    S64,
    U64,
    Float32,
    Float64,
    Char,
    String,
    List(TypeIndex),
    Record(AnyRecordTypeIndex),
    Variant(AnyVariantTypeIndex),
    Flags(FlagTypeIndex),
}

impl Type {
    fn from(ty: &InterfaceType) -> Self {
        ...
    }
}

pub struct Field<'a> {
    name: &str,
    ty: Type,
}

pub struct Types(ComponentTypes);

impl Types {
    pub fn record_name(&self, index: RecordTypeIndex) -> &str {
        &self.0[index.0].name
    }

    pub fn record_fields(&self, index: RecordTypeIndex) -> impl Iterator<Field> {
        &self.0[index.0].fields.iter().map(|field| Field { name: &field.name, ty: Type::from(&field.ty) })
    }

    ...
}

pub enum Val {
    Bool(bool),
    S8(i8),
    U8(u8),
    S16(i16),
    U16(u16),
    S32(i32),
    U32(u32),
    S64(i64),
    U64(u64),
    Float32(u32),
    Float64(u64),
    Char(char),
    String(Box<str>),
    List {
        element_type: Type,
        values: Box<[Val]>
    },
    Record {
        ty: RecordType,
        fields: Box<[Val]>
    },
    Variant {
        ty: VariantType,
        discriminant: u32,
        value: Box<Val>,
    },
    Flags {
        ty: FlagsType,
        count: u32,
        value: Box<[u32]>,
    },
}

impl Val {
    pub fn ty(types: &Types) -> Type {
        ...
    }

    ...
}

The idea is to wrap the relevant wasmtime_environ index types with wasmtime equivalents, plus a Types type which wrap's wasmtime_environ's ComponentTypes and provide methods for getting names, fields, cases, etc. for various composite types. Lots of boilerplate, of course, but happy to do it if that's the right approach.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 14 2022 at 18:35):

alexcrichton commented on issue #4442:

That looks along the lines of what I was thinking, but to be clear I haven't fully thought this through. I've got what I think should be the constraints in my head but said constraints don't always translate well to code... In any case what you've described captures all the main constraints I was thinking of. The actual ComponentTypes under the hood will be wrapped up in an Arc so we can "cheaply" clone that and move it around. One possible design tradeoff is to bundle the Arc into the Type itself. For example we could hide all the indices and instead have something liek:

enum Type {
    // ...
    Record(RecordType),
    // ...
}

struct RecordType {
    index: TypeRecordIndex,
    types: Arc<ComponentTypes>,
}

That's a whole lot of Arc cloning though and will lose perf very quickly. Another possible alternative is:

enum Type<'a> {
    // ...
    Record(RecordType<'a>),
    // ...
}

struct RecordType<'a> {
    index: TypeRecordIndex,
    types: &'a ComponentTypes,
}

but then getting that from a Val gets tricky. One possibility though could be:

enum Val {
    // ...
    Record(Record),
    // ...
}

struct Record {
    // ...
    ty: TypeRecordIndex,
    instance: Instance, // just an index
}

or something like that maybe?

I may be too naive also trying to prematurely optimize this. It might make the most sense to go with Type containing Arc<ComponentTypes> for now. I do think ideally we would hide all the indexes used under the hood if we can, but that's not necessarily required.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 14 2022 at 21:07):

dicej commented on issue #4442:

I've tried a few variations. Here's the latest:

pub struct Types<'a>(&'a ComponentTypes);

pub struct Case<'a> {
    name: &'a str,
    ty: Type,
}

#[derive(Copy, Clone)]
pub struct VariantIndex(TypeVariantIndex);

impl VariantIndex {
    pub fn name(self, types: Types) -> &str {
        &types.0[self.0].name
    }

    pub fn cases(self, types: Types) -> impl ExactSizeIterator<Case> {
        types.0[self.0].cases.iter().map(|case| Case {
            name: &case.name,
            ty: Type::from(&case.ty),
        })
    }
}

pub enum Type {
    // ...
    Variant(VariantIndex),
    // ...
}

impl Type {
    fn from(ty: &InterfaceType) -> Self {
        match ty {
            // ...
            InterfaceType::Variant(index) => Type::Variant(VariantIndex(*index)),
            // ...
        }
    }
}

pub enum Val {
    // ...
    Variant(Variant),
    // ...
}

impl Val {
    pub fn ty(&self) -> Type {
        match self {
            // ...
            Self::Variant { ty, .. } => Type::Variant(ty),
            // ...
        }
    }

    // ...
}

pub struct Variant {
    ty: VariantIndex,
    discriminant: u32,
    value: Box<Val>,
}

impl Variant {
    pub fn try_new(
        ty: VariantIndex,
        types: Types,
        discriminant: u32,
        value: Box<Val>,
    ) -> Result<Self> {
        // ... (here we would verify that `discriminant` and `value` match `ty` and throw an error otherwise)

        Ok(Self {
            ty,
            discriminant,
            value,
        })
    }
}

impl Func {
    pub fn types(&self, store: impl AsContext<'a>) -> Types<'a> {
        Types(&store.as_context()[self.0].types)
    }

    pub fn get_param_type(&self, store: impl AsContext, index: u32) -> Type {
        let data = &store.as_context()[self.0];
        Type::from(data.types[data.ty].params[index])
    }

    pub fn get_result_type(&self, store: impl AsContext) -> Type {
        let data = &store.as_context()[self.0];
        Type::from(data.types[data.ty].result)
    }

    // ...
}

#[test]
fn foo() -> Result<()> {
    let engine = super::engine();
    let mut store = Store::new(&engine, ());
    let component = Component::new(
        &engine,
        make_echo_component_with_params(
            r#"(variant (case "A" u32) (case "B" s32))"#,
            &[super::Type::I32, super::Type::I32],
        ),
    )?;
    let instance = Linker::new(&engine).instantiate(&mut store, &component)?;
    let func = instance.get_func(&mut store, "echo").unwrap();
    let ty = func.get_param_type(&store, 0);

    if let Type::Variant(index) = ty {
        let input = [Val::Variant(Variant::try_new(ty, func.types(&store))?)];
        let output = func.call(&mut store, &input)?;

        assert_eq!(input[0], output)
    } else {
        unreachable!()
    }

    Ok(())
}

The ergonomics aren't outstanding, but the efficiency is pretty good. My rationale for still exposing indexes is that Wasm itself uses indexes everywhere, so the paradigm will at least be familiar to users.

Happy to just use an Arc<ComponentTypes> everywhere if we want to prioritize ergonomics, though.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 15 2022 at 14:35):

dicej commented on issue #4442:

@alexcrichton Any thoughts on what I posted above? I want to make sure we agree on the design of Val and Type before I start rewriting things.

Meanwhile, I'll start working on #4307 and come back to this once we have consensus.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 15 2022 at 18:39):

alexcrichton commented on issue #4442:

Apologies for the delay, busy day yesterday and this morning! That looks good to me though. Some things may be tweaked over time but I think that's good to land.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 18 2022 at 16:23):

dicej commented on issue #4442:

I started implementing this according to what I proposed above, but now I'm having second thoughts. At first, I thought the natural place to do type checking was when creating the Val. For example:

impl Variant {
    pub fn try_new(
        ty: VariantIndex, // just a wrapper around a TypeVariantIndex
        types: Types,  // just a wrapper around a &ComponentTypes
        discriminant: u32,
        value: Box<Val>,
    ) -> Result<Self> {
        // ... (here we would verify that `discriminant` and `value` match `ty` and throw an error otherwise)

        Ok(Self {
            ty,
            discriminant,
            value,
        })
    }
}

However, that has a big hole: an embedding programmer could accidentally create a Val using a Types from one component instance and then use it with a different one. Since Val only has indexes, there's nothing to tie it to a specific component instance. They might even get confused and try to build new Vals which contain other Vals from other component instances.

Also, my plan was to provide accessor methods for e.g. getting the cases for a Type, such that each accessor would take a Types parameter. That would put the onus on the embedding programmer to make sure to always pass the correct Types instance, which I could imagine might get confusing when wrangling many component instances at once.

So now I think we're better off with @alexcrichton's suggestion of either embedding an Arc<ComponentTypes> or &ComponentTypes inside each relevant Type' variant alongside the index. Each corresponding Val variant would do the same. We'd still need to be careful to prevent the programmer from building Vals which contain Vals from other component instances, and also check once more in Func::call that the parameters all reference the correct ComponentTypes, but at least that's quick and easy to do.

Another crazy idea would be to borrow a trick from GhostCell and tag Vals and Types with a synthetic lifetime which uniquely identifies the component instance they may be used with. That would probably be optimal from a performance standpoint, as well as catch issues at compile time instead of runtime, but perhaps it's a bit too exotic for the problem at hand?

One final thought: @alexcrichton how strongly do you feel that one should be able to get a Type from a Val? I ask because I think there are ergonomic advantages to _not_ tying a Val to a specific type. It certainly makes using Func::call easier -- just pass a &[Val] in without needing to first ask the Func for its parameter Types and match against each of them. It could also be nice to use the output of a Func::call from one component instance as the input to a Func::call in another instance without any explicit conversion.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 18 2022 at 21:28):

alexcrichton commented on issue #4442:

Hm yes that is indeed a problem! FWIW this is solved at the store level with the store_id field in Stored<T>, basically a solution for ABA problem (ish). We could do something similar here where each component is assigned a unique index at runtime to prevent mixing types (although we would still want to sort of support equality of types across components which could perhaps get quite nasty quite quickly, unsure).

Otherwise though storing Arc<T> would also be a good solution. I don't know a ton about GhostCell myself but when I skimmed it in the past it looked like it wouldn't ever really lend itself well to an ergonomic API. I haven't tried using it historically though.

I dunno how to feel about getting a Type from a Val. I feel like it should be possible as a primitive operation, but I'm not exactly certain in this opinion. The current iteration where a record is just a list of Val also feels a bit odd to me in that there's no correlation with field names at construction to when the value is used, which seems like it may accidentally be easy to shoot yourself in the foot by mixing up field orders or things like that.

Personally I'm trying to stay analgous to the core wasm API, but I realize how it's a lot easier there than it is here. I don't really know the best answer to any of these questions. Our main use case for this is fuzzing at this point where getting a Type from Val may not matter too much.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 21 2022 at 04:14):

dicej commented on issue #4442:

I've overhauled the code according to the above feedback, and I decided to squash and rebase since so much was changing anyway. Please let me know what you think.

Highlights to note:

view this post on Zulip Wasmtime GitHub notifications bot (Jul 21 2022 at 23:13):

dicej commented on issue #4442:

Thanks so much for the thorough review, @alexcrichton. I've addressed some of the issues you raised (and marked the corresponding conversations "resolved") and will look at the others tomorrow.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 22 2022 at 18:37):

dicej commented on issue #4442:

@alexcrichton I believe I've addressed everything you mentioned. I'm still unsure about how Type equality should work, but I'm hoping the current implementation is good enough to start with and can be refined later.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 25 2022 at 17:51):

dicej commented on issue #4442:

@alexcrichton I've addressed your latest feedback and opened https://github.com/bytecodealliance/wasmtime/issues/4522.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 25 2022 at 18:38):

alexcrichton commented on issue #4442:

Thanks again for this @dicej!


Last updated: Dec 23 2024 at 12:05 UTC