dicej commented on issue #4442:
@alexcrichton @jameysharp This addresses #4310. I plan to push one more commit to add "sad path" tests, but this should be complete otherwise.
Please pay particular attention to the
unsafe
block inFunc::call
-- I copied it fromTypedFunc::call_raw
and modified it, but I want to make sure I did that correctly.
github-actions[bot] commented on issue #4442:
Subscribe to Label Action
cc @peterhuene
<details>
This issue or pull request has been labeled: "wasmtime:api"Thus the following users have been cc'd because of the following labels:
- peterhuene: wasmtime:api
To subscribe or unsubscribe from this label, edit the <code>.github/subscribe-to-label.json</code> configuration file.
Learn more.
</details>
dicej commented on issue #4442:
I could use some advice about how to address the CI issue. I've added a new
wasmtime-component-util
crate with code that I wanted to share betweenwasmtime
andwasmtime-component-macro
. Thepublish
script doesn't like that, though:Caused by: no matching package named `wasmtime-component-util` found location searched: registry `crates-io` required by package `wasmtime-component-macro v0.40.0 (/home/runner/work/wasmtime/wasmtime/target/package/wasmtime-component-macro-0.40.0)` thread 'main' panicked at 'failed to verify "crates/component-macro/Cargo.toml"', scripts/publish.rs:462:9
Is there a better way to reuse that code, or else some way to bootstrap the dependency and make
publish
happy?
jameysharp commented on issue #4442:
I didn't know this, but there's a comment in
publish.rs
that says "note that this list must be topologically sorted by dependencies". Does it work if you swap the order ofwasmtime-component-util
andwasmtime-component-macro
?
dicej commented on issue #4442:
I didn't know this, but there's a comment in
publish.rs
that says "note that this list must be topologically sorted by dependencies". Does it work if you swap the order ofwasmtime-component-util
andwasmtime-component-macro
?Oh good call. I _did_ know that, but forgot about it. Will try that now.
dicej commented on issue #4442:
Thanks @alexcrichton and @jameysharp for your thoughtful and detailed feedback.
Alex: I agree with what you said about representing types and wanting to be able to get a type from a
Val
. For my first pass at this, I didn't have aType
enum -- I just usedInterfaceType
everywhere. That worked pretty well until the borrow checker made its verdict. I couldn't figure out how to pass a&ComponentTypes
toVal::lower
since it also takes a&mut StoreContextMut<T>
since the former was borrowed from the latter. Now that I reflect on it, though -- I see I don't need to pass both of those things since I can always get the former from the latter "just in time" by reborrowing when needed. I'll give that another shot and see what happens. I'll also see what I can do about deduplicatingFunc::call
andTypedFunc::call
.
alexcrichton commented on issue #4442:
Hm actually I do think you'll want to pass around
&ComponentTypes
if you can. One trick I used with instances is that I took it out of the store temporarily, placing it back into the store when I was done with it during the function call.I think this will want to be an argument because if an end user creates a
Val
they'll need to create it with a type for cases like aRecord
, so the end-user may create a foreignComponentTypes
. Hm maybe at least, I'm not sure. They'll always want to use types already registered within the component so perhaps not... Well in any case as long as it works!
dicej commented on issue #4442:
@alexcrichton Would you please point me to the code where you do that trick of pulling the ComponentTypes out of the store temporarily?
Also, here's a quick sketch of what I'm thinking. Please let me know what you think.
pub enum RecordIndex { Unit, Record(TypeRecordIndex), Tuple(TypeTupleIndex) } pub enum VariantIndex { Variant(TypeVariantIndex), Enum(TypeEnumIndex), Union(TypeUnionIndex), Option(TypeInterfaceIndex), Expected(TypeExpectedIndex) } pub enum Val { Bool(bool), S8(i8), U8(u8), S16(i16), U16(u16), S32(i32), U32(u32), S64(i64), U64(u64), Float32(u32), Float64(u64), Char(char), String(Rc<str>), List { element_type_index: TypeInterfaceIndex, values: Rc<[Val]> }, Record { type_index: RecordIndex, fields: Rc<[Val]> }, Variant { type_index: VariantIndex, discriminant: u32, value: Rc<Val>, }, Flags { type_index: TypeFlagsIndex, count: u32, value: Rc<[u32]>, }, } impl Val { pub fn ty() -> InterfaceType { ... } ... }
dicej commented on issue #4442:
Regarding the code sketch above: we'd need to add methods to
ComponentTypes
to create compositeVal
s, I guess.
alexcrichton commented on issue #4442:
That seems like a good start to me. I'd probably go ahead and use
Box<T>
everywhere instead ofRc<T>
forSend
/Sync
reasons. We're already nowhere near "this is almost optimal perf" so I don't think there's much benefit to making it cheaply clone-able.Otherwise though the type information will be a bit tricky since I would prefer to not simply reexport everything within
wasmtime_environ::component::types
. I think thewasmtime
crate will need to have its own parallel type hierarchy with methods for us to audit, document, and maintain as part of an embedder API.
dicej commented on issue #4442:
@alexcrichton Can you elaborate on what you think the parallel type hierarchy in the
wasmtime
crate would look like? I'm having trouble imagining how to build that in such a way that it reusesComponentTypes
behind the scenes. I also wonder whether it would represent just despecialized types or the full range including specializations like unions, options, etc.Here's my latest sketch:
pub struct TypeIndex(TypeInterfaceIndex); pub struct RecordTypeIndex(TypeRecordIndex); pub struct TupleTypeIndex(TypeTupleIndex); pub struct VariantTypeIndex(TypeVariantIndex); pub struct EnumTypeIndex(TypeEnumIndex); ... pub enum AnyRecordTypeIndex { Unit, Record(RecordTypeIndex), Tuple(TupleTypeIndex) } pub enum AnyVariantTypeIndex { Variant(VariantTypeIndex), Enum(EnumTypeIndex), Union(UnionTypeIndex), Option(TypeIndex), Expected(ExpectedTypeIndex) } pub enum Type { Bool, S8, U8, S16, U16, S32, U32, S64, U64, Float32, Float64, Char, String, List(TypeIndex), Record(AnyRecordTypeIndex), Variant(AnyVariantTypeIndex), Flags(FlagTypeIndex), } impl Type { fn from(ty: &InterfaceType) -> Self { ... } } pub struct Field<'a> { name: &str, ty: Type, } pub struct Types(ComponentTypes); impl Types { pub fn record_name(&self, index: RecordTypeIndex) -> &str { &self.0[index.0].name } pub fn record_fields(&self, index: RecordTypeIndex) -> impl Iterator<Field> { &self.0[index.0].fields.iter().map(|field| Field { name: &field.name, ty: Type::from(&field.ty) }) } ... } pub enum Val { Bool(bool), S8(i8), U8(u8), S16(i16), U16(u16), S32(i32), U32(u32), S64(i64), U64(u64), Float32(u32), Float64(u64), Char(char), String(Box<str>), List { element_type: Type, values: Box<[Val]> }, Record { ty: RecordType, fields: Box<[Val]> }, Variant { ty: VariantType, discriminant: u32, value: Box<Val>, }, Flags { ty: FlagsType, count: u32, value: Box<[u32]>, }, } impl Val { pub fn ty(types: &Types) -> Type { ... } ... }
The idea is to wrap the relevant
wasmtime_environ
index types withwasmtime
equivalents, plus aTypes
type which wrap'swasmtime_environ
'sComponentTypes
and provide methods for getting names, fields, cases, etc. for various composite types. Lots of boilerplate, of course, but happy to do it if that's the right approach.
alexcrichton commented on issue #4442:
That looks along the lines of what I was thinking, but to be clear I haven't fully thought this through. I've got what I think should be the constraints in my head but said constraints don't always translate well to code... In any case what you've described captures all the main constraints I was thinking of. The actual
ComponentTypes
under the hood will be wrapped up in anArc
so we can "cheaply" clone that and move it around. One possible design tradeoff is to bundle theArc
into theType
itself. For example we could hide all the indices and instead have something liek:enum Type { // ... Record(RecordType), // ... } struct RecordType { index: TypeRecordIndex, types: Arc<ComponentTypes>, }
That's a whole lot of
Arc
cloning though and will lose perf very quickly. Another possible alternative is:enum Type<'a> { // ... Record(RecordType<'a>), // ... } struct RecordType<'a> { index: TypeRecordIndex, types: &'a ComponentTypes, }
but then getting that from a
Val
gets tricky. One possibility though could be:enum Val { // ... Record(Record), // ... } struct Record { // ... ty: TypeRecordIndex, instance: Instance, // just an index }
or something like that maybe?
I may be too naive also trying to prematurely optimize this. It might make the most sense to go with
Type
containingArc<ComponentTypes>
for now. I do think ideally we would hide all the indexes used under the hood if we can, but that's not necessarily required.
dicej commented on issue #4442:
I've tried a few variations. Here's the latest:
pub struct Types<'a>(&'a ComponentTypes); pub struct Case<'a> { name: &'a str, ty: Type, } #[derive(Copy, Clone)] pub struct VariantIndex(TypeVariantIndex); impl VariantIndex { pub fn name(self, types: Types) -> &str { &types.0[self.0].name } pub fn cases(self, types: Types) -> impl ExactSizeIterator<Case> { types.0[self.0].cases.iter().map(|case| Case { name: &case.name, ty: Type::from(&case.ty), }) } } pub enum Type { // ... Variant(VariantIndex), // ... } impl Type { fn from(ty: &InterfaceType) -> Self { match ty { // ... InterfaceType::Variant(index) => Type::Variant(VariantIndex(*index)), // ... } } } pub enum Val { // ... Variant(Variant), // ... } impl Val { pub fn ty(&self) -> Type { match self { // ... Self::Variant { ty, .. } => Type::Variant(ty), // ... } } // ... } pub struct Variant { ty: VariantIndex, discriminant: u32, value: Box<Val>, } impl Variant { pub fn try_new( ty: VariantIndex, types: Types, discriminant: u32, value: Box<Val>, ) -> Result<Self> { // ... (here we would verify that `discriminant` and `value` match `ty` and throw an error otherwise) Ok(Self { ty, discriminant, value, }) } } impl Func { pub fn types(&self, store: impl AsContext<'a>) -> Types<'a> { Types(&store.as_context()[self.0].types) } pub fn get_param_type(&self, store: impl AsContext, index: u32) -> Type { let data = &store.as_context()[self.0]; Type::from(data.types[data.ty].params[index]) } pub fn get_result_type(&self, store: impl AsContext) -> Type { let data = &store.as_context()[self.0]; Type::from(data.types[data.ty].result) } // ... } #[test] fn foo() -> Result<()> { let engine = super::engine(); let mut store = Store::new(&engine, ()); let component = Component::new( &engine, make_echo_component_with_params( r#"(variant (case "A" u32) (case "B" s32))"#, &[super::Type::I32, super::Type::I32], ), )?; let instance = Linker::new(&engine).instantiate(&mut store, &component)?; let func = instance.get_func(&mut store, "echo").unwrap(); let ty = func.get_param_type(&store, 0); if let Type::Variant(index) = ty { let input = [Val::Variant(Variant::try_new(ty, func.types(&store))?)]; let output = func.call(&mut store, &input)?; assert_eq!(input[0], output) } else { unreachable!() } Ok(()) }
The ergonomics aren't outstanding, but the efficiency is pretty good. My rationale for still exposing indexes is that Wasm itself uses indexes everywhere, so the paradigm will at least be familiar to users.
Happy to just use an
Arc<ComponentTypes>
everywhere if we want to prioritize ergonomics, though.
dicej commented on issue #4442:
@alexcrichton Any thoughts on what I posted above? I want to make sure we agree on the design of
Val
andType
before I start rewriting things.Meanwhile, I'll start working on #4307 and come back to this once we have consensus.
alexcrichton commented on issue #4442:
Apologies for the delay, busy day yesterday and this morning! That looks good to me though. Some things may be tweaked over time but I think that's good to land.
dicej commented on issue #4442:
I started implementing this according to what I proposed above, but now I'm having second thoughts. At first, I thought the natural place to do type checking was when creating the
Val
. For example:impl Variant { pub fn try_new( ty: VariantIndex, // just a wrapper around a TypeVariantIndex types: Types, // just a wrapper around a &ComponentTypes discriminant: u32, value: Box<Val>, ) -> Result<Self> { // ... (here we would verify that `discriminant` and `value` match `ty` and throw an error otherwise) Ok(Self { ty, discriminant, value, }) } }
However, that has a big hole: an embedding programmer could accidentally create a
Val
using aTypes
from one component instance and then use it with a different one. SinceVal
only has indexes, there's nothing to tie it to a specific component instance. They might even get confused and try to build newVal
s which contain otherVal
s from other component instances.Also, my plan was to provide accessor methods for e.g. getting the cases for a
Type
, such that each accessor would take aTypes
parameter. That would put the onus on the embedding programmer to make sure to always pass the correctTypes
instance, which I could imagine might get confusing when wrangling many component instances at once.So now I think we're better off with @alexcrichton's suggestion of either embedding an
Arc<ComponentTypes>
or&ComponentTypes
inside each relevantType
' variant alongside the index. Each correspondingVal
variant would do the same. We'd still need to be careful to prevent the programmer from buildingVal
s which containVal
s from other component instances, and also check once more inFunc::call
that the parameters all reference the correctComponentTypes
, but at least that's quick and easy to do.Another crazy idea would be to borrow a trick from GhostCell and tag
Val
s andType
s with a synthetic lifetime which uniquely identifies the component instance they may be used with. That would probably be optimal from a performance standpoint, as well as catch issues at compile time instead of runtime, but perhaps it's a bit too exotic for the problem at hand?One final thought: @alexcrichton how strongly do you feel that one should be able to get a
Type
from aVal
? I ask because I think there are ergonomic advantages to _not_ tying aVal
to a specific type. It certainly makes usingFunc::call
easier -- just pass a &[Val] in without needing to first ask theFunc
for its parameterType
s and match against each of them. It could also be nice to use the output of aFunc::call
from one component instance as the input to aFunc::call
in another instance without any explicit conversion.
alexcrichton commented on issue #4442:
Hm yes that is indeed a problem! FWIW this is solved at the store level with the
store_id
field inStored<T>
, basically a solution for ABA problem (ish). We could do something similar here where each component is assigned a unique index at runtime to prevent mixing types (although we would still want to sort of support equality of types across components which could perhaps get quite nasty quite quickly, unsure).Otherwise though storing
Arc<T>
would also be a good solution. I don't know a ton aboutGhostCell
myself but when I skimmed it in the past it looked like it wouldn't ever really lend itself well to an ergonomic API. I haven't tried using it historically though.I dunno how to feel about getting a
Type
from aVal
. I feel like it should be possible as a primitive operation, but I'm not exactly certain in this opinion. The current iteration where a record is just a list ofVal
also feels a bit odd to me in that there's no correlation with field names at construction to when the value is used, which seems like it may accidentally be easy to shoot yourself in the foot by mixing up field orders or things like that.Personally I'm trying to stay analgous to the core wasm API, but I realize how it's a lot easier there than it is here. I don't really know the best answer to any of these questions. Our main use case for this is fuzzing at this point where getting a
Type
fromVal
may not matter too much.
dicej commented on issue #4442:
I've overhauled the code according to the above feedback, and I decided to squash and rebase since so much was changing anyway. Please let me know what you think.
Highlights to note:
component::types::Type
is public, andVal
now has aty
method to get the type of that value as aType
. This should also be suitable for use withComponent::exports
andComponent::imports
when the time comes to implement those.- I manged to reuse
call_raw
so there's minimal duplication inFunc::call
and no newunsafe
code (but note thatcall_raw
has moved toFunc
along withpost_return
).- Type checking now mostly happens during
Val
creation, with one final, quick, linear-time check happening inFunc::call
.- The sad path tests are now checking for specific errors.
dicej commented on issue #4442:
Thanks so much for the thorough review, @alexcrichton. I've addressed some of the issues you raised (and marked the corresponding conversations "resolved") and will look at the others tomorrow.
dicej commented on issue #4442:
@alexcrichton I believe I've addressed everything you mentioned. I'm still unsure about how
Type
equality should work, but I'm hoping the current implementation is good enough to start with and can be refined later.
dicej commented on issue #4442:
@alexcrichton I've addressed your latest feedback and opened https://github.com/bytecodealliance/wasmtime/issues/4522.
alexcrichton commented on issue #4442:
Thanks again for this @dicej!
Last updated: Dec 23 2024 at 12:05 UTC