Stream: general

Topic: adding component model support to wasmi


view this post on Zulip monkeyontheloose (Jul 10 2023 at 11:14):

gm, im wondering how hard it would be and what would be the best steps to add component model support to wasmi,
hope this is the right place to ask this question :pray:

Zooming out I'm trying to use this lib https://github.com/DelphinusLab/zkWasm which uses wamsi for execution traces, unfortunately wasmi doesn't support component model yet

Contribute to DelphinusLab/zkWasm development by creating an account on GitHub.

view this post on Zulip fitzgen (he/him) (Jul 11 2023 at 16:08):

I don't know whether wasmi directly interprets wasm opcodes or translates those to its own IR that it then interprets. the approach taken might vary depending on that.

in general, first you'd need support for parsing the text format and decoding the binary format. I think wasmi uses our wasm-tools crates, so if that is true then this first step is already complete.

at the runtime level, you'd need to either interpret the lifting and lowering instructions or translate those instructions to wasm opcodes or the internal IR if it exists and then interpret that. this is where the bulk of the effort will be.

maybe @Alex Crichton or others can add more details here / correct anythign I misrepresented.

view this post on Zulip Alex Crichton (Jul 11 2023 at 16:29):

I would agree that the hard part here is likely going to be the lifting/lowering and all of those semantics. I've found the representation of everything to be quite difficult to pin down in Wasmtime but that's also because I'm trying to be careful about allocations/speed/etc and less-optimized solutions which are simpler to understand are probably more in the wheelhouse of wasmi which may make that part much easier

view this post on Zulip fitzgen (he/him) (Jul 11 2023 at 16:42):

right, all the work we do in FACT is basically something that an interpreter could skip

view this post on Zulip monkeyontheloose (Jul 11 2023 at 16:53):

  1. how many hours of work would the take (ballpark)?
  2. the zkWasm app uses wasmi to create execution traces which it then uses to create proofs, just to be sure, you can't use wasmtime which already supports CM for this, right?
  3. lifting and lowering instructions are new wasm opcodes?
  4. could you pls refer me to links that might be useful in understanding the work that needs to be done?

view this post on Zulip fitzgen (he/him) (Jul 11 2023 at 17:31):

all the relevant documents are linked from the README here: https://github.com/WebAssembly/component-model/

Repository for design and specification of the Component Model - GitHub - WebAssembly/component-model: Repository for design and specification of the Component Model

view this post on Zulip fitzgen (he/him) (Jul 11 2023 at 17:32):

I'd estimate a month of work for someone who is familiar with wasmi and wasm

view this post on Zulip fitzgen (he/him) (Jul 11 2023 at 17:33):

the lifting and lowering instructions are at the component model level, not core wasm. they can't be interspersed with regular wasm opcodes. they statically appear within a different context.

view this post on Zulip fitzgen (he/him) (Jul 11 2023 at 17:35):

you might be able to pre-process the wasm before running it to insert instrumentation to create execution traces and then run the instrumented wasm in your wasmtime embedding that provides the hooks that the aforementioned instrumentation relies upon. I don't really know that particular domain and what kind of traces it is taking so I can't really say more than that.

view this post on Zulip Notification Bot (Jul 12 2023 at 22:16):

Michelle Thalakottur has marked this topic as resolved.

view this post on Zulip Notification Bot (Jul 12 2023 at 22:16):

Michelle Thalakottur has marked this topic as unresolved.

view this post on Zulip Graydon Hoare (Jul 20 2023 at 17:12):

this is also in my area of interest -- we're using wasmi (and have had to hand-roll an ABI) -- and I've .. struggled to understand obligations on the host and guest from the linked docs

view this post on Zulip Graydon Hoare (Jul 20 2023 at 17:13):

is it correct to understand that the "canonical ABI" is part of what's going to be standardized, or is the CM like .. parameterized by multiple possible ABIs, or something?

view this post on Zulip Dan Gohman (Jul 20 2023 at 17:19):

The CM is designed to support many possible ABIs. And, the Canonical ABI is a particular ABI that is being developed to be standardized.

view this post on Zulip Graydon Hoare (Jul 20 2023 at 17:33):

hmm .. ok .. that's unfortunatley even less to go on. I'll keep reading! but like .. just to provide feedback as a reader: I understand the wasm 1.0 model I think fairly well at this point, and I understand how wasmi implements it / am basically comfortable with the wasmi codebase, and .. reading the above docs (which I've attempted to read several times in the past ~12 months) brings me very little clarity about what I, or the wasmi maintainer, would need to do to the wasmi codebase to make it "support the CM". like it's not at all clear where the boundaries of responsibilities are between (say) an interpreter codebase, the embedding environment using the interpreter, guest code (wasm, wit or guest source-language) that users see when working in their modules, guest toolchain features (eg. rustc features), code assumed (in perpetuity, by design) to be generated by extra tools, and polyfills.

view this post on Zulip Graydon Hoare (Jul 20 2023 at 17:34):

I get that there's a set of answers to that! but it is sure not something a non-system-designer-reader can parse from those docs, much less find a specification of.

view this post on Zulip Graydon Hoare (Jul 20 2023 at 17:38):

I can't even really figure out the dependency graph of specs, like what other post-1.0 extensions (or parts of them) it depends on a runtime having implemented (which I suspect are "more than none" and might well be "more than wasmi supports", i.e. it might be quite a ways before the starting line for "supporting the CM")

view this post on Zulip Alex Crichton (Jul 20 2023 at 17:39):

At least personally I agree that that the docs right now could be better, and CanonicalABI.md is a bit dense approaching it from nothing. That being said all this is still in-development and not "finished" so to some degree this is expected. Not to say we couldn't do better!

From a wasmi perspective I would recommend considering CM support as roughly analagous to core wasm support. Wasmi presumably supports loading a binary-encoded wasm module and doing things with it. The component model at that high layer is the same way, you're given a binary and you enable doing things with it. What can be done is primarily different through a different set of types of values at runtime and through a different "shape" of a component (e.g. it does more than export a flat list of functions but can export bags of functions through instances and such)

view this post on Zulip Robin Brown (Jul 20 2023 at 17:39):

At the moment, the Canonical ABI is the only ABI you can use. It is parametric in that it can be configured using Canonical ABI options (canonopts) when lifting module exports to the Component level or lowering Component imports to the module level. This involves things like specifying the encoding being used for strings, the memory and allocator to use, etc.

view this post on Zulip Alex Crichton (Jul 20 2023 at 17:39):

The CM does not depend on any core wasm features beyond the MVP, so you're safe in that regard

view this post on Zulip Alex Crichton (Jul 20 2023 at 17:41):

One thing you might find helpful is to explore examples, which I might recommend Wasmtime's test suite for. There's tests/misc_testsuite/component-model/*.wast which has a lot of components of various shapes and sizes. There's also the wit-bindgen test suite which builds a bunch of components as part of its tests you can poke around with as well. For the "poking" I'd recommend the wasm-tools CLI since it's the only one I'm aware of with component model support

view this post on Zulip Graydon Hoare (Jul 20 2023 at 17:45):

it depends, surely, on extern refs, no? (luckily wasmi does seem to support those)

view this post on Zulip Alex Crichton (Jul 20 2023 at 17:47):

If it helps, some more details on the ABI business is that on one hand there's the component model functions, aka "this function returns a string". On the other hand there's core wasm functions, aka "this function returns an integers". These two concepts are bridged through "lifting" and "lowering" where you lift a core wasm function into a component function and then you can lower a component function into a core wasm function. For example a host provides a component function, the component lowers it, then a core wasm inside the component imports it. Or alternatively a core wasm in a component exports a core wasm function, then a component lifts that, then a host calls it.

The lifting/lowering operations are currently only defined in the context of the "canonical ABI" which you see as canon lower and canon lift. This dictates exact ABI details such as what integer means what, how to handle many arguments, many returns, where do strings live, how do things get allocated, all that stuff. This is the main body of CanonicalABI.md. You might also find it useful to explore the canonical ABI through *.wit files and generated bindings code through wit-bindgen. For example this WIT file:

package my:example

world my-world {
    import foo: func() -> string
}

you can generate Rust bindings with wit-bindgen rust foo.wit and see what's generated to see how the ABI details there work.

view this post on Zulip Alex Crichton (Jul 20 2023 at 17:48):

it depends, surely, on extern refs, no?

The component model does not, no. If you're thinking that resources are connected to externrefs they're similar but not the same. Resources when lowered are an index into a component-specific table, which means that core wasm always sees resources as integers (think file descriptors)

view this post on Zulip Graydon Hoare (Jul 20 2023 at 17:49):

hmm so ok very basic baby question (which, apologies for not being able to parse out of the docs): does every module, as a wasm blob, contain its _own_ copy of lifting/lowering functions? or does the runtime provide some common set?

view this post on Zulip Lann Martin (Jul 20 2023 at 17:52):

On the guest (core wasm) side, lifting and lowering code is generated by wit-bindgen for the guest language. On the wasmtime host side its mostly runtime with some macro generation to make the interfaces nicer

view this post on Zulip Alex Crichton (Jul 20 2023 at 17:53):

lifting/lowering is represented as "lift this core wasm function" or "lower this component function", so it's a function without a body sort of where the "body" is implied by the canonical ABI

view this post on Zulip Alex Crichton (Jul 20 2023 at 17:53):

so in that sense I suppose you can think of it as runtimes provide lifting/lowering operations, and components specify what lifting/lowering they'll need

view this post on Zulip fitzgen (he/him) (Jul 20 2023 at 17:54):

it is the runtime's responsibility to do the lifting and lowering, and it is free to dedupe as much of them as it can

view this post on Zulip fitzgen (he/him) (Jul 20 2023 at 17:55):

(separately, the guest language might want to do another layer of massaging/translating/lifting/lowering from the canonical ABI into its data types, this is what @Lann Martin was getting at, I think. eg turn a (usize, usize) from the canonical ABI into a Box<str> in Rust or something like that)

view this post on Zulip Graydon Hoare (Jul 20 2023 at 17:56):

For example a host provides a component function, the component lowers it, then a core wasm inside the component imports it

I'd like to understand, in detail, all the actors involved in accomplishing this sentence. my host is a rust program, it has a wasm interpreter in it, it has rust types. currently it has a rust type that's the union of all possible core wasm types (i32/i64/f32/f64) and n-ary functions taking and returning N of those union types can be registered with the wasm interpreter and dispatched-to, using a little bit of rust type system fudging, using rust dyn fn objects and such. if I have a host function that has a structured component type .. I'm using .. some code generated by another tool that's VM specific? or non-VM-specific?

view this post on Zulip Lann Martin (Jul 20 2023 at 17:57):

One of the documentation pieces missing is a glossary :sweat_smile:

view this post on Zulip Graydon Hoare (Jul 20 2023 at 17:57):

it is the runtime's responsibility to do the lifting and lowering, and it is free to dedupe as much of them as it can

Ok so .. there's like a list stapled to the module of functions that will need the runtime to provide lift/lower wrappers, the part of the runtime's job instantiating the module is to synthesize such wrappers?

view this post on Zulip Jamey Sharp (Jul 20 2023 at 17:59):

The list is part of the component, rather than in any core module inside the component, but yes, I think that's basically right.

view this post on Zulip Graydon Hoare (Jul 20 2023 at 18:00):

(but those are lift/lower wrappers on the runtime's side, not the guest's side? if the guest uses ABI X and the runtime uses ABI Y, they're supposed to interoperate, right? so .. the guest isn't asking the runtime to synthesize lift/lower code for ABI X and inject it into the guest's module-space, is it?)

view this post on Zulip Lann Martin (Jul 20 2023 at 18:08):

I'm confused by the discussion of multiple ABIs. All of the component tooling currently being worked on assumes the Canonical ABI (which itself is parameterized in a couple of very limited ways).

view this post on Zulip Graydon Hoare (Jul 20 2023 at 18:14):

well, ok, setting aside "different ABIs" (above discussion wasn't clear on how much this can vary, apparently there are parameters and _maybe_ other ABIs in the future?), even just focusing on guest-vs-host, I want to understand who generated what code and whether they did so ahead of time, or at instantiation time. and if ahead of time, if it's at module-compilation time or host-and-VM compilation time.

view this post on Zulip Graydon Hoare (Jul 20 2023 at 18:15):

wit-bindgen is going to spit out (a) some stuff that gets compiled-in to a guest module, but also (b) some stuff that gets compiled-in to a host-and-VM pair? and then there's some stuff (c) that the host-and-VM pair is supposed to synthesize on the fly when the component-and-module bundle shows up asking to be instantiated

view this post on Zulip Graydon Hoare (Jul 20 2023 at 18:16):

so I guess I'm wondering: is that correct? do (a), (b) and (c) all exist? if so I can I think ignore (a), need to teach wasmi to conform to the type signatures and expectations of (b), and need to teach wasmi to actually _do_ (c), unless it's provided by some standard crate in terms of (b)

view this post on Zulip Jamey Sharp (Jul 20 2023 at 18:19):

fwiw I think in the context of an interpreter you won't need to "synthesize" (c), just walk over the type information you get out of the component to transform between your host-side enum variants and the core wasm types expected by the guest

view this post on Zulip Lann Martin (Jul 20 2023 at 18:24):

(b) is sort of true of wit-bindgen today but that's an internal detail of the current state of tooling; a runtime can get all of the type information it needs from the component binary itself

view this post on Zulip Graydon Hoare (Jul 20 2023 at 18:30):

ok but .. since the runtime needs to connect to a bunch of statically-typed embedder host functions .. I think the embedder probably wants to get a projection of the wit into a static type in the embedder language, no? or is the VM just supposed to make up a projection from the generalized component type system into type system entities in the host PL? (doing so would likely couple the embedder to a specific VM's choice of projection, much more so than currently since the current 1.0 type system is relatively simple to massage if you change VMs)

view this post on Zulip Graydon Hoare (Jul 20 2023 at 18:32):

(a long time ago I worked on CORBA systems -- some number of people in the room now have all the blood draining out of their faces in horror but I will continue -- and we used to have tools generate "stubs and skeletons", the skeletons being static types and interfaces that host-side callbacks implement and then pass into the runtime to get called-back through. I'm not clear on whether that is assumed here.)

view this post on Zulip Jamey Sharp (Jul 20 2023 at 18:39):

my understanding is that if you specifically want static types in the embedder language then you want to use wit-bindgen to generate that glue, and that you need to teach wit-bindgen what that glue should look like for wasmi. it's also possible to introspect on an unknown component at runtime and offer the same kind of interface that you described wasmi having for core wasm ("a rust type that's the union of all possible core wasm types (i32/i64/f32/f64) and n-ary functions taking and returning N of those union types can be registered with the wasm interpreter and dispatched-to, using a little bit of rust type system fudging, using rust dyn fn objects"). both forms can be useful

view this post on Zulip fitzgen (he/him) (Jul 20 2023 at 18:42):

Graydon Hoare said:

(but those are lift/lower wrappers on the runtime's side, not the guest's side? if the guest uses ABI X and the runtime uses ABI Y, they're supposed to interoperate, right? so .. the guest isn't asking the runtime to synthesize lift/lower code for ABI X and inject it into the guest's module-space, is it?)

everyone speaks the canonical ABI, if a guest wants the data in another format after it receives it in canonical abi format, then it is free to do any further transformations of the data it wants to. but when sending and receiving data, it must be in the canonical ABI.

the trampolines that the runtime is responsible for are for getting two (or more) components linked together: they pass a string, say, and the trampoline has to do the copy from the source to the destination. it is the runtime's responsiblity to create this trampoline (or do equivalent interpreted things) because each component is shared-nothing: they do not have access to each other's core instances' internal state.

backing up a bit:

for example, given:

this is the sequence of events when the host calls B's exported component function:

fin

does that make sense?

wit-bindgen is a bit of a distraction here. it is a tool for allowing people to write guest/host programs that talk canonical ABI. but it doesn't have any bearing on what the runtime has to do to support the component model. it is just sugar for turning (u32, u32) (the canonical ABI representation of a string) into Box<str> in rust and stuff like that

view this post on Zulip Lann Martin (Jul 20 2023 at 18:51):

Yes, you can use wit-bindgen to help generate host/embedder language bindings. wasmtime-py does that here: https://github.com/bytecodealliance/wasmtime-py/blob/main/rust/bindgen/src/bindgen.rs

view this post on Zulip Lann Martin (Jul 20 2023 at 18:53):

As a warning, that tooling is probably quite unstable (moreso than the CM/CABI specs), though Alex would have the best perspective on that

view this post on Zulip Alex Crichton (Jul 20 2023 at 19:05):

Yeah I wouldn't rely on specific details there per se, but I do think that they can be an interesting way to explore how things work. I mentioned wit-bindgen rust above but the wasmtime-py support, available through python -m wasmtime.bindgen, can be a good way to explore components from a host side. The python support is build on Wasmtime's C API which only supports core modules, so the bindings generated by python -m wasmtime.bindgen can perhaps be helpful to read over and see how things are hooked up. You'll find liftings/lowerings there and such

view this post on Zulip Graydon Hoare (Jul 20 2023 at 19:20):

A lifts the core wasm export function into a component function

(thanks for the details!) Can I .. dig into this point? A is a component, which is a byte blob. You're describing it doing something as a verb here: "A lifts ..". What does that mean? When does A do this? Or rather, which tool does what to accomplish this lifting? Or are you saying that the blob that is A contains a binary declaration in itself somewhere that declares a lifting-relationship between the core function and the component function?

view this post on Zulip Lann Martin (Jul 20 2023 at 19:27):

The latter: there are canon lift and canon lower operations which convert between core and CM functions

view this post on Zulip Lann Martin (Jul 20 2023 at 19:29):

e.g. (somewhat loosly): (canon lift <core-funcidx> <component-func-type>) produces a component function from a core function and the component function type

view this post on Zulip Jamey Sharp (Jul 20 2023 at 19:31):

I think what you're looking for is that A contains a declaration (the textual syntax is as shown by Lann) indicating that in order to run, it needs one of its functions lifted. the host is responsible for making that actually happen

view this post on Zulip fitzgen (he/him) (Jul 20 2023 at 20:55):

the wasm runtime is responsible for making that happen

(I like to reserve "host" for the embedder of the runtime, and I think that's how we usually use it)

view this post on Zulip Jamey Sharp (Jul 20 2023 at 21:08):

ah, that's a good distinction

view this post on Zulip Graydon Hoare (Jul 20 2023 at 21:29):

ok so .. just to be 100% clear (it's extremely hard for non-spec-editors to read the examples since they're full of abbreviated forms) .. the (canon ..) form is a declaration form, at the component level, and it defines .. the obligation to build a specific trampoline? or half-trampoline?

view this post on Zulip Graydon Hoare (Jul 20 2023 at 21:31):

aside, I find the textual definitions almost impossible to read due to all the abbreviations, the only way I've been able to understand wasm _at all_ is by referring to the binary structure definition of a module. like I have no idea at all -- despite staring for the past half hour -- how to parse this example:

  (func $run (param string) (result string) (canon lift
    (core func $main "run")
    (memory (core memory $libc "mem")) (realloc (func $libc "realloc"))
  ))

view this post on Zulip Alex Crichton (Jul 20 2023 at 21:35):

like I have no idea at all -- despite staring for the past half hour

One thing that may help with this is to run the examples through wasm-tools print. That prints the binary form which has a lot more index annotations. Not exactly readable, but it may help perhaps

view this post on Zulip Graydon Hoare (Jul 20 2023 at 21:35):

I think that desugars to a canon definition -- desugaring the (func ... (canon lift ...)) into a (canon lift ... (func ...)) and then I think there's an inline-sugar declaration of a core func, maybe? and then .. I don't have any idea what the memory or realloc forms are in there, they do not look like externdescs to me through any desugaring path I can understand

view this post on Zulip Graydon Hoare (Jul 20 2023 at 21:40):

One thing that may help with this is to run the examples through wasm-tools print.

wasm-tools I get from cargo install does not accept the examples in the docs. maybe there's a fresher version? or should I look instead at the examples in the repo, not the docs?

view this post on Zulip Graydon Hoare (Jul 20 2023 at 21:54):

oh, maybe the (memory) and (realloc) forms there are <canonopt> and the outer desugaring is .. somehow .. defining the lifting of the (core func ...) form?

view this post on Zulip Dan Gohman (Jul 20 2023 at 22:12):

Yes, they are canonopts. This syntax is defining a component function that wraps a core-wasm function, using the specified realloc and memory to communicate the string data with the core-wasm function.

view this post on Zulip Alex Crichton (Jul 20 2023 at 22:23):

ah yeah wasm-tools will only work in the context of a "full component" which in this case isn't there since it's just one func. There's also a number of syntax differences/typos in the spec so you can probably disregard my suggestion (the spec examples aren't tested yet)

view this post on Zulip Graydon Hoare (Jul 20 2023 at 22:49):

ah yeah wasm-tools will only work in the context of a "full component" which in this case isn't there since it's just one func.

No I don't just mean this one func. I mean the full component example in the explainer ("we can finally write a non-trivial component"). there appear to be enough syntax differences that I can't figure out how to edit it back to being-right (especially since I am reading it to try to figure out what it means, editing it isn't something I have a lot of confidence in)

view this post on Zulip Graydon Hoare (Jul 20 2023 at 22:58):

it's ok, I can .. at least conceptually picture what this is doing, I think. to restate -- can you confirm? -- this is declaring -- entirely declaratively -- that the existing declared-earlier core func "run", in the $main instance, which had core type ((i32,i32)->i32) where it was declared, should be lifted to a CM func $run of CM type string->string, and the trampoline (or half-trampoline?) that the runtime needs to synthesize to do that should map "a CM string" conceptually to a linear-address-and-length pair in the "mem" memory of the $libc instance (itself in the $main instance), and should make allocations it needs in that instance by calling a realloc-shaped core export named "realloc" in the $libc instance.

view this post on Zulip Graydon Hoare (Jul 20 2023 at 22:59):

is that right?

view this post on Zulip Alex Crichton (Jul 20 2023 at 23:01):

Indeed! That all sounds right to me

view this post on Zulip Alex Crichton (Jul 20 2023 at 23:01):

Sorry we should go through the examples in the spec and validate they all use actual valid syntax -- as you have probably figured out all the examples were written before we had any parsers and we never went back and updated them.

view this post on Zulip Graydon Hoare (Jul 20 2023 at 23:02):

(if so, what's the difference between the lift and the lower applied to "log"? log is external? if it's in "CM space" already why does it need to be lowered too? just to attempt to fuse it with a lift?)

view this post on Zulip Graydon Hoare (Jul 20 2023 at 23:03):

s'ok, just pointing it out if you're doing docs-updates at some point. I can file a bug on the repo if you want but I'm not sure this repo is the long-term home of reference material anyway?

view this post on Zulip Alex Crichton (Jul 20 2023 at 23:04):

The one thing I might clarify is the "half-trampoline" aspect there. This is creating, as you say declaratively, a function in the component model which has type string -> string. The implementation of this function is defined as the "half trampoline" you're thinking of where the function when called with a string will pass through the string as defined by the canonical ABI using the memory/realloc options. Similarly when the wasm function returns it will interpret the return values using the memory/realloc options.

This component model function, whether it actually concretely exists or not, sort of depends on the runtime. For example sometimes in Wasmtime it's "fused" with another component's request for the function that's where a whole-trampoline as you're thinking exists. If the host uses this function directly it's sort of a half-trampoline.

Basically I wanted to point out the half/whole trampoline may not be quite the right way to think about it. It sort of is and sort of isn't, but may be worthwhile understanding the context here of "it's a function in its own right" and the definition of what that function is depends on the host.

view this post on Zulip Alex Crichton (Jul 20 2023 at 23:05):

I'm not sure this repo is the long-term home of reference material anyway?

Oh WebAssembly/component-model will probably stick around for quite a long time, so bugs are appreciated!

view this post on Zulip Graydon Hoare (Jul 20 2023 at 23:07):

the specifics of when half-trampolines exist or are fused-away is indeed something I need to get straight, glad you pointed it out. would it fuse, say, only when the canonopts match exactly, including their referents, so as to allow passing through not just an unaltered representation but crucially pointers in the same memory?

view this post on Zulip Alex Crichton (Jul 20 2023 at 23:07):

Ah sorry I don't have the link to what you're looking at on hand (I can poke at it though to take a look), but in general you're right that there's the "CM space" and the "core space" and lift/lower go between these two. So you'll lift from core->CM and lower from CM->core. Component boundaries only support CM things, so if you want to take a core function from one component to another you'll have to lift it to the CM then lower it somewhere else. This produces the "fused" operation where a runtime can do clever things if it so desires, but the "clever things" aren't necessarily stricly spec-mandated.

This is where the Python bits-and-pieces of CanonicalABI.md show up where a lift-then-lower is sort of like a curry of the canon_lift and canon_lower functions. I think though you have to squint a bit to see it line up for sure at the current time.

view this post on Zulip Graydon Hoare (Jul 20 2023 at 23:08):

I'm looking at the example a page or two down from this anchor: https://github.com/WebAssembly/component-model/blob/main/design/mvp/Explainer.md#canonical-definitions -- search for "we can finally write a non-trivial component"

view this post on Zulip Alex Crichton (Jul 20 2023 at 23:09):

the specifics of when half-trampolines exist or are fused-away is indeed something I need to get straight, glad you pointed it out. would it fuse, say, only when the canonopts match exactly, including their referents, so as to allow passing through not just an unaltered representation but crucially pointers in the same memory?

Ah specifically no! Fusing happens between entirely different components, which can't share any state. So their options fundamentally will be different (e.g. you're transferring a string from one linear memory to another). The way you can think about it is that component model values have an abstract definition of sorts, but the ABI makes it concrete on either end.

So for example if component A encodes strings as utf-8 but component B uses utf-16 they're both using the same concept of a "string" as a valid unicode thing (I forget the technical name) but they're represented differently. In this case the fused adapter, the lift/lower pair, would transcode from. utf-8 to utf-16 when transferring strings

view this post on Zulip Alex Crichton (Jul 20 2023 at 23:10):

so a lift will sort of interpret all the input parameters/options/etc into an abstract set of values which may or may not get re-ified in the host itself (e.g. a simple interpreter might always create an actual Val representation), and then the lower translates from these abstract representations back into the destination as configured

view this post on Zulip Graydon Hoare (Jul 20 2023 at 23:11):

oh, wait, so in this case the fact that "log" and "run" are both declared to deal in the same memory "mem" is not nudging the system towards possibly fusing their lift/lower pair, because strings represented as i32,i32 pairs are literally pointing into the same memory space?

view this post on Zulip Alex Crichton (Jul 20 2023 at 23:11):

however the host represents it is entirely up to the host, so long as it can faithfully represent all possible values

view this post on Zulip Alex Crichton (Jul 20 2023 at 23:11):

let me look more closely at the example

view this post on Zulip Alex Crichton (Jul 20 2023 at 23:11):

aha so there is no fusing at all in this example

view this post on Zulip Graydon Hoare (Jul 20 2023 at 23:11):

np. you also don't have to hang around here answering my questions! I know you're very busy

view this post on Zulip Alex Crichton (Jul 20 2023 at 23:12):

The log function, a component model thing, is lowered down into a core wasm thing which goes into the "blob" of core wasm

view this post on Zulip Alex Crichton (Jul 20 2023 at 23:12):

somehow coming out of that core wasm is a "run" function which is lifted into a separate CM thing

view this post on Zulip Alex Crichton (Jul 20 2023 at 23:12):

log/run, however, aren't connected at all

view this post on Zulip Alex Crichton (Jul 20 2023 at 23:12):

except well through the core wasm I suppose

view this post on Zulip Alex Crichton (Jul 20 2023 at 23:12):

let me see if I can find a fusing example

view this post on Zulip Alex Crichton (Jul 20 2023 at 23:13):

ah ok so here's a "hello world" of fusing -- https://github.com/bytecodealliance/wasmtime/blob/53274fefe433944964bafd3f2942a942c33bf6c1/tests/misc_testsuite/component-model/fused.wast#L2-L21

view this post on Zulip Graydon Hoare (Jul 20 2023 at 23:13):

(I just was involved in building a wasmi-based system recently that had to roll its own ABI and everyone keeps showing up and asking "why didn't you use the wasm CM to link in definitions and allow inter-component magic?" and my answers are a bit wishy-washy somewhere between "the schedule doesn't seem to line up, it's not ready yet" and "I still actually have no idea how to adapt wasmi to support the CM, every time I try to understand that I get lost trying to understand the mechanisms implied by the CM" so I figured I might ask more about that...)

view this post on Zulip Alex Crichton (Jul 20 2023 at 23:14):

no worries! Confusion is IMO a good way to shape how to word the docs when we get around to it :)

view this post on Zulip Alex Crichton (Jul 20 2023 at 23:15):

but I mostly wanted to point out that everything related to adapter fusion of lift/lower pairs may have been misunderstood so far, but it requires the same value to get lifted/lowered, not just lifts/lowers of separate values in a component

view this post on Zulip Alex Crichton (Jul 20 2023 at 23:15):

and it also requires a component boundary, e.g. the sub-component in that example above

view this post on Zulip Alex Crichton (Jul 20 2023 at 23:15):

(not sure if it helps to see this though)

view this post on Zulip Graydon Hoare (Jul 20 2023 at 23:16):

looking. I think I still don't quite get what a lower _is_ besides meaninglessly "the inverse of a lift". like it's declaring a CM type maps to a given core type, but .. it's a bijection right? .. can that not be _exactly_ rewritten as a lift? why give it a separate form?

view this post on Zulip Graydon Hoare (Jul 20 2023 at 23:18):

(and like what entities-with-CM-types even exist that are not, themselves, lifts of core entities? when do you have such a CM entity you need to lower, independent of a core entity you need to lift?)

view this post on Zulip Alex Crichton (Jul 20 2023 at 23:18):

To answer the second question first which may provide more context, the biggest answer is "host things"

view this post on Zulip Alex Crichton (Jul 20 2023 at 23:18):

aka the host has a function that returns a string and wasm wants to use it

view this post on Zulip Alex Crichton (Jul 20 2023 at 23:19):

but the host function basically doesn't know it's being called by wasm

view this post on Zulip Alex Crichton (Jul 20 2023 at 23:19):

lift/lower are different halves of the operation which is what makes them necessary -- perhaps it may be helpful to ignore sub-components and fusion for now?

view this post on Zulip Alex Crichton (Jul 20 2023 at 23:20):

e.g. you get "host stuff" and you lower it, and to give functionality to the host you lift it

view this post on Zulip Graydon Hoare (Jul 20 2023 at 23:20):

ok. that's a good starting schema!

view this post on Zulip Alex Crichton (Jul 20 2023 at 23:20):

but you can't swap those since the host stuff isn't a lift of anything, it just is host stuff

view this post on Zulip Graydon Hoare (Jul 20 2023 at 23:21):

they we just throw in separate memories and an Owens-Flatt unit linking language :P

view this post on Zulip Alex Crichton (Jul 20 2023 at 23:22):

heh yeah it's true that much of the interesting stuff doesn't come up until there's more than one component in the system, but some of the bits and pieces I've found are more helpfully motivated if they're ignored when first learning

view this post on Zulip Graydon Hoare (Jul 20 2023 at 23:22):

hmm .. ok but the "lower" of a host function is .. uh .. in core-space?

view this post on Zulip Graydon Hoare (Jul 20 2023 at 23:23):

so the runtime has to do the lower-side just to call the host function, because it's some wild win32 or cocoa API speaking UTF-16

view this post on Zulip Alex Crichton (Jul 20 2023 at 23:23):

this is where the runtime sort of has a lot of flexibility

view this post on Zulip Alex Crichton (Jul 20 2023 at 23:23):

somehow it needs to produce a "core looking thing", and how to interpret the arguments to the core-looking-thing are dicated by the canonical ABI

view this post on Zulip Alex Crichton (Jul 20 2023 at 23:23):

and then what the host does with that is up to it

view this post on Zulip Alex Crichton (Jul 20 2023 at 23:24):

but yeah if it receives a utf-8 string and talks to a cocoa utf-16 api then the host has to transcode

view this post on Zulip Alex Crichton (Jul 20 2023 at 23:24):

here the host has the choice of representing strings as (encoding, wasm bytes) or it could unconditionally translate everything to a utf-16 string and run with that

view this post on Zulip Alex Crichton (Jul 20 2023 at 23:25):

if you've got some cocoa thing though and wasm wants to call it, the goop between wasm and cocoa is basically "the thing that lower produces"

view this post on Zulip Alex Crichton (Jul 20 2023 at 23:25):

it's tough to point at it and say "yes it's this" since it's probably spread out a bit as it's doing type translation, crossing core wasm ABIs, etc

view this post on Zulip Graydon Hoare (Jul 20 2023 at 23:25):

mhm. ok, informative! it still seems to me like that lift and lower are both just "CM-to-core bijections" where sometimes the bijectee is in host-world not VM-world, but .. meh .. doesn't matter!

I actually have to step out for a bit, but .. uh .. it's coming more into focus, and if it's ok with you I would love to return to this and pepper you with more questions another time?

view this post on Zulip Alex Crichton (Jul 20 2023 at 23:25):

In wasmtime for example it's a mixture of Rust-monomorphized code plus a Cranelift-generated trampoline

view this post on Zulip Alex Crichton (Jul 20 2023 at 23:26):

Happy to help out!


Last updated: Nov 22 2024 at 16:03 UTC