Hi all, I'm trying to build a Neovim WASM plugin system with wasmtime. I'm using the component model to define API functions and figuring out a way so that client applications can pass a callback to host functions, which the host can store the callback somewhere and invoke the callback when an event is triggered, which can be after the host function call returns. I am considering two approaches:
world api {
// might not be the correct syntax
type listener = i32
import register-listener: func(listener: listener)
export call-listener: func(listener: listener, arg: string)
export drop-listener: func(listener: listener)
}
And the client will implement it like
// export this as call-listener
fn call_listener(listener: i32, arg: String) {
let f = &mut *(listener as *mut Fn(String))
f(arg)
}
What is not clear to me is that:
1. How will function pointers be converted to integer in WASM? I think it will just be the offset of WASM's linear address. If this is the case, because all WASM instance has their own linear address, to specify a function I'll need a pair of instance ID and address, is that correct? In short, do I also need the instance ID to uniquely determine a client callback?
2. If the answer to 1. is yes, then I'll need to figure out a way to get the instance that is calling the host function. How do I do it with component model? For "vanilla" WASM without component model in wasmtime, the function given to wrap
can take an extra Caller
argument, but it does not seems to give information about the calling instance. Not to mention that if I use the component model, I can only implement the member functions in the trait generated by bindgen, which does not include the Caller
argument.
world api {
// might not be the correct syntax
import register-listener: func(listener: listener)
export resource listener {
call: func(arg: string)
}
}
And the client can then expose a type, say a wrapper of Rc<dyn FnMut(String)>
, as the type of the callback. Some questions:
1. Similarly, do I need the instance ID that owns the callback function to correctly resolve it? If wasmtime can convert the resource type listener
into some typed-handle that I call just do listener.call(42)
I won't need to know which instance is calling the register-listener
function, but if not I still need to known which instance does this callback belong to and manually pass the correct function implemented by the module.
What I would do is separate the "protocol" for the "api". The "protocol" will be the raw wit component world, which will be a bit difficult to work with because of the manual management, so you can create a rust create (and similar libraries in other languages) that will provide better experience to providing the callbacks.
"Deep down", I would choose to represent the callback as a u32/u64, but make it an ID, not a direct memory address. The guest will be responsible for managing a table from ids to the actual callbacks. This makes it possible to store an closure in the table, for example.
It is true that you need to remember which instance is the callback coming from. In host, you can set "current instance id" in your type that implements the world imports before calling any exported function. This id will then be accessible from the &mut self
argument if the imported function.
In the guest, the "convenient" api library/crate I would make would expose idiomatic functions like add_callback_handler(event, hander: impl Fn ... + 'static)
and that would manage storing it in the table automatically.
It's true that resources would be ideal for this, but I have no idea if/how they work.
Thank you for your answer. Just for clarification,
In host, you can set "current instance id" in your type that implements the world imports before calling any exported function.
I thought the type that implement the world will be the type T
in Store<T>
, and the same store and its inner state will be shared across multiple instance. From the documentation it says
This T is suitable for storing Store-specific information which imported functions may want access to.
I feel like it isn't clear that all instance will share the same state or there will be separated ones.
You can store more stuff in the store than just the world imports. The second argument in the "add_to_imports" function fetches the world imports from the store:
MyWorld::add_to_linker(&mut linker, |state| state)?;
So if you have more data i nthe store, just explain to the linker how to get the world imports from the store with the "lambda" function.
You could, for example, have a HashMap<id, improts> in your store, as well as "currentId", and then do state.hashMap.get_mut(state.currentId)
for example.
This, however, is most likely not what you want. You could instead put the "currentId" into the struct that implements the world imports directly, so that it would be accessible from the imported function.
So I guess the question is how to track the current instance? I known I can use a stack, whenever a instance's function is called push the instance ID to it, and pop element when a call to instance completes. But I think this information is definitely store somewhere in wasmtime and I hope there is a way to surface it.
take a look at this example - then imagine extending it so that you store some information in the MyImports
struct that allows you to change the behavior of empty_error
https://github.com/bytecodealliance/wasmtime/blob/main/tests/all/component_model/bindgen/results.rs#L61-L74
a much more complex example is the implementation in the wasmtime-wasi crate, under preview2. it uses a table to track resources, and the WasiCtx to track information that isnt resources (e.g., environment variables, arguments)
But isn't it possible to have multiple instance in a single store? We can only store arbitrary data in Store
but not Instance
.
yes, you can have multiple linked instances in a single store. calls into the host do not have any way of indicating which instance in the store they come from.
Oh right, you can "shove" exports of one instance into imports of another instance, right? I have never done this, but that is the whole point of the "composable" component model. When one instance then calls that imported function, code execution will enter into the other instance without ever "returning" to the host ... at least from the user's perspective.
Again, I have never done this, so I could be wrong. Does it actually work like this? If so, then identifying which instance is calling the host imported function can be tricky. Wasmtime should probably add some API to expose this information to the user, if it doesn't already.
yes, this is called linking. there is linking available both in core wasm modules, and in components
linking between modules works like https://docs.wasmtime.dev/api/wasmtime/struct.Linker.html#method.module
this is a wasmtime functionality because modules themselves have no way of describing linking, so the engine has to do it for them
components do have ways to describe linking - if you want to compose two components together, you can create an outer component that contains both component definitions and describes which imports of one are satisified by exports of the other
so, wasmtime's component linker does not provide a way to link multiple components to each other, just how to link a single component to the host. wasm-compose is used for component composition
either way, when either a module function or a component function call into the host, wasmtime does not provide an indicator of what module-instance or component-instance it came from.
you can capture a backtrace, and that will provide you with a set of wasm frames, each of which indicates the module and function responsible https://docs.wasmtime.dev/api/wasmtime/struct.WasmBacktrace.html
but the instance is not part of that description
so far we haven't seen a use case where the instance is needed in that description. i'm not sure whether we have the runtime information available to determine it, at the moment
core dumps and profiling both want the instance to be included in the backtrace but we don't have a plan for how to do that yet as far as I know
Jamey Sharp said:
core dumps and profiling both want the instance to be included in the backtrace but we don't have a plan for how to do that yet as far as I know
Yeah it would require pushing an instance id or something as part of our stack frame, which seems not great
Or maybe always pushing the vmctx register in the prologue?
Pat Hickey said:
so far we haven't seen a use case where the instance is needed in that description.
I think the OP wants a way to identify which instance is calling the host function, so "sifting" through the trace log wouldn't really be ideal.
Just to make sure this isn't an XY problem, I will re-phrase what (I think) the OP wants: Any WASM component can "register" a listener by calling an imported function, for example "on_new_file_created(name: String, bytes: Vec<u8>)". This then needs to be stored on the host, and a when a new file is created, the host needs to call all the listeners in their respective instances.
So it's kind of important to know which instance registered which listener. You could of course, as I already mention, save the current instance id before every exported function call in the store's data, and then read that instance id in the imported function. But if one instance calls another instance and that other instance registers the even handler, then you would get the first instance id, even though the registration is coming from the other instance. At least that's how I understand it.
if the wasm component needs to be called by the host, it can create a resource, call a host function passing that resource, and then the host can invoke a method on that resource to call back into the component.
at least, that will work once resource support has fully landed in wasmtime. its currently in progress.
calling a method on a resource wont have any of the problems you describe wrt what the correct instance to call is
Last updated: Nov 22 2024 at 16:03 UTC