Stream: wasmtime

Topic: How to link several guest <-> host calls of one interaction?


view this post on Zulip Alex (Nov 15 2024 at 13:09):

Hi! I'm trying to experiment with using Kotlin for a guest module which is now possible with the latest Wasmtime builds fully supporting Wasm GC.

To pass rich data to guests compiled in other stacks I've experimented with (e.g. TinyGo, AssemblyScript), I've used various versions of malloc and free to store data in the guest linear memory and pass a pointer to that data when calling a guest function. With Kotlin it seems this pattern wouldn't work since it doesn't provide a malloc function nor means to implement it manually. Essentially it only allows allocations within a withScopedMemoryAllocator { } lambda call and allocations made inside it will be garbage collected after the lambda completes.

As far as I can understand the only way to pass rich data to a Kotlin function (other than using stdin I guess) would be to do something like this:

@OptIn(UnsafeWasmMemoryApi::class)
@WasmExport("func_with_string_arg")
fun funcWithStringArg(stringSize: Int) = withScopedMemoryAllocator { allocator ->
    val ptr = allocator.allocate(stringSize)
    storeData(ptr.address.toInt()) // this calls the host which should now store the string at the provided address
    // read the data at ptr
}

This means that a host function storeData will be called from the guest while the host is calling the guest function funcWithStringArg. Does Wasmtime provide any means to connect these calls (the call chain might be deeper)?

I'm embedding Wasmtime in Go using wasmtime-go, the interface for a host exported function is like this:

func(caller *wasmtime.Caller, args []wasmtime.Val) ([]wasmtime.Val, *wasmtime.Trap)

The only relevant thing I can find here is caller.Context() but it is an opaque object, nothing can be stored in it as far as I can tell. Looks like I can store some sort of hashmap on the host side in the form of (storelikeReference, someId) => data, generate someId when calling a guest function and expect the guest to pass back someId when calling storeData so that the host can understand what data the guest requests? I.e. something like this:

@OptIn(UnsafeWasmMemoryApi::class)
@WasmExport("func_with_string_arg")
fun funcWithStringArg(stringSize: Int, dataId Int) = withScopedMemoryAllocator { allocator ->
    val ptr = allocator.allocate(stringSize)
    storeData(ptr.address.toInt(), dataId) // host can understand by dataId what data to store at ptr
    // read the data at ptr
}

This approach seems error prone and pretty convoluted, I really hope there are other options but I can't find anything relevant in the docs. Maybe there is a better approach?
Thanks!

view this post on Zulip Alex Crichton (Nov 15 2024 at 23:18):

It's possible to get this working, but you're right it's a bit convoluted. Ideally it would be possible to call an allocator without first entering into Kotlin as that'll make this much easier, but that may also be a lot of work.

Otherwise though what you can do is from the host function use the Caller as the context to get a memory's data/len pointers. In wasmtime-go I don't think we have it bound but in Rust there's Caller::get_export to make loading the memory easier. In Go though you can create your callback and then acquire memory after the wasm instance is created and arrange for the memory to be accessible from the callback

view this post on Zulip Alex (Nov 16 2024 at 07:42):

Thank you for the response.

Well yes, there is a way to get access to the guest memory from the caller in wasmtime-go too. What I don't like in this approach is that when a host wants to call a guest function that accepts rich data and returns rich data, instead of just calling a guest function, the interaction becomes split in several and requires keeping some sort of a global state on the host side:

  1. The host calls the guest function while storing the argument data somewhere (some sort of a global variable I guess?)
  2. During the (1) call the guest function calls the host back with a pointer to the buffer where the host can write the argument data to. The host has to somehow understand that this call is in related to (1). There might be many guests running at once, the only way to differentiate them is by using the caller, right? But can we get a caller reference before the guest callback? I mean in (1) we don't have a caller instance yet, should something from the store be used to connect call in (1) and the guest callback?
  3. Then when the guest wants to return data from the function call it apparently can't simply write it to the linear memory and return a pointer because after the function call (1) completes, all allocations will be garbage-collected. So I guess to return data from the function call the guest again would have to call some host function that accepts the result of the (1) call.

The guest function would look something like that:

fun funcWithStringArg(stringSize: Int, callId Int) = withScopedMemoryAllocator { allocator ->
    // allocate a buffer for the argument
    callHostToStoreArgumentData(argBufPtr.address.toInt(), callId) // host can understand by callId arguments of which call are requested
    // execute the function body
   // store the function result data in the linear memory
    callHostToReturnResultData(resultDataPtr.address.toInt(), callId) // host can understand by callId which call returned the result
}

I just can't believe this is how it is supposed to be done :smile:. It would probably be pretty slow too with the overhead of several RPC calls between the host and the guest, not to mention how convoluted and error prone this becomes.

view this post on Zulip Alex Crichton (Nov 16 2024 at 13:58):

I just can't believe this is how it is supposed to be done

Well, to some degree, something along these lines needs to happen when you communicate with something in a separate address space. Wasm lives in a separate address space which the host must copy into, and that's a fundamental property of sandboxing WebAssembly.

If you're interested this is what the component model is intended to solve. There's not integration of the component model in the wasmtime-go bindings right now but at a high-level the component model is exactly tasked with communicating high-level data into and out of wasm modules.

As for how to implement this all, there's various techniques in the Rust wasmtime crate for example but we don't always do the best job of mapping 100% of them into per-language bindings. For example the most common way in Rust to store state is to put it in the T of Store<T> which you can access from Caller<T>. A custom void* pointer can be set on a store which I think can be accessed in the C API (IIRC, I may be misremembering) which I don't believe is bound in Go yet. In that sense this could be something that could be improved in wasmtime-go.

view this post on Zulip Alex (Nov 18 2024 at 07:08):

Thanks, yes, I've read the proposal for the component model, but it will probably be a while before it is possible to use it.
For example, a multi-value return values proposal was merged into the standard about 4 years ago, but when I checked how well it is supported in various stacks, it turned out that it is usable only from Rust and out of all other stacks I've tried it only half-works in TinyGo (works for imported functions but not for exported).

Also, as far as I can tell, Wasmtime is the only runtime at the moment that supports the component model, but there is no C API for it, so it is only usable from Rust. So, it seems unlikely that it would be possible to make use of the component model from anything other than Rust in the near future unfortunately. In general, my early impression from experimenting with WebAssembly for several weeks is that for non-browser use cases Wasm works great when both the host and guests use Rust. In almost any other stack I've tried, Wasm/WASI support is, at best, highly experimental, with various compatibility and performance issues :unamused:.

For example the most common way in Rust to store state is to put it in the T of Store<T> which you can access from Caller<T>.

Thank you, I will look into that.


Last updated: Nov 22 2024 at 16:03 UTC