Hi,
I'm building a service which pipelines many gigabytes of data from a couchdb cluster into a elasticsearch cluster. As the data needs to get transformed and the code for the transformation comes directly from my clients, I want to use wasm components here. The transform code can load more data from the cluster via a host function. As loading from the cluster should be batched I collect all requests for some time and do one big request to the cluster.
At the moment I simulate both clusters as if they have unlimited resources.
I made a poc which performs ok (about 100mbit throughput on my laptop) with 1000 instances of the same component all running on the same engine with async support enabled. I have a central queue (deadqueue::limited) and each instance runs on its own tokio task.
It runs a little bit better if I have more engines and have the 1000 instances split over the engines. But the sweet spot seems to be around 2 engines. (My CPU has 20 Threads).
I searched all over the documentation but can't find any best efforts on how to design such a system. Especially I don't know if having multiple engines is supported at all? If I understood correctly, all instance on one engine run each on an own stack but on the same cpu. So having multiple engines should help here.
On one engine it does not really help to have the component function itself async as component instances are not Clone
so I had to clone the linker, component and engine to get a new instance for each runner (which runs in its own tokio task). So async here only helps me with the host function itself, or am I missing something here?
By "engines" are you referring to wasmtime::Engine
? There shouldn't be any reason to use multiple engines (assuming they would all have the same wasmtime::Config
).
I think you are looking for the instantiate_pre family of functions. It does everything that can be shared between instances and then returns to you a preinitialized module that can be instantiated as often as you want with minimum overhead, each of these instances is separate and can run on a different CPU core
Yes, multiple wasmtime::engine::Engine
. I changed some code and it seems I was wrong with the assumption that having multiple engines perform better. Thanks for the tip with instantiate_pre
!
I still wonder how to call multiple functions on the same instance which should run on one core with multiple stacks as I can not clone an instance and I need a mutable pointer to a store to call a function on an instance.
You cannot run parallel function calls on a single instance or sharing a single Store
today. An InstancePre
(returned by instantiate_pre
) can be cloned and run with its own Store
yes the only reason for multiple Engines is if you have diverging configs, reuse as much as possible if you want performance :)
each thread can have its own separate version of your function, but they can't share data, so you have to design your code in a map reduce way
Alright, that's good to know. I read a lot about stack switching in the wasmtime documentation and was really confused how I'm able to use it
Wasmtime's own stack switching (fibers) is an internal detail of its async support. There is - separately - a proposal for stack switching in guest wasm code, which is not yet implemented afaik
As I'm already discussing performance with you. As I don't want my tokio executor to lock up too much I would need too use engine.increment_epoch
every maybe 100ns? There's only one example which uses 1 second which seems a bit much for the tokio executor in my experience.
Depends on the workload. Spin, which expects relatively IO-heavy workloads, currently uses 10ms. I haven't really tried to tune this experimentally
René Rössler has marked this topic as resolved.
Last updated: Nov 22 2024 at 17:03 UTC