Stream: general

Topic: Module linking, interface types & reducing memory copies


view this post on Zulip Bernard Kolobara (Oct 20 2021 at 12:00):

Hi :wave: ,

I'm working on a plugin system that builds on top of the module linking proposal and will also eventually support interface types. The ideas is simple, I provide a set of core host functions that other Wasm plugins use to build on top, and export their own functions that are used by guest wasm modules. For example, you could build a WASI plugin exporting WASI compatible functions, but proxying the calls to the native host functions.

You can also chain multiple plugins together. I wanted to avoid copying buffers between each layer/plugin until you hit the host function. So I came up with a system where each Wasm instance exports their memory under a unique ID. I pass now a triple (ID, start, end) memory slice between Wasm instances so that when you eventually want to read/write to the slice you grab the right memory and do it. Especially if you are writing bigger buffers I want to avoid multiple copies.

Now I'm trying to see how this system would fit into interface types. I have been going through the spec, but from my perspective it seems like the copies always need to happen when you use interface types. Wasmtime could potentially optmize out some of the copying, but this would only work between native host functions and wasm instances, but can't work if I want to forward a slice of memory through another instance without first copying it into the instance.

I'm wondering if I'm maybe missing something here and there is a better approach? Or if runtimes will be able to optimize this copies away? I would love to support interface types, but I'm worried about all the additional copies if we have multiple plugins passing a buffer between each other until it hits the final host function.

view this post on Zulip Dan Gohman (Oct 20 2021 at 12:25):

Hello! One path for some forms of this problem is that we're working on adding a stream type to interface types, for streaming data between components. It's an evolving concept at this point and there's not a lot of documentation yet, but you can see some of the ideas here: https://docs.google.com/presentation/d/1WtnO_WlaoZu1wp4gI93yc7T_fWTuq3RZp8XUHlrQHl4/edit#slide=id.ge7fd2f6194_0_275

async functions and streams in Interface Types WASI subgroup August 12+26, 2021

view this post on Zulip Bernard Kolobara (Oct 20 2021 at 12:30):

Thanks for sharing the presentation. It's great to see that this issue is being considered and worked on.

view this post on Zulip Till Schneidereit (Oct 20 2021 at 12:36):

For a lot of use cases, streams will indeed be the right thing to use, and they can be forwarded without intermediate copies. For other uses, where streams aren't the right fit, resources will probably make sense: those allow the originating module to just pass on a handle, which can cheaply be forwarded through your chain of modules, and only the final destination needs to operate on the resource contents, e.g. by calling methods to read its contents

view this post on Zulip Till Schneidereit (Oct 20 2021 at 12:38):

Now I'm trying to see how this system would fit into interface types.

To also answer this question: I don't think it'd fit in, because it'd remove encapsulation: by default, a component shouldn't (need to, at least) export its memory, such that even the host only interacts with it via its public interfaces, there are no spooky actions at a distance, etc.

view this post on Zulip Bernard Kolobara (Oct 20 2021 at 12:44):

Yeah, the main issue as I see it is that plugins should have a higher level of trust than regular modules, but this is hard to express if they are just Wasm modules too that are linked together.

view this post on Zulip Dan Gohman (Oct 20 2021 at 13:07):

Our overall goal is to build a system that's efficient even if you don't have that higher level of trust :-).

view this post on Zulip Dan Gohman (Oct 20 2021 at 13:09):

This is, admittedly, a more complex goal. But it's an important one, because even in scenarios where one trusts individual modules, one still has to watch out for malicious input data. And as one scales up to large systems with many components, the potential for even trusted components to interact with each other's non-public interfaces in unintended ways goes up.


Last updated: Oct 23 2024 at 20:03 UTC