Stream: wasm

Topic: Multi-memory / shared memory


view this post on Zulip J Pratt (they/them) (Jul 07 2021 at 03:05):

Is anyone interested in multi-memory in Wasm up for a chat?
I'm interested in learning about the status of https://github.com/WebAssembly/multi-memory and any other projects in this space.
Particularly, it seems very useful to be able to have some of the memory of a wasm module shared with the host and possibly with other wasm modules, allowing zero copy semantics for that memory.
This seems possible to achieve with memory mapping, but multi-memory might be a good way to make the semantics clearer.
Anyway, hope someone else is interested in this.

Multiple per-module memories for Wasm. Contribute to WebAssembly/multi-memory development by creating an account on GitHub.

view this post on Zulip Stephan Renatus (Jul 07 2021 at 07:36):

Hi! I'm interested in that, too. From the proposals doc, that spec is in the "implementation phase". From the wasmtime docs, it seems to be supported there. I haven't played with it myself yet, thanks for the reminder :D

Tracking WebAssembly proposals. Contribute to WebAssembly/proposals development by creating an account on GitHub.

view this post on Zulip Stephan Renatus (Jul 07 2021 at 07:37):

I have to admit I'm a bit surprised it's supported by Wasmtime. I had only used it through wasmtime-go, and the configurable doesn't seem to be supported there yet (am I missing something?).

view this post on Zulip fitzgen (he/him) (Jul 07 2021 at 16:05):

from rust, you can enable it with https://docs.rs/wasmtime/0.28.0/wasmtime/struct.Config.html#method.wasm_multi_memory

I think the go api is similar, but I'm not too familiar with it

view this post on Zulip Peter Huene (Jul 07 2021 at 16:34):

It looks like it's an oversight that it's missing from the C API and therefore all the (non-Rust) embeddings.

view this post on Zulip Peter Huene (Jul 07 2021 at 16:39):

I've opened an issue to add it: https://github.com/bytecodealliance/wasmtime/issues/3066

It appears that wasmtime_config_wasm_multi_memory_set is missing from the C API and therefore all the non-Rust language embeddings. This means that the other language embeddings can't enable mu...

view this post on Zulip Stephan Renatus (Jul 09 2021 at 08:00):

Thanks for work on that, too! @Peter Huene

view this post on Zulip Stephan Renatus (Jul 12 2021 at 11:14):

https://github.com/bytecodealliance/wasmtime-go/pull/91 review would be great :smiley:

Fixes #89.

view this post on Zulip J Pratt (they/them) (Aug 03 2021 at 02:30):

Following up on multi-memory:

What's the current story for how the usage of this will be from Rust (and other languages)?

I've read some claims that LLVM already supports multiple address spaces (for AVR + Harvard arch) using either fat pointers or assumptions about read/write/executable data.

Source: https://users.rust-lang.org/t/rust-wasm-multi-memory-feature/54798/6

I'm reading https://github.com/WebAssembly/multi-memory/blob/master/proposals/multi-memory/Overview.md

To find out more, but it seems focused on the WASM side, without particular concern for making this accessible for compilers.

Is the main usage going to be via a special 'copy from memory A to memory B' function +/ instruction?

(Thanks again for the help last time)

Pardon me if I'm wrong, but as far as I can tell the WASM multi-memory proposal doesn't exactly use fat pointers. There's simply a new immediate added to the memory instructions which indicates which memory to operate on - so the compiler has to know which ops go to which memory during compilation, which complicates things. That's why I was talking about special syntaxes. One use-case I was hoping to explore was to expose shared memory between modules while also giving each their own private h...
Multiple per-module memories for Wasm. Contribute to WebAssembly/multi-memory development by creating an account on GitHub.

view this post on Zulip Alexandru Ene (Aug 03 2021 at 09:37):

Would a custom allocator suffice? This would be predominantly used to share data between wasm module and something else (another wasm module or some other component). Some sort of a custom allocator would be sufficient I'd think.

view this post on Zulip Alex Crichton (Aug 03 2021 at 14:01):

At this time AFAIK there are not plans to add language support to Rust for multi-memory and wasm. My assumption has been that multi-memory will come about at some sort of toolilng later as opposed to compiled in. That being said we'll probably follow whatever clang/C does in this regard

view this post on Zulip Till Schneidereit (Aug 03 2021 at 14:06):

I wonder if doing this in wasm64 might actually not be preposterously hard? It seems like it should be possible to reserve a few bytes of each pointer to denote the memory, and at least support, say, 4 or 8 memories

view this post on Zulip Chris Fallin (Aug 03 2021 at 17:36):

Interesting idea; the hardest part would be making that efficient at load/store time (we really wouldn't want a load/store to such a pointer to require unpacking it each time). Also IIUC multimemory only has static memory indices; wouldn't this require a dynamically-indexed variant?

I guess given the latter, an interesting option might be a memarg variant that basically says "take the memory index from the top N bits of the pointer, dynamically". I have no idea what the history of the multimemory proposal looks like or if that's already been discussed though?

view this post on Zulip Michael Martin (Aug 10 2021 at 03:00):

I have a few questions...

Since the multimemory proposal doesn't modify pointer values, the compiler needs to know the required memidx for all load and store instructions. I can imagine this being controlled by a global flag set by the user (via a new system API), but that won't work when operating on pointers with different indexes.

My assumption has been that multi-memory will come about at some sort of toolilng later as opposed to compiled in.

I just can't see how developers will be able to use multimemory without some new language affordances. A custom allocator will almost certainly be required, but ultimately we need some way to tie index info to individual pointers; something like:

template<typename T>
struct IndexedPtr {
  T *ptr;
  uint32_t idx;
  // could this use 'asm' style methods to implement indexed wasm ops?
};

char *p1 = malloc(n);  // normal pointer; compiler will use index 0 for ops using p1
IndexedPtr<char> *p2 = mallocIndexed(n, index);  // compiler somehow knows to set memidx for ops using p2

And for languages that can't provide pointer type aliasing, this probably gets a lot more cumbersome.

@Chris Fallin: Presumably we wouldn't want to reserve any indexing bits from 32 bit pointers, since that would dramatically curtail the available addressing space. Maybe using a single bit to indicate that an index value follows in the next byte/word (i.e. pointers would take 40 or 64 bits but each segment can only address up to 2Gb) would be acceptable?

On efficiency: the proposal already requires that memarg is inspected, with index extraction and alignment shift required for non-zero indexing. Would pulling an index from the pointer be all that much slower?

view this post on Zulip Chris Fallin (Aug 10 2021 at 05:22):

Michael Martin said:

Chris Fallin: Presumably we wouldn't want to reserve any indexing bits from 32 bit pointers, since that would dramatically curtail the available addressing space. Maybe using a single bit to indicate that an index value follows in the next byte/word (i.e. pointers would take 40 or 64 bits but each segment can only address up to 2Gb) would be acceptable?

Right, my thought was in the context of Till's question " wonder if doing this in wasm64 might actually not be preposterously hard?"; I think it would only really make sense with 64-bit pointers.

On efficiency: the proposal already requires that memarg is inspected, with index extraction and alignment shift required for non-zero indexing. Would pulling an index from the pointer be all that much slower?

The key distinction is that existing opcodes in the multimemory proposal take a static memarg (see the spec link above) so this can be inspected, and potentially resolved to a baked-in base address (or at least a fixed-offset load from a memory-base-pointers table). In contrast, taking bits from the pointer implies dynamic indexing, with at least a bounds-check of some sort. A little bit of overhead but maybe not too bad, I'm not sure.

view this post on Zulip Michael Martin (Aug 10 2021 at 06:03):

Ok, that makes sense. I'm still curious as to the higher-level direction for multimemory, though. Absent a 64 bit VM, the current proposal doesn't seem to be particularly useful for developers without further details of how to indexed memory ops will be plumbed into the development langauge(s). Do you know if there's somewhere I can find out more about this?

view this post on Zulip Chris Fallin (Aug 10 2021 at 06:15):

my understanding is that in its present form it's still pretty useful in a multi-module situation; e.g. imagine one wants to link a DAG of modules into a single Wasm module, one could include each memory from the original modules and rewrite code to refer to the appropriate memory

view this post on Zulip Chris Fallin (Aug 10 2021 at 06:16):

but cc @Alex Crichton @fitzgen (he/him) for more thoughts on that as I have a fairly surface-level understanding

view this post on Zulip Alex Crichton (Aug 10 2021 at 16:14):

Yeah my impression of multi-memory was that it is mostly a tooling/runtime thing that won't necessarily be exposed to languages themselves. For example I don't know how we'd implement this in Rust or how it'd be surfaced in Rust.

view this post on Zulip Alex Crichton (Aug 10 2021 at 16:15):

I could maybe imagine like memcpy intrinsics in/out of statically-allocated memories, but much more than that seems a bit implausible to me (e.g. a Vec<T> located in a foreign address space seems dubious to implement but also somewhat dubious in motivation as well)

view this post on Zulip Michael Martin (Aug 16 2021 at 03:20):

... but also somewhat dubious in motivation as well

Our use case is to share (preferably) read-only memory between modules that have their own module-specific working memory. The prime example is a static ML model being shared between isolated runtimes/modules performing different inference calculations. Such models may be quite large and the overhead of copying them for each inference is prohibitive.

For this case a statically reserved space in the standard wasm memory would suffice, but that needs to be mapped onto the existing, external ML model (which may be achievable with POSIX virtual memory manipulation by the wasm host).

view this post on Zulip J Pratt (they/them) (Aug 17 2021 at 03:46):

(For context, I work with Michael Martin)

In addition to this we have a second use case where there's a large data store that different modules should all read from, but not write to.
We'd ideally like to avoid keeping copies of the data store per module.
We'd also like to enforce that the modules can't write to the data store.

This could be something that Multi-memory support could help with, particularly with something similar to memory mapping between the modules or linking the modules together (though this is less desirable, the modules are not coupled).


Last updated: Nov 22 2024 at 16:03 UTC