Is anyone interested in multi-memory in Wasm up for a chat?
I'm interested in learning about the status of https://github.com/WebAssembly/multi-memory and any other projects in this space.
Particularly, it seems very useful to be able to have some of the memory of a wasm module shared with the host and possibly with other wasm modules, allowing zero copy semantics for that memory.
This seems possible to achieve with memory mapping, but multi-memory might be a good way to make the semantics clearer.
Anyway, hope someone else is interested in this.
Hi! I'm interested in that, too. From the proposals doc, that spec is in the "implementation phase". From the wasmtime docs, it seems to be supported there. I haven't played with it myself yet, thanks for the reminder :D
I have to admit I'm a bit surprised it's supported by Wasmtime. I had only used it through wasmtime-go, and the configurable doesn't seem to be supported there yet (am I missing something?).
from rust, you can enable it with https://docs.rs/wasmtime/0.28.0/wasmtime/struct.Config.html#method.wasm_multi_memory
I think the go api is similar, but I'm not too familiar with it
It looks like it's an oversight that it's missing from the C API and therefore all the (non-Rust) embeddings.
I've opened an issue to add it: https://github.com/bytecodealliance/wasmtime/issues/3066
Thanks for work on that, too! @Peter Huene
https://github.com/bytecodealliance/wasmtime-go/pull/91 review would be great :smiley:
Following up on multi-memory:
What's the current story for how the usage of this will be from Rust (and other languages)?
I've read some claims that LLVM already supports multiple address spaces (for AVR + Harvard arch) using either fat pointers or assumptions about read/write/executable data.
Source: https://users.rust-lang.org/t/rust-wasm-multi-memory-feature/54798/6
I'm reading https://github.com/WebAssembly/multi-memory/blob/master/proposals/multi-memory/Overview.md
To find out more, but it seems focused on the WASM side, without particular concern for making this accessible for compilers.
Is the main usage going to be via a special 'copy from memory A to memory B' function +/ instruction?
(Thanks again for the help last time)
Would a custom allocator suffice? This would be predominantly used to share data between wasm module and something else (another wasm module or some other component). Some sort of a custom allocator would be sufficient I'd think.
At this time AFAIK there are not plans to add language support to Rust for multi-memory and wasm. My assumption has been that multi-memory will come about at some sort of toolilng later as opposed to compiled in. That being said we'll probably follow whatever clang/C does in this regard
I wonder if doing this in wasm64 might actually not be preposterously hard? It seems like it should be possible to reserve a few bytes of each pointer to denote the memory, and at least support, say, 4 or 8 memories
Interesting idea; the hardest part would be making that efficient at load/store time (we really wouldn't want a load/store to such a pointer to require unpacking it each time). Also IIUC multimemory only has static memory indices; wouldn't this require a dynamically-indexed variant?
I guess given the latter, an interesting option might be a memarg
variant that basically says "take the memory index from the top N bits of the pointer, dynamically". I have no idea what the history of the multimemory proposal looks like or if that's already been discussed though?
I have a few questions...
Since the multimemory proposal doesn't modify pointer values, the compiler needs to know the required memidx
for all load and store instructions. I can imagine this being controlled by a global flag set by the user (via a new system API), but that won't work when operating on pointers with different indexes.
My assumption has been that multi-memory will come about at some sort of toolilng later as opposed to compiled in.
I just can't see how developers will be able to use multimemory without some new language affordances. A custom allocator will almost certainly be required, but ultimately we need some way to tie index info to individual pointers; something like:
template<typename T>
struct IndexedPtr {
T *ptr;
uint32_t idx;
// could this use 'asm' style methods to implement indexed wasm ops?
};
char *p1 = malloc(n); // normal pointer; compiler will use index 0 for ops using p1
IndexedPtr<char> *p2 = mallocIndexed(n, index); // compiler somehow knows to set memidx for ops using p2
And for languages that can't provide pointer type aliasing, this probably gets a lot more cumbersome.
@Chris Fallin: Presumably we wouldn't want to reserve any indexing bits from 32 bit pointers, since that would dramatically curtail the available addressing space. Maybe using a single bit to indicate that an index value follows in the next byte/word (i.e. pointers would take 40 or 64 bits but each segment can only address up to 2Gb) would be acceptable?
On efficiency: the proposal already requires that memarg
is inspected, with index extraction and alignment shift required for non-zero indexing. Would pulling an index from the pointer be all that much slower?
Michael Martin said:
Chris Fallin: Presumably we wouldn't want to reserve any indexing bits from 32 bit pointers, since that would dramatically curtail the available addressing space. Maybe using a single bit to indicate that an index value follows in the next byte/word (i.e. pointers would take 40 or 64 bits but each segment can only address up to 2Gb) would be acceptable?
Right, my thought was in the context of Till's question " wonder if doing this in wasm64 might actually not be preposterously hard?"; I think it would only really make sense with 64-bit pointers.
On efficiency: the proposal already requires that
memarg
is inspected, with index extraction and alignment shift required for non-zero indexing. Would pulling an index from the pointer be all that much slower?
The key distinction is that existing opcodes in the multimemory proposal take a static memarg
(see the spec link above) so this can be inspected, and potentially resolved to a baked-in base address (or at least a fixed-offset load from a memory-base-pointers table). In contrast, taking bits from the pointer implies dynamic indexing, with at least a bounds-check of some sort. A little bit of overhead but maybe not too bad, I'm not sure.
Ok, that makes sense. I'm still curious as to the higher-level direction for multimemory, though. Absent a 64 bit VM, the current proposal doesn't seem to be particularly useful for developers without further details of how to indexed memory ops will be plumbed into the development langauge(s). Do you know if there's somewhere I can find out more about this?
my understanding is that in its present form it's still pretty useful in a multi-module situation; e.g. imagine one wants to link a DAG of modules into a single Wasm module, one could include each memory from the original modules and rewrite code to refer to the appropriate memory
but cc @Alex Crichton @fitzgen (he/him) for more thoughts on that as I have a fairly surface-level understanding
Yeah my impression of multi-memory was that it is mostly a tooling/runtime thing that won't necessarily be exposed to languages themselves. For example I don't know how we'd implement this in Rust or how it'd be surfaced in Rust.
I could maybe imagine like memcpy intrinsics in/out of statically-allocated memories, but much more than that seems a bit implausible to me (e.g. a Vec<T>
located in a foreign address space seems dubious to implement but also somewhat dubious in motivation as well)
... but also somewhat dubious in motivation as well
Our use case is to share (preferably) read-only memory between modules that have their own module-specific working memory. The prime example is a static ML model being shared between isolated runtimes/modules performing different inference calculations. Such models may be quite large and the overhead of copying them for each inference is prohibitive.
For this case a statically reserved space in the standard wasm memory would suffice, but that needs to be mapped onto the existing, external ML model (which may be achievable with POSIX virtual memory manipulation by the wasm host).
(For context, I work with Michael Martin)
In addition to this we have a second use case where there's a large data store that different modules should all read from, but not write to.
We'd ideally like to avoid keeping copies of the data store per module.
We'd also like to enforce that the modules can't write to the data store.
This could be something that Multi-memory support could help with, particularly with something similar to memory mapping between the modules or linking the modules together (though this is less desirable, the modules are not coupled).
Last updated: Dec 23 2024 at 12:05 UTC