wasmtime / PR #4187 Add shared memories · git-wasmtime

Stream: git-wasmtime

Topic: wasmtime / PR #4187 Add shared memories

Wasmtime GitHub notifications bot (May 24 2022 at 22:18):

abrown opened PR #4187 from shared-memory-vmcontext to main:

This change adds the ability to use shared memories in Wasmtime when the
[threads proposal] is enabled. Shared memories are annotated as shared
in the WebAssembly syntax, e.g., (memory 1 1 shared), and are
protected from concurrent access during memory.size and memory.grow.

[threads proposal]: https://github.com/WebAssembly/threads/blob/master/proposals/threads/Overview.md

In order to implement this in Wasmtime, there are two main cases to
cover:
- a program may simply create a shared memory and possibly export it;
this means that Wasmtime itself must be able to create shared
memories

- a user may create a shared memory externally and pass it in as an
import during instantiation; this is the case when the program
contains code like `(import "env" "memory" (memory 1 1
shared))`--this case is handled by a new Wasmtime API
type--`SharedMemory`
Because of the first case, this change allows any of the current
memory-creation mechanisms to work as-is. Wasmtime can still create
either static or dynamic memories in either on-demand or pooling modes,
and any of these memories can be considered shared. When shared, the
Memory runtime container will lock appropriately during memory.size
and memory.grow operations; since all memories use this container, it
is an ideal place for implementing the locking once and once only.

The second case is covered by the new SharedMemory structure. It uses
the same Mmap allocation under the hood as non-shared memories, but
allows the user to perform the allocation externally to Wasmtime and
share the memory across threads (via an Arc). The pointer address to
the actual memory is carefully wired through and owned by the
SharedMemory structure itself. This means that there are differing
views of where to access the pointer (i.e., VMMemoryDefinition): for
owned memories (the default), the VMMemoryDefinition is stored
directly by the VMContext; in the SharedMemory case, however, this
VMContext must point to this separate structure.

To ensure that the VMContext can always point to the correct
VMMemoryDefinition, this change alters the VMContext structure.
Since a SharedMemory owns its own VMMemoryDefinition, the
defined_memories table in the VMContext becomes a sequence of
pointers--in the shared memory case, they point to the
VMMemoryDefinition owned by the SharedMemory and in the owned memory
case (i.e., not shared) they point to VMMemoryDefinitions stored in a
new table, owned_memories.

This change adds an additional indirection (through the *mut VMMemoryDefinition pointer) that could add overhead. Using an imported
memory as a proxy, we measured a 1-3% overhead of this approach on the
pulldown-cmark benchmark. To avoid this, Cranelift-generated code will
special-case the owned memory access (i.e., load a pointer directly to
the owned_memories entry) for memory.size so that only
shared memories (and imported memories, as before) incur the indirection
cost.

Wasmtime GitHub notifications bot (May 24 2022 at 22:56):

abrown updated PR #4187 from shared-memory-vmcontext to main.

Wasmtime GitHub notifications bot (May 24 2022 at 23:03):

abrown submitted PR review.

Wasmtime GitHub notifications bot (May 24 2022 at 23:03):

abrown created PR review comment:

I think a refactoring to make this work would be to:

change the signature of RuntimeLinearMemory::grow to take a limiter_fn: Option<Fn<...>> instead of the entire store

remove the error reporting to the limiter (Store::memory_grow_failed -> ResourceLimiter::memory_grow_failed); right now it is unclear (except during testing) why a limiter would need to receive the growth failure--shouldn't it be the one producing this error instead? If indeed this _is_ needed, perhaps we could alter the suggestion above to an Option<dyn ResourceLimiter>?

Wasmtime GitHub notifications bot (May 24 2022 at 23:03):

abrown edited PR review comment.

Wasmtime GitHub notifications bot (May 24 2022 at 23:05):

abrown updated PR #4187 from shared-memory-vmcontext to main.

Wasmtime GitHub notifications bot (May 24 2022 at 23:18):

abrown submitted PR review.

Wasmtime GitHub notifications bot (May 24 2022 at 23:18):

abrown created PR review comment:

It is unclear what to do at this point if the module's memory is a shared one... The current logic below is incorrect.

Wasmtime GitHub notifications bot (May 24 2022 at 23:19):

abrown updated PR #4187 from shared-memory-vmcontext to main.

Wasmtime GitHub notifications bot (May 24 2022 at 23:19):

abrown updated PR #4187 from shared-memory-vmcontext to main.

Wasmtime GitHub notifications bot (May 24 2022 at 23:23):

abrown updated PR #4187 from shared-memory-vmcontext to main.

Wasmtime GitHub notifications bot (May 25 2022 at 14:43):

alexcrichton submitted PR review.

Wasmtime GitHub notifications bot (May 25 2022 at 14:43):

alexcrichton submitted PR review.

Wasmtime GitHub notifications bot (May 25 2022 at 14:43):

alexcrichton created PR review comment:

I see this was added to import OwnedMemoryIndex, but for most other types the wasmtime-types crate is largely reexported through the wasmtime-environ crate. Could the indexes be used through there perhaps?

Wasmtime GitHub notifications bot (May 25 2022 at 14:43):

alexcrichton created PR review comment:

I think this is no longer needed?

Wasmtime GitHub notifications bot (May 25 2022 at 14:43):

alexcrichton created PR review comment:

Now that shared memories are supported I think it's' ok to remove this and the block below that was modified (and let wasmparser handle using shared memories without the threads feature enabled)

Wasmtime GitHub notifications bot (May 25 2022 at 14:43):

alexcrichton created PR review comment:

"instantiation" here I think should be reworded

Wasmtime GitHub notifications bot (May 25 2022 at 14:43):

alexcrichton created PR review comment:

Since self is Copy I think the into_ here can be skipped and this can probably instead be as_shared_memory perhaps? (taking &self as an argument)

Wasmtime GitHub notifications bot (May 25 2022 at 14:43):

alexcrichton created PR review comment:

To me this isn't necessarily an error per se, I think a better signature for this function might be -> Option<SharedMemory> (since there's only one "error" condition here anyway)

Wasmtime GitHub notifications bot (May 25 2022 at 14:43):

alexcrichton created PR review comment:

leaving a note here to ensure this isn't forgotten before landing

Wasmtime GitHub notifications bot (May 25 2022 at 14:43):

alexcrichton created PR review comment:

This API isn't actually safe to provide because otherwise you could do something like:
let mut mem1: SharedMemory = ...;
let mut mem2 = mem1.clone();
let a = mem1.data_mut();
let b = mem2.data_mut();
// .. a and b alias here
which isn't memory-safe in Rust. In the Memory API we leverage that the owner of the memory is Store<T> so by taking AsContextMut we can thread through the lifetime of that borrow onto the borrow of memory, but here with no single owner we can't do that.

We actually can't even support fn data(&self) -> &[u8] either since that type signature in Rust assumes no writes will happen to the array (when in fact they can happen). For now all we can do I think is return perhaps *mut [u8] and then document how to be careful with it.

This also reminds me, we'll need to do something about Memory::data{,_mut} as well. That'll either need to return None or panic for a shared memory. I feel like panicking may be the best bet for now, but we'd need to clearly document that as well (and panicking isn't great but the set of embedders that don't want to think about threads is probably nontrivial to the point that returning an Option would be too onerous)

This also

Wasmtime GitHub notifications bot (May 25 2022 at 14:43):

alexcrichton created PR review comment:

I think the minimum/maximum here should be u32 if this is a 32-bit memory constructor (and later with MemoryType::shared64 the arguments would be u64)

Wasmtime GitHub notifications bot (May 25 2022 at 14:43):

alexcrichton created PR review comment:

Similar to above could the memory get reexported and the embedding api tested on the returned handle?

Wasmtime GitHub notifications bot (May 25 2022 at 14:43):

alexcrichton created PR review comment:

Could this test exercise growing the memory externally and ensuring that the instanc sees the grown memory size? Also could you add a second instance here and ensure that both instances see the same size?

Wasmtime GitHub notifications bot (May 25 2022 at 14:43):

alexcrichton created PR review comment:

From what I was mentioning above about how data_mut isn't a sound API in Rust, for the memory modification here it's probably best to have an exported function from the wasm itself which modifies memory and that's invoked here to do the modifications on behalf of this test.

Otherwise though having an unsafe block here to do the modification is also fine (but this should be unsafe one way or another with a non-atomic write)

Wasmtime GitHub notifications bot (May 25 2022 at 14:43):

alexcrichton created PR review comment:

It's ok to skip the Arc wrappers here since the Clone implementation on the inner types already only clones an Arc

Wasmtime GitHub notifications bot (May 25 2022 at 14:43):

alexcrichton created PR review comment:

I believe this assertion can get tripped and as-written I think that's expected. I believe 4 different threads an all concurrently execute grow, they witness sizes N, N+1, N+2, and N+3, and then the threads all race to push onto the sizes list meaning that there's no guarantee it'll be pushed in a sorted order.

I think though one thing that could be tested here is that the returned list is sorted and then assert that each element from 1 to N is present in the array?

Wasmtime GitHub notifications bot (May 25 2022 at 14:43):

alexcrichton created PR review comment:

If an atomic_load doesn't take an offset in clif for now it's probably ok to emit an i32.add to add in the offset?

Wasmtime GitHub notifications bot (May 25 2022 at 14:43):

alexcrichton created PR review comment:

This got me thinking "does this need to be an atomic load?" and I think the answer is no because shared memories never have their base pointer change (so it's write-once only during initialization). That being said though one thing this did remind me of is that the current_length should never be used here. I think it's only used for dynamic memories, so could this have an assert below that the dynamic memory branch is never taken if the memory is shared?

Wasmtime GitHub notifications bot (May 25 2022 at 14:43):

alexcrichton created PR review comment:

I don't think that this is quite right because imported memories are listed in memory_plans (it's all memories in the module). I think that imported memories need to be excluded here?

Wasmtime GitHub notifications bot (May 25 2022 at 14:43):

alexcrichton created PR review comment:

Leaving a note here to be sure to fill this "TODO" in

Wasmtime GitHub notifications bot (May 25 2022 at 14:43):

alexcrichton created PR review comment:

Similar to above, I think that the calculation here needs to exclude imported memories first, and then exclude shared memories from the remaining set of memories to figure out the number of owned memories.

Wasmtime GitHub notifications bot (May 25 2022 at 14:43):

alexcrichton created PR review comment:

I think this should instead be an assert that the memory index specified is in-bounds

Wasmtime GitHub notifications bot (May 25 2022 at 14:43):

alexcrichton created PR review comment:

IIRC the reason for passing them explicitly was that it's somewhat tricky to calculate the maximum/minimum converting from u32-or-u64 to usize so by passing them here it's "pre-validated" from the results of the previous calculations.

Wasmtime GitHub notifications bot (May 25 2022 at 14:43):

alexcrichton created PR review comment:

Since the outer SharedMemory is boxed up as well, could this perhaps be SharedMemoryInner<T> and the memory here is memory: RwLock<T>?

Wasmtime GitHub notifications bot (May 25 2022 at 14:43):

alexcrichton created PR review comment:

This I think has pretty deep implications which require an audit of callers of this function to ensure they still work correctly. Looking I see:

Instance::memory - this is not ok for shared memories since it does a non-atomic read of the pointer

Instance::set_memory - this is no longer valid and needs to be updated. Callers of this function are:

Instance::memory_grow - this'll need to be updated since the update only happens for owned memories

Instance::get_exported_memory - this is ok but I think callers of this need to be updated to do atomic reads of the length field

Instance::memory_index - this is no longer valid and needs to be updated (or deleted)

Wasmtime GitHub notifications bot (May 25 2022 at 14:43):

alexcrichton created PR review comment:

This growth operation, on success, needs to perform an atomic increment of the LongTermVMMemoryDefinition current length I think?

Wasmtime GitHub notifications bot (May 25 2022 at 14:43):

alexcrichton created PR review comment:

Could this function take MemoryPlan instead of recalculating it here?

Also, could this internally validate that the plan is Static?

Wasmtime GitHub notifications bot (May 25 2022 at 14:43):

alexcrichton created PR review comment:

This change is actually a bit worrisome to me because the purpose of this argument is to limit memory usage within a Store, but our current strategy of a per-Store memory limiter is not sufficient to protect against shared memories.

If store is None could this assert that the memory is shared? Additionally I think we'll need to keep a note on a tracking issue for shared memories to figure out how best to limit them since our current ResourceLimiter won't work

Wasmtime GitHub notifications bot (May 25 2022 at 14:43):

alexcrichton created PR review comment:

... or actually, here's an idea.

Right now we have enum Extern { ..., Memory(Memory), ... } and even shared memories here are represented and wrapped as a Memory. One thing we could do is statically make a distinction between Memory and SharedMemory, meaning that a Memory wouldn't wrap a SharedMemory. That way something like instance.get_memory(...) would fail because the shared memory wouldn't be classified as a Memory.

This is fine to do as a follow-up PR or just keep in the back of our heads now though. I realize this is a pretty different design from what you've been working on so far.

Wasmtime GitHub notifications bot (May 25 2022 at 14:43):

alexcrichton created PR review comment:

Oh sorry I sort of independently came to a similar conclusion that this would be a problem above before I saw your comment here.

I think for now let's skip limiting memory growth of shared memories with ResourceLimiter so this would thread through a None or similar (and wherever uses the limiter would assert the store is present if the memory isn't shared).

Overall we're going to need a different strategy for limiting shared memories since growth would notify a random store which isn't necessarily the right "store". I would naively say that we'll need a limiter-per-shared-linear-memory but we can figure that out later, no need to have it covered for now.

Wasmtime GitHub notifications bot (May 25 2022 at 14:43):

alexcrichton created PR review comment:

Given that this is only debuginfo related and debuginfo is off-by-default I think it'd be fine to just return an error here for shared memories and have someone else come along later and figure this out if necessary.

Wasmtime GitHub notifications bot (May 25 2022 at 14:43):

alexcrichton created PR review comment:

Could this include some comments about why it's safe to do this?

Wasmtime GitHub notifications bot (May 25 2022 at 14:43):

alexcrichton created PR review comment:

(only applicable for multi-memory of course, but still something we'll want to handle, and perhaps a good test case to add eventually as well)

Wasmtime GitHub notifications bot (May 25 2022 at 14:43):

alexcrichton created PR review comment:

Or actually I see now that the downcast into a shared memory might be difficult with that, so could this otherwise be hardcoded as an MmapMemory?

Wasmtime GitHub notifications bot (May 26 2022 at 17:21):

abrown updated PR #4187 from shared-memory-vmcontext to main.

Wasmtime GitHub notifications bot (May 26 2022 at 17:24):

abrown updated PR #4187 from shared-memory-vmcontext to main.

Wasmtime GitHub notifications bot (May 26 2022 at 17:25):

abrown updated PR #4187 from shared-memory-vmcontext to main.

Wasmtime GitHub notifications bot (May 31 2022 at 16:43):

abrown updated PR #4187 from shared-memory-vmcontext to main.

Wasmtime GitHub notifications bot (May 31 2022 at 17:02):

abrown submitted PR review.

Wasmtime GitHub notifications bot (May 31 2022 at 17:02):

abrown created PR review comment:

Not sure I understand what you mean by "dynamic memory branch" here... there is an if is_shared ... above?

Wasmtime GitHub notifications bot (May 31 2022 at 18:37):

abrown submitted PR review.

Wasmtime GitHub notifications bot (May 31 2022 at 18:37):

abrown created PR review comment:

Done in grow instead.

Wasmtime GitHub notifications bot (May 31 2022 at 20:26):

abrown updated PR #4187 from shared-memory-vmcontext to main.

Wasmtime GitHub notifications bot (May 31 2022 at 20:31):

abrown submitted PR review.

Wasmtime GitHub notifications bot (May 31 2022 at 20:31):

abrown created PR review comment:

defined_memory_index a few lines above was returning Option<...> so I kept the same level of abstraction--defined_memory_index can fail by indexing into the imported memories and this one can fail by not finding an owned memory within the index space. Looking at it again, now I'm wondering if the .skip(...) actually makes the lower part of this incorrect...

Wasmtime GitHub notifications bot (May 31 2022 at 20:42):

abrown submitted PR review.

Wasmtime GitHub notifications bot (May 31 2022 at 20:42):

abrown created PR review comment:

Right now there are multiple kinds of memory that could get wrapped into a SharedMemory? (See new_* methods).

Wasmtime GitHub notifications bot (Jun 01 2022 at 22:42):

abrown updated PR #4187 from shared-memory-vmcontext to main.

Wasmtime GitHub notifications bot (Jun 02 2022 at 15:54):

abrown updated PR #4187 from shared-memory-vmcontext to main.

Wasmtime GitHub notifications bot (Jun 02 2022 at 15:59):

abrown updated PR #4187 from shared-memory-vmcontext to main.

Wasmtime GitHub notifications bot (Jun 02 2022 at 17:05):

abrown updated PR #4187 from shared-memory-vmcontext to main.

Wasmtime GitHub notifications bot (Jun 02 2022 at 17:15):

abrown updated PR #4187 from shared-memory-vmcontext to main.

Wasmtime GitHub notifications bot (Jun 02 2022 at 19:01):

abrown updated PR #4187 from shared-memory-vmcontext to main.

Wasmtime GitHub notifications bot (Jun 02 2022 at 19:06):

abrown updated PR #4187 from shared-memory-vmcontext to main.

Wasmtime GitHub notifications bot (Jun 02 2022 at 19:13):

abrown updated PR #4187 from shared-memory-vmcontext to main.

Wasmtime GitHub notifications bot (Jun 02 2022 at 19:13):

abrown updated PR #4187 from shared-memory-vmcontext to main.

Wasmtime GitHub notifications bot (Jun 02 2022 at 22:18):

abrown updated PR #4187 from shared-memory-vmcontext to main.

Wasmtime GitHub notifications bot (Jun 02 2022 at 23:53):

abrown has marked PR #4187 as ready for review.

Wasmtime GitHub notifications bot (Jun 02 2022 at 23:53):

abrown requested alexcrichton for a review on PR #4187.

Wasmtime GitHub notifications bot (Jun 03 2022 at 18:41):

alexcrichton submitted PR review.

Wasmtime GitHub notifications bot (Jun 03 2022 at 18:41):

alexcrichton submitted PR review.

Wasmtime GitHub notifications bot (Jun 03 2022 at 18:41):

alexcrichton created PR review comment:

Can the tests in test_vmmemory_import below be expanded to test the offsets for this field as well?

Wasmtime GitHub notifications bot (Jun 03 2022 at 18:41):

alexcrichton created PR review comment:

With a split type now I think that this preallocation parameter can be removed since it's always None?

Wasmtime GitHub notifications bot (Jun 03 2022 at 18:41):

alexcrichton created PR review comment:

Could this perhaps filter for !p.1.memory.shared to calculate num_owned_memories? It's a little non-obvious otherwise what's going on here with the formula below.

Wasmtime GitHub notifications bot (Jun 03 2022 at 18:41):

alexcrichton created PR review comment:

I think this function should go ahead and return *mut [u8] probably. *const is just as unsafe as *mut here since neither mean anything in this concurrent context.

Wasmtime GitHub notifications bot (Jun 03 2022 at 18:41):

alexcrichton created PR review comment:

Noting the TODO here

Wasmtime GitHub notifications bot (Jun 03 2022 at 18:41):

alexcrichton created PR review comment:

I think it's ok to put this use at the top of the module

Wasmtime GitHub notifications bot (Jun 03 2022 at 18:41):

alexcrichton created PR review comment:

Reading over this again, could this function perhaps be changed to take an OwnedMemoryIndex since those are the only ones we should ever set?

Wasmtime GitHub notifications bot (Jun 03 2022 at 18:41):

alexcrichton created PR review comment:

This is surprising to me in that vmimport is intended to be a cheap "read the field" sort of method rather than one that actually allocates within the provided store.

I thought we were going to do something where the VMMemoryImport here was stored within the wasmtime_runtime::SharedMemory itself? That would mean that this method would need to push self.0.clone() into the StoreOpaque somewhere to keep a strong reference to the shared memory, but I don't think we otherwise need to allocate a separate VMContext any more per store we're inserting within.

I know we've waffled a bit on this historically, so maybe I'm forgetting why this wasn't done?

Wasmtime GitHub notifications bot (Jun 03 2022 at 18:41):

alexcrichton created PR review comment:

Could this memory_id be learned from the memory_plans.push above like before?

Wasmtime GitHub notifications bot (Jun 03 2022 at 18:41):

alexcrichton created PR review comment:

Yeah this is the correct engine. If the shared memory came from internally within a module it's correct, and if it came from externally it was validated going in to be of the correct engine.

Wasmtime GitHub notifications bot (Jun 03 2022 at 18:41):

alexcrichton created PR review comment:

I'm still pretty uncomfortable with this, I don't think that we should sometimes apply store-related limiting to shared memories and sometimes not depending on whatever happens to be the context. I don't think stores should cover shared memories at all due to their shared nature, so I think that if a store is available it needs to be intentionally ignored for shared memories.

Wasmtime GitHub notifications bot (Jun 03 2022 at 18:41):

alexcrichton created PR review comment:

Thinking on this again, is the RwLock really necessary here? (as opposed to a Mutex)

There aren't that many "long running" operations that only need a read lock I think?

Wasmtime GitHub notifications bot (Jun 03 2022 at 18:41):

alexcrichton created PR review comment:

I'm a little worried that in the case of a shared memory that this growth drops the error on the floor and it doesn't go anywhere. Could you perhaps add a // FIXME?

(in general I think it would be good to create a general tracking issue (or edit an existing issue) to keep track of all these items related to shared memories, I feel like there's quite a few of follow-ups to be done before everything is "stable")

Wasmtime GitHub notifications bot (Jun 03 2022 at 18:41):

alexcrichton created PR review comment:

Noting the TODO here.

Additionally instead of using if is_shared you could probably also do:
match self.module.owned_memory_index(def_index) {
    Some(owned) => /* ... */
    None => /* shared ... */
}

Wasmtime GitHub notifications bot (Jun 03 2022 at 18:41):

alexcrichton created PR review comment:

Is it possible to panic here? I think that this VMMemoryDefinition should only ever be used for owned memories by the wasmtime_runtime::Instance to update the state of the memory itself.

Basically it seems like a bug if this is ever actually called for shared memories.

Wasmtime GitHub notifications bot (Jun 03 2022 at 18:41):

alexcrichton created PR review comment:

To perhaps stress this a bit more, could the loop happen in wasm itself? The growth loop could also perhaps happen entirely within wasm as well where -1 stores in some region of memory "stop looping on memory.size" and both threads return?

Wasmtime GitHub notifications bot (Jun 03 2022 at 18:41):

alexcrichton created PR review comment:

Can the comment here be expanded a bit? I initially was wary of this since there's nothing atomic about this in terms of concurrent growth, but this is happening inside of a write lock so it should be fine. Additionally this field isn't only read by JIT-generated code but a ton of the runtime support reads this field as well (as I'm sure you're well aware of now)

Wasmtime GitHub notifications bot (Jun 03 2022 at 18:41):

alexcrichton created PR review comment:

Bumping this comment again

Wasmtime GitHub notifications bot (Jun 03 2022 at 18:41):

alexcrichton created PR review comment:

The above defined_memory_index is different though where Option isn't out of bounds it means "that's an imported memory" (which is the purpose of the function, to classify the index as either imported or locally defined).

Here I dont think an attempt should be made to handle out-of-bounds behavior. Otherwise the None return value either means "out of bounds" or "not an owned memory index", and given that nothing should ever be out of bounds internally I don't think it's necessary to handle the out of bounds case.

Wasmtime GitHub notifications bot (Jun 03 2022 at 18:41):

alexcrichton created PR review comment:

Although actually, it looks like this function is basically the fusion of Memory::new_dynamic and SharedMemory::wrap, so could this function be deleted entirely and callers get updated to using Memory::new_dynamic?

Or could this call Memory::new_dynamic and then do the cast back to SharedMemory as an assertion?

Wasmtime GitHub notifications bot (Jun 06 2022 at 17:37):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Jun 06 2022 at 17:37):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Jun 06 2022 at 17:37):

cfallin created PR review comment:

A comment here noting that this is the only access to the length field from JIT'd code for a shared memory would be helpful. Specifically I was worried that a dynamically-bounds-checked access could exist and went looking for logic that would generate atomic loads in the heap_addr lowering, or some equivalent change; then realized that we prevent that case by construction, by only allowing Static-type shared memories. Could we document that here?

Wasmtime GitHub notifications bot (Jun 06 2022 at 17:37):

cfallin created PR review comment:

This doc-comment seems a little unclear to me: if the result isn't exactly precise at all times, how might it differ? Is it always smaller or equal (because memory can only grow but it may be outdated)? Or something else? It seems we need a stronger statement here to rely on it at all.

I also wonder if this is necessary vs. a variant that uses Ordering::Acquire and makes a stronger claim? Or in other words, can we justify why a "fast and imprecise" variant is needed?

Wasmtime GitHub notifications bot (Jun 06 2022 at 17:37):

cfallin created PR review comment:

I'm not sure I fully understand the atomicity guarantees around how the actual growth (making pages accessible) and setting the new size (below) occur. A comment here might be good to have, explaining why it is OK for these to be two separate steps, and what happens if other concurrently-running threads run Wasm that use memory.size and probe pages for accessibility?

Specifically I'm curious about the case where another thread (i) accesses a memory address, finding that it does not trap, and execution continues; (ii) runs memory.size, and sees that the memory is supposedly smaller than the just-accessed address. This is possible if one imagines this (growing) thread to pause between the grow() above and the atomic store below.

Do we say that this is a racy Wasm program and so results are undefined? In other words, are all memory accesses that implicitly depend on the memory length actually non-synchronizing, and any Wasm program that depends on a certain access to an address trapping must ensure it synchronizes with any other threads growing the shared memory?

That seems like a reasonable enough answer to me, but we should (i) document it, and (ii) confirm that it's spec-compliant. And actually I hope that it is our answer, because providing a stronger unit of atomicity seems really hard (pause the world? Or set the new length first, and catch and handle a false out-of-bounds trap by taking a lock on length and re-checking to be sure in the signal handler?).

Wasmtime GitHub notifications bot (Jun 06 2022 at 17:37):

cfallin created PR review comment:

I think in the sense of "make sure we don't feed current_length to the heap-access machinery in Cranelift"-safety, we could maybe make current_length_offset an Option here, and return None for the shared case. Then below wherever we use it, unwrap it with "expected current-length field (always present on non-shared memories)" or something of the sort as an expect error?

Wasmtime GitHub notifications bot (Jun 06 2022 at 20:04):

alexcrichton created PR review comment:

Actually that raises another question for me about atomic orderings which sort of furthers my concerns about using ptr::copy for memory.copy and friends. If one thread witnesses that the shared memory has a particular size I don't know what the guarantees are about witnessing other parts of memory in terms of memory orderings. For example should this be Acquire? Should this be SeqCst? Do wasm modules otherwise need to atomic.fence before calling host functionality or something like that?

I personally know very little about the atomic memory model for wasm and how it's all intended to work out so there may not be a great answer here. That being said it's almost certainly not wrong to fall back to SeqCst so I might say we should do that to be maximally conservative?

Wasmtime GitHub notifications bot (Jun 06 2022 at 20:04):

alexcrichton submitted PR review.

Wasmtime GitHub notifications bot (Jun 06 2022 at 21:50):

abrown updated PR #4187 from shared-memory-vmcontext to main.

Wasmtime GitHub notifications bot (Jun 06 2022 at 21:59):

abrown created PR review comment:

There's no vmmemory_import_index function to test against... should there be? (I don't think we use the index offset in the JIT-generated code).

Wasmtime GitHub notifications bot (Jun 06 2022 at 21:59):

abrown submitted PR review.

Wasmtime GitHub notifications bot (Jun 06 2022 at 22:01):

abrown created PR review comment:

SharedMemory::vmimport uses Memory::_new to calculate a VMMemoryImport.

Wasmtime GitHub notifications bot (Jun 06 2022 at 22:01):

abrown submitted PR review.

Wasmtime GitHub notifications bot (Jun 06 2022 at 22:05):

abrown created PR review comment:

Looking at Tunables (now for the n-th time), I can't really see how we could guarantee that only a static allocation is used.

Wasmtime GitHub notifications bot (Jun 06 2022 at 22:05):

abrown submitted PR review.

Wasmtime GitHub notifications bot (Jun 06 2022 at 22:11):

abrown created PR review comment:

Welp, not any more after removing the Option from the return of owned_memory_index!

Wasmtime GitHub notifications bot (Jun 06 2022 at 22:11):

abrown submitted PR review.

Wasmtime GitHub notifications bot (Jun 06 2022 at 22:12):

abrown updated PR #4187 from shared-memory-vmcontext to main.

Wasmtime GitHub notifications bot (Jun 07 2022 at 13:44):

alexcrichton created PR review comment:

Ah no you're right, disregard me!

Wasmtime GitHub notifications bot (Jun 07 2022 at 13:44):

alexcrichton submitted PR review.

Wasmtime GitHub notifications bot (Jun 07 2022 at 13:46):

alexcrichton submitted PR review.

Wasmtime GitHub notifications bot (Jun 07 2022 at 13:46):

alexcrichton created PR review comment:

The vmimport constructor doesn't need to create a Memory through which inserts into the memories table within the Store, so I think that the call to the trampoline helper here can be inlined into vmimport and then the instance handle can be used to extract the memory directly rather than going through Memory

Wasmtime GitHub notifications bot (Jun 07 2022 at 13:47):

alexcrichton submitted PR review.

Wasmtime GitHub notifications bot (Jun 07 2022 at 13:47):

alexcrichton created PR review comment:

I don't think you'll want to determine that by looking at the tunables but rather by looking at the MemoryPlan that pops out. Additionally though this isn't the best place for the error but rather in wasmtime_runtime::SharedMemory during the constructor you can match on the MemoryPlan and reject Dynamic variants.

Wasmtime GitHub notifications bot (Jun 07 2022 at 17:36):

abrown created PR review comment:

I tried but we aren't there yet: instance.rs uses this function both to setup the vmctx initially in initialize_vmctx and later to replace the VMMemoryDefinition as a part of memory_grow.

Wasmtime GitHub notifications bot (Jun 07 2022 at 17:36):

abrown submitted PR review.

Wasmtime GitHub notifications bot (Jun 07 2022 at 17:54):

alexcrichton created PR review comment:

Isn't that a bug though that needs fixing? The write of VMMemoryDefinition isn't atomic so it needs to be synchronized with everything else. I think that memory_grow needs to be updated to only update the local owned context and otherwise rely on the receiver to update?

Wasmtime GitHub notifications bot (Jun 07 2022 at 17:54):

alexcrichton submitted PR review.

Wasmtime GitHub notifications bot (Jun 07 2022 at 18:03):

abrown submitted PR review.

Wasmtime GitHub notifications bot (Jun 07 2022 at 18:03):

abrown created PR review comment:

https://github.com/bytecodealliance/wasmtime/issues/4240

Wasmtime GitHub notifications bot (Jun 07 2022 at 18:15):

abrown submitted PR review.

Wasmtime GitHub notifications bot (Jun 07 2022 at 18:15):

abrown created PR review comment:

Yeah, the RwLock could be considered overkill here. Looking at https://github.com/WebAssembly/threads/issues/26 shows some different options for implementing the locking around this. Initially I was considering locking everything (even memory.size through a host call) but since the addition of AtomicUsize I migrated towards locking memory.grow and allowing atomic access to the length. The read locks used in this struct could eventually be replaced by atomic loads and this RwLock with a Mutex but at the moment the extra safety seems fine and this type of refactor could be resolved in the tracking issue.

Wasmtime GitHub notifications bot (Jun 07 2022 at 18:23):

abrown submitted PR review.

Wasmtime GitHub notifications bot (Jun 07 2022 at 18:23):

abrown created PR review comment:

what happens if other concurrently-running threads run Wasm that use memory.size and probe pages for accessibility?

The test_memory_size_accessibility test does exactly that, but not probing beyond the memory.size

Do we say that this is a racy Wasm program and so results are undefined?

Conrad's summary of the original discussion of this issue is: "multiple accesses in one thread racing with another thread's memory.grow that are in-bounds only after the grow commits may independently succeed or trap." See the discussion above that comment for more details, but my reading is that the weaker semantics that you described are the ones in the spec.

Wasmtime GitHub notifications bot (Jun 07 2022 at 18:41):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Jun 07 2022 at 18:41):

cfallin created PR review comment:

OK awesome -- just a comment here pointing to that and documenting why it's OK would be great then :-)

Wasmtime GitHub notifications bot (Jun 07 2022 at 18:47):

abrown submitted PR review.

Wasmtime GitHub notifications bot (Jun 07 2022 at 18:47):

abrown created PR review comment:

I believe the reason @alexcrichton suggested this relaxed function was for all the cases where the length is retrieved but no synchronization was necessary (i.e., anything related to owned memory). I could certainly make this SeqCst but that doesn't totally make sense for all the owned memory cases. Since, like @cfallin mentioned, current_length() can only differ by appearing smaller than what another thread sees, I will update the documentation comment accordingly. This function should be precise for any owned memories (regardless of what ordering is used) and will be safely imprecise for shared memories but under-estimating (at least until the spec changes to allow memory shrinking in some way).

Wasmtime GitHub notifications bot (Jun 07 2022 at 18:52):

abrown updated PR #4187 from shared-memory-vmcontext to main.

Wasmtime GitHub notifications bot (Jun 07 2022 at 18:53):

abrown requested alexcrichton for a review on PR #4187.

Wasmtime GitHub notifications bot (Jun 07 2022 at 18:58):

abrown submitted PR review.

Wasmtime GitHub notifications bot (Jun 07 2022 at 18:58):

abrown created PR review comment:

I don't see it. Instance::memory is called by Instance::get_memory which is used by memory_copy, memory_fill, etc. That functionality should still be present for shared memory I think.

Wasmtime GitHub notifications bot (Jun 07 2022 at 19:08):

abrown created PR review comment:

The VMMemoryDefinition is stored on the wasmtime_runtime::SharedMemory already; maybe that's what you mean. Storing the VMMemoryImport would mean that the shared memory is associated with a single instance in a single store... can't these things be shared across stores?

In any case, this function is only used when importing a shared memory into an instance from OwnedImports::push so it seems like the extra work is only done then. If the above question is answered, perhaps this refactor is a suitable candidate for the shared memory tracking issue?

Wasmtime GitHub notifications bot (Jun 07 2022 at 19:08):

abrown submitted PR review.

Wasmtime GitHub notifications bot (Jun 07 2022 at 19:50):

alexcrichton submitted PR review.

Wasmtime GitHub notifications bot (Jun 07 2022 at 19:50):

alexcrichton created PR review comment:

Sorry I couldn't place this comment well due to the PR not actually covering the relevant line of code, I mean the set_memory function just below this comment

Wasmtime GitHub notifications bot (Jun 07 2022 at 19:56):

alexcrichton submitted PR review.

Wasmtime GitHub notifications bot (Jun 07 2022 at 19:56):

alexcrichton created PR review comment:

Reading over this again, I think this should return an error that this isn't supported because this won't work correctly with the pooling allocator as the lifetime of the linear memory is tied ot the lifetime of the instance, which isn't correct in the case of shared memories.

Wasmtime GitHub notifications bot (Jun 07 2022 at 19:58):

alexcrichton created PR review comment:

Can these asserts be deferred to SharedMemory::wrap and can this method call SharedMemory::wrap? Otherwise Memory::new_dynamic which I believe is called for locally defined memories isn't asserting these invariants (when it needs to be doing so)

Wasmtime GitHub notifications bot (Jun 07 2022 at 19:58):

alexcrichton submitted PR review.

Wasmtime GitHub notifications bot (Jun 07 2022 at 20:00):

alexcrichton created PR review comment:

This is a pretty minor point and it can be left to later, but my point here is that this can and probably should be replaced with Mutex outright. I don't think there's any need for the reader/writer aspect of RwLock, only the "Lock" part is needed.

Eventually moving some operations outside the lock entirely I think is somewhat orthogonal. Still good to do but otherwise having RwLock here I think gives the wrong impression that this is somehow optimized or architected to have few writers and many readers.

Wasmtime GitHub notifications bot (Jun 07 2022 at 20:00):

alexcrichton submitted PR review.

Wasmtime GitHub notifications bot (Jun 07 2022 at 20:01):

alexcrichton submitted PR review.

Wasmtime GitHub notifications bot (Jun 07 2022 at 20:01):

alexcrichton created PR review comment:

Sorry yeah the refactoring you've pushed up now is what I had in mind, still creating a local VMContext per-store but just not pushing that VMContext as a Memory into the Store

Wasmtime GitHub notifications bot (Jun 07 2022 at 20:04):

alexcrichton created PR review comment:

One thing to consider here is the effect on the synchronization of this precise field, the current length, but there's also the effects of whether we can see the rest of concurrent memory modifications or such. For example if we read the current length at a particular value what does that mean for reading other pieces of memory?

I'm not sure what the answer is but I'm pretty sure it's affected by the memory orderings here. I don't know what the desired memory orderings are so I would personally use SeqCst everywhere because I know it's not wrong. Then again SeqCst can also be buggy because it provides too strong of an ordering which means we can't relax it in the future when it turns out we're not supposed to have such a strong ordering.

Anyway these are also theoretical concerns of mine and don't necessarily need any changes at this time.

Wasmtime GitHub notifications bot (Jun 07 2022 at 20:04):

alexcrichton submitted PR review.

Wasmtime GitHub notifications bot (Jun 07 2022 at 20:11):

abrown updated PR #4187 from shared-memory-vmcontext to main.

Wasmtime GitHub notifications bot (Jun 07 2022 at 20:46):

abrown submitted PR review.

Wasmtime GitHub notifications bot (Jun 07 2022 at 20:46):

abrown created PR review comment:

Ok, I conditioned the "replace the memory definition" code on whether the memory was shared or not which then makes it possible to make this unreachable!. I tried changing set_memory to take an OwnedMemoryIndex but calculating that is not fast: we have to iterate through all of the memories to find the owned ones. We do this already during instantiation because we're iterating through the memories anyways but it seems expensive to do for every memory.grow (we could obviously save off a map of "defined to owned indexes" during instantiation but then that means extra space).

Wasmtime GitHub notifications bot (Jun 07 2022 at 20:46):

abrown updated PR #4187 from shared-memory-vmcontext to main.

Wasmtime GitHub notifications bot (Jun 07 2022 at 20:48):

abrown submitted PR review.

Wasmtime GitHub notifications bot (Jun 07 2022 at 20:48):

abrown created PR review comment:

Ok, see my comment here: it's tricky to re-calculate the owned index for every memory.grow.

Wasmtime GitHub notifications bot (Jun 07 2022 at 20:49):

abrown submitted PR review.

Wasmtime GitHub notifications bot (Jun 07 2022 at 20:49):

abrown created PR review comment:

Could this be done later as a part of the tracking issue?

Wasmtime GitHub notifications bot (Jun 07 2022 at 20:54):

abrown created PR review comment:

No, it is possible that multiple threads could be asking for byte_size() and this allows those calls to be concurrent. Eventually we could move to using the current_length: AtomicUsize exclusively for byte_size but now that I think of it maximum_byte_size and needs_init still only need read access, not a full lock.

Wasmtime GitHub notifications bot (Jun 07 2022 at 20:54):

abrown submitted PR review.

Wasmtime GitHub notifications bot (Jun 07 2022 at 22:02):

abrown updated PR #4187 from shared-memory-vmcontext to main.

Wasmtime GitHub notifications bot (Jun 07 2022 at 22:28):

abrown created PR review comment:

But we should still be able to create instances in the pooling allocator that contain (memory _ _ shared), right? I agree that there is something to be fixed here (maybe in the tracking issue?) but it's not clear what that would be--prevent deallocation in the pool for memories once they are shared outside the instance?

Wasmtime GitHub notifications bot (Jun 07 2022 at 22:28):

abrown submitted PR review.

Wasmtime GitHub notifications bot (Jun 08 2022 at 14:33):

alexcrichton created PR review comment:

In theory that should work yeah but it's currently not sound to do so with the pooling allocator. I think it's fine to return an error and leave a "FIXME" for implementing this in the future. For now though I'm trying to make sure that there aren't any known unsoundnesses in the implementation and this is one that needs to be fixed.

Wasmtime GitHub notifications bot (Jun 08 2022 at 14:33):

alexcrichton submitted PR review.

Wasmtime GitHub notifications bot (Jun 08 2022 at 16:06):

abrown updated PR #4187 from shared-memory-vmcontext to main.

Wasmtime GitHub notifications bot (Jun 08 2022 at 16:06):

abrown submitted PR review.

Wasmtime GitHub notifications bot (Jun 08 2022 at 16:06):

abrown created PR review comment:

Oh, you said return an error... I used todo!. Does it matter?

Wasmtime GitHub notifications bot (Jun 08 2022 at 16:27):

alexcrichton submitted PR review.

Wasmtime GitHub notifications bot (Jun 08 2022 at 16:27):

alexcrichton created PR review comment:

Certainly!

Wasmtime GitHub notifications bot (Jun 08 2022 at 17:12):

alexcrichton submitted PR review.

Wasmtime GitHub notifications bot (Jun 08 2022 at 17:12):

alexcrichton created PR review comment:

Nah todo! is fine, just something to indicate us to come back to this if anyone stumbles across it.

Wasmtime GitHub notifications bot (Jun 08 2022 at 17:13):

alexcrichton submitted PR review.

Wasmtime GitHub notifications bot (Jun 08 2022 at 17:13):

alexcrichton merged PR #4187.

Last updated: Apr 17 2025 at 02:30 UTC