abrown opened PR #4187 from shared-memory-vmcontext
to main
:
This change adds the ability to use shared memories in Wasmtime when the
[threads proposal] is enabled. Shared memories are annotated asshared
in the WebAssembly syntax, e.g.,(memory 1 1 shared)
, and are
protected from concurrent access duringmemory.size
andmemory.grow
.[threads proposal]: https://github.com/WebAssembly/threads/blob/master/proposals/threads/Overview.md
In order to implement this in Wasmtime, there are two main cases to
cover:- a program may simply create a shared memory and possibly export it; this means that Wasmtime itself must be able to create shared memories - a user may create a shared memory externally and pass it in as an import during instantiation; this is the case when the program contains code like `(import "env" "memory" (memory 1 1 shared))`--this case is handled by a new Wasmtime API type--`SharedMemory`
Because of the first case, this change allows any of the current
memory-creation mechanisms to work as-is. Wasmtime can still create
either static or dynamic memories in either on-demand or pooling modes,
and any of these memories can be considered shared. When shared, the
Memory
runtime container will lock appropriately duringmemory.size
andmemory.grow
operations; since all memories use this container, it
is an ideal place for implementing the locking once and once only.The second case is covered by the new
SharedMemory
structure. It uses
the sameMmap
allocation under the hood as non-shared memories, but
allows the user to perform the allocation externally to Wasmtime and
share the memory across threads (via anArc
). The pointer address to
the actual memory is carefully wired through and owned by the
SharedMemory
structure itself. This means that there are differing
views of where to access the pointer (i.e.,VMMemoryDefinition
): for
owned memories (the default), theVMMemoryDefinition
is stored
directly by theVMContext
; in theSharedMemory
case, however, this
VMContext
must point to this separate structure.To ensure that the
VMContext
can always point to the correct
VMMemoryDefinition
, this change alters theVMContext
structure.
Since aSharedMemory
owns its ownVMMemoryDefinition
, the
defined_memories
table in theVMContext
becomes a sequence of
pointers--in the shared memory case, they point to the
VMMemoryDefinition
owned by theSharedMemory
and in the owned memory
case (i.e., not shared) they point toVMMemoryDefinition
s stored in a
new table,owned_memories
.This change adds an additional indirection (through the
*mut VMMemoryDefinition
pointer) that could add overhead. Using an imported
memory as a proxy, we measured a 1-3% overhead of this approach on the
pulldown-cmark
benchmark. To avoid this, Cranelift-generated code will
special-case the owned memory access (i.e., load a pointer directly to
theowned_memories
entry) formemory.size
so that only
shared memories (and imported memories, as before) incur the indirection
cost.<!--
Please ensure that the following steps are all taken care of before submitting
the PR.
[ ] This has been discussed in issue #..., or if not, please tell us why
here.[ ] A short description of what this does, why it is needed; if the
description becomes long, the matter should probably be discussed in an issue
first.[ ] This PR contains test cases, if meaningful.
- [ ] A reviewer from the core maintainer team has been assigned for this PR.
If you don't know who could review this, please indicate so. The list of
suggested reviewers on the right can help you.Please ensure all communication adheres to the code of conduct.
-->
abrown updated PR #4187 from shared-memory-vmcontext
to main
.
abrown submitted PR review.
abrown created PR review comment:
I think a refactoring to make this work would be to:
- change the signature of
RuntimeLinearMemory::grow
to take alimiter_fn: Option<Fn<...>>
instead of the entire store- remove the error reporting to the limiter (
Store::memory_grow_failed
->ResourceLimiter::memory_grow_failed
); right now it is unclear (except during testing) why a limiter would need to receive the growth failure--shouldn't it be the one producing this error instead? If indeed this _is_ needed, perhaps we could alter the suggestion above to anOption<dyn ResourceLimiter>
?
abrown edited PR review comment.
abrown updated PR #4187 from shared-memory-vmcontext
to main
.
abrown submitted PR review.
abrown created PR review comment:
It is unclear what to do at this point if the module's memory is a shared one... The current logic below is incorrect.
abrown updated PR #4187 from shared-memory-vmcontext
to main
.
abrown updated PR #4187 from shared-memory-vmcontext
to main
.
abrown updated PR #4187 from shared-memory-vmcontext
to main
.
alexcrichton submitted PR review.
alexcrichton submitted PR review.
alexcrichton created PR review comment:
I see this was added to import
OwnedMemoryIndex
, but for most other types thewasmtime-types
crate is largely reexported through thewasmtime-environ
crate. Could the indexes be used through there perhaps?
alexcrichton created PR review comment:
I think this is no longer needed?
alexcrichton created PR review comment:
Now that shared memories are supported I think it's' ok to remove this and the block below that was modified (and let wasmparser handle using shared memories without the threads feature enabled)
alexcrichton created PR review comment:
"instantiation" here I think should be reworded
alexcrichton created PR review comment:
Since
self
isCopy
I think theinto_
here can be skipped and this can probably instead beas_shared_memory
perhaps? (taking&self
as an argument)
alexcrichton created PR review comment:
To me this isn't necessarily an error per se, I think a better signature for this function might be
-> Option<SharedMemory>
(since there's only one "error" condition here anyway)
alexcrichton created PR review comment:
leaving a note here to ensure this isn't forgotten before landing
alexcrichton created PR review comment:
This API isn't actually safe to provide because otherwise you could do something like:
let mut mem1: SharedMemory = ...; let mut mem2 = mem1.clone(); let a = mem1.data_mut(); let b = mem2.data_mut(); // .. a and b alias here
which isn't memory-safe in Rust. In the
Memory
API we leverage that the owner of the memory isStore<T>
so by takingAsContextMut
we can thread through the lifetime of that borrow onto the borrow of memory, but here with no single owner we can't do that.We actually can't even support
fn data(&self) -> &[u8]
either since that type signature in Rust assumes no writes will happen to the array (when in fact they can happen). For now all we can do I think is return perhaps*mut [u8]
and then document how to be careful with it.This also reminds me, we'll need to do something about
Memory::data{,_mut}
as well. That'll either need to returnNone
or panic for a shared memory. I feel like panicking may be the best bet for now, but we'd need to clearly document that as well (and panicking isn't great but the set of embedders that don't want to think about threads is probably nontrivial to the point that returning anOption
would be too onerous)This also
alexcrichton created PR review comment:
I think the minimum/maximum here should be u32 if this is a 32-bit memory constructor (and later with
MemoryType::shared64
the arguments would beu64
)
alexcrichton created PR review comment:
Similar to above could the memory get reexported and the embedding api tested on the returned handle?
alexcrichton created PR review comment:
Could this test exercise growing the memory externally and ensuring that the instanc sees the grown memory size? Also could you add a second instance here and ensure that both instances see the same size?
alexcrichton created PR review comment:
From what I was mentioning above about how
data_mut
isn't a sound API in Rust, for the memory modification here it's probably best to have an exported function from the wasm itself which modifies memory and that's invoked here to do the modifications on behalf of this test.Otherwise though having an
unsafe
block here to do the modification is also fine (but this should beunsafe
one way or another with a non-atomic write)
alexcrichton created PR review comment:
It's ok to skip the
Arc
wrappers here since theClone
implementation on the inner types already only clones anArc
alexcrichton created PR review comment:
I believe this assertion can get tripped and as-written I think that's expected. I believe 4 different threads an all concurrently execute
grow
, they witness sizes N, N+1, N+2, and N+3, and then the threads all race to push onto thesizes
list meaning that there's no guarantee it'll be pushed in a sorted order.I think though one thing that could be tested here is that the returned list is sorted and then assert that each element from 1 to N is present in the array?
alexcrichton created PR review comment:
If an
atomic_load
doesn't take an offset in clif for now it's probably ok to emit ani32.add
to add in the offset?
alexcrichton created PR review comment:
This got me thinking "does this need to be an atomic load?" and I think the answer is no because shared memories never have their base pointer change (so it's write-once only during initialization). That being said though one thing this did remind me of is that the
current_length
should never be used here. I think it's only used for dynamic memories, so could this have an assert below that the dynamic memory branch is never taken if the memory is shared?
alexcrichton created PR review comment:
I don't think that this is quite right because imported memories are listed in
memory_plans
(it's all memories in the module). I think that imported memories need to be excluded here?
alexcrichton created PR review comment:
Leaving a note here to be sure to fill this "TODO" in
alexcrichton created PR review comment:
Similar to above, I think that the calculation here needs to exclude imported memories first, and then exclude shared memories from the remaining set of memories to figure out the number of owned memories.
alexcrichton created PR review comment:
I think this should instead be an assert that the
memory
index specified is in-bounds
alexcrichton created PR review comment:
IIRC the reason for passing them explicitly was that it's somewhat tricky to calculate the maximum/minimum converting from u32-or-u64 to
usize
so by passing them here it's "pre-validated" from the results of the previous calculations.
alexcrichton created PR review comment:
Since the outer
SharedMemory
is boxed up as well, could this perhaps beSharedMemoryInner<T>
and the memory here ismemory: RwLock<T>
?
alexcrichton created PR review comment:
This I think has pretty deep implications which require an audit of callers of this function to ensure they still work correctly. Looking I see:
Instance::memory
- this is not ok for shared memories since it does a non-atomic read of the pointerInstance::set_memory
- this is no longer valid and needs to be updated. Callers of this function are:
Instance::memory_grow
- this'll need to be updated since the update only happens for owned memoriesInstance::get_exported_memory
- this is ok but I think callers of this need to be updated to do atomic reads of the length fieldInstance::memory_index
- this is no longer valid and needs to be updated (or deleted)
alexcrichton created PR review comment:
This growth operation, on success, needs to perform an atomic increment of the
LongTermVMMemoryDefinition
current length I think?
alexcrichton created PR review comment:
Could this function take
MemoryPlan
instead of recalculating it here?Also, could this internally validate that the plan is
Static
?
alexcrichton created PR review comment:
This change is actually a bit worrisome to me because the purpose of this argument is to limit memory usage within a
Store
, but our current strategy of a per-Store
memory limiter is not sufficient to protect against shared memories.If
store
isNone
could this assert that the memory is shared? Additionally I think we'll need to keep a note on a tracking issue for shared memories to figure out how best to limit them since our currentResourceLimiter
won't work
alexcrichton created PR review comment:
... or actually, here's an idea.
Right now we have
enum Extern { ..., Memory(Memory), ... }
and even shared memories here are represented and wrapped as aMemory
. One thing we could do is statically make a distinction betweenMemory
andSharedMemory
, meaning that aMemory
wouldn't wrap aSharedMemory
. That way something likeinstance.get_memory(...)
would fail because the shared memory wouldn't be classified as aMemory
.This is fine to do as a follow-up PR or just keep in the back of our heads now though. I realize this is a pretty different design from what you've been working on so far.
alexcrichton created PR review comment:
Oh sorry I sort of independently came to a similar conclusion that this would be a problem above before I saw your comment here.
I think for now let's skip limiting memory growth of shared memories with
ResourceLimiter
so this would thread through aNone
or similar (and wherever uses the limiter would assert the store is present if the memory isn't shared).Overall we're going to need a different strategy for limiting shared memories since growth would notify a random store which isn't necessarily the right "store". I would naively say that we'll need a limiter-per-shared-linear-memory but we can figure that out later, no need to have it covered for now.
alexcrichton created PR review comment:
Given that this is only debuginfo related and debuginfo is off-by-default I think it'd be fine to just return an error here for shared memories and have someone else come along later and figure this out if necessary.
alexcrichton created PR review comment:
Could this include some comments about why it's safe to do this?
alexcrichton created PR review comment:
(only applicable for multi-memory of course, but still something we'll want to handle, and perhaps a good test case to add eventually as well)
alexcrichton created PR review comment:
Or actually I see now that the downcast into a shared memory might be difficult with that, so could this otherwise be hardcoded as an
MmapMemory
?
abrown updated PR #4187 from shared-memory-vmcontext
to main
.
abrown updated PR #4187 from shared-memory-vmcontext
to main
.
abrown updated PR #4187 from shared-memory-vmcontext
to main
.
abrown updated PR #4187 from shared-memory-vmcontext
to main
.
abrown submitted PR review.
abrown created PR review comment:
Not sure I understand what you mean by "dynamic memory branch" here... there is an
if is_shared ...
above?
abrown submitted PR review.
abrown created PR review comment:
Done in
grow
instead.
abrown updated PR #4187 from shared-memory-vmcontext
to main
.
abrown submitted PR review.
abrown created PR review comment:
defined_memory_index
a few lines above was returningOption<...>
so I kept the same level of abstraction--defined_memory_index
can fail by indexing into the imported memories and this one can fail by not finding an owned memory within the index space. Looking at it again, now I'm wondering if the.skip(...)
actually makes the lower part of this incorrect...
abrown submitted PR review.
abrown created PR review comment:
Right now there are multiple kinds of memory that could get wrapped into a
SharedMemory
? (Seenew_*
methods).
abrown updated PR #4187 from shared-memory-vmcontext
to main
.
abrown updated PR #4187 from shared-memory-vmcontext
to main
.
abrown updated PR #4187 from shared-memory-vmcontext
to main
.
abrown updated PR #4187 from shared-memory-vmcontext
to main
.
abrown updated PR #4187 from shared-memory-vmcontext
to main
.
abrown updated PR #4187 from shared-memory-vmcontext
to main
.
abrown updated PR #4187 from shared-memory-vmcontext
to main
.
abrown updated PR #4187 from shared-memory-vmcontext
to main
.
abrown updated PR #4187 from shared-memory-vmcontext
to main
.
abrown updated PR #4187 from shared-memory-vmcontext
to main
.
abrown has marked PR #4187 as ready for review.
abrown requested alexcrichton for a review on PR #4187.
alexcrichton submitted PR review.
alexcrichton submitted PR review.
alexcrichton created PR review comment:
Can the tests in
test_vmmemory_import
below be expanded to test the offsets for this field as well?
alexcrichton created PR review comment:
With a split type now I think that this
preallocation
parameter can be removed since it's alwaysNone
?
alexcrichton created PR review comment:
Could this perhaps filter for
!p.1.memory.shared
to calculatenum_owned_memories
? It's a little non-obvious otherwise what's going on here with the formula below.
alexcrichton created PR review comment:
I think this function should go ahead and return
*mut [u8]
probably.*const
is just as unsafe as*mut
here since neither mean anything in this concurrent context.
alexcrichton created PR review comment:
Noting the TODO here
alexcrichton created PR review comment:
I think it's ok to put this
use
at the top of the module
alexcrichton created PR review comment:
Reading over this again, could this function perhaps be changed to take an
OwnedMemoryIndex
since those are the only ones we should ever set?
alexcrichton created PR review comment:
This is surprising to me in that
vmimport
is intended to be a cheap "read the field" sort of method rather than one that actually allocates within the provided store.I thought we were going to do something where the
VMMemoryImport
here was stored within thewasmtime_runtime::SharedMemory
itself? That would mean that this method would need to pushself.0.clone()
into theStoreOpaque
somewhere to keep a strong reference to the shared memory, but I don't think we otherwise need to allocate a separateVMContext
any more per store we're inserting within.I know we've waffled a bit on this historically, so maybe I'm forgetting why this wasn't done?
alexcrichton created PR review comment:
Could this
memory_id
be learned from thememory_plans.push
above like before?
alexcrichton created PR review comment:
Yeah this is the correct engine. If the shared memory came from internally within a module it's correct, and if it came from externally it was validated going in to be of the correct engine.
alexcrichton created PR review comment:
I'm still pretty uncomfortable with this, I don't think that we should sometimes apply store-related limiting to shared memories and sometimes not depending on whatever happens to be the context. I don't think stores should cover shared memories at all due to their shared nature, so I think that if a store is available it needs to be intentionally ignored for shared memories.
alexcrichton created PR review comment:
Thinking on this again, is the
RwLock
really necessary here? (as opposed to aMutex
)There aren't that many "long running" operations that only need a read lock I think?
alexcrichton created PR review comment:
I'm a little worried that in the case of a shared memory that this growth drops the error on the floor and it doesn't go anywhere. Could you perhaps add a
// FIXME
?(in general I think it would be good to create a general tracking issue (or edit an existing issue) to keep track of all these items related to shared memories, I feel like there's quite a few of follow-ups to be done before everything is "stable")
alexcrichton created PR review comment:
Noting the TODO here.
Additionally instead of using
if is_shared
you could probably also do:match self.module.owned_memory_index(def_index) { Some(owned) => /* ... */ None => /* shared ... */ }
alexcrichton created PR review comment:
Is it possible to panic here? I think that this
VMMemoryDefinition
should only ever be used for owned memories by thewasmtime_runtime::Instance
to update the state of the memory itself.Basically it seems like a bug if this is ever actually called for shared memories.
alexcrichton created PR review comment:
To perhaps stress this a bit more, could the loop happen in wasm itself? The growth loop could also perhaps happen entirely within wasm as well where -1 stores in some region of memory "stop looping on
memory.size
" and both threads return?
alexcrichton created PR review comment:
Can the comment here be expanded a bit? I initially was wary of this since there's nothing atomic about this in terms of concurrent growth, but this is happening inside of a write lock so it should be fine. Additionally this field isn't only read by JIT-generated code but a ton of the runtime support reads this field as well (as I'm sure you're well aware of now)
alexcrichton created PR review comment:
Bumping this comment again
alexcrichton created PR review comment:
The above
defined_memory_index
is different though whereOption
isn't out of bounds it means "that's an imported memory" (which is the purpose of the function, to classify the index as either imported or locally defined).Here I dont think an attempt should be made to handle out-of-bounds behavior. Otherwise the
None
return value either means "out of bounds" or "not an owned memory index", and given that nothing should ever be out of bounds internally I don't think it's necessary to handle the out of bounds case.
alexcrichton created PR review comment:
Although actually, it looks like this function is basically the fusion of
Memory::new_dynamic
andSharedMemory::wrap
, so could this function be deleted entirely and callers get updated to usingMemory::new_dynamic
?Or could this call
Memory::new_dynamic
and then do the cast back toSharedMemory
as an assertion?
cfallin submitted PR review.
cfallin submitted PR review.
cfallin created PR review comment:
A comment here noting that this is the only access to the length field from JIT'd code for a shared memory would be helpful. Specifically I was worried that a dynamically-bounds-checked access could exist and went looking for logic that would generate atomic loads in the
heap_addr
lowering, or some equivalent change; then realized that we prevent that case by construction, by only allowingStatic
-type shared memories. Could we document that here?
cfallin created PR review comment:
This doc-comment seems a little unclear to me: if the result isn't exactly precise at all times, how might it differ? Is it always smaller or equal (because memory can only grow but it may be outdated)? Or something else? It seems we need a stronger statement here to rely on it at all.
I also wonder if this is necessary vs. a variant that uses
Ordering::Acquire
and makes a stronger claim? Or in other words, can we justify why a "fast and imprecise" variant is needed?
cfallin created PR review comment:
I'm not sure I fully understand the atomicity guarantees around how the actual growth (making pages accessible) and setting the new size (below) occur. A comment here might be good to have, explaining why it is OK for these to be two separate steps, and what happens if other concurrently-running threads run Wasm that use
memory.size
and probe pages for accessibility?Specifically I'm curious about the case where another thread (i) accesses a memory address, finding that it does not trap, and execution continues; (ii) runs
memory.size
, and sees that the memory is supposedly smaller than the just-accessed address. This is possible if one imagines this (growing) thread to pause between thegrow()
above and the atomic store below.Do we say that this is a racy Wasm program and so results are undefined? In other words, are all memory accesses that implicitly depend on the memory length actually non-synchronizing, and any Wasm program that depends on a certain access to an address trapping must ensure it synchronizes with any other threads growing the shared memory?
That seems like a reasonable enough answer to me, but we should (i) document it, and (ii) confirm that it's spec-compliant. And actually I hope that it is our answer, because providing a stronger unit of atomicity seems really hard (pause the world? Or set the new length first, and catch and handle a false out-of-bounds trap by taking a lock on length and re-checking to be sure in the signal handler?).
cfallin created PR review comment:
I think in the sense of "make sure we don't feed
current_length
to the heap-access machinery in Cranelift"-safety, we could maybe makecurrent_length_offset
anOption
here, and returnNone
for the shared case. Then below wherever we use it, unwrap it with "expected current-length field (always present on non-shared memories)" or something of the sort as anexpect
error?
alexcrichton created PR review comment:
Actually that raises another question for me about atomic orderings which sort of furthers my concerns about using
ptr::copy
formemory.copy
and friends. If one thread witnesses that the shared memory has a particular size I don't know what the guarantees are about witnessing other parts of memory in terms of memory orderings. For example should this beAcquire
? Should this beSeqCst
? Do wasm modules otherwise need toatomic.fence
before calling host functionality or something like that?I personally know very little about the atomic memory model for wasm and how it's all intended to work out so there may not be a great answer here. That being said it's almost certainly not wrong to fall back to
SeqCst
so I might say we should do that to be maximally conservative?
alexcrichton submitted PR review.
abrown updated PR #4187 from shared-memory-vmcontext
to main
.
abrown created PR review comment:
There's no
vmmemory_import_index
function to test against... should there be? (I don't think we use the index offset in the JIT-generated code).
abrown submitted PR review.
abrown created PR review comment:
SharedMemory::vmimport
usesMemory::_new
to calculate aVMMemoryImport
.
abrown submitted PR review.
abrown created PR review comment:
Looking at
Tunables
(now for the n-th time), I can't really see how we could guarantee that only a static allocation is used.
abrown submitted PR review.
abrown created PR review comment:
Welp, not any more after removing the
Option
from the return ofowned_memory_index
!
abrown submitted PR review.
abrown updated PR #4187 from shared-memory-vmcontext
to main
.
alexcrichton created PR review comment:
Ah no you're right, disregard me!
alexcrichton submitted PR review.
alexcrichton submitted PR review.
alexcrichton created PR review comment:
The
vmimport
constructor doesn't need to create aMemory
through which inserts into the memories table within theStore
, so I think that the call to the trampoline helper here can be inlined intovmimport
and then the instance handle can be used to extract the memory directly rather than going throughMemory
alexcrichton submitted PR review.
alexcrichton created PR review comment:
I don't think you'll want to determine that by looking at the tunables but rather by looking at the
MemoryPlan
that pops out. Additionally though this isn't the best place for the error but rather inwasmtime_runtime::SharedMemory
during the constructor you canmatch
on theMemoryPlan
and rejectDynamic
variants.
abrown created PR review comment:
I tried but we aren't there yet:
instance.rs
uses this function both to setup thevmctx
initially ininitialize_vmctx
and later to replace theVMMemoryDefinition
as a part ofmemory_grow
.
abrown submitted PR review.
alexcrichton created PR review comment:
Isn't that a bug though that needs fixing? The write of
VMMemoryDefinition
isn't atomic so it needs to be synchronized with everything else. I think thatmemory_grow
needs to be updated to only update the local owned context and otherwise rely on the receiver to update?
alexcrichton submitted PR review.
abrown submitted PR review.
abrown created PR review comment:
abrown submitted PR review.
abrown created PR review comment:
Yeah, the
RwLock
could be considered overkill here. Looking at https://github.com/WebAssembly/threads/issues/26 shows some different options for implementing the locking around this. Initially I was considering locking everything (evenmemory.size
through a host call) but since the addition ofAtomicUsize
I migrated towards lockingmemory.grow
and allowing atomic access to the length. The read locks used in this struct could eventually be replaced by atomic loads and thisRwLock
with aMutex
but at the moment the extra safety seems fine and this type of refactor could be resolved in the tracking issue.
abrown submitted PR review.
abrown created PR review comment:
what happens if other concurrently-running threads run Wasm that use memory.size and probe pages for accessibility?
The
test_memory_size_accessibility
test does exactly that, but not probing beyond thememory.size
Do we say that this is a racy Wasm program and so results are undefined?
Conrad's summary of the original discussion of this issue is: "multiple accesses in one thread racing with another thread's memory.grow that are in-bounds only after the grow commits may independently succeed or trap." See the discussion above that comment for more details, but my reading is that the weaker semantics that you described are the ones in the spec.
cfallin submitted PR review.
cfallin created PR review comment:
OK awesome -- just a comment here pointing to that and documenting why it's OK would be great then :-)
abrown submitted PR review.
abrown created PR review comment:
I believe the reason @alexcrichton suggested this relaxed function was for all the cases where the length is retrieved but no synchronization was necessary (i.e., anything related to owned memory). I could certainly make this
SeqCst
but that doesn't totally make sense for all the owned memory cases. Since, like @cfallin mentioned,current_length()
can only differ by appearing smaller than what another thread sees, I will update the documentation comment accordingly. This function should be precise for any owned memories (regardless of what ordering is used) and will be safely imprecise for shared memories but under-estimating (at least until the spec changes to allow memory shrinking in some way).
abrown updated PR #4187 from shared-memory-vmcontext
to main
.
abrown requested alexcrichton for a review on PR #4187.
abrown submitted PR review.
abrown created PR review comment:
I don't see it.
Instance::memory
is called byInstance::get_memory
which is used bymemory_copy
,memory_fill
, etc. That functionality should still be present for shared memory I think.
abrown created PR review comment:
The
VMMemoryDefinition
is stored on thewasmtime_runtime::SharedMemory
already; maybe that's what you mean. Storing theVMMemoryImport
would mean that the shared memory is associated with a single instance in a single store... can't these things be shared across stores?In any case, this function is only used when importing a shared memory into an instance from
OwnedImports::push
so it seems like the extra work is only done then. If the above question is answered, perhaps this refactor is a suitable candidate for the shared memory tracking issue?
abrown submitted PR review.
alexcrichton submitted PR review.
alexcrichton created PR review comment:
Sorry I couldn't place this comment well due to the PR not actually covering the relevant line of code, I mean the
set_memory
function just below this comment
alexcrichton submitted PR review.
alexcrichton created PR review comment:
Reading over this again, I think this should return an error that this isn't supported because this won't work correctly with the pooling allocator as the lifetime of the linear memory is tied ot the lifetime of the instance, which isn't correct in the case of shared memories.
alexcrichton created PR review comment:
Can these asserts be deferred to
SharedMemory::wrap
and can this method callSharedMemory::wrap
? OtherwiseMemory::new_dynamic
which I believe is called for locally defined memories isn't asserting these invariants (when it needs to be doing so)
alexcrichton submitted PR review.
alexcrichton created PR review comment:
This is a pretty minor point and it can be left to later, but my point here is that this can and probably should be replaced with
Mutex
outright. I don't think there's any need for the reader/writer aspect ofRwLock
, only the "Lock
" part is needed.Eventually moving some operations outside the lock entirely I think is somewhat orthogonal. Still good to do but otherwise having
RwLock
here I think gives the wrong impression that this is somehow optimized or architected to have few writers and many readers.
alexcrichton submitted PR review.
alexcrichton submitted PR review.
alexcrichton created PR review comment:
Sorry yeah the refactoring you've pushed up now is what I had in mind, still creating a local
VMContext
per-store but just not pushing thatVMContext
as aMemory
into theStore
alexcrichton created PR review comment:
One thing to consider here is the effect on the synchronization of this precise field, the current length, but there's also the effects of whether we can see the rest of concurrent memory modifications or such. For example if we read the current length at a particular value what does that mean for reading other pieces of memory?
I'm not sure what the answer is but I'm pretty sure it's affected by the memory orderings here. I don't know what the desired memory orderings are so I would personally use
SeqCst
everywhere because I know it's not wrong. Then againSeqCst
can also be buggy because it provides too strong of an ordering which means we can't relax it in the future when it turns out we're not supposed to have such a strong ordering.Anyway these are also theoretical concerns of mine and don't necessarily need any changes at this time.
alexcrichton submitted PR review.
abrown updated PR #4187 from shared-memory-vmcontext
to main
.
abrown submitted PR review.
abrown created PR review comment:
Ok, I conditioned the "replace the memory definition" code on whether the memory was shared or not which then makes it possible to make this
unreachable!
. I tried changingset_memory
to take anOwnedMemoryIndex
but calculating that is not fast: we have to iterate through all of the memories to find the owned ones. We do this already during instantiation because we're iterating through the memories anyways but it seems expensive to do for everymemory.grow
(we could obviously save off a map of "defined to owned indexes" during instantiation but then that means extra space).
abrown updated PR #4187 from shared-memory-vmcontext
to main
.
abrown submitted PR review.
abrown created PR review comment:
Ok, see my comment here: it's tricky to re-calculate the owned index for every
memory.grow
.
abrown submitted PR review.
abrown created PR review comment:
Could this be done later as a part of the tracking issue?
abrown created PR review comment:
No, it is possible that multiple threads could be asking for
byte_size()
and this allows those calls to be concurrent. Eventually we could move to using thecurrent_length: AtomicUsize
exclusively forbyte_size
but now that I think of itmaximum_byte_size
andneeds_init
still only need read access, not a full lock.
abrown submitted PR review.
abrown updated PR #4187 from shared-memory-vmcontext
to main
.
abrown created PR review comment:
But we should still be able to create instances in the pooling allocator that contain
(memory _ _ shared)
, right? I agree that there is something to be fixed here (maybe in the tracking issue?) but it's not clear what that would be--prevent deallocation in the pool for memories once they are shared outside the instance?
abrown submitted PR review.
alexcrichton created PR review comment:
In theory that should work yeah but it's currently not sound to do so with the pooling allocator. I think it's fine to return an error and leave a "FIXME" for implementing this in the future. For now though I'm trying to make sure that there aren't any known unsoundnesses in the implementation and this is one that needs to be fixed.
alexcrichton submitted PR review.
abrown updated PR #4187 from shared-memory-vmcontext
to main
.
abrown submitted PR review.
abrown created PR review comment:
Oh, you said return an error... I used
todo!
. Does it matter?
alexcrichton submitted PR review.
alexcrichton created PR review comment:
Certainly!
alexcrichton submitted PR review.
alexcrichton created PR review comment:
Nah
todo!
is fine, just something to indicate us to come back to this if anyone stumbles across it.
alexcrichton submitted PR review.
alexcrichton merged PR #4187.
Last updated: Dec 23 2024 at 13:07 UTC