fitzgen opened PR #1832 from externref-stack-maps
to master
:
For host VM code, we use plain reference counting, where cloning increments the reference count, and dropping decrements it. We can avoid many of the on-stack increment/decrement operations that typically plague the performance of reference counting via Rust's ownership and borrowing system. Moving a
VMExternRef
avoids mutating its reference count, and borrowing it either avoids the reference count increment or delays it until if/when theVMExternRef
is cloned.When passing a
VMExternRef
into compiled Wasm code, we don't want to do reference count mutations for every compiledlocal.{get,set}
, nor for every function call. Therefore, we use a variation of deferred reference counting, where we only mutate reference counts when storingVMExternRef
s somewhere that outlives the activation: into a global or table. Simultaneously, we over-approximate the set ofVMExternRef
s that are inside Wasm function activations. Periodically, we walk the stack at GC safe points, and use stack map information to precisely identify the set ofVMExternRef
s inside Wasm activations. Then we take the difference between this precise set and our over-approximation, and decrement the reference count for each of theVMExternRef
s that are in our over-approximation but not in the precise set. Finally, the over-approximation is replaced with the precise set.The
VMExternRefActivationsTable
implements the over-approximized set ofVMExternRef
s referenced by Wasm activations. Calling a Wasm function and passing it aVMExternRef
moves theVMExternRef
into the table, and the compiled Wasm function logically "borrows" theVMExternRef
from the table. Similarly,global.get
andtable.get
operations clone the gotten
VMExternRef
into theVMExternRefActivationsTable
and then "borrow" the reference out of the table.When a
VMExternRef
is returned to host code from a Wasm function, the host increments the reference count (because the reference is logically "borrowed" from theVMExternRefActivationsTable
and the reference count from the table will be dropped at the next GC).For more general information on deferred reference counting, see An Examination of Deferred Reference Counting and Cycle Detection by Quinane: https://openresearch-repository.anu.edu.au/bitstream/1885/42030/2/hon-thesis.pdf
cc #929
Fixes #1804
Depends on https://github.com/rust-lang/backtrace-rs/pull/341
fitzgen requested alexcrichton for a review on PR #1832.
bjorn3 submitted PR Review.
bjorn3 submitted PR Review.
bjorn3 created PR Review Comment:
Reminder to uncomment or remove.
bjorn3 created PR Review Comment:
self.infos.sort_unstable_by_key(|info| info.code_offset);
bjorn3 created PR Review Comment:
/// The range of PCs that this module covers. Different modules must
bjorn3 created PR Review Comment:
Maybe panic in the drop implementation instead and add an unsafe
finish
function?
bjorn3 created PR Review Comment:
flag_builder.enable("enable_safepoints").unwrap();
bjorn3 created PR Review Comment:
if enable { self.flags.enable("enable_safepoints").unwrap(); }
alexcrichton submitted PR Review.
alexcrichton created PR Review Comment:
I think this line can now be removed
alexcrichton submitted PR Review.
alexcrichton created PR Review Comment:
We've got a lot of various registries floating around wasmtime, would it be possible to insert this into an existing registry somehow? For example most of the code here looks like it's copying what we do for frame information.
alexcrichton created PR Review Comment:
FWIW I've been trying to clean this up over time where
Instance
doesn't do much memory management of its internals. Is it possible to package up this and the stack map registry elsewhere? I'm not really sure entirely what that would look like but I think it would be best if we could keepInstance
having a pretty "raw" feeling rather than holding onto various items.(e.g. shoving this into
Store
and passing it through for when we call wasm)
alexcrichton created PR Review Comment:
Also, to keep adding to this, I'd like to get around to cleaning up the interrupts part eventually too
alexcrichton created PR Review Comment:
Reading over this, I also don't think that this needs to hold onto this information? It looks like the store can hold onto it for all instances?
alexcrichton created PR Review Comment:
Sort of continuing my comment from earlier, but this is an example where it would be great to not have this duplication. Ideally there'd be one "register stuff" call, although I'm not sure yet if this is fundamentally required to be two separate steps.
alexcrichton created PR Review Comment:
Looks like this file become executable?
alexcrichton created PR Review Comment:
I think I may be missing something, but how come this is stored in an
Engine
? (vs aStore
)
fitzgen submitted PR Review.
fitzgen created PR Review Comment:
I really need to figure out why the heck this is happening...
fitzgen updated PR #1832 from externref-stack-maps
to master
:
For host VM code, we use plain reference counting, where cloning increments the reference count, and dropping decrements it. We can avoid many of the on-stack increment/decrement operations that typically plague the performance of reference counting via Rust's ownership and borrowing system. Moving a
VMExternRef
avoids mutating its reference count, and borrowing it either avoids the reference count increment or delays it until if/when theVMExternRef
is cloned.When passing a
VMExternRef
into compiled Wasm code, we don't want to do reference count mutations for every compiledlocal.{get,set}
, nor for every function call. Therefore, we use a variation of deferred reference counting, where we only mutate reference counts when storingVMExternRef
s somewhere that outlives the activation: into a global or table. Simultaneously, we over-approximate the set ofVMExternRef
s that are inside Wasm function activations. Periodically, we walk the stack at GC safe points, and use stack map information to precisely identify the set ofVMExternRef
s inside Wasm activations. Then we take the difference between this precise set and our over-approximation, and decrement the reference count for each of theVMExternRef
s that are in our over-approximation but not in the precise set. Finally, the over-approximation is replaced with the precise set.The
VMExternRefActivationsTable
implements the over-approximized set ofVMExternRef
s referenced by Wasm activations. Calling a Wasm function and passing it aVMExternRef
moves theVMExternRef
into the table, and the compiled Wasm function logically "borrows" theVMExternRef
from the table. Similarly,global.get
andtable.get
operations clone the gotten
VMExternRef
into theVMExternRefActivationsTable
and then "borrow" the reference out of the table.When a
VMExternRef
is returned to host code from a Wasm function, the host increments the reference count (because the reference is logically "borrowed" from theVMExternRefActivationsTable
and the reference count from the table will be dropped at the next GC).For more general information on deferred reference counting, see An Examination of Deferred Reference Counting and Cycle Detection by Quinane: https://openresearch-repository.anu.edu.au/bitstream/1885/42030/2/hon-thesis.pdf
cc #929
Fixes #1804
Depends on https://github.com/rust-lang/backtrace-rs/pull/341
fitzgen updated PR #1832 from externref-stack-maps
to master
:
For host VM code, we use plain reference counting, where cloning increments the reference count, and dropping decrements it. We can avoid many of the on-stack increment/decrement operations that typically plague the performance of reference counting via Rust's ownership and borrowing system. Moving a
VMExternRef
avoids mutating its reference count, and borrowing it either avoids the reference count increment or delays it until if/when theVMExternRef
is cloned.When passing a
VMExternRef
into compiled Wasm code, we don't want to do reference count mutations for every compiledlocal.{get,set}
, nor for every function call. Therefore, we use a variation of deferred reference counting, where we only mutate reference counts when storingVMExternRef
s somewhere that outlives the activation: into a global or table. Simultaneously, we over-approximate the set ofVMExternRef
s that are inside Wasm function activations. Periodically, we walk the stack at GC safe points, and use stack map information to precisely identify the set ofVMExternRef
s inside Wasm activations. Then we take the difference between this precise set and our over-approximation, and decrement the reference count for each of theVMExternRef
s that are in our over-approximation but not in the precise set. Finally, the over-approximation is replaced with the precise set.The
VMExternRefActivationsTable
implements the over-approximized set ofVMExternRef
s referenced by Wasm activations. Calling a Wasm function and passing it aVMExternRef
moves theVMExternRef
into the table, and the compiled Wasm function logically "borrows" theVMExternRef
from the table. Similarly,global.get
andtable.get
operations clone the gotten
VMExternRef
into theVMExternRefActivationsTable
and then "borrow" the reference out of the table.When a
VMExternRef
is returned to host code from a Wasm function, the host increments the reference count (because the reference is logically "borrowed" from theVMExternRefActivationsTable
and the reference count from the table will be dropped at the next GC).For more general information on deferred reference counting, see An Examination of Deferred Reference Counting and Cycle Detection by Quinane: https://openresearch-repository.anu.edu.au/bitstream/1885/42030/2/hon-thesis.pdf
cc #929
Fixes #1804
Depends on https://github.com/rust-lang/backtrace-rs/pull/341
fitzgen updated PR #1832 from externref-stack-maps
to master
:
For host VM code, we use plain reference counting, where cloning increments the reference count, and dropping decrements it. We can avoid many of the on-stack increment/decrement operations that typically plague the performance of reference counting via Rust's ownership and borrowing system. Moving a
VMExternRef
avoids mutating its reference count, and borrowing it either avoids the reference count increment or delays it until if/when theVMExternRef
is cloned.When passing a
VMExternRef
into compiled Wasm code, we don't want to do reference count mutations for every compiledlocal.{get,set}
, nor for every function call. Therefore, we use a variation of deferred reference counting, where we only mutate reference counts when storingVMExternRef
s somewhere that outlives the activation: into a global or table. Simultaneously, we over-approximate the set ofVMExternRef
s that are inside Wasm function activations. Periodically, we walk the stack at GC safe points, and use stack map information to precisely identify the set ofVMExternRef
s inside Wasm activations. Then we take the difference between this precise set and our over-approximation, and decrement the reference count for each of theVMExternRef
s that are in our over-approximation but not in the precise set. Finally, the over-approximation is replaced with the precise set.The
VMExternRefActivationsTable
implements the over-approximized set ofVMExternRef
s referenced by Wasm activations. Calling a Wasm function and passing it aVMExternRef
moves theVMExternRef
into the table, and the compiled Wasm function logically "borrows" theVMExternRef
from the table. Similarly,global.get
andtable.get
operations clone the gotten
VMExternRef
into theVMExternRefActivationsTable
and then "borrow" the reference out of the table.When a
VMExternRef
is returned to host code from a Wasm function, the host increments the reference count (because the reference is logically "borrowed" from theVMExternRefActivationsTable
and the reference count from the table will be dropped at the next GC).For more general information on deferred reference counting, see An Examination of Deferred Reference Counting and Cycle Detection by Quinane: https://openresearch-repository.anu.edu.au/bitstream/1885/42030/2/hon-thesis.pdf
cc #929
Fixes #1804
Depends on https://github.com/rust-lang/backtrace-rs/pull/341
fitzgen requested alexcrichton for a review on PR #1832.
fitzgen updated PR #1832 from externref-stack-maps
to master
:
For host VM code, we use plain reference counting, where cloning increments the reference count, and dropping decrements it. We can avoid many of the on-stack increment/decrement operations that typically plague the performance of reference counting via Rust's ownership and borrowing system. Moving a
VMExternRef
avoids mutating its reference count, and borrowing it either avoids the reference count increment or delays it until if/when theVMExternRef
is cloned.When passing a
VMExternRef
into compiled Wasm code, we don't want to do reference count mutations for every compiledlocal.{get,set}
, nor for every function call. Therefore, we use a variation of deferred reference counting, where we only mutate reference counts when storingVMExternRef
s somewhere that outlives the activation: into a global or table. Simultaneously, we over-approximate the set ofVMExternRef
s that are inside Wasm function activations. Periodically, we walk the stack at GC safe points, and use stack map information to precisely identify the set ofVMExternRef
s inside Wasm activations. Then we take the difference between this precise set and our over-approximation, and decrement the reference count for each of theVMExternRef
s that are in our over-approximation but not in the precise set. Finally, the over-approximation is replaced with the precise set.The
VMExternRefActivationsTable
implements the over-approximized set ofVMExternRef
s referenced by Wasm activations. Calling a Wasm function and passing it aVMExternRef
moves theVMExternRef
into the table, and the compiled Wasm function logically "borrows" theVMExternRef
from the table. Similarly,global.get
andtable.get
operations clone the gotten
VMExternRef
into theVMExternRefActivationsTable
and then "borrow" the reference out of the table.When a
VMExternRef
is returned to host code from a Wasm function, the host increments the reference count (because the reference is logically "borrowed" from theVMExternRefActivationsTable
and the reference count from the table will be dropped at the next GC).For more general information on deferred reference counting, see An Examination of Deferred Reference Counting and Cycle Detection by Quinane: https://openresearch-repository.anu.edu.au/bitstream/1885/42030/2/hon-thesis.pdf
cc #929
Fixes #1804
Depends on https://github.com/rust-lang/backtrace-rs/pull/341
alexcrichton submitted PR Review.
alexcrichton submitted PR Review.
alexcrichton created PR Review Comment:
Perhaps a merge conflict gone wrong?
alexcrichton created PR Review Comment:
Could the canary business here be moved into
catch_traps
to avoid some duplication?
alexcrichton created PR Review Comment:
I'd personally still prefer to avoid having this here if it's sole purpose is to keep the field alive. I think that memory management should be deferred to
Store
alexcrichton created PR Review Comment:
This might be a bit more compact as
(0..size).map(|_| ...).collect()
alexcrichton created PR Review Comment:
(same with
externref_activations_table
above too)
alexcrichton created PR Review Comment:
I don't actually see this end up getting used anywhere, was this perhaps from a previous iteration?
alexcrichton created PR Review Comment:
How is this field used from wasm code? I tried to read over but couldn't find much... It also looked like
VMOffsets
contained information about theend
field below too, but it's not anUnsafeCell
?
alexcrichton created PR Review Comment:
Since this registry only needs to be within a
Store
, how come this is anRwLock
instead ofRefCell
?
alexcrichton created PR Review Comment:
Could this assert that
sp
is not null for wasm frames? (help catch bugs inbacktrace
)
alexcrichton created PR Review Comment:
Ah ok I confirmed on Zulip, but it sounds like this is planned to get used in the future. I presume that the idea is that with
table.get
you'd have a fast path for using these two fields to inline thetry_insert
method, and you'd fallback to an intrinsic for out-of-line growth and such?Reading over all this dealing with chunks and such feels a bit overly complicated. Could this perhaps be simplified a bit? I'm thinking that this could perhaps be restructured to something that looks like:
pub struct VMExternRefActivationsTable { // finter into `fast_path_storage` next: UnsafeCell<NonNull<TableElem>>, // never changes after creation end: NonNull<TableElem>, // used by `try_insert` and stored via `next` in jit code fast_path_storage: Box<[TableElem]>, imprecise_roots: HashSet<VMExternRef>, precise_roots: HashSet<VMExternRef>, }
where here
fast_path_storage
is allocated once and never changes. When an overflow of that table happens it's drained and everything is moved intoimprecise_roots
. That wayimprecise_roots
is a growing set which deduplicates things (in case youtable.get
the same thing a bunch of times) and also handles reallocation for us (we don't need to manually double capacities and such).On a GC we'd fill in
precise_roots
and then we'dmem::swap
the precise/imprecise sets and then clear the imprecise one. Maybe with some other trickery around optimizing reference counts or something like that.I'm mostly hoping that we can simplifiy the
chunks
list because:
- It seems somewhat complicated to manage, especially if we're trying to be clever about capacities and double them
- I'm a bit worried about the segmented-stacks thrashing problem where you keep allocating a doubly-bigger chunk but then freeing it, when it'd be more efficient to just double the size of the internal chunk.
- Roots aren't deduplicated in the chunks so if you pass around the same
VMExternRef
a bunch it may accidentally fill them up quite a lot.With a persistent hash set I think it'd solve the duplication/capacity issues? Anyway curious what you think about this!
alexcrichton created PR Review Comment:
Because of this, could the activation table have a
Drop
which asserts that theprecise_stack_roots
set is empty? Otherwise we'd leak data accidentally b/c there's no dtor which drains the map.(or maybe we should also assert that all chunks are
None
since they don't get frobbed onDrop
either?)
fitzgen submitted PR Review.
fitzgen created PR Review Comment:
Yeah, when I get merge conflicts in
Cargo.lock
I generally just blow it away and let the next build recreate it, but this pulled in that problematic dep update we all talked about a couple days ago. Bleh.
fitzgen submitted PR Review.
fitzgen created PR Review Comment:
It probably doesn't matter in practice, but that would mean the canary is taken from a stack frame that has since been popped
/me shrugs
I guess I could switch the API to
table.with_stack_canary(|| { ... })
instead of an RAII thing? Would be harder to misuse this way too. I think I'll do that.
fitzgen submitted PR Review.
fitzgen created PR Review Comment:
This is a good idea, however
we may still want to have the fast-path table grow shrink based on the precise size of the last root set (probably fine to skip for now, but we'll likely want to investigate this in the future)
we lose the ability to reclaim memory once we are no longer using excessive capacity in these sets (again, probably fine for now, but we'll likely want to handle this sometime in the future)
alexcrichton submitted PR Review.
alexcrichton created PR Review Comment:
Either way's fine by me, I was mostly just hoping that we can encapsulate everything necessary to enter wasm in one function in wasmtime, which currently is
catch_traps
. I think thewith_stack_canary
call can be embedded in there too? (we can perhaps renamecatch_traps
toinvoke_wasm
or something like that)
alexcrichton submitted PR Review.
alexcrichton created PR Review Comment:
Switching
fast_path_storage
to aVec<T>
would be an easy way to make the fast-path-table growable (that's a good point, and one thing I was curious how often it would come into play). For shrinking we could periodically callHashSet::shrink_to_fit
I think?
fitzgen submitted PR Review.
fitzgen created PR Review Comment:
Yeah. Would be nice if
HashSet::shrink_to
was stable, so we could shrink capacity by half when the length is a quarter of the capacity, and keep the amortized O(1). We could doshrink_to_fit
followed byreserve
but this is suboptimal.Anyways, I'm just gonna do the simplest thing here for now, and leave that stuff to when we actually see it in profiles.
fitzgen submitted PR Review.
fitzgen created PR Review Comment:
Also, the
backtrace
update needs these dep updates.
fitzgen submitted PR Review.
fitzgen created PR Review Comment:
This runs into the weirdness at the
wasmtime
andwasmtime-runtime
boundary again: this is insidewasmtime-runtime
and so it has "no" knowledge ofStore
and whether anything else is holding the table alive for it.Would you prefer that
InstanceHandle::new
took a*mut VMExternRefActivationsTable
instead of anRc
and we added this to the safetry invariants required to be maintained forInstanceHandle::new
?
fitzgen submitted PR Review.
fitzgen created PR Review Comment:
Same question as above regarding
*mut
vsArc
: https://github.com/bytecodealliance/wasmtime/pull/1832/files?file-filters%5B%5D=.rs#r439681507
fitzgen submitted PR Review.
fitzgen created PR Review Comment:
Ah I misunderstood what you were asking for in the original comment. This is done now.
fitzgen submitted PR Review.
fitzgen created PR Review Comment:
This doesn't work for disabling reference types, and there is no
disable
function to go along withenable
either.
fitzgen updated PR #1832 from externref-stack-maps
to master
:
For host VM code, we use plain reference counting, where cloning increments the reference count, and dropping decrements it. We can avoid many of the on-stack increment/decrement operations that typically plague the performance of reference counting via Rust's ownership and borrowing system. Moving a
VMExternRef
avoids mutating its reference count, and borrowing it either avoids the reference count increment or delays it until if/when theVMExternRef
is cloned.When passing a
VMExternRef
into compiled Wasm code, we don't want to do reference count mutations for every compiledlocal.{get,set}
, nor for every function call. Therefore, we use a variation of deferred reference counting, where we only mutate reference counts when storingVMExternRef
s somewhere that outlives the activation: into a global or table. Simultaneously, we over-approximate the set ofVMExternRef
s that are inside Wasm function activations. Periodically, we walk the stack at GC safe points, and use stack map information to precisely identify the set ofVMExternRef
s inside Wasm activations. Then we take the difference between this precise set and our over-approximation, and decrement the reference count for each of theVMExternRef
s that are in our over-approximation but not in the precise set. Finally, the over-approximation is replaced with the precise set.The
VMExternRefActivationsTable
implements the over-approximized set ofVMExternRef
s referenced by Wasm activations. Calling a Wasm function and passing it aVMExternRef
moves theVMExternRef
into the table, and the compiled Wasm function logically "borrows" theVMExternRef
from the table. Similarly,global.get
andtable.get
operations clone the gotten
VMExternRef
into theVMExternRefActivationsTable
and then "borrow" the reference out of the table.When a
VMExternRef
is returned to host code from a Wasm function, the host increments the reference count (because the reference is logically "borrowed" from theVMExternRefActivationsTable
and the reference count from the table will be dropped at the next GC).For more general information on deferred reference counting, see An Examination of Deferred Reference Counting and Cycle Detection by Quinane: https://openresearch-repository.anu.edu.au/bitstream/1885/42030/2/hon-thesis.pdf
cc #929
Fixes #1804
Depends on https://github.com/rust-lang/backtrace-rs/pull/341
fitzgen submitted PR Review.
fitzgen created PR Review Comment:
It was unnecessary, and is now removed.
fitzgen submitted PR Review.
fitzgen created PR Review Comment:
Otherwise
Module
is notSend
:error[E0277]: `std::cell::RefCell<wasmtime_runtime::externref::StackMapRegistryInner>` cannot be shared between threads safely --> crates/wasmtime/src/module.rs:579:5 | 578 | fn _assert<T: Send + Sync>() {} | ---- required by this bound in `module::_assert_send_sync::_assert` 579 | _assert::<Module>(); | ^^^^^^^^^^^^^^^^^ `std::cell::RefCell<wasmtime_runtime::externref::StackMapRegistryInner>` cannot be shared between threads safely | = help: within `wasmtime_runtime::externref::StackMapRegistry`, the trait `std::marker::Sync` is not implemented for `std::cell::RefCell<wasmtime_runtime::externref::StackMapRegistryInner>` = note: required because it appears within the type `wasmtime_runtime::externref::StackMapRegistry` = note: required because of the requirements on the impl of `std::marker::Send` for `std::sync::Arc<wasmtime_runtime::externref::StackMapRegistry>` = note: required because it appears within the type `wasmtime_runtime::externref::StackMapRegistration` = note: required because of the requirements on the impl of `std::marker::Send` for `std::sync::Arc<wasmtime_runtime::externref::StackMapRegistration>` = note: required because it appears within the type `std::option::Option<std::sync::Arc<wasmtime_runtime::externref::StackMapRegistration>>` = note: required because it appears within the type `std::option::Option<std::option::Option<std::sync::Arc<wasmtime_runtime::externref::StackMapRegistration>>>` = note: required because of the requirements on the impl of `std::marker::Send` for `std::sync::Mutex<std::option::Option<std::option::Option<std::sync::Arc<wasmtime_runtime::externref::StackMapRegistration>>>>` = note: required because of the requirements on the impl of `std::marker::Send` for `std::sync::Arc<std::sync::Mutex<std::option::Option<std::option::Option<std::sync::Arc<wasmtime_runtime::externref::StackMapRegistration>>>>>` = note: required because it appears within the type `module::Module`
fitzgen updated PR #1832 from externref-stack-maps
to master
:
For host VM code, we use plain reference counting, where cloning increments the reference count, and dropping decrements it. We can avoid many of the on-stack increment/decrement operations that typically plague the performance of reference counting via Rust's ownership and borrowing system. Moving a
VMExternRef
avoids mutating its reference count, and borrowing it either avoids the reference count increment or delays it until if/when theVMExternRef
is cloned.When passing a
VMExternRef
into compiled Wasm code, we don't want to do reference count mutations for every compiledlocal.{get,set}
, nor for every function call. Therefore, we use a variation of deferred reference counting, where we only mutate reference counts when storingVMExternRef
s somewhere that outlives the activation: into a global or table. Simultaneously, we over-approximate the set ofVMExternRef
s that are inside Wasm function activations. Periodically, we walk the stack at GC safe points, and use stack map information to precisely identify the set ofVMExternRef
s inside Wasm activations. Then we take the difference between this precise set and our over-approximation, and decrement the reference count for each of theVMExternRef
s that are in our over-approximation but not in the precise set. Finally, the over-approximation is replaced with the precise set.The
VMExternRefActivationsTable
implements the over-approximized set ofVMExternRef
s referenced by Wasm activations. Calling a Wasm function and passing it aVMExternRef
moves theVMExternRef
into the table, and the compiled Wasm function logically "borrows" theVMExternRef
from the table. Similarly,global.get
andtable.get
operations clone the gotten
VMExternRef
into theVMExternRefActivationsTable
and then "borrow" the reference out of the table.When a
VMExternRef
is returned to host code from a Wasm function, the host increments the reference count (because the reference is logically "borrowed" from theVMExternRefActivationsTable
and the reference count from the table will be dropped at the next GC).For more general information on deferred reference counting, see An Examination of Deferred Reference Counting and Cycle Detection by Quinane: https://openresearch-repository.anu.edu.au/bitstream/1885/42030/2/hon-thesis.pdf
cc #929
Fixes #1804
Depends on https://github.com/rust-lang/backtrace-rs/pull/341
fitzgen updated PR #1832 from externref-stack-maps
to master
:
For host VM code, we use plain reference counting, where cloning increments the reference count, and dropping decrements it. We can avoid many of the on-stack increment/decrement operations that typically plague the performance of reference counting via Rust's ownership and borrowing system. Moving a
VMExternRef
avoids mutating its reference count, and borrowing it either avoids the reference count increment or delays it until if/when theVMExternRef
is cloned.When passing a
VMExternRef
into compiled Wasm code, we don't want to do reference count mutations for every compiledlocal.{get,set}
, nor for every function call. Therefore, we use a variation of deferred reference counting, where we only mutate reference counts when storingVMExternRef
s somewhere that outlives the activation: into a global or table. Simultaneously, we over-approximate the set ofVMExternRef
s that are inside Wasm function activations. Periodically, we walk the stack at GC safe points, and use stack map information to precisely identify the set ofVMExternRef
s inside Wasm activations. Then we take the difference between this precise set and our over-approximation, and decrement the reference count for each of theVMExternRef
s that are in our over-approximation but not in the precise set. Finally, the over-approximation is replaced with the precise set.The
VMExternRefActivationsTable
implements the over-approximized set ofVMExternRef
s referenced by Wasm activations. Calling a Wasm function and passing it aVMExternRef
moves theVMExternRef
into the table, and the compiled Wasm function logically "borrows" theVMExternRef
from the table. Similarly,global.get
andtable.get
operations clone the gotten
VMExternRef
into theVMExternRefActivationsTable
and then "borrow" the reference out of the table.When a
VMExternRef
is returned to host code from a Wasm function, the host increments the reference count (because the reference is logically "borrowed" from theVMExternRefActivationsTable
and the reference count from the table will be dropped at the next GC).For more general information on deferred reference counting, see An Examination of Deferred Reference Counting and Cycle Detection by Quinane: https://openresearch-repository.anu.edu.au/bitstream/1885/42030/2/hon-thesis.pdf
cc #929
Fixes #1804
Depends on https://github.com/rust-lang/backtrace-rs/pull/341
fitzgen updated PR #1832 from externref-stack-maps
to master
:
For host VM code, we use plain reference counting, where cloning increments the reference count, and dropping decrements it. We can avoid many of the on-stack increment/decrement operations that typically plague the performance of reference counting via Rust's ownership and borrowing system. Moving a
VMExternRef
avoids mutating its reference count, and borrowing it either avoids the reference count increment or delays it until if/when theVMExternRef
is cloned.When passing a
VMExternRef
into compiled Wasm code, we don't want to do reference count mutations for every compiledlocal.{get,set}
, nor for every function call. Therefore, we use a variation of deferred reference counting, where we only mutate reference counts when storingVMExternRef
s somewhere that outlives the activation: into a global or table. Simultaneously, we over-approximate the set ofVMExternRef
s that are inside Wasm function activations. Periodically, we walk the stack at GC safe points, and use stack map information to precisely identify the set ofVMExternRef
s inside Wasm activations. Then we take the difference between this precise set and our over-approximation, and decrement the reference count for each of theVMExternRef
s that are in our over-approximation but not in the precise set. Finally, the over-approximation is replaced with the precise set.The
VMExternRefActivationsTable
implements the over-approximized set ofVMExternRef
s referenced by Wasm activations. Calling a Wasm function and passing it aVMExternRef
moves theVMExternRef
into the table, and the compiled Wasm function logically "borrows" theVMExternRef
from the table. Similarly,global.get
andtable.get
operations clone the gotten
VMExternRef
into theVMExternRefActivationsTable
and then "borrow" the reference out of the table.When a
VMExternRef
is returned to host code from a Wasm function, the host increments the reference count (because the reference is logically "borrowed" from theVMExternRefActivationsTable
and the reference count from the table will be dropped at the next GC).For more general information on deferred reference counting, see An Examination of Deferred Reference Counting and Cycle Detection by Quinane: https://openresearch-repository.anu.edu.au/bitstream/1885/42030/2/hon-thesis.pdf
cc #929
Fixes #1804
Depends on https://github.com/rust-lang/backtrace-rs/pull/341
alexcrichton submitted PR Review.
alexcrichton submitted PR Review.
alexcrichton created PR Review Comment:
This seems to have gone backwards?
(along with a number of other crates?)
alexcrichton created PR Review Comment:
FWIW this is likely to break CI until this is fixed
alexcrichton created PR Review Comment:
Published now!
alexcrichton created PR Review Comment:
I think I commented on this last time too, so sorry if I missed your response in the meantime, but can there be a destructor for this type which debug-asserts that all slots in this
chunk
areNone
?
alexcrichton created PR Review Comment:
Here on
insert_slow_path
it always inserts in the hash set, but presumably aftergc
the chunk is empty?
alexcrichton created PR Review Comment:
Also, should this idiom perhaps be encapsulated in a method on the activation table?
alexcrichton created PR Review Comment:
Ah right yeah, I think we probably need to clean up that storge of the registrations in
Module
, but for now seems fine.
alexcrichton created PR Review Comment:
Instead of
#[cfg]
could this conditionally panic ifif cfg!
in the implementation ofDeref
and such?
alexcrichton created PR Review Comment:
Ah sorry missed this earlier, but yeah let's try passing around
*mut
here instead of an owned pointer.
alexcrichton created PR Review Comment:
Perhaps we should take a leaf out of
Rc
's book and usestrong_count
for forwards-compatibility with weak references if we ever add them?
alexcrichton created PR Review Comment:
To confirm, this for sure works, right? One thing I'd be worried about is rustc's rvalue-promotion where the 0 gets promoted into static data. If you did
let canary = &0
, for example, I think that'd point into static data rather than on the stack. I'd be a bit worried that eventually if rustc did more mir inlining or something like that it'd inline the definition ofcanary
to here and do the rvalue promotion anyway.In any case so long as this works today I'm fine, and I would imagine that if rustc ever changed it would cause tests to fail here too.
alexcrichton created PR Review Comment:
I think this'll need to look like
register_jit_code
ish where we register a compiled module with aStore
and metadata about that is stored inStore
rather than theModule
.Honestly we should do this with
register_frame_info
too, but I can tackle that some other time.
alexcrichton created PR Review Comment:
you can probably avoid passing these two parameters explicitly and read them from the passed-in
store
argument inside the method.
alexcrichton created PR Review Comment:
Out of curiosity, could we always enable this? Presumably once reference types ships we'll enable this unconditionally anyway.
alexcrichton created PR Review Comment:
I think that this API may actually be incorrect for GC, because you can register a module into two stores. The second store won't actually get registered in its
registry
which I think means GC won't be precise?
alexcrichton created PR Review Comment:
Answering my own question, if my thought earlier about the stack canary and rvalue promotion breaks things this assertion will break. If our stack canary never works then nothing will get gc'd from the above loop.
fitzgen submitted PR Review.
fitzgen created PR Review Comment:
Sorry I didn't mention this, but since switching from the list-of-chunks representation to your new suggested one, the table automatically runs the
Drop
implementations, so dropping the table will now properly avoid leaks.Therefore, I didn't think it was necessary to insert these debug assertions anymore. (Also, inserting them would require that we
gc
before dropping the table, to ensure that it gets swept. Everything is easier when we can rely on automaticDrop
s!)
fitzgen submitted PR Review.
fitzgen created PR Review Comment:
I figured since it is already a slow path, it might as well do de-duplication, rather than use the bump chunk. I can leave a comment to this effect, and also make a method for this idiom.
alexcrichton submitted PR Review.
alexcrichton created PR Review Comment:
Whoops missed that, nice!
fitzgen submitted PR Review.
fitzgen created PR Review Comment:
It does work today (the tests would fail if it didn't, since they are checking the reference counts and asserting that they go down after GC, which means we did sweep the table, which means we saw the canary).
I agree that it is slightly fragile. I don't know what we could do to avoid it without something like
test::blackbox
or callinggetcontext
to get the SP (eww out-of-line call).
fitzgen submitted PR Review.
fitzgen created PR Review Comment:
Then we would need to always have the
inner: T
member and rely on LLVM to optimize it away when not used in non-cfg(debug_assertions)
. Does that sound OK to you?
alexcrichton submitted PR Review.
alexcrichton created PR Review Comment:
Yeah given the limited usage of this I trust LLVM enough to figure everything out.
fitzgen submitted PR Review.
fitzgen created PR Review Comment:
Good eye! I didn't realize that a single module could be registered with multiple stores!
fitzgen submitted PR Review.
fitzgen created PR Review Comment:
We could, but it implies an additional pass over the IR by cranelift, and if we know there aren't any being used, then we can skip that unnecessary work.
fitzgen submitted PR Review.
fitzgen created PR Review Comment:
Ah, yep, I see you figured it out :)
fitzgen updated PR #1832 from externref-stack-maps
to master
:
For host VM code, we use plain reference counting, where cloning increments the reference count, and dropping decrements it. We can avoid many of the on-stack increment/decrement operations that typically plague the performance of reference counting via Rust's ownership and borrowing system. Moving a
VMExternRef
avoids mutating its reference count, and borrowing it either avoids the reference count increment or delays it until if/when theVMExternRef
is cloned.When passing a
VMExternRef
into compiled Wasm code, we don't want to do reference count mutations for every compiledlocal.{get,set}
, nor for every function call. Therefore, we use a variation of deferred reference counting, where we only mutate reference counts when storingVMExternRef
s somewhere that outlives the activation: into a global or table. Simultaneously, we over-approximate the set ofVMExternRef
s that are inside Wasm function activations. Periodically, we walk the stack at GC safe points, and use stack map information to precisely identify the set ofVMExternRef
s inside Wasm activations. Then we take the difference between this precise set and our over-approximation, and decrement the reference count for each of theVMExternRef
s that are in our over-approximation but not in the precise set. Finally, the over-approximation is replaced with the precise set.The
VMExternRefActivationsTable
implements the over-approximized set ofVMExternRef
s referenced by Wasm activations. Calling a Wasm function and passing it aVMExternRef
moves theVMExternRef
into the table, and the compiled Wasm function logically "borrows" theVMExternRef
from the table. Similarly,global.get
andtable.get
operations clone the gotten
VMExternRef
into theVMExternRefActivationsTable
and then "borrow" the reference out of the table.When a
VMExternRef
is returned to host code from a Wasm function, the host increments the reference count (because the reference is logically "borrowed" from theVMExternRefActivationsTable
and the reference count from the table will be dropped at the next GC).For more general information on deferred reference counting, see An Examination of Deferred Reference Counting and Cycle Detection by Quinane: https://openresearch-repository.anu.edu.au/bitstream/1885/42030/2/hon-thesis.pdf
cc #929
Fixes #1804
Depends on https://github.com/rust-lang/backtrace-rs/pull/341
fitzgen updated PR #1832 from externref-stack-maps
to master
:
For host VM code, we use plain reference counting, where cloning increments the reference count, and dropping decrements it. We can avoid many of the on-stack increment/decrement operations that typically plague the performance of reference counting via Rust's ownership and borrowing system. Moving a
VMExternRef
avoids mutating its reference count, and borrowing it either avoids the reference count increment or delays it until if/when theVMExternRef
is cloned.When passing a
VMExternRef
into compiled Wasm code, we don't want to do reference count mutations for every compiledlocal.{get,set}
, nor for every function call. Therefore, we use a variation of deferred reference counting, where we only mutate reference counts when storingVMExternRef
s somewhere that outlives the activation: into a global or table. Simultaneously, we over-approximate the set ofVMExternRef
s that are inside Wasm function activations. Periodically, we walk the stack at GC safe points, and use stack map information to precisely identify the set ofVMExternRef
s inside Wasm activations. Then we take the difference between this precise set and our over-approximation, and decrement the reference count for each of theVMExternRef
s that are in our over-approximation but not in the precise set. Finally, the over-approximation is replaced with the precise set.The
VMExternRefActivationsTable
implements the over-approximized set ofVMExternRef
s referenced by Wasm activations. Calling a Wasm function and passing it aVMExternRef
moves theVMExternRef
into the table, and the compiled Wasm function logically "borrows" theVMExternRef
from the table. Similarly,global.get
andtable.get
operations clone the gotten
VMExternRef
into theVMExternRefActivationsTable
and then "borrow" the reference out of the table.When a
VMExternRef
is returned to host code from a Wasm function, the host increments the reference count (because the reference is logically "borrowed" from theVMExternRefActivationsTable
and the reference count from the table will be dropped at the next GC).For more general information on deferred reference counting, see An Examination of Deferred Reference Counting and Cycle Detection by Quinane: https://openresearch-repository.anu.edu.au/bitstream/1885/42030/2/hon-thesis.pdf
cc #929
Fixes #1804
Depends on https://github.com/rust-lang/backtrace-rs/pull/341
fitzgen submitted PR Review.
fitzgen created PR Review Comment:
Okay the new version fixes this by registering with the store, rather than with the module. This also means that we can get rid of
StackMapRegistration
entirely, and can instead rely on the store to keep the registry itself alive!
alexcrichton submitted PR Review.
alexcrichton created PR Review Comment:
This seems fine for now, but I wanted to point this out because that optimization is something we need to likely implement in short order in that case. In theory there should be a fast path for functions which don't use anyref?
Put another way, if this is scoped to only reference types for performance reasons, then we need to file an issue about that and get it fixed before fully shipping reference types.
fitzgen submitted PR Review.
fitzgen created PR Review Comment:
Filed https://github.com/bytecodealliance/wasmtime/issues/1883
alexcrichton submitted PR Review.
alexcrichton submitted PR Review.
alexcrichton created PR Review Comment:
I think at this point this can be
RefCell
?
alexcrichton created PR Review Comment:
Since these strong references are dropped here, could this either take
&...
or plumb through the*mut
business?
alexcrichton created PR Review Comment:
To confirm, is every function guaranteed to have a stack map, even if it doesn't use reference types?
alexcrichton created PR Review Comment:
I think this can be
RefCell
now?
alexcrichton created PR Review Comment:
Mind throwing a
cfg!
here for the target pointer width? (and below too)
alexcrichton created PR Review Comment:
Oh dear this is nasty, can we load flags from the
Store
instead of trying to recreate them here though?
alexcrichton created PR Review Comment:
(in that I think we have a
Box<dyn TargetIsa>
shoved somewhere in there I think
fitzgen created PR Review Comment:
Good eye, this is a bug introduced during all the refactoring today. I'm going to just make
StackMapRegistry::register_stack_maps
idempotent, rather than trying to check if its already registered here. This is a nice little clean up, since the registry is in the best position to answer this question anyways.
fitzgen submitted PR Review.
fitzgen updated PR #1832 from externref-stack-maps
to master
:
For host VM code, we use plain reference counting, where cloning increments the reference count, and dropping decrements it. We can avoid many of the on-stack increment/decrement operations that typically plague the performance of reference counting via Rust's ownership and borrowing system. Moving a
VMExternRef
avoids mutating its reference count, and borrowing it either avoids the reference count increment or delays it until if/when theVMExternRef
is cloned.When passing a
VMExternRef
into compiled Wasm code, we don't want to do reference count mutations for every compiledlocal.{get,set}
, nor for every function call. Therefore, we use a variation of deferred reference counting, where we only mutate reference counts when storingVMExternRef
s somewhere that outlives the activation: into a global or table. Simultaneously, we over-approximate the set ofVMExternRef
s that are inside Wasm function activations. Periodically, we walk the stack at GC safe points, and use stack map information to precisely identify the set ofVMExternRef
s inside Wasm activations. Then we take the difference between this precise set and our over-approximation, and decrement the reference count for each of theVMExternRef
s that are in our over-approximation but not in the precise set. Finally, the over-approximation is replaced with the precise set.The
VMExternRefActivationsTable
implements the over-approximized set ofVMExternRef
s referenced by Wasm activations. Calling a Wasm function and passing it aVMExternRef
moves theVMExternRef
into the table, and the compiled Wasm function logically "borrows" theVMExternRef
from the table. Similarly,global.get
andtable.get
operations clone the gotten
VMExternRef
into theVMExternRefActivationsTable
and then "borrow" the reference out of the table.When a
VMExternRef
is returned to host code from a Wasm function, the host increments the reference count (because the reference is logically "borrowed" from theVMExternRefActivationsTable
and the reference count from the table will be dropped at the next GC).For more general information on deferred reference counting, see An Examination of Deferred Reference Counting and Cycle Detection by Quinane: https://openresearch-repository.anu.edu.au/bitstream/1885/42030/2/hon-thesis.pdf
cc #929
Fixes #1804
Depends on https://github.com/rust-lang/backtrace-rs/pull/341
fitzgen updated PR #1832 from externref-stack-maps
to master
:
For host VM code, we use plain reference counting, where cloning increments the reference count, and dropping decrements it. We can avoid many of the on-stack increment/decrement operations that typically plague the performance of reference counting via Rust's ownership and borrowing system. Moving a
VMExternRef
avoids mutating its reference count, and borrowing it either avoids the reference count increment or delays it until if/when theVMExternRef
is cloned.When passing a
VMExternRef
into compiled Wasm code, we don't want to do reference count mutations for every compiledlocal.{get,set}
, nor for every function call. Therefore, we use a variation of deferred reference counting, where we only mutate reference counts when storingVMExternRef
s somewhere that outlives the activation: into a global or table. Simultaneously, we over-approximate the set ofVMExternRef
s that are inside Wasm function activations. Periodically, we walk the stack at GC safe points, and use stack map information to precisely identify the set ofVMExternRef
s inside Wasm activations. Then we take the difference between this precise set and our over-approximation, and decrement the reference count for each of theVMExternRef
s that are in our over-approximation but not in the precise set. Finally, the over-approximation is replaced with the precise set.The
VMExternRefActivationsTable
implements the over-approximized set ofVMExternRef
s referenced by Wasm activations. Calling a Wasm function and passing it aVMExternRef
moves theVMExternRef
into the table, and the compiled Wasm function logically "borrows" theVMExternRef
from the table. Similarly,global.get
andtable.get
operations clone the gotten
VMExternRef
into theVMExternRefActivationsTable
and then "borrow" the reference out of the table.When a
VMExternRef
is returned to host code from a Wasm function, the host increments the reference count (because the reference is logically "borrowed" from theVMExternRefActivationsTable
and the reference count from the table will be dropped at the next GC).For more general information on deferred reference counting, see An Examination of Deferred Reference Counting and Cycle Detection by Quinane: https://openresearch-repository.anu.edu.au/bitstream/1885/42030/2/hon-thesis.pdf
cc #929
Fixes #1804
Depends on https://github.com/rust-lang/backtrace-rs/pull/341
fitzgen updated PR #1832 from externref-stack-maps
to master
:
For host VM code, we use plain reference counting, where cloning increments the reference count, and dropping decrements it. We can avoid many of the on-stack increment/decrement operations that typically plague the performance of reference counting via Rust's ownership and borrowing system. Moving a
VMExternRef
avoids mutating its reference count, and borrowing it either avoids the reference count increment or delays it until if/when theVMExternRef
is cloned.When passing a
VMExternRef
into compiled Wasm code, we don't want to do reference count mutations for every compiledlocal.{get,set}
, nor for every function call. Therefore, we use a variation of deferred reference counting, where we only mutate reference counts when storingVMExternRef
s somewhere that outlives the activation: into a global or table. Simultaneously, we over-approximate the set ofVMExternRef
s that are inside Wasm function activations. Periodically, we walk the stack at GC safe points, and use stack map information to precisely identify the set ofVMExternRef
s inside Wasm activations. Then we take the difference between this precise set and our over-approximation, and decrement the reference count for each of theVMExternRef
s that are in our over-approximation but not in the precise set. Finally, the over-approximation is replaced with the precise set.The
VMExternRefActivationsTable
implements the over-approximized set ofVMExternRef
s referenced by Wasm activations. Calling a Wasm function and passing it aVMExternRef
moves theVMExternRef
into the table, and the compiled Wasm function logically "borrows" theVMExternRef
from the table. Similarly,global.get
andtable.get
operations clone the gotten
VMExternRef
into theVMExternRefActivationsTable
and then "borrow" the reference out of the table.When a
VMExternRef
is returned to host code from a Wasm function, the host increments the reference count (because the reference is logically "borrowed" from theVMExternRefActivationsTable
and the reference count from the table will be dropped at the next GC).For more general information on deferred reference counting, see An Examination of Deferred Reference Counting and Cycle Detection by Quinane: https://openresearch-repository.anu.edu.au/bitstream/1885/42030/2/hon-thesis.pdf
cc #929
Fixes #1804
Depends on https://github.com/rust-lang/backtrace-rs/pull/341
fitzgen updated PR #1832 from externref-stack-maps
to master
:
For host VM code, we use plain reference counting, where cloning increments the reference count, and dropping decrements it. We can avoid many of the on-stack increment/decrement operations that typically plague the performance of reference counting via Rust's ownership and borrowing system. Moving a
VMExternRef
avoids mutating its reference count, and borrowing it either avoids the reference count increment or delays it until if/when theVMExternRef
is cloned.When passing a
VMExternRef
into compiled Wasm code, we don't want to do reference count mutations for every compiledlocal.{get,set}
, nor for every function call. Therefore, we use a variation of deferred reference counting, where we only mutate reference counts when storingVMExternRef
s somewhere that outlives the activation: into a global or table. Simultaneously, we over-approximate the set ofVMExternRef
s that are inside Wasm function activations. Periodically, we walk the stack at GC safe points, and use stack map information to precisely identify the set ofVMExternRef
s inside Wasm activations. Then we take the difference between this precise set and our over-approximation, and decrement the reference count for each of theVMExternRef
s that are in our over-approximation but not in the precise set. Finally, the over-approximation is replaced with the precise set.The
VMExternRefActivationsTable
implements the over-approximized set ofVMExternRef
s referenced by Wasm activations. Calling a Wasm function and passing it aVMExternRef
moves theVMExternRef
into the table, and the compiled Wasm function logically "borrows" theVMExternRef
from the table. Similarly,global.get
andtable.get
operations clone the gotten
VMExternRef
into theVMExternRefActivationsTable
and then "borrow" the reference out of the table.When a
VMExternRef
is returned to host code from a Wasm function, the host increments the reference count (because the reference is logically "borrowed" from theVMExternRefActivationsTable
and the reference count from the table will be dropped at the next GC).For more general information on deferred reference counting, see An Examination of Deferred Reference Counting and Cycle Detection by Quinane: https://openresearch-repository.anu.edu.au/bitstream/1885/42030/2/hon-thesis.pdf
cc #929
Fixes #1804
Depends on https://github.com/rust-lang/backtrace-rs/pull/341
fitzgen updated PR #1832 from externref-stack-maps
to master
:
For host VM code, we use plain reference counting, where cloning increments the reference count, and dropping decrements it. We can avoid many of the on-stack increment/decrement operations that typically plague the performance of reference counting via Rust's ownership and borrowing system. Moving a
VMExternRef
avoids mutating its reference count, and borrowing it either avoids the reference count increment or delays it until if/when theVMExternRef
is cloned.When passing a
VMExternRef
into compiled Wasm code, we don't want to do reference count mutations for every compiledlocal.{get,set}
, nor for every function call. Therefore, we use a variation of deferred reference counting, where we only mutate reference counts when storingVMExternRef
s somewhere that outlives the activation: into a global or table. Simultaneously, we over-approximate the set ofVMExternRef
s that are inside Wasm function activations. Periodically, we walk the stack at GC safe points, and use stack map information to precisely identify the set ofVMExternRef
s inside Wasm activations. Then we take the difference between this precise set and our over-approximation, and decrement the reference count for each of theVMExternRef
s that are in our over-approximation but not in the precise set. Finally, the over-approximation is replaced with the precise set.The
VMExternRefActivationsTable
implements the over-approximized set ofVMExternRef
s referenced by Wasm activations. Calling a Wasm function and passing it aVMExternRef
moves theVMExternRef
into the table, and the compiled Wasm function logically "borrows" theVMExternRef
from the table. Similarly,global.get
andtable.get
operations clone the gotten
VMExternRef
into theVMExternRefActivationsTable
and then "borrow" the reference out of the table.When a
VMExternRef
is returned to host code from a Wasm function, the host increments the reference count (because the reference is logically "borrowed" from theVMExternRefActivationsTable
and the reference count from the table will be dropped at the next GC).For more general information on deferred reference counting, see An Examination of Deferred Reference Counting and Cycle Detection by Quinane: https://openresearch-repository.anu.edu.au/bitstream/1885/42030/2/hon-thesis.pdf
cc #929
Fixes #1804
Depends on https://github.com/rust-lang/backtrace-rs/pull/341
fitzgen updated PR #1832 from externref-stack-maps
to master
:
For host VM code, we use plain reference counting, where cloning increments the reference count, and dropping decrements it. We can avoid many of the on-stack increment/decrement operations that typically plague the performance of reference counting via Rust's ownership and borrowing system. Moving a
VMExternRef
avoids mutating its reference count, and borrowing it either avoids the reference count increment or delays it until if/when theVMExternRef
is cloned.When passing a
VMExternRef
into compiled Wasm code, we don't want to do reference count mutations for every compiledlocal.{get,set}
, nor for every function call. Therefore, we use a variation of deferred reference counting, where we only mutate reference counts when storingVMExternRef
s somewhere that outlives the activation: into a global or table. Simultaneously, we over-approximate the set ofVMExternRef
s that are inside Wasm function activations. Periodically, we walk the stack at GC safe points, and use stack map information to precisely identify the set ofVMExternRef
s inside Wasm activations. Then we take the difference between this precise set and our over-approximation, and decrement the reference count for each of theVMExternRef
s that are in our over-approximation but not in the precise set. Finally, the over-approximation is replaced with the precise set.The
VMExternRefActivationsTable
implements the over-approximized set ofVMExternRef
s referenced by Wasm activations. Calling a Wasm function and passing it aVMExternRef
moves theVMExternRef
into the table, and the compiled Wasm function logically "borrows" theVMExternRef
from the table. Similarly,global.get
andtable.get
operations clone the gotten
VMExternRef
into theVMExternRefActivationsTable
and then "borrow" the reference out of the table.When a
VMExternRef
is returned to host code from a Wasm function, the host increments the reference count (because the reference is logically "borrowed" from theVMExternRefActivationsTable
and the reference count from the table will be dropped at the next GC).For more general information on deferred reference counting, see An Examination of Deferred Reference Counting and Cycle Detection by Quinane: https://openresearch-repository.anu.edu.au/bitstream/1885/42030/2/hon-thesis.pdf
cc #929
Fixes #1804
Depends on https://github.com/rust-lang/backtrace-rs/pull/341
fitzgen merged PR #1832.
alexcrichton created PR Review Comment:
For future instnaces of this, mind tagging this with a FIXME and an issue number so when we get around to aarch64 reference types we can make sure we run all the tests?
alexcrichton submitted PR Review.
alexcrichton submitted PR Review.
alexcrichton created PR Review Comment:
Same comment here for the aarch64 testing
Last updated: Jan 24 2025 at 00:11 UTC