Stream: git-wasmtime

Topic: wasmtime / PR #13323 Inline the copying collector's bump ...


view this post on Zulip Wasmtime GitHub notifications bot (May 07 2026 at 23:32):

fitzgen opened PR #13323 from fitzgen:wasmtime-inline-copying-collector-alloc to bytecodealliance:main:

<!--
Please make sure you include the following information:

Our development process is documented in the Wasmtime book:
https://docs.wasmtime.dev/contributing-development-process.html

Please ensure all communication follows the code of conduct:
https://github.com/bytecodealliance/wasmtime/blob/main/CODE_OF_CONDUCT.md
-->

view this post on Zulip Wasmtime GitHub notifications bot (May 07 2026 at 23:32):

fitzgen requested uweigand for a review on PR #13323.

view this post on Zulip Wasmtime GitHub notifications bot (May 07 2026 at 23:32):

fitzgen requested wasmtime-compiler-reviewers for a review on PR #13323.

view this post on Zulip Wasmtime GitHub notifications bot (May 07 2026 at 23:32):

fitzgen requested dicej for a review on PR #13323.

view this post on Zulip Wasmtime GitHub notifications bot (May 07 2026 at 23:32):

fitzgen requested wasmtime-core-reviewers for a review on PR #13323.

view this post on Zulip Wasmtime GitHub notifications bot (May 07 2026 at 23:32):

fitzgen unassigned uweigand from PR #13323 Inline the copying collector's bump allocation in Wasm code.

view this post on Zulip Wasmtime GitHub notifications bot (May 07 2026 at 23:32):

fitzgen requested alexcrichton for a review on PR #13323.

view this post on Zulip Wasmtime GitHub notifications bot (May 07 2026 at 23:32):

fitzgen requested wasmtime-compiler-reviewers for a review on PR #13323.

view this post on Zulip Wasmtime GitHub notifications bot (May 08 2026 at 03:54):

github-actions[bot] added the label wasmtime:api on PR #13323.

view this post on Zulip Wasmtime GitHub notifications bot (May 08 2026 at 03:54):

github-actions[bot] added the label wasmtime:ref-types on PR #13323.

view this post on Zulip Wasmtime GitHub notifications bot (May 08 2026 at 03:55):

github-actions[bot] commented on PR #13323:

Subscribe to Label Action

cc @fitzgen

<details>
This issue or pull request has been labeled: "wasmtime:api", "wasmtime:ref-types"

Thus the following users have been cc'd because of the following labels:

To subscribe or unsubscribe from this label, edit the <code>.github/subscribe-to-label.json</code> configuration file.

Learn more.
</details>

view this post on Zulip Wasmtime GitHub notifications bot (May 08 2026 at 15:31):

:memo: alexcrichton submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (May 08 2026 at 15:31):

:speech_balloon: alexcrichton created PR review comment:

These might be good methods to have on the PtrSize trait in case they ever, in the future, need to be parameterized on that?

view this post on Zulip Wasmtime GitHub notifications bot (May 08 2026 at 15:31):

:speech_balloon: alexcrichton created PR review comment:

Is this by-and-large emit_gc_raw_alloc from the drc collector? (modulo some control flow on the edges)

view this post on Zulip Wasmtime GitHub notifications bot (May 08 2026 at 15:31):

:speech_balloon: alexcrichton created PR review comment:

This is preexisting I think for other collectors, but there's 2 primary ways that this method is UB according to Miri:

  1. This is deriving a pointer from &self which is later mutated through in compiled code. That's UB in Rust because by deriving the pointer from &self you're saying that the contents can't be mutated. This is a playground example showing the UB (top right, Tools -> Miri). The fix for this specific issue is to change this method to &mut self and then have the implementation be NonNull::from(&mut self.vmctx_data).cast()
  2. The second problem is that this struture is stored inline with the CopyingHeap, which means that re-acquiring a mutable pointer to CopyingHeap will actually invalidate the previous pointer returned by this method. This is a playground example of the UB here.

Fixing the latter one is not going to be super easy. I can think a few possible ways:

I haven't double-checked, but I believe this would all plague the drc/null collectors as well. If you'd like I think it'd be reasonable to fix them all in a subsequent PR.

In terms of detecting all of this, in theory Miri should be able to capture everything here. I think that may mean we're not running much GC-compiled wasms through Miri, which might be a good addition to have? For miri coverage copying this line is probably the easiest. Along those lines rustup run nightly ./ci/miri-wast.sh ./tests/spec_testsuite/struct.wast locally shows a Miri violation which may be related to this, but it's always sort of hard to tell where exactly the miri error comes from due to the mix of pulley and lack of strict provenance...

view this post on Zulip Wasmtime GitHub notifications bot (May 08 2026 at 15:31):

:speech_balloon: alexcrichton created PR review comment:

I think the alias region here is wrong?

view this post on Zulip Wasmtime GitHub notifications bot (May 08 2026 at 15:31):

:speech_balloon: alexcrichton created PR review comment:

This might be able to use istore32 to avoid the ireduce ahead of it. (same for updating the bump ptr above)

view this post on Zulip Wasmtime GitHub notifications bot (May 08 2026 at 15:31):

:speech_balloon: alexcrichton created PR review comment:

How come this became optional? It's passed as Some at all call-sites it looks like. It seems a bit weird to store VMSharedTypeIndex::reserved_value in headers?

view this post on Zulip Wasmtime GitHub notifications bot (May 08 2026 at 15:31):

:speech_balloon: alexcrichton created PR review comment:

Are you sure about the big/little endian here? That shouldn't swap the order of the fields, but instead just the byte order within the fields?

view this post on Zulip Wasmtime GitHub notifications bot (May 08 2026 at 15:31):

alexcrichton unassigned dicej from PR #13323 Inline the copying collector's bump allocation in Wasm code.

view this post on Zulip Wasmtime GitHub notifications bot (May 08 2026 at 15:31):

alexcrichton requested alexcrichton for a review on PR #13323.

view this post on Zulip Wasmtime GitHub notifications bot (May 08 2026 at 16:11):

fitzgen commented on PR #13323:

FWIW this is a 2% reduction in instructions retired on splay.wasm, but I haven't been able to measure a statistically significant difference in cycles yet. Will try bumping up the number of iterations.

view this post on Zulip Wasmtime GitHub notifications bot (May 08 2026 at 18:47):

alexcrichton unassigned alexcrichton from PR #13323 Inline the copying collector's bump allocation in Wasm code.

view this post on Zulip Wasmtime GitHub notifications bot (May 11 2026 at 17:41):

fitzgen commented on PR #13323:

FWIW this is a 2% reduction in instructions retired on splay.wasm, but I haven't been able to measure a statistically significant difference in cycles yet. Will try bumping up the number of iterations.

With 200 iterations:

execution :: cycles :: benchmarks/splay/splay.wasm

  Δ = 84634629.66 ± 43938351.82 (confidence = 99%)

  inline-alloc.so (-Wgc,function-references -Ccollector=copying) is 1.00x to 1.01x faster than main.so (-Wgc,function-references -Ccollector=copying)!

  [13062747333 13278335414.74 13771371538] inline-alloc.so (-Wgc,function-references -Ccollector=copying)
  [13062358191 13362970044.40 13846668642] main.so (-Wgc,function-references -Ccollector=copying)

view this post on Zulip Wasmtime GitHub notifications bot (May 12 2026 at 18:55):

:memo: fitzgen submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (May 12 2026 at 18:55):

:speech_balloon: fitzgen created PR review comment:

We aren't storing into the VMContext but we are storing into a vmctx-style structure (VMCopyingHeapData), which we use the same alias region for (same as for e.g. VMStoreContext accesses).

view this post on Zulip Wasmtime GitHub notifications bot (May 12 2026 at 18:56):

:memo: fitzgen submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (May 12 2026 at 18:56):

:speech_balloon: fitzgen created PR review comment:

Will handle this in a follow up.

view this post on Zulip Wasmtime GitHub notifications bot (May 12 2026 at 20:14):

:memo: alexcrichton submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (May 12 2026 at 20:14):

:speech_balloon: alexcrichton created PR review comment:

Oh, another possible fix is like Store::data_mut where provenance of some Miri-safe pointer is used but the address is used like usual, meaning that the actual machine code doesn't change. This will still require an indirection, however, I think.

view this post on Zulip Wasmtime GitHub notifications bot (May 12 2026 at 20:40):

fitzgen updated PR #13323.

view this post on Zulip Wasmtime GitHub notifications bot (May 12 2026 at 21:32):

:memo: fitzgen submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (May 12 2026 at 21:32):

:speech_balloon: fitzgen created PR review comment:

Yeah, its a bit funky and probably buggy so lets just do two stores for now (same as the null collector, fwiw).

view this post on Zulip Wasmtime GitHub notifications bot (May 12 2026 at 21:36):

fitzgen updated PR #13323.

view this post on Zulip Wasmtime GitHub notifications bot (May 12 2026 at 21:37):

fitzgen requested alexcrichton for a review on PR #13323.

view this post on Zulip Wasmtime GitHub notifications bot (May 12 2026 at 22:11):

:thumbs_up: alexcrichton submitted PR review:

Any idea what's going on with the splay benchmark? Is it not actually allocating that much? Or is something else so massively dominating that allocation isn't factoring in?

view this post on Zulip Wasmtime GitHub notifications bot (May 12 2026 at 22:11):

alexcrichton added PR #13323 Inline the copying collector's bump allocation in Wasm code to the merge queue.

view this post on Zulip Wasmtime GitHub notifications bot (May 12 2026 at 22:36):

github-merge-queue[bot] removed PR #13323 Inline the copying collector's bump allocation in Wasm code from the merge queue.

view this post on Zulip Wasmtime GitHub notifications bot (May 12 2026 at 22:54):

fitzgen commented on PR #13323:

Any idea what's going on with the splay benchmark? Is it not actually allocating that much? Or is something else so massively dominating that allocation isn't factoring in?

It has a few constants that control the size of the splay tree to create, the number of splay tree mutations to perform once its created, and one other thing that I forget. When porting it to Wasm, I replaced those constants with configurable parameters read from default.input, and then I also adjusted the default inputs to make executing it take a little longer and match some of our other benchmarks. That last bit involved mostly doing 10x as many (IIRC) mutations to the tree, while keeping it the same size, so this change could have made it less allocation-heavy in an overall relative sense, and exercising GC object accesses more. (IIRC also, trying to make the tree bigger was starting to hit GC OOM).

Also, it is written in an "idiomatic" style with lots of little functions, and we don't do any inlining by default. That could probably change things drastically for it.

view this post on Zulip Wasmtime GitHub notifications bot (May 12 2026 at 23:04):

fitzgen updated PR #13323.

view this post on Zulip Wasmtime GitHub notifications bot (May 12 2026 at 23:04):

fitzgen has enabled auto merge for PR #13323.

view this post on Zulip Wasmtime GitHub notifications bot (May 12 2026 at 23:27):

fitzgen added PR #13323 Inline the copying collector's bump allocation in Wasm code to the merge queue.

view this post on Zulip Wasmtime GitHub notifications bot (May 12 2026 at 23:53):

:check: fitzgen merged PR #13323.

view this post on Zulip Wasmtime GitHub notifications bot (May 12 2026 at 23:53):

fitzgen removed PR #13323 Inline the copying collector's bump allocation in Wasm code from the merge queue.

view this post on Zulip Wasmtime GitHub notifications bot (May 13 2026 at 19:00):

fitzgen commented on PR #13323:

Any idea what's going on with the splay benchmark? Is it not actually allocating that much? Or is something else so massively dominating that allocation isn't factoring in?

FWIW, here is a quick perf profile of splay.wasm:

<img width="2042" height="398" alt="image" src="https://github.com/user-attachments/assets/33a26afd-9e5d-4697-a07d-5af5e38d8940" />

Nearly all time is spent inside the collector itself.


Last updated: Jun 01 2026 at 09:49 UTC