fitzgen opened PR #12969 from fitzgen:free-list-improvements to bytecodealliance:main:
- Avoid using a
BTreeMapand use a cache-friendlyVecinstead.- When merging blocks in the free list, use linear search, only falling back to binary search when the free list is large.
- Add fast-path entry points that take a
u32size directly that has already been rounded to the free list's alignment.Altogether, this shaves off ~309B instructions retired (48%) from the benchmark in https://github.com/bytecodealliance/wasmtime/issues/11141
<!--
Please make sure you include the following information:
If this work has been discussed elsewhere, please include a link to that
conversation. If it was discussed in an issue, just mention "issue #...".Explain why this change is needed. If the details are in an issue already,
this can be brief.Our development process is documented in the Wasmtime book:
https://docs.wasmtime.dev/contributing-development-process.htmlPlease ensure all communication follows the code of conduct:
https://github.com/bytecodealliance/wasmtime/blob/main/CODE_OF_CONDUCT.md
-->
fitzgen requested pchickey for a review on PR #12969.
fitzgen requested wasmtime-core-reviewers for a review on PR #12969.
github-actions[bot] added the label wasmtime:api on PR #12969.
github-actions[bot] added the label wasmtime:ref-types on PR #12969.
github-actions[bot] commented on PR #12969:
Subscribe to Label Action
cc @fitzgen
<details>
This issue or pull request has been labeled: "wasmtime:api", "wasmtime:ref-types"Thus the following users have been cc'd because of the following labels:
- fitzgen: wasmtime:ref-types
To subscribe or unsubscribe from this label, edit the <code>.github/subscribe-to-label.json</code> configuration file.
Learn more.
</details>
alexcrichton commented on PR #12969:
With this approach, can't deallocation be quadractic w.r.t. the number of blocks? Insofar as this has removal/insertion from a sorted vector which is O(n), so deallocating N items could result in quadratic runtime?
fitzgen commented on PR #12969:
Hm yeah I suppose so. I can re-add the
BTreeMapfor the non-bump portion of the free list, which should avoid that behavior.
alexcrichton commented on PR #12969:
For the bump part though, is this just a tradeoff of rss vs throughput (or something like that)? I'm not sure how big GC heaps are by default but if they're big enough this seems not great where it'll unconditionally fill up the entire linear memory before reusing anything that's been deallocated. Or is the initial size of linear memories typically smaller?
fitzgen commented on PR #12969:
Or is the initial size of linear memories typically smaller?
The initial size is one Wasm page.
After that, right now, we always prefer growing the GC heap to collecting when possible, but that is changing in https://github.com/bytecodealliance/wasmtime/pull/12942
fitzgen updated PR #12969.
fitzgen commented on PR #12969:
Latest commit switched back to
BTreeMapfor the non-bump portion of the free list.
fitzgen unassigned pchickey from PR #12969 Use bump allocation in DRC free list and other improvements.
fitzgen requested alexcrichton for a review on PR #12969.
alexcrichton submitted PR review:
Personally I still sort of feel like the true way forward here will be to compile the allocation algorithm to wasm itself. We talked about this historically, and I realize it's far out of scope for this PR, but I feel like we're going to otherwise endlessly golf various forms of allocators here and there with heuristics and such, whereas putting in wasm with a strict API would make it much easier to experiment and play with at least.
alexcrichton created PR review comment:
Oh, well, uh, I guess this is quadratic as well... I realize this is preexisting, though, so sorry I didn't see this earlier :(
Mind tagging this with a FIXME and an issue? We presumably don't want to make this tier 1 with a quadratic-complexity allocator...
alexcrichton created PR review comment:
I'd probably leave this off and let LLVM figure it out, the
#[cold]should be enough
alexcrichton created PR review comment:
Could this method always be defined, and internally it uses
if cfg!(...)to return early?Also, would this be something suitable instead for
gc_zealsince this could otherwise make N allocations quadratic I think in debug mode?
fitzgen submitted PR review.
fitzgen created PR review comment:
A single allocation will generally always be $O(n)$ in the worst case in the face of heap fragmentation. You can avoid it with moving collectors, where you have more control over the heap, but the DRC collector is highly unlikely to ever become moving.
But perhaps we can do something to prevent N repeated allocations from becoming $O(n^2)$...
fitzgen edited PR review comment.
fitzgen updated PR #12969.
fitzgen commented on PR #12969:
Personally I still sort of feel like the true way forward here will be to compile the allocation algorithm to wasm itself. We talked about this historically, and I realize it's far out of scope for this PR, but I feel like we're going to otherwise endlessly golf various forms of allocators here and there with heuristics and such, whereas putting in wasm with a strict API would make it much easier to experiment and play with at least.
I agree, I don't think that we need to touch the free list again after this PR (plus the follow up fixme) until if/when we self-host the free list.
fitzgen has enabled auto merge for PR #12969.
fitzgen added PR #12969 Use bump allocation in DRC free list and other improvements to the merge queue.
fitzgen merged PR #12969.
fitzgen removed PR #12969 Use bump allocation in DRC free list and other improvements from the merge queue.
Last updated: Apr 12 2026 at 23:10 UTC