fitzgen opened PR #8590 from fitzgen:decommit-queue to bytecodealliance:main:
This introduces a
DecommitQueuefor batching decommits together in the pooling allocator:
Deallocating a memory/table/stack enqueues their associated regions of memory for decommit; it no longer immediately returns the associated slot to the pool's free list. If the queue's length has reached the configured batch size, then we flush the queue by running all the decommits, and finally returning the memory/table/stack slots to their respective pools and free lists.
Additionally, if allocating a new memory/table/stack fails because the free list is empty (aka we've reached the max concurrently-allocated limit for this entity) then we fall back to a slow path before propagating the error. This slow path flushes the decommit queue and then retries allocation, hoping that the queue flush reclaimed slots and made them available for this fallback allocation attempt. This involved defining a new
PoolConcurrencyLimitErrorto match on, which is also exposed in the public embedder API.It is also worth noting that we always use this new decommit queue now. To keep the existing behavior, where e.g. a memory's decommits happen immediately on deallocation, you can use a batch size of one. This effectively disables queuing, forcing all decommits to be flushed immediately.
The default decommit batch size is one.
This commit, with batch size of one, consistently gives me an increase on
wasmtime serve's requests-per-second versus its parent commit, as measured bybenches/wasmtime-serve-rps.sh. I get ~39K RPS on this commit compared to ~35K RPS on the parent commit. This is quite puzzling to me. I was expecting no change, and hoping there wouldn't be a regression. I was not expecting a speed up. I cannot explain this result at this time.<!--
Please make sure you include the following information:
If this work has been discussed elsewhere, please include a link to that
conversation. If it was discussed in an issue, just mention "issue #...".Explain why this change is needed. If the details are in an issue already,
this can be brief.Our development process is documented in the Wasmtime book:
https://docs.wasmtime.dev/contributing-development-process.htmlPlease ensure all communication follows the code of conduct:
https://github.com/bytecodealliance/wasmtime/blob/main/CODE_OF_CONDUCT.md
-->
fitzgen requested alexcrichton for a review on PR #8590.
fitzgen requested wasmtime-fuzz-reviewers for a review on PR #8590.
fitzgen requested wasmtime-core-reviewers for a review on PR #8590.
fitzgen requested wasmtime-default-reviewers for a review on PR #8590.
fitzgen updated PR #8590.
fitzgen updated PR #8590.
alexcrichton submitted PR review:
Thanks and looks good! I've got some thoughts on refactorings below, but otherwise can you additionally add some tests that exercise the if-empty-then-flush-and-try-again path? Basically a test that fails if that path isn't there but succeeds if it is.
This is quite puzzling to me. I was expecting no change, and hoping there wouldn't be a regression. I was not expecting a speed up. I cannot explain this result at this time.
I think I've seen things to this effect historically where very-close-together
madvises are slightly more optimal sometimes. The theory is that the kernel didn't have a chance to run any other threads between two calls tomadviseso the second one can skip IPIs since the kernel dynamically knows that no other cores need to be shot down, but that's only a guess.
alexcrichton submitted PR review:
Thanks and looks good! I've got some thoughts on refactorings below, but otherwise can you additionally add some tests that exercise the if-empty-then-flush-and-try-again path? Basically a test that fails if that path isn't there but succeeds if it is.
This is quite puzzling to me. I was expecting no change, and hoping there wouldn't be a regression. I was not expecting a speed up. I cannot explain this result at this time.
I think I've seen things to this effect historically where very-close-together
madvises are slightly more optimal sometimes. The theory is that the kernel didn't have a chance to run any other threads between two calls tomadviseso the second one can skip IPIs since the kernel dynamically knows that no other cores need to be shot down, but that's only a guess.
alexcrichton created PR review comment:
Given that this happens under a lock and needs to happen as quickly as possible could this avoid the
sysconfon each call?
alexcrichton created PR review comment:
Given the size of
DecommitQueuecan a newtype be made forlibc::iovecwhich has these impls?
alexcrichton created PR review comment:
Can this API fail? And if so could
batch_sizebe used as-is?
alexcrichton created PR review comment:
Instead of passing around a closure here and in the
*_pool.rsfiles below would you be ok adding a small abstraction which has&mut DecommitQueueinternally along with theboolthatpooling.rsis tracking?
alexcrichton created PR review comment:
This is probably going to need #[cfg] or move into the sys module to handle windows/macos/miri as well
alexcrichton created PR review comment:
Could you add a helper method for this? Something like:
fn allocate_and_maybe_decommit<T>(&self, f: impl Fn() -> Result<T>) -> Result<T> { ... }to avoid copying this in a few places? Also it might be worth having
flush_decommit_queuereturn whether it actually did anything and skip the re-allocate if the flush didn't flush.
alexcrichton created PR review comment:
Hm actually on second thought I have a different idea. The duplication in the three deallocate methods in
pooling.rsis pretty subtle and feels sort of hard to get right. It also has the subtelty of theMutex<DeallocationQueue>is held across function calls, such as this one, and it's getting to the point where it's easy to get into issues of lock ordering issues or deadlocks. For example indeallocate_fiber_stackthe deallocation queue lock may be held acrossself.stacks.deallocatewhich internally acquires a different lock.This leads me to a different way to model this which might be a bit less error-prone. This
decommitargument could be replaced by a structure withSmallVecs inside of it where it's pushed onto. Once this step has completed there's a general method for "clear out this thing with small vectors or, if empty, do the deallocation immediately" or something like that. That would keep the critical section as small as possible while I think also making it a bit clearer how to reuse/shape code using it perhaps?
fitzgen updated PR #8590.
jameysharp submitted PR review.
jameysharp created PR review comment:
On glibc,
sysconf(_SC_IOV_MAX)returns a compile-time constant, so I don't think we should worry about its speed: https://github.com/bminor/glibc/blob/dd5f891c1ad9f1b43b9db93afe2a55cbb7a6194e/sysdeps/posix/sysconf.c#L412-L418
github-actions[bot] commented on PR #8590:
Subscribe to Label Action
cc @fitzgen
<details>
This issue or pull request has been labeled: "fuzzing", "wasmtime:api", "wasmtime:config"Thus the following users have been cc'd because of the following labels:
- fitzgen: fuzzing
To subscribe or unsubscribe from this label, edit the <code>.github/subscribe-to-label.json</code> configuration file.
Learn more.
</details>
github-actions[bot] commented on PR #8590:
Label Messager: wasmtime:config
It looks like you are changing Wasmtime's configuration options. Make sure to
complete this check list:
[ ] If you added a new
Configmethod, you wrote extensive documentation for
it.<details>
Our documentation should be of the following form:
```text
Short, simple summary sentence.More details. These details can be multiple paragraphs. There should be
information about not just the method, but its parameters and results as
well.Is this method fallible? If so, when can it return an error?
Can this method panic? If so, when does it panic?
Example
Optional example here.
```</details>
[ ] If you added a new
Configmethod, or modified an existing one, you
ensured that this configuration is exercised by the fuzz targets.<details>
For example, if you expose a new strategy for allocating the next instance
slot inside the pooling allocator, you should ensure that at least one of our
fuzz targets exercises that new strategy.Often, all that is required of you is to ensure that there is a knob for this
configuration option in [wasmtime_fuzzing::Config][fuzzing-config] (or one
of its nestedstructs).Rarely, this may require authoring a new fuzz target to specifically test this
configuration. See [our docs on fuzzing][fuzzing-docs] for more details.</details>
[ ] If you are enabling a configuration option by default, make sure that it
has been fuzzed for at least two weeks before turning it on by default.[fuzzing-config]: https://github.com/bytecodealliance/wasmtime/blob/ca0e8d0a1d8cefc0496dba2f77a670571d8fdcab/crates/fuzzing/src/generators.rs#L182-L194
[fuzzing-docs]: https://docs.wasmtime.dev/contributing-fuzzing.html
<details>
To modify this label's message, edit the <code>.github/label-messager/wasmtime-config.md</code> file.
To add new label messages or remove existing label messages, edit the
<code>.github/label-messager.json</code> configuration file.</details>
jameysharp submitted PR review.
jameysharp created PR review comment:
I think the least-code way to do what you're suggesting is to pass an empty
DecommitQueueinto e.g.image.clear_and_remain_readyand unconditionally callpush_memoryon it; then pass it to a common method.That method should check if its raw iovec queue is empty, and if so, immediately flush it, which will call
self.memories.deallocateon the only memory pushed into it. Otherwise, take the lock and append its contents to the globalDecommitQueue, and then if that queue is full, immediately flush that.Having the ability to append one
DecommitQueueinto another might be nice for maintaining thread-local queues or something too, someday.Also it might make sense to move the
batch_sizefield toPoolingInstanceAllocator. We don't need to check it duringenqueue_raw, especially with this proposed change, so we might as well store it alongside all the other data we need access to during flush.
jameysharp submitted PR review.
jameysharp created PR review comment:
I gather you're suggesting the closure parameter would be, for example:
|| self.memories.allocate(request, memory_plan, memory_index)That makes sense to me!
fitzgen submitted PR review.
fitzgen created PR review comment:
We have to handle the case where the batch size is larger than
IOV_MAXanyways, since a single wasm memory can push two memory regions to decommit and so we can always blow our target batch size. Therefore, I think we should just remove this bit, and allow the batch decommit to break things up into smaller chunks if it must, since it has to have that logic anyways.
fitzgen updated PR #8590.
alexcrichton submitted PR review.
alexcrichton created PR review comment:
Sounds good to me :+1:
fitzgen updated PR #8590.
fitzgen updated PR #8590.
fitzgen commented on PR #8590:
@alexcrichton I think this is ready for re-review!
fitzgen edited a comment on PR #8590:
Label Messager: wasmtime:config
It looks like you are changing Wasmtime's configuration options. Make sure to
complete this check list:
[x] If you added a new
Configmethod, you wrote extensive documentation for
it.<details>
Our documentation should be of the following form:
```text
Short, simple summary sentence.More details. These details can be multiple paragraphs. There should be
information about not just the method, but its parameters and results as
well.Is this method fallible? If so, when can it return an error?
Can this method panic? If so, when does it panic?
Example
Optional example here.
```</details>
[ ] If you added a new
Configmethod, or modified an existing one, you
ensured that this configuration is exercised by the fuzz targets.<details>
For example, if you expose a new strategy for allocating the next instance
slot inside the pooling allocator, you should ensure that at least one of our
fuzz targets exercises that new strategy.Often, all that is required of you is to ensure that there is a knob for this
configuration option in [wasmtime_fuzzing::Config][fuzzing-config] (or one
of its nestedstructs).Rarely, this may require authoring a new fuzz target to specifically test this
configuration. See [our docs on fuzzing][fuzzing-docs] for more details.</details>
[ ] If you are enabling a configuration option by default, make sure that it
has been fuzzed for at least two weeks before turning it on by default.[fuzzing-config]: https://github.com/bytecodealliance/wasmtime/blob/ca0e8d0a1d8cefc0496dba2f77a670571d8fdcab/crates/fuzzing/src/generators.rs#L182-L194
[fuzzing-docs]: https://docs.wasmtime.dev/contributing-fuzzing.html
<details>
To modify this label's message, edit the <code>.github/label-messager/wasmtime-config.md</code> file.
To add new label messages or remove existing label messages, edit the
<code>.github/label-messager.json</code> configuration file.</details>
fitzgen edited a comment on PR #8590:
Label Messager: wasmtime:config
It looks like you are changing Wasmtime's configuration options. Make sure to
complete this check list:
[x] If you added a new
Configmethod, you wrote extensive documentation for
it.<details>
Our documentation should be of the following form:
```text
Short, simple summary sentence.More details. These details can be multiple paragraphs. There should be
information about not just the method, but its parameters and results as
well.Is this method fallible? If so, when can it return an error?
Can this method panic? If so, when does it panic?
Example
Optional example here.
```</details>
[x] If you added a new
Configmethod, or modified an existing one, you
ensured that this configuration is exercised by the fuzz targets.<details>
For example, if you expose a new strategy for allocating the next instance
slot inside the pooling allocator, you should ensure that at least one of our
fuzz targets exercises that new strategy.Often, all that is required of you is to ensure that there is a knob for this
configuration option in [wasmtime_fuzzing::Config][fuzzing-config] (or one
of its nestedstructs).Rarely, this may require authoring a new fuzz target to specifically test this
configuration. See [our docs on fuzzing][fuzzing-docs] for more details.</details>
[ ] If you are enabling a configuration option by default, make sure that it
has been fuzzed for at least two weeks before turning it on by default.[fuzzing-config]: https://github.com/bytecodealliance/wasmtime/blob/ca0e8d0a1d8cefc0496dba2f77a670571d8fdcab/crates/fuzzing/src/generators.rs#L182-L194
[fuzzing-docs]: https://docs.wasmtime.dev/contributing-fuzzing.html
<details>
To modify this label's message, edit the <code>.github/label-messager/wasmtime-config.md</code> file.
To add new label messages or remove existing label messages, edit the
<code>.github/label-messager.json</code> configuration file.</details>
fitzgen edited a comment on PR #8590:
Label Messager: wasmtime:config
It looks like you are changing Wasmtime's configuration options. Make sure to
complete this check list:
[x] If you added a new
Configmethod, you wrote extensive documentation for
it.<details>
Our documentation should be of the following form:
```text
Short, simple summary sentence.More details. These details can be multiple paragraphs. There should be
information about not just the method, but its parameters and results as
well.Is this method fallible? If so, when can it return an error?
Can this method panic? If so, when does it panic?
Example
Optional example here.
```</details>
[x] If you added a new
Configmethod, or modified an existing one, you
ensured that this configuration is exercised by the fuzz targets.<details>
For example, if you expose a new strategy for allocating the next instance
slot inside the pooling allocator, you should ensure that at least one of our
fuzz targets exercises that new strategy.Often, all that is required of you is to ensure that there is a knob for this
configuration option in [wasmtime_fuzzing::Config][fuzzing-config] (or one
of its nestedstructs).Rarely, this may require authoring a new fuzz target to specifically test this
configuration. See [our docs on fuzzing][fuzzing-docs] for more details.</details>
[x] If you are enabling a configuration option by default, make sure that it
has been fuzzed for at least two weeks before turning it on by default.[fuzzing-config]: https://github.com/bytecodealliance/wasmtime/blob/ca0e8d0a1d8cefc0496dba2f77a670571d8fdcab/crates/fuzzing/src/generators.rs#L182-L194
[fuzzing-docs]: https://docs.wasmtime.dev/contributing-fuzzing.html
<details>
To modify this label's message, edit the <code>.github/label-messager/wasmtime-config.md</code> file.
To add new label messages or remove existing label messages, edit the
<code>.github/label-messager.json</code> configuration file.</details>
alexcrichton submitted PR review:
Nice!
jameysharp commented on PR #8590:
Looks like the reason CI is failing is that we can only use
libc::ioveconcfg(unix). Elsewhere I guess we should define an alternativestruct iovecwith the same fields? We need it to match system ABI if we use it in syscalls but on non-Unix (non-Linux, even) any pair of pointer and length will do.
fitzgen updated PR #8590.
fitzgen has enabled auto merge for PR #8590.
fitzgen updated PR #8590.
fitzgen has enabled auto merge for PR #8590.
fitzgen updated PR #8590.
fitzgen has enabled auto merge for PR #8590.
fitzgen updated PR #8590.
fitzgen has enabled auto merge for PR #8590.
fitzgen updated PR #8590.
fitzgen updated PR #8590.
fitzgen updated PR #8590.
fitzgen merged PR #8590.
Last updated: Dec 13 2025 at 19:03 UTC