fitzgen opened PR #12685 from fitzgen:bforest-fallible-insertion-methods to bytecodealliance:main:
<!--
Please make sure you include the following information:
If this work has been discussed elsewhere, please include a link to that
conversation. If it was discussed in an issue, just mention "issue #...".Explain why this change is needed. If the details are in an issue already,
this can be brief.Our development process is documented in the Wasmtime book:
https://docs.wasmtime.dev/contributing-development-process.htmlPlease ensure all communication follows the code of conduct:
https://github.com/bytecodealliance/wasmtime/blob/main/CODE_OF_CONDUCT.md
-->
fitzgen requested wasmtime-fuzz-reviewers for a review on PR #12685.
fitzgen requested wasmtime-compiler-reviewers for a review on PR #12685.
fitzgen requested alexcrichton for a review on PR #12685.
fitzgen requested wasmtime-default-reviewers for a review on PR #12685.
alexcrichton commented on PR #12685:
Before going too too far down this route (this PR is fine in isolation though), could you detail a bit more the end state here? I'm assuming this is eventually going to make its way into the
ModuleRegistrywithin aStorein wasmtime to replace theBTreeMaps there currently. I'm not familiar with the data structures in this crate, but currentlyKandVare both bounded byCopywhich doesn't satisfy the current usage withinModuleRegistry.So overall I'm not quite sure I see the vision of where this is going to end up, hence the question -- could you detail a bit more about how this is going to make its way into Wasmtime? We have a surprisingly large number of
BTreeMaps used in the runtime, and is this aimed at all of them or just a subset?
github-actions[bot] added the label cranelift on PR #12685.
github-actions[bot] added the label fuzzing on PR #12685.
github-actions[bot] commented on PR #12685:
Subscribe to Label Action
cc @fitzgen
<details>
This issue or pull request has been labeled: "cranelift", "fuzzing"Thus the following users have been cc'd because of the following labels:
- fitzgen: fuzzing
To subscribe or unsubscribe from this label, edit the <code>.github/subscribe-to-label.json</code> configuration file.
Learn more.
</details>
fitzgen commented on PR #12685:
Before going too too far down this route (this PR is fine in isolation though), could you detail a bit more the end state here? I'm assuming this is eventually going to make its way into the
ModuleRegistrywithin aStorein wasmtime to replace theBTreeMaps there currently. I'm not familiar with the data structures in this crate, but currentlyKandVare both bounded byCopywhich doesn't satisfy the current usage withinModuleRegistry.So overall I'm not quite sure I see the vision of where this is going to end up, hence the question -- could you detail a bit more about how this is going to make its way into Wasmtime? We have a surprisingly large number of
BTreeMaps used in the runtime, and is this aimed at all of them or just a subset?Yeah, we do use
BTreeMaps in a bunch of core places in the core runtime, and we use them in ways that make it hard to replace with other data structures.My plan is to create a
wasmtime_environ::collections::BTreeMapthat looks something like this:#[derive(...)] struct BTreeMapValueIndex(u32); entity_impl!(BTreeMapValueIndex); pub struct BTreeMap<K, V> where K: Copy, { values: PrimaryMap<BTreeMapValueIndex, V>, forest: cranelift_bforest::MapForest<K, BTreeMapValueIndex>, map: cranelift_bforest::Map<BTreeMapKey, BTreeMapValueIndex>, } impl<K, V> BTreeMap<K, V> where K: Copy, { pub fn get(&self, key: K) -> Option<&V> where K: Ord, { let idx = self.map.get(key, &self.forest, &()?; Some(&self.values[idx]) } // etc... }I haven't done a full audit, but everything I've encountered so far has
Copykeys (which, as you note,cranelift_bforest::Maprequires), so we wouldn't need another indirection for keys in the map, like we have with values here. But if we have to, we can add that as well. It would unfortunately have the extra wrinkle of requiring some kind of hash consing or something tho, but that is not insurmountable.If you are concerned about the performance implications this extra indirection has, we can put it behind a cargo feature. When the feature is not enabled, we would wrap
std::collections::BTreeMapinstead of doing the above, exposing the same OOM-handling methods/interface/API to users of the collection, but not actually attempting to handle OOM internally in this scenario.
fitzgen edited a comment on PR #12685:
Before going too too far down this route (this PR is fine in isolation though), could you detail a bit more the end state here? I'm assuming this is eventually going to make its way into the
ModuleRegistrywithin aStorein wasmtime to replace theBTreeMaps there currently. I'm not familiar with the data structures in this crate, but currentlyKandVare both bounded byCopywhich doesn't satisfy the current usage withinModuleRegistry.So overall I'm not quite sure I see the vision of where this is going to end up, hence the question -- could you detail a bit more about how this is going to make its way into Wasmtime? We have a surprisingly large number of
BTreeMaps used in the runtime, and is this aimed at all of them or just a subset?Yeah, we do use
BTreeMaps in a bunch of core places in the core runtime, and we use them in ways that make it hard to replace with other data structures.My plan is to create a
wasmtime_environ::collections::BTreeMapthat looks something like this:#[derive(...)] struct BTreeMapValueIndex(u32); entity_impl!(BTreeMapValueIndex); pub struct BTreeMap<K, V> where K: Copy, { values: PrimaryMap<BTreeMapValueIndex, V>, forest: cranelift_bforest::MapForest<K, BTreeMapValueIndex>, map: cranelift_bforest::Map<K, BTreeMapValueIndex>, } impl<K, V> BTreeMap<K, V> where K: Copy, { pub fn get(&self, key: K) -> Option<&V> where K: Ord, { let idx = self.map.get(key, &self.forest, &()?; Some(&self.values[idx]) } // etc... }I haven't done a full audit, but everything I've encountered so far has
Copykeys (which, as you note,cranelift_bforest::Maprequires), so we wouldn't need another indirection for keys in the map, like we have with values here. But if we have to, we can add that as well. It would unfortunately have the extra wrinkle of requiring some kind of hash consing or something tho, but that is not insurmountable.If you are concerned about the performance implications this extra indirection has, we can put it behind a cargo feature. When the feature is not enabled, we would wrap
std::collections::BTreeMapinstead of doing the above, exposing the same OOM-handling methods/interface/API to users of the collection, but not actually attempting to handle OOM internally in this scenario.
alexcrichton commented on PR #12685:
Ah ok that makes sense yeah. And to confirm, I'm not familiar with the implementation of bforest, but given the documented caveats you're confident that this is a suitable replacement for our usages of
BTreeMap? I'm not sure if thatBTreeMapwould, for example, end up being many times larger in the by-value size thanstd::collections::BTreeMapwith the various collections it contains.Also, that design may be a bit tricky insofar as it won't easily support removal without further maps/etc for tracking that. I don't think we need to remove from
Store-based maps inModuleRegistry, but we will need a solution for removing from the global map that has all loaded modules.Perf-wise I'm not familiar enough with bforest to have much of an opinion on that. I don't think we need a hyper-optimal data structure here and just something that isn't O(n) in the worst case. I do think this is a reasonable enough time to rethink our data structures, though, perhaps?
For example
BTreeMapfeels rightly necessary for the global map (or some sort of sorted map thingy), but forModuleRegistryin theoryBTreeMapis way overkill for our use case. Almost all stores have just a single module or component in them and it's quite rare to have a very large number, so we might actually be able to get away with just a plain old sortedVec<T>. I realize that I just also said we shouldn't do O(n) things which runs counter to that, but we could in theory also dynamically switch between aVec<T>and a map after crossing some threshold (e.g. 100+ modules).It's been a bit though since I looked at
ModuleRegistrytoo closely, but WDYT about maybe first refactoring to remove someBTreeMaps to reduce the number of constraints on our futureTryBTreeMap?
fitzgen commented on PR #12685:
Ah ok that makes sense yeah. And to confirm, I'm not familiar with the implementation of bforest, but given the documented caveats you're confident that this is a suitable replacement for our usages of
BTreeMap? I'm not sure if thatBTreeMapwould, for example, end up being many times larger in the by-value size thanstd::collections::BTreeMapwith the various collections it contains.Those documented caveats seem workable, AFAICT.
I haven't investigated exact size differences, but I think it should be within a constant factor of the same size.
Also, that design may be a bit tricky insofar as it won't easily support removal without further maps/etc for tracking that. I don't think we need to remove from
Store-based maps inModuleRegistry, but we will need a solution for removing from the global map that has all loaded modules.I guess my initial sketch had
values: PrimaryMap<BTreeMapValueIndex, V>,but we probably actually wantvalues: Slab<V>,so we have a free list for reusing entries and don't require a value for every entry in the arena. Thenremoveis simply something like this:pub fn remove(&mut self, key: K) -> Option<V> { let id = self.map.remove(key, &mut self.forest, &())?; Some(self.values.dealloc(id)) }but for
ModuleRegistryin theoryBTreeMapis way overkill for our use case. Almost all stores have just a single module or component in them and it's quite rare to have a very large number, so we might actually be able to get away with just a plain old sortedVec<T>. I realize that I just also said we shouldn't do O(n) things which runs counter to that, but we could in theory also dynamically switch between aVec<T>and a map after crossing some threshold (e.g. 100+ modules).It's been a bit though since I looked at
ModuleRegistrytoo closely, but WDYT about maybe first refactoring to remove someBTreeMaps to reduce the number of constraints on our futureTryBTreeMap?Generally a fan of re-thinking data structures where it makes sense as we go through and touch various things.
However, in the particular case of
ModuleRegistry, and doing something likeenum ModuleRegistryData { Few(Vec<Module>), Many(BTreeMap<RegisteredModuleId, Module>), }I'm fairly meh. We need a
BTreeMapimplementation either way, so it isn't saving implementation work. Also, a single BTree node is roughly equivalent to aVecof a small size if you squint, so I am not sure it would make a meaningful perf difference.What might be worth it would be something like this tho:
enum ModuleRegistryData { One(Module), Many(BTreeMap<RegisteredModuleId, Module>), }This would let us avoid any allocation at all in the extremely common case of a single module in the registry.
alexcrichton submitted PR review:
Well, in any case, this is of course fine as-is, and probably best to keep discussing in further PRs/issues/etc.
I think it should be within a constant factor of the same size
Mind double-checking? I'd basically want to make sure the constant isn't so big to be egregious.
Store<T>is probably already way too big as-is...we probably actually want values:
Slab<V>makes sense yeah
and doing something like ... I'm fairly meh
I agree yeah, I don't think we've crossed the complexity threshold to those just yet, and I agree it doesn't remove the need for
BTreeMap, so it's probably fine to defer that to some future optimizations
Along the lines of rethinking data structures, one example is that within
ModuleRegistryI don't think that either ofModuleRegistry::{store_code, modules}need to be aBTreeMap. Those look like they can both be hash maps.ModuleRegistry::loaded_codeis kind of the most fundamental one, but we can probably get away with removingLoadedCode::modules. The latter needs range queries but we can probably switch that to a sorted vector since it's not dynamically added to (it's a per-component thing so we could precompute it at component-load-time or something like that).The
GlobalRegistryis also pretty fundamentally aBTreeMap, but overall I think we'll only need two in the long-run?
cfallin commented on PR #12685:
don't think ...
ModuleRegistry::modulesneeds to be a BTreeMapAgreed, but one caveat I should mention is that its iteration order is exposed via the debug APIs (
Store::debug_all_modules()) so if we use a HashMap instead we should probably sort in that method before returning for determinism. (We don't have a strict determinism guarantee stated for debugging but it's nice to trend in that direction IMHO.) Just something to keep in mind in general I guess.
alexcrichton commented on PR #12685:
Good point yeah, that'd be a better fit for
IndexMapthanHashMapin that case
fitzgen added PR #12685 Add insertion methods with fallible allocation to cranelift_bforest::{Map,Set} to the merge queue
fitzgen merged PR #12685.
fitzgen removed PR #12685 Add insertion methods with fallible allocation to cranelift_bforest::{Map,Set} from the merge queue
fitzgen commented on PR #12685:
Mind double-checking? I'd basically want to make sure the constant isn't so big to be egregious.
Store<T>is probably already way too big as-is...Whipped up a quick example to check the max resident size (in kilobytes) of a process that inserts
10_000_000entries into aBTreeMap<K, V>withstd::collections::BTreeMap<K, V>vs thebforest-basedBTreeMapsketched above.It looks like the overheads get smaller the larger the value gets, and after
u128and larger values, thebforest-based implementation actually starts outperforming thestdimplementation:
KVstd-basedbforest-basedOverhead ( bforest/std)u32u81547202885721.87 u32u322068802885881.39 u32u642847083276841.15 u32u1284149084057520.98 u32[u8; 64]12745368746440.69 Increasing the size of the keys does cause more overhead in the
bforest-based implementation, but not by too much:
KVstd-basedbforest-basedOverhead ( bforest/std)u32u1284149084057520.98 u64u1284930525099481.03 u128u1286492287182681.11 [u128; 2]u12893582010828201.16 Overall, I think this is probably fine?
alexcrichton commented on PR #12685:
Sounds reasonable to me :+1:, thanks for checking!
Last updated: Mar 23 2026 at 16:19 UTC