jameysharp opened issue #5829:
Feature
The
EntityList/ListPooltypes declared incranelift/entity/src/list.rsallocate arrays from a large pool of memory. Empty lists use no space in the pool. Otherwise, lists use a minimum of 4 elements of the pool. With a little cleverness, I believe we can arrange that lists of length 1 use no space in the pool either. Alternatively, we can at least reduce the minimum pool allocation to 2 elements instead of 4 and test whether that's an improvement.Benefit
If we have a lot of length-1
EntityLists, then this could save a lot of memory. It might also save on memory bandwidth, reduce cache pressure, and avoid pipeline stalls from chasing indirections.Part of completing this issue is checking whether this change really does improve performance. It could be that the extra complexity of handling singleton lists specially takes more time than we save, or that most lists have at least two elements, or that I've misjudged the possible microarchitectural benefits.
That said, my guess is that we do have a lot of length-1 lists, especially now that
BlockCalls are stored in theValueListPool. (A branch to a block that takes no block-parameters is stored as a singleton list containing only theBlockindex.)Implementation
The basic idea is similar to smallvec.
EntityListholds an index to the location of the list within theListPool. For a singleton list, we can store the one list element inside theEntityList, instead of interpreting it as an index into the pool.There are two details that make this worth considering:
All of our types that implement
EntityRefareu32internally, and so isEntityList. If entities were larger this wouldn't be worth doing.Some bit-patterns in an
EntityListare unused. A storage block within aListPoolalways starts at an offset that is at least a multiple of 4, and the index stored inEntityListis one past the start of the block. So currently, anEntityListis either zero (representing an empty list), or its least-significant two bits are01.And one detail that makes this tricky: An
EntityListcan return a slice, possibly mutable, of its contents. To make this work with singleton lists stored directly inside thestruct EntityList<T>itself, we'd need to be able to borrow a&mut [T]out of it without enabling callers to break our invariants.One way to do that is when someone asks for a mutable slice, promote a singleton list into a pool-backed list. This is bad if we need to do that for most lists, since it undoes any advantages we might get from making this change in the first place. For
BlockCallspecifically we can do better, because at any given time we only want to access either the first element or a slice of the remaining elements, and for singleton lists we can return&mut []without needing any backing storage.Alternatives
The simplest alternative which might help is to reduce the minimum block allocation size from 4 to 2 in
ListPool. If all our lists were singletons, that change would cut memory usage in these pools by 50%; counting the size of thestruct EntityListit's a savings of 40%. That might be a better choice than all of the above due to its simplicity, but doesn't save as much memory.
jameysharp labeled issue #5829:
Feature
The
EntityList/ListPooltypes declared incranelift/entity/src/list.rsallocate arrays from a large pool of memory. Empty lists use no space in the pool. Otherwise, lists use a minimum of 4 elements of the pool. With a little cleverness, I believe we can arrange that lists of length 1 use no space in the pool either. Alternatively, we can at least reduce the minimum pool allocation to 2 elements instead of 4 and test whether that's an improvement.Benefit
If we have a lot of length-1
EntityLists, then this could save a lot of memory. It might also save on memory bandwidth, reduce cache pressure, and avoid pipeline stalls from chasing indirections.Part of completing this issue is checking whether this change really does improve performance. It could be that the extra complexity of handling singleton lists specially takes more time than we save, or that most lists have at least two elements, or that I've misjudged the possible microarchitectural benefits.
That said, my guess is that we do have a lot of length-1 lists, especially now that
BlockCalls are stored in theValueListPool. (A branch to a block that takes no block-parameters is stored as a singleton list containing only theBlockindex.)Implementation
The basic idea is similar to smallvec.
EntityListholds an index to the location of the list within theListPool. For a singleton list, we can store the one list element inside theEntityList, instead of interpreting it as an index into the pool.There are two details that make this worth considering:
All of our types that implement
EntityRefareu32internally, and so isEntityList. If entities were larger this wouldn't be worth doing.Some bit-patterns in an
EntityListare unused. A storage block within aListPoolalways starts at an offset that is at least a multiple of 4, and the index stored inEntityListis one past the start of the block. So currently, anEntityListis either zero (representing an empty list), or its least-significant two bits are01.And one detail that makes this tricky: An
EntityListcan return a slice, possibly mutable, of its contents. To make this work with singleton lists stored directly inside thestruct EntityList<T>itself, we'd need to be able to borrow a&mut [T]out of it without enabling callers to break our invariants.One way to do that is when someone asks for a mutable slice, promote a singleton list into a pool-backed list. This is bad if we need to do that for most lists, since it undoes any advantages we might get from making this change in the first place. For
BlockCallspecifically we can do better, because at any given time we only want to access either the first element or a slice of the remaining elements, and for singleton lists we can return&mut []without needing any backing storage.Alternatives
The simplest alternative which might help is to reduce the minimum block allocation size from 4 to 2 in
ListPool. If all our lists were singletons, that change would cut memory usage in these pools by 50%; counting the size of thestruct EntityListit's a savings of 40%. That might be a better choice than all of the above due to its simplicity, but doesn't save as much memory.
jameysharp labeled issue #5829:
Feature
The
EntityList/ListPooltypes declared incranelift/entity/src/list.rsallocate arrays from a large pool of memory. Empty lists use no space in the pool. Otherwise, lists use a minimum of 4 elements of the pool. With a little cleverness, I believe we can arrange that lists of length 1 use no space in the pool either. Alternatively, we can at least reduce the minimum pool allocation to 2 elements instead of 4 and test whether that's an improvement.Benefit
If we have a lot of length-1
EntityLists, then this could save a lot of memory. It might also save on memory bandwidth, reduce cache pressure, and avoid pipeline stalls from chasing indirections.Part of completing this issue is checking whether this change really does improve performance. It could be that the extra complexity of handling singleton lists specially takes more time than we save, or that most lists have at least two elements, or that I've misjudged the possible microarchitectural benefits.
That said, my guess is that we do have a lot of length-1 lists, especially now that
BlockCalls are stored in theValueListPool. (A branch to a block that takes no block-parameters is stored as a singleton list containing only theBlockindex.)Implementation
The basic idea is similar to smallvec.
EntityListholds an index to the location of the list within theListPool. For a singleton list, we can store the one list element inside theEntityList, instead of interpreting it as an index into the pool.There are two details that make this worth considering:
All of our types that implement
EntityRefareu32internally, and so isEntityList. If entities were larger this wouldn't be worth doing.Some bit-patterns in an
EntityListare unused. A storage block within aListPoolalways starts at an offset that is at least a multiple of 4, and the index stored inEntityListis one past the start of the block. So currently, anEntityListis either zero (representing an empty list), or its least-significant two bits are01.And one detail that makes this tricky: An
EntityListcan return a slice, possibly mutable, of its contents. To make this work with singleton lists stored directly inside thestruct EntityList<T>itself, we'd need to be able to borrow a&mut [T]out of it without enabling callers to break our invariants.One way to do that is when someone asks for a mutable slice, promote a singleton list into a pool-backed list. This is bad if we need to do that for most lists, since it undoes any advantages we might get from making this change in the first place. For
BlockCallspecifically we can do better, because at any given time we only want to access either the first element or a slice of the remaining elements, and for singleton lists we can return&mut []without needing any backing storage.Alternatives
The simplest alternative which might help is to reduce the minimum block allocation size from 4 to 2 in
ListPool. If all our lists were singletons, that change would cut memory usage in these pools by 50%; counting the size of thestruct EntityListit's a savings of 40%. That might be a better choice than all of the above due to its simplicity, but doesn't save as much memory.
Last updated: Dec 13 2025 at 21:03 UTC