jakubDoka opened issue #5026:
Feature
Removal of
PackedOption
. MakingEntityRef
a generic struct, whereT
would be the type it is pointing to.Benefit
PackedOption
is making code verbose and unnatural. This proposal allows usingOption
while preserving advantages ofPackedOption
s compactness. Also eliminates need forReservedValue
.Changing
EntityRef
from trait to generic struct can greatly simplifycranelift_entity
implementation and save us some names when defining entity data. The structure can also implement all of the useful traits on one place so no shorthand macro is needed.Implementation
Here is a quick demonstration of how this can work. This solution requires (forever) unstable feature.
#![feature(rustc_attrs)] use std::{marker::PhantomData, mem, ops::{Index, IndexMut}}; pub struct PrimaryMap<T> { data: Vec<T>, } impl<T> PrimaryMap<T> { pub fn insert(&mut self, value: T) -> EntityRef<T> { let index = self.data.len(); self.data.push(value); EntityRef::new(index as u32).expect("no more entities can be spawned") } } impl<T> Index<EntityRef<T>> for PrimaryMap<T> { type Output = T; fn index(&self, index: EntityRef<T>) -> &Self::Output { &self.data[index.index()] } } impl<T> IndexMut<EntityRef<T>> for PrimaryMap<T> { fn index_mut(&mut self, index: EntityRef<T>) -> &mut Self::Output { &mut self.data[index.index()] } } // We might want to choose a different index representation but most common one is // as default. Also the name is a bit long for something common, `ERef` or `VRef` // (Virtual Reference) might be a better fit. #[repr(transparent)] pub struct EntityRef<T, R = NonMaxU32>(R, PhantomData<*const T>); impl<T, R: EntityRefRepr> EntityRef<T, R> { pub fn new(args: R::InitArgs) -> Option<Self> { R::new(args).map(|r| Self(r, PhantomData)) } pub fn index(&self) -> usize { self.0.index() } pub fn args(self) -> R::InitArgs { self.0.args() } } // Making representation as flexible as possible. pub trait EntityRefRepr: Sized { type InitArgs; fn new(args: Self::InitArgs) -> Option<Self>; fn index(&self) -> usize; fn args(self) -> Self::InitArgs; } // Compiler is able to optimize the size const _: () = assert!(std::mem::size_of::<Option<EntityRef<()>>>() == 4); // Works even on structures that contain non-optional `EntityRef` const _: () = assert!(std::mem::size_of::<Option<(EntityRef<(), u32)>>>() == 8); // If we cannot use nightly, this solution is basically useless :( #[rustc_layout_scalar_valid_range_end(4294967294)] #[repr(transparent)] pub struct NonMaxU32(u32); impl NonMaxU32 { pub fn new(value: u32) -> Option<Self> { unsafe { mem::transmute(Self(value)) } // might not be correct implementation } pub fn get(&self) -> u32 { self.0 } } impl EntityRefRepr for NonMaxU32 { type InitArgs = u32; fn new(args: Self::InitArgs) -> Option<Self> { Self::new(args) } fn index(&self) -> usize { self.get() as usize } fn args(self) -> Self::InitArgs { self.get() } }
More extensive demonstration can be found here.
bjorn3 commented on issue #5026:
rust // If we cannot use nightly, this solution is basically useless :(
Yeah, we can't use unstable.
jakubDoka commented on issue #5026:
Okay, was not sure about that, I'll close this.
jakubDoka closed issue #5026:
Feature
Removal of
PackedOption
. MakingEntityRef
a generic struct, whereT
would be the type it is pointing to.Benefit
PackedOption
is making code verbose and unnatural. This proposal allows usingOption
while preserving advantages ofPackedOption
s compactness. Also eliminates need forReservedValue
.Changing
EntityRef
from trait to generic struct can greatly simplifycranelift_entity
implementation and save us some names when defining entity data. The structure can also implement all of the useful traits on one place so no shorthand macro is needed.Implementation
Here is a quick demonstration of how this can work. This solution requires (forever) unstable feature.
#![feature(rustc_attrs)] use std::{marker::PhantomData, mem, ops::{Index, IndexMut}}; pub struct PrimaryMap<T> { data: Vec<T>, } impl<T> PrimaryMap<T> { pub fn insert(&mut self, value: T) -> EntityRef<T> { let index = self.data.len(); self.data.push(value); EntityRef::new(index as u32).expect("no more entities can be spawned") } } impl<T> Index<EntityRef<T>> for PrimaryMap<T> { type Output = T; fn index(&self, index: EntityRef<T>) -> &Self::Output { &self.data[index.index()] } } impl<T> IndexMut<EntityRef<T>> for PrimaryMap<T> { fn index_mut(&mut self, index: EntityRef<T>) -> &mut Self::Output { &mut self.data[index.index()] } } // We might want to choose a different index representation but most common one is // as default. Also the name is a bit long for something common, `ERef` or `VRef` // (Virtual Reference) might be a better fit. #[repr(transparent)] pub struct EntityRef<T, R = NonMaxU32>(R, PhantomData<*const T>); impl<T, R: EntityRefRepr> EntityRef<T, R> { pub fn new(args: R::InitArgs) -> Option<Self> { R::new(args).map(|r| Self(r, PhantomData)) } pub fn index(&self) -> usize { self.0.index() } pub fn args(self) -> R::InitArgs { self.0.args() } } // Making representation as flexible as possible. pub trait EntityRefRepr: Sized { type InitArgs; fn new(args: Self::InitArgs) -> Option<Self>; fn index(&self) -> usize; fn args(self) -> Self::InitArgs; } // Compiler is able to optimize the size const _: () = assert!(std::mem::size_of::<Option<EntityRef<()>>>() == 4); // Works even on structures that contain non-optional `EntityRef` const _: () = assert!(std::mem::size_of::<Option<(EntityRef<(), u32)>>>() == 8); // If we cannot use nightly, this solution is basically useless :( #[rustc_layout_scalar_valid_range_end(4294967294)] #[repr(transparent)] pub struct NonMaxU32(u32); impl NonMaxU32 { pub fn new(value: u32) -> Option<Self> { unsafe { mem::transmute(Self(value)) } // might not be correct implementation } pub fn get(&self) -> u32 { self.0 } } impl EntityRefRepr for NonMaxU32 { type InitArgs = u32; fn new(args: Self::InitArgs) -> Option<Self> { Self::new(args) } fn index(&self) -> usize { self.get() as usize } fn args(self) -> Self::InitArgs { self.get() } }
More extensive demonstration can be found here.
jameysharp commented on issue #5026:
PackedOption
has been bothering me too, so I was recently also thinking about ways to remove it.I'd considered using
NonZeroU32
and making the reserved value be 0 instead ofu32::MAX
. That would work on stable rustc, but I haven't gotten past idly thinking about it, so maybe there are other difficulties I haven't thought of. For example, does subtracting 1 to get a valid array index inSecondaryMap
impose a significant compile-time cost? Individually that's very cheap, but we index into those maps a lot, so maybe it matters.If you want to try a few things I'd be happy to answer questions and review PRs.
Regardless, I'm re-opening this issue because unless we have evidence that the current design is the best option we have, I think it's worth considering alternatives to
PackedOption
.
jameysharp reopened issue #5026:
Feature
Removal of
PackedOption
. MakingEntityRef
a generic struct, whereT
would be the type it is pointing to.Benefit
PackedOption
is making code verbose and unnatural. This proposal allows usingOption
while preserving advantages ofPackedOption
s compactness. Also eliminates need forReservedValue
.Changing
EntityRef
from trait to generic struct can greatly simplifycranelift_entity
implementation and save us some names when defining entity data. The structure can also implement all of the useful traits on one place so no shorthand macro is needed.Implementation
Here is a quick demonstration of how this can work. This solution requires (forever) unstable feature.
#![feature(rustc_attrs)] use std::{marker::PhantomData, mem, ops::{Index, IndexMut}}; pub struct PrimaryMap<T> { data: Vec<T>, } impl<T> PrimaryMap<T> { pub fn insert(&mut self, value: T) -> EntityRef<T> { let index = self.data.len(); self.data.push(value); EntityRef::new(index as u32).expect("no more entities can be spawned") } } impl<T> Index<EntityRef<T>> for PrimaryMap<T> { type Output = T; fn index(&self, index: EntityRef<T>) -> &Self::Output { &self.data[index.index()] } } impl<T> IndexMut<EntityRef<T>> for PrimaryMap<T> { fn index_mut(&mut self, index: EntityRef<T>) -> &mut Self::Output { &mut self.data[index.index()] } } // We might want to choose a different index representation but most common one is // as default. Also the name is a bit long for something common, `ERef` or `VRef` // (Virtual Reference) might be a better fit. #[repr(transparent)] pub struct EntityRef<T, R = NonMaxU32>(R, PhantomData<*const T>); impl<T, R: EntityRefRepr> EntityRef<T, R> { pub fn new(args: R::InitArgs) -> Option<Self> { R::new(args).map(|r| Self(r, PhantomData)) } pub fn index(&self) -> usize { self.0.index() } pub fn args(self) -> R::InitArgs { self.0.args() } } // Making representation as flexible as possible. pub trait EntityRefRepr: Sized { type InitArgs; fn new(args: Self::InitArgs) -> Option<Self>; fn index(&self) -> usize; fn args(self) -> Self::InitArgs; } // Compiler is able to optimize the size const _: () = assert!(std::mem::size_of::<Option<EntityRef<()>>>() == 4); // Works even on structures that contain non-optional `EntityRef` const _: () = assert!(std::mem::size_of::<Option<(EntityRef<(), u32)>>>() == 8); // If we cannot use nightly, this solution is basically useless :( #[rustc_layout_scalar_valid_range_end(4294967294)] #[repr(transparent)] pub struct NonMaxU32(u32); impl NonMaxU32 { pub fn new(value: u32) -> Option<Self> { unsafe { mem::transmute(Self(value)) } // might not be correct implementation } pub fn get(&self) -> u32 { self.0 } } impl EntityRefRepr for NonMaxU32 { type InitArgs = u32; fn new(args: Self::InitArgs) -> Option<Self> { Self::new(args) } fn index(&self) -> usize { self.get() as usize } fn args(self) -> Self::InitArgs { self.get() } }
More extensive demonstration can be found here.
cfallin commented on issue #5026:
It's definitely worth thinking about options (... uh, pun definitely not intended) here. That said, I'd be a little concerned about an "off by one" invariant: it feels like a bug waiting to happen, or at the very least, a source of confusion when debugging. In addition to that I suspect it might actually carry a measurable cost, given how frequently we access entities given their IDs.
IMHO we probably need to wait until we can encode the invariant for the compiler so it can reason about the niche on its own (i.e. wait for something like the above to become possible in stable Rust).
sunfishcode commented on issue #5026:
An RFC for a way to encode these kinds of invariants for the compiler to reason about was just opened yesterday: https://github.com/rust-lang/rfcs/pull/3334
meithecatte commented on issue #5026:
I'd be a little concerned about an "off by one" invariant: it feels like a bug waiting to happen, or at the very least, a source of confusion when debugging.
The way I'd approach this is to encapsulate this into a
NonMaxU32
type, that does the +/-1 internally during conversions. Then you get a seamless transition once Rust supports something like this natively, and in the meantime, you don't need to worry about correctness.I am not sure how much actual debuggers are used by cranelift developers (as opposed to
dbg!
). Handling them properly would require some kind of debugger integration. But cranelift already uses many debugger-unfriendly things – such as data that can't really be interpreted without external context structs, so I imagine thatdbg!
-style debugging would be much more common – which can be addressed with a properDebug
implementation.In addition to that I suspect it might actually carry a measurable cost, given how frequently we access entities given their IDs.
I wouldn't be too surprised if this turned out to be in the noise. When used as an index, the +/-1 will just be folded into the addressing mode. An extra byte to fetch, and probably zero cost when actually executing (this is me going purely off vibes, I don't know the exact numbers for address generation latency and I don't have the proficiency with the relevant documentation to look it up in reasonable time).
It might even be the case that it'll be a performance win, because the niche at 0 will be easier to test for than
u32::MAX
.In fact, there is a chance that the biggest difference will happen whether Rust has support for custom niches or not – the current implementation is "unsafe", with the check for accidentally creating the reserved value by converting a
u32
being only adebug_assert!
. Switching toNonZeroU32
will force actually performing this bounds-check.
jakubDoka commented on issue #5026:
you could also use an array container optimized for 1-based indexing, meaning that the base pointer is always offset by -1 to cancel the index +1 offset, tho miri will be angry
jakubDoka edited a comment on issue #5026:
you could also use an array container optimized for 1-based indexing, meaning that the base pointer is always offset by -1 to cancel the index +1 offset, tho miri will be angry
edit: another way that surprisingly respects provenance, is offsetting the base by length and subtracting the index, which inverts the order of everything
hanna-kruppe commented on issue #5026:
The sign-bound crate demonstrates a neat trick that works on stable and avoids extra arithmetic on indexing as well as unusual addressing modes that are less efficient on some architectures:
- Define a
repr(u8)
enum that has variants for every valid value of the most significant byte of your integers.sign-bound
uses this for positive and negative integers, so it has two enums of 127 variants each, but for cranelift-entity's purpose the range can be0..=254
.- Define a a
#[repr(C, align(4))]
struct that represents an u32 with the MSB stored as this enum, plus threeu8
s for the remaining bytes. The correct order of these fields depends oncfg(target_endian)
. Doing this for all integer types, assign-bound
does, needs some serious macro-fu, but cranelift-entity only needs the u32 case.I'll call this struct
NonMaxU32
for brevity, although that's not quite correct. This type is guaranteed to have the same representation in memory as an u32 integer thanks to therepr
annotations, so it can be transmuted into and fromu32
-- as long as the u32 is less than0xFF_00_00_00
. This restriction on the MSB enables rustc to makeOption<NonMaxU32>
the same size. While this layout optimization is not guaranteed, it's fairly simple and has existed for a long time, so I wouldn't be worried about relying on it for performance. However, there's some other limitations:
- There are no guarantees about the memory layout of
Option<NonMaxU32>
, nor about how it's handled by calling conventions (unlikerepr(transparent)
wrappers likeNonZeroU32
). So the guarantee aboutPackedOption
's layout added in https://github.com/bytecodealliance/wasmtime/pull/9697 can't be salvaged with this approach.- Getting good performance relies on
NonZeroU32
being passed around and operated on as an ordinary integer, not as an aggregate with four fields. This is not true for the C calling convention on several platforms. Empirically it seems to work fine with the Rust calling convention today (see https://rust.godbolt.org/z/3sd3K9sdK), but it's not guaranteed.- On some architectures,
0xFF_00_00_00
is a more awkward to encode as immediate operand thanu32::MAX
and0
. Thus, the comparisons done byEntityRef::new
and by pattern matching onOption<NonMaxU32>
may sometimes require an extra instruction or a longer instruction encoding.- Because the entire range
0xFF_00_00_00..=u32::MAX
is invalid, a checked conversionu32 -> Option<NonMaxU32>
has to do a comparison and normalize out-of-range values to theNone
representative (unlikeu32 -> Option<NonZeroU32>
, which is just a transmute).- It can't represent quite as many distinct values as a type that only excludes
u32::MAX
, but the difference is tiny, relatively speaking (4.278 billion vs 4.295 billion).
hanna-kruppe edited a comment on issue #5026:
The sign-bound crate demonstrates a neat trick that works on stable and avoids extra arithmetic on indexing as well as unusual addressing modes that are less efficient on some architectures:
- Define a
repr(u8)
enum that has variants for every valid value of the most significant byte of your integers.sign-bound
uses this for positive and negative integers, so it has two enums of 127 variants each, but for cranelift-entity's purpose the range can be0..=254
.- Define a a
#[repr(C, align(4))]
struct that represents an u32 with the MSB stored as this enum, plus threeu8
s for the remaining bytes. The correct order of these fields depends oncfg(target_endian)
. Doing this for all integer types, assign-bound
does, needs some serious macro-fu, but cranelift-entity only needs the u32 case.I'll call this struct
NonMaxU32
for brevity, although that's not quite correct. This type is guaranteed to have the same representation in memory as an u32 integer thanks to therepr
annotations, so it can be transmuted into and fromu32
-- as long as the u32 is less than0xFF_00_00_00
. This restriction on the MSB enables rustc to makeOption<NonMaxU32>
the same size. While this layout optimization is not guaranteed, it's fairly simple and has existed for a long time, so I wouldn't be worried about relying on it for performance. However, there's some other limitations:
- There are no guarantees about the memory layout of
Option<NonMaxU32>
, nor about how it's handled by calling conventions (unlikerepr(transparent)
wrappers likeNonZeroU32
). So the guarantee aboutPackedOption
's layout added in https://github.com/bytecodealliance/wasmtime/pull/9697 can't be salvaged with this approach.- Getting good performance relies on the struct being passed around and operated on as an ordinary integer, not as an aggregate with four fields. This is not true for the C calling convention on several platforms. Empirically it seems to work fine with the Rust calling convention today (see https://rust.godbolt.org/z/3sd3K9sdK), but it's not guaranteed.
- On some architectures,
0xFF_00_00_00
is a more awkward to encode as immediate operand thanu32::MAX
and0
. Thus, the comparisons done byEntityRef::new
and by pattern matching onOption<NonMaxU32>
may sometimes require an extra instruction or a longer instruction encoding.- Because the entire range
0xFF_00_00_00..=u32::MAX
is invalid, a checked conversionu32 -> Option<NonMaxU32>
has to do a comparison and normalize out-of-range values to theNone
representative (unlikeu32 -> Option<NonZeroU32>
, which is just a transmute).- It can't represent quite as many distinct values as a type that only excludes
u32::MAX
, but the difference is tiny, relatively speaking (4.278 billion vs 4.295 billion).
Last updated: Jan 24 2025 at 00:11 UTC