alexcrichton transferred Issue #642:
Cranelift uses arrays and indices extensively, rather than pointer-based data structures. This allows us to use 32-bit indices rather than 64-bit pointers on 64-bit platforms, and has some other nice properties, however these indices appear to Rust as random accesses, so they get bounds checked.
The following patch to disable bounds checks in
PrimaryMap
speeds up Cranelift compile times by over 8%, which is considerable:diff --git a/lib/entity/src/primary.rs b/lib/entity/src/primary.rs index 6d710c2e..911223bd 100644 --- a/lib/entity/src/primary.rs +++ b/lib/entity/src/primary.rs @@ -150,7 +150,7 @@ where type Output = V; fn index(&self, k: K) -> &V { - &self.elems[k.index()] + unsafe { self.elems.get_unchecked(k.index()) } } } @@ -160,7 +160,7 @@ where K: EntityRef, { fn index_mut(&mut self, k: K) -> &mut V { - &mut self.elems[k.index()] + unsafe { self.elems.get_unchecked_mut(k.index()) } } }Of course, this is not completely safe.
But, the common pattern of these indices in Cranelift is to use them relatively safely. We don't use this indices for iterating; we use custom index types that don't even permit arithmetic. Most
PrimaryMap
keys are produced byPrimaryMap
itself when we push new entries:let key = my_primary_map.push(x); // returns the key of the newly pushed element
These keys are in bounds from the start.
Yet, there are some ways things could go wrong. Besides a few minor details which a relatively easy to fix, there are a few deep problems. Nothing outright prevents keys from living across calls to
clear()
. Similarly, nothing prevents indices from living while we exit aPrimaryMap
's scope and then re-enter it, creating a newPrimaryMap
. And, while we use distinct index types to catch using an index in the wrong map, and don't tend to use more than one instance ofPrimaryMap
with the same type at a time, nothing enforces that.We don't tend to do the kinds of things which would lead to trouble. But all the same, disabling the bounds checks is not completely safe.
So I don't yet know what the best thing to do here is. Should we add a cargo feature to allow users to decide whether they want the safe or unsafe version? Is there a way to do what we're doing in Rust safely without paying for bounds checks? Should we re-architect Cranelift's core data structures to some other form?
cc @nnethercote
Last updated: Dec 23 2024 at 13:07 UTC