Stream: git-cranelift

Topic: cranelift / Issue #642 Impact of bounds checks on compile...


view this post on Zulip GitHub (Feb 28 2020 at 23:25):

alexcrichton transferred Issue #642:

Cranelift uses arrays and indices extensively, rather than pointer-based data structures. This allows us to use 32-bit indices rather than 64-bit pointers on 64-bit platforms, and has some other nice properties, however these indices appear to Rust as random accesses, so they get bounds checked.

The following patch to disable bounds checks in PrimaryMap speeds up Cranelift compile times by over 8%, which is considerable:

diff --git a/lib/entity/src/primary.rs b/lib/entity/src/primary.rs
index 6d710c2e..911223bd 100644
--- a/lib/entity/src/primary.rs
+++ b/lib/entity/src/primary.rs
@@ -150,7 +150,7 @@ where
     type Output = V;

     fn index(&self, k: K) -> &V {

-        &self.elems[k.index()]
+        unsafe { self.elems.get_unchecked(k.index()) }
     }
 }

@@ -160,7 +160,7 @@ where
     K: EntityRef,
 {
     fn index_mut(&mut self, k: K) -> &mut V {

-        &mut self.elems[k.index()]
+        unsafe { self.elems.get_unchecked_mut(k.index()) }
     }
 }

Of course, this is not completely safe.

But, the common pattern of these indices in Cranelift is to use them relatively safely. We don't use this indices for iterating; we use custom index types that don't even permit arithmetic. Most PrimaryMap keys are produced by PrimaryMap itself when we push new entries:

let key = my_primary_map.push(x); // returns the key of the newly pushed element

These keys are in bounds from the start.

Yet, there are some ways things could go wrong. Besides a few minor details which a relatively easy to fix, there are a few deep problems. Nothing outright prevents keys from living across calls to clear(). Similarly, nothing prevents indices from living while we exit a PrimaryMap's scope and then re-enter it, creating a new PrimaryMap. And, while we use distinct index types to catch using an index in the wrong map, and don't tend to use more than one instance of PrimaryMap with the same type at a time, nothing enforces that.

We don't tend to do the kinds of things which would lead to trouble. But all the same, disabling the bounds checks is not completely safe.

So I don't yet know what the best thing to do here is. Should we add a cargo feature to allow users to decide whether they want the safe or unsafe version? Is there a way to do what we're doing in Rust safely without paying for bounds checks? Should we re-architect Cranelift's core data structures to some other form?

cc @nnethercote


Last updated: Oct 23 2024 at 20:03 UTC