I have been benchmarking rustc with llvm and cg_clif today. One fun thing I noticed is that the cpu-clock metric when comparing cg_llvm and cg_clif improves more than the instruction count. This seems to suggest that Cranelift has a slightly higher ipc than LLVM.
Interesting, thanks for measuring that! I suspect this may be due to better cache locality (LLVM has heavily pointer-based IR data structures, Cranelift is largely index-based with dense arrays)
It might also be due to bound checks. I noticed the same behavior when benchmarking a Rust app and it's C++ equivalent: the Rust one had a bound check on the critical path, increasing the instructions count but not the cpu time thanks to branch prediction.
Last updated: Jan 24 2025 at 00:11 UTC