jameysharp opened PR #5850 from branch-once
to main
:
This uses the
cmov
that was necessary anyway for Spectre mitigation to clamp the table index, instead of zeroing it. By then placing the default target as the last entry in the table, we can use just one branch instruction in all cases.This is a net savings of two bytes in the encoding of x64's br_table pseudoinstruction.
I haven't done any benchmarking yet. I'm guessing either this will be faster because it doesn't branch as often, or it will be slower because it executes more instructions in the common case where the bounds check succeeds.
I also haven't updated the comments in the implementation because first I wanted to see if this works at all (it passes the test suite!) and whether folks have strong arguments for or against this change.
It's just a random idea I had and thought I'd try out.
jameysharp requested cfallin for a review on PR #5850.
jameysharp updated PR #5850 from branch-once
to main
.
jameysharp has marked PR #5850 as ready for review.
cfallin submitted PR review.
cfallin submitted PR review.
cfallin created PR review comment:
Can we add a comment here that this is safe because
tmp2
is written only afteridx
is used, and likewise a comment inemit.rs
in the listing of the emitted sequence that if this property changes, we need to update regalloc metadata?
jameysharp updated PR #5850 from branch-once
to main
.
jameysharp has enabled auto merge for PR #5850.
jameysharp merged PR #5850.
Last updated: Dec 23 2024 at 12:05 UTC