fitzgen labeled issue #6051:
I'm seeing a bunch of 3-5 target
br_tables inspidermonkey.wasm, and I suspect that a series of conditional branches might perform better than these smallbr_tables and their indirect jumps. That they are showing up inspidermonkey.wasmmeans that LLVM's Wasm backend is regularly emitting them, so this wouldn't be too overly specific of an optimization.Could potentially do this in the mid-end or during lowering. Haven't thought too much about it.
fitzgen opened issue #6051:
I'm seeing a bunch of 3-5 target
br_tables inspidermonkey.wasm, and I suspect that a series of conditional branches might perform better than these smallbr_tables and their indirect jumps. That they are showing up inspidermonkey.wasmmeans that LLVM's Wasm backend is regularly emitting them, so this wouldn't be too overly specific of an optimization.Could potentially do this in the mid-end or during lowering. Haven't thought too much about it.
cfallin commented on issue #6051:
We could definitely do it during lowering; we could have pseudoinstruction sequences for powers of two (2, 4, 8 targets at least) that are full binary trees, and if we have e.g. 7 targets instead then fill in the remainders with the default label.
One consideration here is to ensure that we don't lose the Spectre mitigation properties: we have a cmove to short-circuit the loaded target to default on out-of-bounds, and modern CPUs (i) guarantee that cmoves are not data-speculated, generally (or have a special incantation a la aarch64's
csdbbarrier to get this guarantee), and (ii) have thought specifically about the Spectre implications of indirect branch prediction and mitigated that. A tree of conditional branches may also be safe (I'm not aware of any CPUs that feed data from speculative execution back into the direction predictor) but we'd have to be sure.
afonso360 commented on issue #6051:
We have similar logic in the the switch interface where it can emit trees of branches if it thinks is beneficial. We could do a quick test by replacing all
br_tables with that, it also falls back tobr_tablewhen necessary.
jameysharp commented on issue #6051:
If you give the switch helper in cranelift-frontend a single contiguous range of cases, it will always generate a single
br_tablefor the whole range. The only exceptions are when there are no cases (which just jumps to the default case) or when there's exactly one case (which generates abrifto choose between the case and the default). So that isn't as helpful as it looks.I think I'd weakly argue against doing this during lowering, because I think that will impede branch optimization. As long as we represent the search tree explicitly in CLIF, the block lowering order will be able to see that control flow and have the opportunity to fuse jump-target blocks into the tree. If we use pseudo-instructions then the branches will be emitted too late for that.
Last updated: Dec 06 2025 at 06:05 UTC