fitzgen opened PR #12006 from fitzgen:dont-select-select to bytecodealliance:main:
selects cannot be speculated through on some of our targets (e.g. x64) so strongly prefer not choosing them.Using a target-specific cost function, so that we only pessimize
selectwhen it makes sense, is left for follow up work and is tracked in #12005.<!--
Please make sure you include the following information:
If this work has been discussed elsewhere, please include a link to that
conversation. If it was discussed in an issue, just mention "issue #...".Explain why this change is needed. If the details are in an issue already,
this can be brief.Our development process is documented in the Wasmtime book:
https://docs.wasmtime.dev/contributing-development-process.htmlPlease ensure all communication follows the code of conduct:
https://github.com/bytecodealliance/wasmtime/blob/main/CODE_OF_CONDUCT.md
-->
fitzgen requested alexcrichton for a review on PR #12006.
fitzgen requested wasmtime-compiler-reviewers for a review on PR #12006.
fitzgen edited PR #12006:
selects cannot be speculated through on some of our targets (e.g. x64) so strongly prefer not choosing them.Using a target-specific cost function, so that we only pessimize
selectwhen it makes sense for the target, is left for follow up work and is tracked in #12005. FWIW, no CPUs on the market do value speculation today, as far as I am aware, so this particular case is somewhat hypothetical (but the larger goal of target-specific cost functions would still be useful).<!--
Please make sure you include the following information:
If this work has been discussed elsewhere, please include a link to that
conversation. If it was discussed in an issue, just mention "issue #...".Explain why this change is needed. If the details are in an issue already,
this can be brief.Our development process is documented in the Wasmtime book:
https://docs.wasmtime.dev/contributing-development-process.htmlPlease ensure all communication follows the code of conduct:
https://github.com/bytecodealliance/wasmtime/blob/main/CODE_OF_CONDUCT.md
-->
cfallin commented on PR #12006:
selects cannot be speculated through on some of our targets (e.g. x64) so strongly prefer not choosing them.
FWIW, I think this is losing a little nuance and the reality doesn't merit a cost of 50 (!). In more detail, select (cmove) works like any other multi-input operator on modern out-of-order CPUs; the main difference is that it has three inputs (flags/condition, source for the CMOVcc, old register value for the CMOVcc). When all three inputs are ready, the instruction will execute.
When folks say that "select isn't speculated through" what they mean is that the CPU doesn't have a condition predictor (like the branch predictor) that will allow the instruction to speculatively go without the condition input ready. It doesn't mean, however, that the instruction blocks all speculation or serves as a pipeline barrier/flush. That would be far worse!
Supporting source: Agner Fog's instruction latency tables show on a reasonable x86-64 baseline (Skylake, circa 2015), CMOVcc reg/reg form is 1 uop and has a latency of 1 cycle and a reciprocal throughput of 0.5 (so two CMOVccs can complete per cycle).
alexcrichton requested cfallin for a review on PR #12006.
fitzgen updated PR #12006.
fitzgen commented on PR #12006:
@cfallin I was indeed not thinking in a nuanced way about this, thanks for the clarifications/reality check.
I removed the
selectspecial case, so it will hit the default cost, which will be slightly greater than the cost of adds/etc.I left
imulat cost 10 though.
Last updated: Dec 06 2025 at 07:03 UTC