wasmtime / issue #12789 Using `csdb` in `JTSequence` on m... · git-wasmtime

Stream: git-wasmtime

Topic: wasmtime / issue #12789 Using `csdb` in `JTSequence` on m...

Wasmtime GitHub notifications bot (Mar 16 2026 at 22:33):

alexcrichton opened issue #12789:

I have vague recollections of discussing this in the past but I don't see a dedicated issue for this. Currently in the aarch64 backend the JTSequence instruction, a jump table which br_table in wasm uses, includes a csdb instruction for spectre mitigations and preventing speculation. The introduction of this in https://github.com/bytecodealliance/wasmtime/pull/4555 ran some benchmarks and found this to have little impact, but I've been made aware locally that this can have a much larger impact on macOS. IIRC this is macOS specific, but I forget.

To reproduce this I was using a coremark.wasm (such as this one) and I found that it prints a score of ~15k by default with wasmtime. I commented out the csdb, re-built wasmtime, ad the score jumped up to ~38k. Effectively, this instruction definitely has a noticable cost on at least macOS.

Do others remember any historical discussion we've had about this? Is this a macOS "bug" fixed in some future version of macOS silicon? Is this something fundamental that we stand by? (in comparison v8 performs over 2x better than Wasmtime on this same benchmark, presumably because it doesn't use csdb but I can't easily find out, but that'd be a least one data point).

Wasmtime GitHub notifications bot (Mar 16 2026 at 22:33):

alexcrichton added the cranelift:area:aarch64 label to Issue #12789.

Wasmtime GitHub notifications bot (Mar 16 2026 at 22:36):

alexcrichton commented on issue #12789:

Another data point: that same repository hosting coremark.wasm contains a lua.wasm with a few assorted "benchmarks" where Wasmtime/Cranelift are 6x slower than v8, and one smoke test locally shows that this is explained by csdb as well. My branch of Wasmtime with csdb removed performs on-par with v8, where the stock CLI I can reproduce the original results.

Wasmtime GitHub notifications bot (Mar 16 2026 at 22:41):

cfallin commented on issue #12789:

There's an extensive Zulip thread here about this very topic! I was also chasing this down at the time and hoping to be able to justify removing csdb.

Unfortunately it seems that no one who really knows if it's necessary (as in "actually necessary on known microarchitectures") can comment straightforwardly, and the architecture docs say that we must do it if we want to be Spectre-safe. I was trying to argue from a place of "I'm not aware of value speculation that would actually cause issues in our br_table implementation" but reading between the lines, it seems that wasn't fully supported.

It's pretty unfortunate that others can do benchmarking and show huge gains relative to Wasmtime+Cranelift on the basis of our Spectre safety mitigations that the arch docs say we must have!

IIRC, one workaround discussed at the time was that wasmtime-cli could potentially disable Spectre mitigations, on the basis that only one instance was running (or, I guess, only do so for a single-instance component). Or alternately, we could add a -C i-like-to-live-dangerously-and-spectre-cannot-hurt-me=true option.

Wasmtime GitHub notifications bot (Mar 18 2026 at 04:55):

alexcrichton commented on issue #12789:

I posted #12798 to at least provide the option to test this out locally disabled.

I also feel like we've talked about this before, but do we feel that we have a bullet-proof-enough spectre story that it's worth taking a 2x performance hit, by default, on a popular platform that people more-often-than-not benchmark on? I realize that's a bit of a loaded question, but I'm hesitant to have hand-wavy reasons that none of us understand to take such a large performance hit.

Wasmtime GitHub notifications bot (Mar 18 2026 at 09:41):

cfallin commented on issue #12789:

Yeah, I'm with you on this one honestly: surveys of peer engines, and various benchmarking, have shown that we're alone on this one and I think we should consider changing the default to avoid such a large unilateral penalty.

Taking the view "what if we didn't have this mitigation today and someone proposed it", I would have asked for benchmarks, I would see the 2x penalty, and I would require extraordinary evidence that this is actually necessary to mitigate a real security issue. We don't have that, only handwavy docs-say-we-should-do-this, and I don't think I would have seen that as sufficient for on-by-default.

So I guess I'm saying: I approved #12798 and I'm happy to approve another PR that flips the default if you want. We could also discuss further at the next Wasmtime meeting if you think that's needed.

Wasmtime GitHub notifications bot (Mar 18 2026 at 09:57):

cfallin closed issue #12789:

I have vague recollections of discussing this in the past but I don't see a dedicated issue for this. Currently in the aarch64 backend the JTSequence instruction, a jump table which br_table in wasm uses, includes a csdb instruction for spectre mitigations and preventing speculation. The introduction of this in https://github.com/bytecodealliance/wasmtime/pull/4555 ran some benchmarks and found this to have little impact, but I've been made aware locally that this can have a much larger impact on macOS. IIRC this is macOS specific, but I forget.

To reproduce this I was using a coremark.wasm (such as this one) and I found that it prints a score of ~15k by default with wasmtime. I commented out the csdb, re-built wasmtime, ad the score jumped up to ~38k. Effectively, this instruction definitely has a noticable cost on at least macOS.

Do others remember any historical discussion we've had about this? Is this a macOS "bug" fixed in some future version of macOS silicon? Is this something fundamental that we stand by? (in comparison v8 performs over 2x better than Wasmtime on this same benchmark, presumably because it doesn't use csdb but I can't easily find out, but that'd be a least one data point).

Wasmtime GitHub notifications bot (Mar 18 2026 at 10:18):

cfallin reopened issue #12789:

I have vague recollections of discussing this in the past but I don't see a dedicated issue for this. Currently in the aarch64 backend the JTSequence instruction, a jump table which br_table in wasm uses, includes a csdb instruction for spectre mitigations and preventing speculation. The introduction of this in https://github.com/bytecodealliance/wasmtime/pull/4555 ran some benchmarks and found this to have little impact, but I've been made aware locally that this can have a much larger impact on macOS. IIRC this is macOS specific, but I forget.

To reproduce this I was using a coremark.wasm (such as this one) and I found that it prints a score of ~15k by default with wasmtime. I commented out the csdb, re-built wasmtime, ad the score jumped up to ~38k. Effectively, this instruction definitely has a noticable cost on at least macOS.

Do others remember any historical discussion we've had about this? Is this a macOS "bug" fixed in some future version of macOS silicon? Is this something fundamental that we stand by? (in comparison v8 performs over 2x better than Wasmtime on this same benchmark, presumably because it doesn't use csdb but I can't easily find out, but that'd be a least one data point).

Wasmtime GitHub notifications bot (Mar 18 2026 at 11:41):

tschneidereit commented on issue #12789:

I agree with this: we wouldn't be likely to accept this as a regression enabled by default.

If we do end up switching the default, we should make sure to highlight that fact very prominently in the change log at least, though. Alternatively, we could change the default for the cli and not set a default for embedders, forcing them to make a choice on this.

And relatedly, should that choice be about this particular mitigation, or about all of them, with the ability to be more fine-grained if desired, but the obvious path being to either enable or disable all of them?

Wasmtime GitHub notifications bot (Apr 01 2026 at 16:13):

alexcrichton commented on issue #12789:

Conclusion of discussion from Cranelift's meeting today:

Add a new Cranelift setting for "emit csdb on aarch64"

Turn this setting off-by-default

Conditional emission of csdb based on this setting in the aarch64 backend

Wasmtime GitHub notifications bot (Apr 01 2026 at 22:59):

alexcrichton closed issue #12789:

I have vague recollections of discussing this in the past but I don't see a dedicated issue for this. Currently in the aarch64 backend the JTSequence instruction, a jump table which br_table in wasm uses, includes a csdb instruction for spectre mitigations and preventing speculation. The introduction of this in https://github.com/bytecodealliance/wasmtime/pull/4555 ran some benchmarks and found this to have little impact, but I've been made aware locally that this can have a much larger impact on macOS. IIRC this is macOS specific, but I forget.

To reproduce this I was using a coremark.wasm (such as this one) and I found that it prints a score of ~15k by default with wasmtime. I commented out the csdb, re-built wasmtime, ad the score jumped up to ~38k. Effectively, this instruction definitely has a noticable cost on at least macOS.

Do others remember any historical discussion we've had about this? Is this a macOS "bug" fixed in some future version of macOS silicon? Is this something fundamental that we stand by? (in comparison v8 performs over 2x better than Wasmtime on this same benchmark, presumably because it doesn't use csdb but I can't easily find out, but that'd be a least one data point).

Last updated: Jun 01 2026 at 09:49 UTC