bongjunj opened issue #12139:
Hi,
this is a follow-up of https://github.com/bytecodealliance/wasmtime/issues/12106 .
Although we've removed two sorts of performance regressing mid-end ISLE rules,
there still remains a significant performance degradation as well as other suspected cases.
(There is, of course, a bright side: we have significant performance improvements for many cases!)Performance Regression:
- shootout-switch
- pulldown-cmark
First, here is the backing data of the performance regression:
Benchmark No Opt Main Main Speedup blake3-scalar 317,727 317,719 0.00% blake3-simd 313,115 306,232 2.25% bz2 87,201,400 86,337,330 1.00% pulldown-cmark 6,580,174 6,905,992 -4.72% regex 209,743,816 210,183,175 -0.21% shootout-ackermann 8,498,140 7,764,439 9.45% shootout-base64 381,721,177 352,724,661 8.22% shootout-ctype 830,813,398 796,486,698 4.31% shootout-ed25519 9,583,747,723 9,395,321,203 2.01% shootout-fib2 3,009,269,670 3,010,509,565 -0.04% shootout-gimli 5,338,258 5,401,697 -1.17% shootout-heapsort 2,382,073,831 2,375,914,107 0.26% shootout-keccak 25,168,386 21,112,482 19.21% shootout-matrix 538,696,036 544,739,691 -1.11% shootout-memmove 36,156,621 36,115,998 0.11% shootout-minicsv 1,481,713,625 1,291,534,227 14.73% shootout-nestedloop 449 442 1.43% shootout-random 630,328,205 439,691,474 43.36% shootout-ratelimit 39,148,817 39,956,714 -2.02% shootout-seqhash 8,869,585,125 8,639,110,150 2.67% shootout-sieve 905,404,028 840,777,681 7.69% shootout-switch 139,525,474 153,663,682 -9.20% shootout-xblabla20 2,891,404 2,907,369 -0.55% shootout-xchacha20 4,384,703 4,395,319 -0.24% spidermonkey 636,104,785 631,998,404 0.65% Unlike the previous cases, the cause is not obvious.
19245 clif/v-no-opts/shootout-switch/wasm[0]--function[9]--__original_main.clif 19241 clif/v-main/shootout-switch/wasm[0]--function[9]--__original_main.clifThe number of instructions does not increase significantly from no-opt to main.
However, the applied optimizations make the program use long-lived value:--- /data/bongjun/clif/v-no-opts/shootout-switch/wasm[0]--function[9]--__original_main.clif 2025-12-08 12:43:58.406738645 +0000 +++ /data/bongjun/clif/v-main/shootout-switch/wasm[0]--function[9]--__original_main.clif 2025-12-08 12:49:01.961085326 +0000 - v8572 = iconst.i32 1066 - v8573 = iconst.i32 0 -@d20b v4324 = call fn1(v0, v0, v8572, v8573) ; v8572 = 1066, v8573 = 0 - v8574 = iadd.i64 v105, v106 ; v106 = 3584 -@d219 v4333 = load.i32 little heap v8574 - v8575 = iconst.i32 6 - v8576 = icmp uge v4333, v8575 ; v8575 = 6 + v8603 = iconst.i32 1066 + v8604 = iconst.i32 0 +@d20b v4324 = call fn1(v0, v0, v8603, v8604) ; v8603 = 1066, v8604 = 0 + v8605 = iadd.i64 v11, v106 ; v106 = 3584 +@d219 v4333 = load.i32 little heap v8605 + v8606 = iconst.i32 6 + v8607 = icmp uge v4333, v8606 ; v8606 = 6See
v8574andv8605which usesv105andv11.
v11is defined at the beginning, butv105is defined later thanv11:block0(v0: i64, v1: i64): @01f0 v5 = load.i32 notrap aligned table v0+256 @01f6 v6 = iconst.i32 16 @01f8 v7 = isub v5, v6 ; v6 = 16 @01fb store notrap aligned table v7, v0+256 @0203 v9 = iconst.i32 0x2710 @0207 v11 = load.i64 notrap aligned readonly can_move checked v0+56 ... @02d6 v105 = iadd.i64 v11, v4439This might increase the register pressure, causing more spills which can degrade the performance.
alexcrichton added the cranelift label to Issue #12139.
alexcrichton added the cranelift:goal:optimize-speed label to Issue #12139.
alexcrichton added the performance label to Issue #12139.
bongjunj commented on issue #12139:
Investigating the generated CLIF for
shootout-switchcase, I noticed there are over 2,500 unused constants in a block:
https://gist.github.com/bongjunj/08cf48d4e5827cf8ed270f26442e2604#file-main-clif-L216-L240 .The constants defined in the block are never referenced anywhere in the function.
Is this a intended behavior for translatingswitch?To repro this, run
wasmtime compile sightglass/benchmarks/shootout/shootut-switch.wasm --emit-clif <dir>and look up theshootout-switch/wasm\[0\]--function\[9\]--__original_main.cliffile.cc @cfallin
bongjunj edited a comment on issue #12139:
Investigating the generated CLIF for
shootout-switchcase, I noticed there are over 2,500 unused constants in a block:
https://gist.github.com/bongjunj/08cf48d4e5827cf8ed270f26442e2604#file-main-clif-L216-L240 .
For example, look upv185in the CLIF file. None are matched.The constants defined in the block are never referenced anywhere in the function.
Is this a intended behavior for translatingswitch?To repro this, run
wasmtime compile sightglass/benchmarks/shootout/shootut-switch.wasm --emit-clif <dir>and look up theshootout-switch/wasm\[0\]--function\[9\]--__original_main.cliffile.cc @cfallin
bongjunj edited a comment on issue #12139:
Investigating the generated CLIF for
shootout-switchcase, I noticed there are over 2,500 unused constants in a block:
https://gist.github.com/bongjunj/08cf48d4e5827cf8ed270f26442e2604#file-main-clif-L216-L240 .
For example, look upv185in the CLIF file. None are matched.This not only happens in the specified block, but also in another block too:
block3: v4434 = iconst.i32 16 v4435 = iadd.i32 v16, v4434 ; v4434 = 16 v4436 = iconst.i32 0 @0236 v31 = iconst.i32 7 @0231 v28 = iconst.i32 12 @0243 v38 = iconst.i32 6 @023e v36 = iconst.i32 8 @0250 v45 = iconst.i32 5 @024b v43 = iconst.i32 4 @0267 v57 = iconst.i32 3 @0262 v55 = iconst.i32 -4 @0274 v64 = iconst.i32 2 @026f v62 = iconst.i32 -8 @0281 v71 = iconst.i32 1 @027c v69 = iconst.i32 -12 @0289 v76 = iconst.i32 -16 v4409 = iconst.i32 9992 @0293 v81 = iconst.i32 32 @022d jump block4(v4435, v4436) ; v4436 = 0 ... @028e v78 = uextend.i64 v4463 @028e v80 = iadd.i64 v11, v78 @028e store little heap v30, v80 v4413 = icmp ne v30, v4409 ; v4409 = 9992Surprisingly, none of the constants from
v31tov76in the block are used anywhere in the function.
The constants defined in the block are never referenced anywhere in the function.
Is this a intended behavior for translatingswitch?Furthermore, the constant
v4409that is newly generated as a result of optimization is not placed close enough to its usagev4413. I see no valid reason to instantiate such constant distant from its only usage.To repro this, run
wasmtime compile sightglass/benchmarks/shootout/shootut-switch.wasm --emit-clif <dir>and look up theshootout-switch/wasm\[0\]--function\[9\]--__original_main.cliffile.cc @cfallin
Last updated: Dec 13 2025 at 19:03 UTC