wasmtime / PR #5004 x86_64: align loop headers to 64 bytes · git-wasmtime

This PR introduces alignment for the loop headers to 64 bytes on x86_64. I benchmarked this on AMD Zen 3 x5950 and it produced the following results.

instantiation :: cycles :: benchmarks/shootout-gimli/benchmark.wasm

  Δ = 26281.32 ± 4598.52 (confidence = 99%)

  pep-align-loops.so is 1.44x to 1.63x faster than main.so!

  [53890 75407.92 118592] main.so
  [36448 49126.60 87992] pep-align-loops.so

instantiation :: cycles :: benchmarks/shootout-seqhash/benchmark.wasm

  Δ = 34856.12 ± 3933.07 (confidence = 99%)

  pep-align-loops.so is 1.46x to 1.57x faster than main.so!

  [82756 102507.28 150960] main.so
  [50592 67651.16 106658] pep-align-loops.so

instantiation :: cycles :: benchmarks/shootout-keccak/benchmark.wasm

  Δ = 28160.16 ± 5341.58 (confidence = 99%)

  pep-align-loops.so is 1.40x to 1.59x faster than main.so!

  [59942 85105.40 132532] main.so
  [42262 56945.24 91494] pep-align-loops.so

instantiation :: cycles :: benchmarks/shootout-heapsort/benchmark.wasm

  Δ = 30212.40 ± 4510.98 (confidence = 99%)

  pep-align-loops.so is 1.41x to 1.56x faster than main.so!

  [68714 92486.12 158032] main.so
  [47056 62273.72 97852] pep-align-loops.so

instantiation :: cycles :: benchmarks/shootout-xchacha20/benchmark.wasm

  Δ = 26062.70 ± 4747.21 (confidence = 99%)

  pep-align-loops.so is 1.39x to 1.56x faster than main.so!

  [59262 81086.94 130798] main.so
  [40392 55024.24 87312] pep-align-loops.so

instantiation :: cycles :: benchmarks/shootout-fib2/benchmark.wasm

  Δ = 34163.20 ± 3949.15 (confidence = 99%)

  pep-align-loops.so is 1.42x to 1.53x faster than main.so!

  [83232 106564.84 146948] main.so
  [55590 72401.64 104278] pep-align-loops.so

instantiation :: cycles :: benchmarks/shootout-sieve/benchmark.wasm

  Δ = 30320.18 ± 4710.86 (confidence = 99%)

  pep-align-loops.so is 1.39x to 1.53x faster than main.so!

  [73882 96454.94 189584] main.so
  [50558 66134.76 110466] pep-align-loops.so

instantiation :: cycles :: benchmarks/blake3-simd/benchmark.wasm

  Δ = 26868.16 ± 4298.60 (confidence = 99%)

  pep-align-loops.so is 1.38x to 1.53x faster than main.so!

  [63682 85859.86 126990] main.so
  [43044 58991.70 83606] pep-align-loops.so

instantiation :: cycles :: benchmarks/shootout-memmove/benchmark.wasm

  Δ = 25414.32 ± 4195.94 (confidence = 99%)

  pep-align-loops.so is 1.37x to 1.52x faster than main.so!

  [57460 82822.64 127364] main.so
  [46376 57408.32 98498] pep-align-loops.so

instantiation :: cycles :: benchmarks/shootout-ed25519/benchmark.wasm

  Δ = 39825.22 ± 5512.82 (confidence = 99%)

  pep-align-loops.so is 1.38x to 1.50x faster than main.so!

  [93432 130104.74 168538] main.so
  [57222 90279.52 166124] pep-align-loops.so

instantiation :: cycles :: benchmarks/shootout-minicsv/benchmark.wasm

  Δ = 24848.90 ± 3936.23 (confidence = 99%)

  pep-align-loops.so is 1.37x to 1.51x faster than main.so!

  [60656 81573.48 123760] main.so
  [41990 56724.58 90338] pep-align-loops.so

instantiation :: cycles :: benchmarks/shootout-random/benchmark.wasm

  Δ = 26969.48 ± 4123.30 (confidence = 99%)

  pep-align-loops.so is 1.35x to 1.48x faster than main.so!

  [67694 91377.04 137054] main.so
  [46206 64407.56 93228] pep-align-loops.so

instantiation :: cycles :: benchmarks/shootout-nestedloop/benchmark.wasm

  Δ = 25948.46 ± 4511.46 (confidence = 99%)

  pep-align-loops.so is 1.34x to 1.48x faster than main.so!

  [69088 89657.66 136612] main.so
  [46784 63709.20 90916] pep-align-loops.so

instantiation :: cycles :: benchmarks/shootout-ctype/benchmark.wasm

  Δ = 27127.58 ± 4332.85 (confidence = 99%)

  pep-align-loops.so is 1.33x to 1.45x faster than main.so!

  [76228 96649.76 169218] main.so
  [55964 69522.18 109922] pep-align-loops.so

instantiation :: cycles :: benchmarks/shootout-xblabla20/benchmark.wasm

  Δ = 20529.54 ± 4116.39 (confidence = 99%)

  pep-align-loops.so is 1.29x to 1.44x faster than main.so!

  [58446 76712.84 112710] main.so
  [42534 56183.30 83572] pep-align-loops.so

instantiation :: cycles :: benchmarks/shootout-matrix/benchmark.wasm

  Δ = 24574.18 ± 4205.02 (confidence = 99%)

  pep-align-loops.so is 1.30x to 1.43x faster than main.so!

  [73576 92209.02 127092] main.so
  [50830 67634.84 120972] pep-align-loops.so

instantiation :: cycles :: benchmarks/shootout-switch/benchmark.wasm

  Δ = 28385.24 ± 4903.24 (confidence = 99%)

  pep-align-loops.so is 1.30x to 1.42x faster than main.so!

  [78370 107118.02 161194] main.so
  [61778 78732.78 114172] pep-align-loops.so

instantiation :: cycles :: benchmarks/shootout-ratelimit/benchmark.wasm

  Δ = 23482.78 ± 4058.42 (confidence = 99%)

  pep-align-loops.so is 1.29x to 1.42x faster than main.so!

  [66096 89596.12 135694] main.so
  [48586 66113.34 98260] pep-align-loops.so

instantiation :: cycles :: benchmarks/blake3-scalar/benchmark.wasm

  Δ = 28377.42 ± 5294.94 (confidence = 99%)

  pep-align-loops.so is 1.29x to 1.42x faster than main.so!

  [85680 108899.96 149872] main.so
  [63138 80522.54 135082] pep-align-loops.so

instantiation :: cycles :: benchmarks/shootout-base64/benchmark.wasm

  Δ = 22593.34 ± 4545.94 (confidence = 99%)

  pep-align-loops.so is 1.26x to 1.40x faster than main.so!

  [71060 91037.72 146880] main.so
  [54060 68444.38 107202] pep-align-loops.so

instantiation :: cycles :: benchmarks/noop/benchmark.wasm

  Δ = 17670.82 ± 4192.78 (confidence = 99%)

  pep-align-loops.so is 1.25x to 1.40x faster than main.so!

  [59908 72167.72 107202] main.so
  [43724 54496.90 93670] pep-align-loops.so

instantiation :: cycles :: benchmarks/bz2/benchmark.wasm

  Δ = 35288.26 ± 5934.66 (confidence = 99%)

  pep-align-loops.so is 1.26x to 1.37x faster than main.so!

  [102340 146073.52 194038] main.so
  [80784 110785.26 180472] pep-align-loops.so

instantiation :: cycles :: benchmarks/shootout-ackermann/benchmark.wasm

  Δ = 21964.34 ± 4552.01 (confidence = 99%)

  pep-align-loops.so is 1.24x to 1.36x faster than main.so!

  [71502 95173.48 139842] main.so
  [54060 73209.14 108086] pep-align-loops.so

instantiation :: cycles :: benchmarks/intgemm-simd/benchmark.wasm

  Δ = 30523.16 ± 5684.40 (confidence = 99%)

  pep-align-loops.so is 1.21x to 1.31x faster than main.so!

  [127704 147993.16 218076] main.so
  [85170 117470.00 152490] pep-align-loops.so

instantiation :: cycles :: benchmarks/meshoptimizer/benchmark.wasm

  Δ = 27309.14 ± 3876.14 (confidence = 99%)

  pep-align-loops.so is 1.19x to 1.25x faster than main.so!

  [131002 152892.56 180472] main.so
  [89828 125583.42 153068] pep-align-loops.so

execution :: cycles :: benchmarks/shootout-sieve/benchmark.wasm

  Δ = 160632987.42 ± 5972804.43 (confidence = 99%)

  pep-align-loops.so is 1.17x to 1.18x faster than main.so!

  [1039973776 1079490706.44 1137124492] main.so
  [884091222 918857719.02 940632032] pep-align-loops.so

instantiation :: cycles :: benchmarks/pulldown-cmark/benchmark.wasm

  Δ = 22790.88 ± 7872.59 (confidence = 99%)

  pep-align-loops.so is 1.09x to 1.19x faster than main.so!

  [151266 187293.76 236504] main.so
  [115294 164502.88 223210] pep-align-loops.so

execution :: cycles :: benchmarks/shootout-xchacha20/benchmark.wasm

  Δ = 750342.94 ± 343121.40 (confidence = 99%)

  main.so is 1.07x to 1.18x faster than pep-align-loops.so!

  [4639198 5917878.50 8029610] main.so
  [5627170 6668221.44 9286080] pep-align-loops.so

compilation :: cycles :: benchmarks/pulldown-cmark/benchmark.wasm

  Δ = 40131994.18 ± 6247250.64 (confidence = 99%)

  pep-align-loops.so is 1.08x to 1.11x faster than main.so!

  [447057432 471406190.64 514834086] main.so
  [398536440 431274196.46 494554072] pep-align-loops.so

compilation :: cycles :: benchmarks/bz2/benchmark.wasm

  Δ = 17023452.18 ± 6101055.67 (confidence = 99%)

  pep-align-loops.so is 1.05x to 1.11x faster than main.so!

  [197441876 225073129.96 300179132] main.so
  [186926662 208049677.78 259271080] pep-align-loops.so

execution :: cycles :: benchmarks/shootout-fib2/benchmark.wasm

  Δ = 229041837.76 ± 62294164.86 (confidence = 99%)

  main.so is 1.06x to 1.10x faster than pep-align-loops.so!

  [2760966328 2877473442.90 2963631080] main.so
  [3000497042 3106515280.66 5393578372] pep-align-loops.so

compilation :: cycles :: benchmarks/spidermonkey/benchmark.wasm

  Δ = 680581047.42 ± 58488688.83 (confidence = 99%)

  pep-align-loops.so is 1.06x to 1.07x faster than main.so!

  [10586670916 10705050364.66 10875778390] main.so
  [9403424098 10024469317.24 10473503548] pep-align-loops.so

compilation :: cycles :: benchmarks/blake3-scalar/benchmark.wasm

  Δ = 14350847.50 ± 5439729.54 (confidence = 99%)

  pep-align-loops.so is 1.04x to 1.09x faster than main.so!

  [223399074 234169891.16 298711012] main.so
  [203019610 219819043.66 281894306] pep-align-loops.so

execution :: cycles :: benchmarks/shootout-switch/benchmark.wasm

  Δ = 6943520.80 ± 883372.86 (confidence = 99%)

  main.so is 1.06x to 1.07x faster than pep-align-loops.so!

  [103285506 107387684.20 112425352] main.so
  [108359326 114331205.00 123381886] pep-align-loops.so

compilation :: cycles :: benchmarks/shootout-ackermann/benchmark.wasm

  Δ = 4700652.32 ± 4147779.68 (confidence = 99%)

  pep-align-loops.so is 1.01x to 1.11x faster than main.so!

  [78591068 87546072.66 131549298] main.so
  [74343754 82845420.34 119083402] pep-align-loops.so

compilation :: cycles :: benchmarks/shootout-ed25519/benchmark.wasm

  Δ = 32031030.76 ± 10733541.38 (confidence = 99%)

  pep-align-loops.so is 1.04x to 1.07x faster than main.so!


[message truncated]

Stream: git-wasmtime

Topic: wasmtime / PR #5004 x86_64: align loop headers to 64 bytes

Wasmtime GitHub notifications bot (Oct 04 2022 at 10:40):

Wasmtime GitHub notifications bot (Oct 04 2022 at 10:40):

Wasmtime GitHub notifications bot (Oct 04 2022 at 10:40):

Wasmtime GitHub notifications bot (Oct 04 2022 at 11:06):

Wasmtime GitHub notifications bot (Oct 04 2022 at 11:26):

Wasmtime GitHub notifications bot (Oct 04 2022 at 11:26):

Wasmtime GitHub notifications bot (Oct 04 2022 at 11:27):

Wasmtime GitHub notifications bot (Oct 04 2022 at 11:27):

Wasmtime GitHub notifications bot (Oct 04 2022 at 11:29):

Wasmtime GitHub notifications bot (Oct 04 2022 at 11:29):

Wasmtime GitHub notifications bot (May 21 2024 at 22:18):

Wasmtime GitHub notifications bot (May 21 2024 at 22:34):

Wasmtime GitHub notifications bot (May 21 2024 at 23:44):

Wasmtime GitHub notifications bot (May 23 2024 at 00:27):