jlb6740 commented on issue #5060:
/bench_x64
jlb6740 commented on issue #5060:
Change factor shows patch effect on x64 if merged compared to current head for main.
Results are based on clocktick (CT) event cycles. Change Factor = (Patched_CT - Main_CT) / (Main_CT)
A negative change factor means clockticks are expected to be reduced by the patch.
wasm arch phase change_factor benchmarks/blake3-scalar x86_64 Compilation -0.055 benchmarks/blake3-simd x86_64 Compilation -0.018 benchmarks/bz2 x86_64 Compilation -0.017 benchmarks/intgemm-simd x86_64 Compilation 0.000 benchmarks/meshoptimizer x86_64 Compilation -0.004 benchmarks/noop x86_64 Compilation -0.001 benchmarks/pulldown-cmark x86_64 Compilation -0.062 benchmarks/shootout-ackermann x86_64 Compilation 0.019 benchmarks/shootout-base64 x86_64 Compilation 0.004 benchmarks/shootout-ctype x86_64 Compilation -0.018 benchmarks/shootout-ed25519 x86_64 Compilation 0.000 benchmarks/shootout-fib2 x86_64 Compilation -0.014 benchmarks/shootout-gimli x86_64 Compilation 0.051 benchmarks/shootout-heapsort x86_64 Compilation 0.012 benchmarks/shootout-keccak x86_64 Compilation 0.027 benchmarks/shootout-matrix x86_64 Compilation -0.017 benchmarks/shootout-memmove x86_64 Compilation -0.022 benchmarks/shootout-minicsv x86_64 Compilation 0.016 benchmarks/shootout-nestedloop x86_64 Compilation -0.026 benchmarks/shootout-random x86_64 Compilation 0.027 benchmarks/shootout-ratelimit x86_64 Compilation -0.013 benchmarks/shootout-seqhash x86_64 Compilation 0.306 benchmarks/shootout-sieve x86_64 Compilation 0.015 benchmarks/shootout-switch x86_64 Compilation 0.054 benchmarks/shootout-xblabla20 x86_64 Compilation 0.033 benchmarks/shootout-xchacha20 x86_64 Compilation -0.086 benchmarks/spidermonkey x86_64 Compilation -0.013
wasm arch phase change_factor benchmarks/blake3-scalar x86_64 Instantiation 0.071 benchmarks/blake3-simd x86_64 Instantiation -0.037 benchmarks/bz2 x86_64 Instantiation -0.019 benchmarks/intgemm-simd x86_64 Instantiation 0.322 benchmarks/meshoptimizer x86_64 Instantiation 0.075 benchmarks/noop x86_64 Instantiation -0.066 benchmarks/pulldown-cmark x86_64 Instantiation -0.043 benchmarks/shootout-ackermann x86_64 Instantiation 0.003 benchmarks/shootout-base64 x86_64 Instantiation 0.065 benchmarks/shootout-ctype x86_64 Instantiation 0.049 benchmarks/shootout-ed25519 x86_64 Instantiation 0.010 benchmarks/shootout-fib2 x86_64 Instantiation -0.060 benchmarks/shootout-gimli x86_64 Instantiation 0.025 benchmarks/shootout-heapsort x86_64 Instantiation 0.008 benchmarks/shootout-keccak x86_64 Instantiation -0.030 benchmarks/shootout-matrix x86_64 Instantiation 0.039 benchmarks/shootout-memmove x86_64 Instantiation 0.005 benchmarks/shootout-minicsv x86_64 Instantiation -0.003 benchmarks/shootout-nestedloop x86_64 Instantiation 0.072 benchmarks/shootout-random x86_64 Instantiation -0.003 benchmarks/shootout-ratelimit x86_64 Instantiation 0.021 benchmarks/shootout-seqhash x86_64 Instantiation -0.111 benchmarks/shootout-sieve x86_64 Instantiation -0.036 benchmarks/shootout-switch x86_64 Instantiation 0.017 benchmarks/shootout-xblabla20 x86_64 Instantiation 0.022 benchmarks/shootout-xchacha20 x86_64 Instantiation -0.049 benchmarks/spidermonkey x86_64 Instantiation -0.147
wasm arch phase change_factor benchmarks/blake3-scalar x86_64 Execution 0.003 benchmarks/blake3-simd x86_64 Execution -0.011 benchmarks/bz2 x86_64 Execution -0.033 benchmarks/intgemm-simd x86_64 Execution -0.007 benchmarks/meshoptimizer x86_64 Execution 0.000 benchmarks/noop x86_64 Execution 0.454 benchmarks/pulldown-cmark x86_64 Execution -0.002 benchmarks/shootout-ackermann x86_64 Execution -0.350 benchmarks/shootout-base64 x86_64 Execution 0.003 benchmarks/shootout-ctype x86_64 Execution 0.003 benchmarks/shootout-ed25519 x86_64 Execution -0.000 benchmarks/shootout-fib2 x86_64 Execution -0.000 benchmarks/shootout-gimli x86_64 Execution -0.017 benchmarks/shootout-heapsort x86_64 Execution 0.000 benchmarks/shootout-keccak x86_64 Execution -0.006 benchmarks/shootout-matrix x86_64 Execution 0.001 benchmarks/shootout-memmove x86_64 Execution 0.002 benchmarks/shootout-minicsv x86_64 Execution -0.001 benchmarks/shootout-nestedloop x86_64 Execution 0.003 benchmarks/shootout-random x86_64 Execution 0.003 benchmarks/shootout-ratelimit x86_64 Execution -0.000 benchmarks/shootout-seqhash x86_64 Execution 0.004 benchmarks/shootout-sieve x86_64 Execution -0.001 benchmarks/shootout-switch x86_64 Execution -0.001 benchmarks/shootout-xblabla20 x86_64 Execution -0.025 benchmarks/shootout-xchacha20 x86_64 Execution 0.010 benchmarks/spidermonkey x86_64 Execution -0.000 Averages (x64):
|phase|change_factor|
|-|-|
|Compilation|0.007|
|Execution|0.001|
|Instantiation|0.007|
jlb6740 commented on issue #5060:
@cfallin I think this is the one to address some previous comments. Note, https://github.com/bytecodealliance/wasmtime/pull/5064 is a separate PR to look at better stabilizing results.
cfallin commented on issue #5060:
@jlb6740 a few thoughts:
- Can we label "Averages" at the bottom "Geomeans" (and to check: they are geometric means?) so the reader knows what kind they are?
- To me at least, multiplying all the numbers by 100 and presenting them as percentages would make them a bit easier to visually parse. E.g.
0.003, 0.001, -0.004
is ever-so-slightly harder to read than0.3%, 0.1%, -0.4%
.- Would it be possible to sort by effect size (most sped-up to most slowed-down for example)?
jlb6740 commented on issue #5060:
@cfallin. These are averages of what we decided to call the change factor (python pandas calls it percent change). I definetly don't think we want to take the geomean of these numbers. The numbers are already a percentage (and not based on a diff) .. so I think taking the arithmetic average is appropriate there. Note, I updated the comments to include the formula that is used:
Some Factor = (Patched_CT - Main_CT) / (Main_CT)
or
Patched Clock Ticks = Main Clock Ticks + (Main Clock Ticks * (Some Factor))This makes sense to me, but multiplying "Some Factor" by 100 would break this formula.
0.003 factor is not the same as .3%.
1.003 factor is equivalent to .3%
jlb6740 commented on issue #5060:
As far as the effect size I think that is calculated in sightglass so should be doable. That said, I've found it takes a minute to noodle on these changes just because it takes so long to run and you don't know if you've broken something with your changes or to know the format is not quite like you want it. I don't want this to languish for a week while I attend to other things. Is it OK to do these other updates in another iteration? But of course close on the labels and formula here.
cfallin commented on issue #5060:
These are averages of what we decided to call the change factor (python pandas calls it percent change). I definetly don't think we want to take the geomean of these numbers. The numbers are already a percentage (and not based on a diff) .. so I think taking the arithmetic average is appropriate there.
Ah, right, a geomean is warranted if raw ratio of runtime, sorry; I had been thinking in those terms and not fractional-change terms. (Can we call it "arithmetic mean" in the header then to specify to the reader which it is?)
Some Factor = (Patched_CT - Main_CT) / (Main_CT)
This makes sense to me, but multiplying "Some Factor" by 100 would break this formula.
0.003 factor is not the same as .3%.
1.003 factor is equivalent to .3%Isn't this just the definition of fractional change? The
- Main_CT / Main_CT
term is effectively subtracting one (the above formula can be rearranged to(Patched_CT / Main_CT) - 1
). So if this formula reports a result of 0.003, then that is a multiplicative factor of 1.003 from old to new, and that is a 0.3% shift. In other words, if I have a runtime ratio of 1.003, then the run got 0.3% slower; to go from 1.003 to 0.3%, I subtract one and multiply by 100.
jlb6740 commented on issue #5060:
Ok .. I may be confusing something here but this is how I see it:
Patch = Main (1 + Factor) is what the current formula states.
So .. let's say Main is 100 and Factor is -.003 then the Patch Clockticks would decrease to (100 - .3) = 99.7If we take the same thing with %
So .. let's say Main is 100 and Percentage is -.3% then the Patch Clockticks would decrease to (100 * .3) = 99.7Ok .. this is why I don't like percentages :grinning_face_with_smiling_eyes:. I think you are right. Hopefully I can easily change this factor to percentage in pandas with ease. I'll just make that change as I do think it is easier to read.
jlb6740 commented on issue #5060:
/bench_x64
jlb6740 commented on issue #5060:
/bench_x64
jlb6740 commented on issue #5060:
/bench_x64
jlb6740 commented on issue #5060:
Change factor shows patch effect on x64 if merged compared to current head for main.
Results are based on clocktick (CT) event cycles. Change Factor = (Patched_CT - Main_CT) / (Main_CT)
A negative change factor means clockticks are expected to be reduced by the patch.
wasm arch phase change_factor benchmarks/blake3-scalar x86_64 Compilation -2.067% benchmarks/blake3-simd x86_64 Compilation -9.731% benchmarks/bz2 x86_64 Compilation 5.358% benchmarks/intgemm-simd x86_64 Compilation 0.691% benchmarks/meshoptimizer x86_64 Compilation -1.587% benchmarks/noop x86_64 Compilation -1.157% benchmarks/pulldown-cmark x86_64 Compilation -7.943% benchmarks/shootout-ackermann x86_64 Compilation -3.513% benchmarks/shootout-base64 x86_64 Compilation 4.014% benchmarks/shootout-ctype x86_64 Compilation -1.046% benchmarks/shootout-ed25519 x86_64 Compilation 0.191% benchmarks/shootout-fib2 x86_64 Compilation -1.799% benchmarks/shootout-gimli x86_64 Compilation -1.431% benchmarks/shootout-heapsort x86_64 Compilation 2.065% benchmarks/shootout-keccak x86_64 Compilation -2.097% benchmarks/shootout-matrix x86_64 Compilation -5.134% benchmarks/shootout-memmove x86_64 Compilation -2.702% benchmarks/shootout-minicsv x86_64 Compilation -2.506% benchmarks/shootout-nestedloop x86_64 Compilation -2.062% benchmarks/shootout-random x86_64 Compilation -0.815% benchmarks/shootout-ratelimit x86_64 Compilation -0.740% benchmarks/shootout-seqhash x86_64 Compilation 9.665% benchmarks/shootout-sieve x86_64 Compilation 0.483% benchmarks/shootout-switch x86_64 Compilation -0.386% benchmarks/shootout-xblabla20 x86_64 Compilation -3.094% benchmarks/shootout-xchacha20 x86_64 Compilation -5.356% benchmarks/spidermonkey x86_64 Compilation -0.783%
wasm arch phase change_factor benchmarks/blake3-scalar x86_64 Instantiation 1.770% benchmarks/blake3-simd x86_64 Instantiation -21.610% benchmarks/bz2 x86_64 Instantiation 6.219% benchmarks/intgemm-simd x86_64 Instantiation -2.431% benchmarks/meshoptimizer x86_64 Instantiation -1.874% benchmarks/noop x86_64 Instantiation -4.099% benchmarks/pulldown-cmark x86_64 Instantiation -7.633% benchmarks/shootout-ackermann x86_64 Instantiation 0.835% benchmarks/shootout-base64 x86_64 Instantiation 4.207% benchmarks/shootout-ctype x86_64 Instantiation 0.682% benchmarks/shootout-ed25519 x86_64 Instantiation -1.963% benchmarks/shootout-fib2 x86_64 Instantiation 1.302% benchmarks/shootout-gimli x86_64 Instantiation 11.633% benchmarks/shootout-heapsort x86_64 Instantiation 5.706% benchmarks/shootout-keccak x86_64 Instantiation 11.342% benchmarks/shootout-matrix x86_64 Instantiation 25.500% benchmarks/shootout-memmove x86_64 Instantiation 9.481% benchmarks/shootout-minicsv x86_64 Instantiation 0.009% benchmarks/shootout-nestedloop x86_64 Instantiation 1.024% benchmarks/shootout-random x86_64 Instantiation -2.402% benchmarks/shootout-ratelimit x86_64 Instantiation -4.735% benchmarks/shootout-seqhash x86_64 Instantiation 1.122% benchmarks/shootout-sieve x86_64 Instantiation 3.037% benchmarks/shootout-switch x86_64 Instantiation -2.586% benchmarks/shootout-xblabla20 x86_64 Instantiation 2.925% benchmarks/shootout-xchacha20 x86_64 Instantiation 5.166% benchmarks/spidermonkey x86_64 Instantiation -2.497%
wasm arch phase change_factor benchmarks/blake3-scalar x86_64 Execution -0.293% benchmarks/blake3-simd x86_64 Execution -18.240% benchmarks/bz2 x86_64 Execution -0.483% benchmarks/intgemm-simd x86_64 Execution -0.143% benchmarks/meshoptimizer x86_64 Execution -0.128% benchmarks/noop x86_64 Execution -31.053% benchmarks/pulldown-cmark x86_64 Execution 0.558% benchmarks/shootout-ackermann x86_64 Execution -38.342% benchmarks/shootout-base64 x86_64 Execution -0.241% benchmarks/shootout-ctype x86_64 Execution 0.106% benchmarks/shootout-ed25519 x86_64 Execution -0.239% benchmarks/shootout-fib2 x86_64 Execution -0.129% benchmarks/shootout-gimli x86_64 Execution 2.898% benchmarks/shootout-heapsort x86_64 Execution 0.041% benchmarks/shootout-keccak x86_64 Execution 0.485% benchmarks/shootout-matrix x86_64 Execution -0.063% benchmarks/shootout-memmove x86_64 Execution 0.026% benchmarks/shootout-minicsv x86_64 Execution -0.130% benchmarks/shootout-nestedloop x86_64 Execution -0.345% benchmarks/shootout-random x86_64 Execution 0.014% benchmarks/shootout-ratelimit x86_64 Execution 0.223% benchmarks/shootout-seqhash x86_64 Execution -0.153% benchmarks/shootout-sieve x86_64 Execution 0.215% benchmarks/shootout-switch x86_64 Execution 0.132% benchmarks/shootout-xblabla20 x86_64 Execution -3.346% benchmarks/shootout-xchacha20 x86_64 Execution -1.062% benchmarks/spidermonkey x86_64 Execution 0.034% Averages (x64):
|phase|change_factor|
|-|-|
|Compilation|-1.240%|
|Execution|-3.321%|
|Instantiation|1.486%|
jlb6740 commented on issue #5060:
/bench_x64
jlb6740 commented on issue #5060:
Change factor shows patch effect on x64 if merged compared to current head for main.
Results are based on clocktick (CT) event cycles. Change Factor = (Patched_CT - Main_CT) / (Main_CT)
A negative change factor means clockticks are expected to be reduced by the patch.
wasm arch phase change_factor benchmarks/blake3-scalar x86_64 Compilation -1.983% benchmarks/blake3-simd x86_64 Compilation 4.863% benchmarks/bz2 x86_64 Compilation 0.741% benchmarks/intgemm-simd x86_64 Compilation 0.432% benchmarks/meshoptimizer x86_64 Compilation 1.810% benchmarks/noop x86_64 Compilation 0.412% benchmarks/pulldown-cmark x86_64 Compilation -3.803% benchmarks/shootout-ackermann x86_64 Compilation -1.363% benchmarks/shootout-base64 x86_64 Compilation 2.021% benchmarks/shootout-ctype x86_64 Compilation 5.066% benchmarks/shootout-ed25519 x86_64 Compilation -1.554% benchmarks/shootout-fib2 x86_64 Compilation 1.328% benchmarks/shootout-gimli x86_64 Compilation -3.708% benchmarks/shootout-heapsort x86_64 Compilation -2.336% benchmarks/shootout-keccak x86_64 Compilation -0.795% benchmarks/shootout-matrix x86_64 Compilation -1.155% benchmarks/shootout-memmove x86_64 Compilation 6.049% benchmarks/shootout-minicsv x86_64 Compilation 0.892% benchmarks/shootout-nestedloop x86_64 Compilation 2.154% benchmarks/shootout-random x86_64 Compilation -1.150% benchmarks/shootout-ratelimit x86_64 Compilation -5.721% benchmarks/shootout-seqhash x86_64 Compilation -0.942% benchmarks/shootout-sieve x86_64 Compilation -1.220% benchmarks/shootout-switch x86_64 Compilation -0.195% benchmarks/shootout-xblabla20 x86_64 Compilation -1.808% benchmarks/shootout-xchacha20 x86_64 Compilation -3.225% benchmarks/spidermonkey x86_64 Compilation 1.566%
wasm arch phase change_factor benchmarks/blake3-scalar x86_64 Instantiation 14.789% benchmarks/blake3-simd x86_64 Instantiation -6.792% benchmarks/bz2 x86_64 Instantiation 7.797% benchmarks/intgemm-simd x86_64 Instantiation 54.875% benchmarks/meshoptimizer x86_64 Instantiation 19.314% benchmarks/noop x86_64 Instantiation 6.291% benchmarks/pulldown-cmark x86_64 Instantiation -0.683% benchmarks/shootout-ackermann x86_64 Instantiation 9.086% benchmarks/shootout-base64 x86_64 Instantiation 8.598% benchmarks/shootout-ctype x86_64 Instantiation 5.321% benchmarks/shootout-ed25519 x86_64 Instantiation -1.979% benchmarks/shootout-fib2 x86_64 Instantiation 2.882% benchmarks/shootout-gimli x86_64 Instantiation 14.625% benchmarks/shootout-heapsort x86_64 Instantiation 6.266% benchmarks/shootout-keccak x86_64 Instantiation -2.256% benchmarks/shootout-matrix x86_64 Instantiation -1.146% benchmarks/shootout-memmove x86_64 Instantiation 2.591% benchmarks/shootout-minicsv x86_64 Instantiation -15.002% benchmarks/shootout-nestedloop x86_64 Instantiation 2.356% benchmarks/shootout-random x86_64 Instantiation -0.031% benchmarks/shootout-ratelimit x86_64 Instantiation -0.988% benchmarks/shootout-seqhash x86_64 Instantiation 22.805% benchmarks/shootout-sieve x86_64 Instantiation 5.798% benchmarks/shootout-switch x86_64 Instantiation 5.490% benchmarks/shootout-xblabla20 x86_64 Instantiation 3.263% benchmarks/shootout-xchacha20 x86_64 Instantiation 4.046% benchmarks/spidermonkey x86_64 Instantiation 0.671%
wasm arch phase change_factor benchmarks/blake3-scalar x86_64 Execution -0.843% benchmarks/blake3-simd x86_64 Execution 0.142% benchmarks/bz2 x86_64 Execution 2.205% benchmarks/intgemm-simd x86_64 Execution -0.128% benchmarks/meshoptimizer x86_64 Execution -0.003% benchmarks/noop x86_64 Execution 9.814% benchmarks/pulldown-cmark x86_64 Execution -1.530% benchmarks/shootout-ackermann x86_64 Execution 4.071% benchmarks/shootout-base64 x86_64 Execution 0.085% benchmarks/shootout-ctype x86_64 Execution 0.046% benchmarks/shootout-ed25519 x86_64 Execution -0.558% benchmarks/shootout-fib2 x86_64 Execution 0.010% benchmarks/shootout-gimli x86_64 Execution 0.266% benchmarks/shootout-heapsort x86_64 Execution 0.027% benchmarks/shootout-keccak x86_64 Execution -0.302% benchmarks/shootout-matrix x86_64 Execution -0.601% benchmarks/shootout-memmove x86_64 Execution -0.028% benchmarks/shootout-minicsv x86_64 Execution 1.168% benchmarks/shootout-nestedloop x86_64 Execution 0.272% benchmarks/shootout-random x86_64 Execution 0.189% benchmarks/shootout-ratelimit x86_64 Execution -0.064% benchmarks/shootout-seqhash x86_64 Execution 0.254% benchmarks/shootout-sieve x86_64 Execution 0.099% benchmarks/shootout-switch x86_64 Execution -0.192% benchmarks/shootout-xblabla20 x86_64 Execution -2.433% benchmarks/shootout-xchacha20 x86_64 Execution 0.044% benchmarks/spidermonkey x86_64 Execution -0.012% Averages (x64):
|phase|change_factor|
|-|:-:|
|Compilation|-0.134%|
|Execution|0.444%|
|Instantiation|6.222%|
jlb6740 commented on issue #5060:
@cfallin .. Ok. I think this is in the direction of the format we want. Is it OK to merge this as is? Adding an effect size column and sorting based on that we can do in iteration with another patch? Also .. about that. I kind of like the benchmarks being in the same predictable alphabetical order each table where you don't have to search for where a specific benchmark result is. Maybe we want to add the column but not sort?
cfallin commented on issue #5060:
One final request (sorry!). "Change factor" makes sense to me if it's a pure ratio where 1.000 is no change; but since we're presenting only the delta (subracting 1, or subracting
Main_CT
in the numerator, or subracting 100%, all equivalent), this is a delta, not a factor. It's also now a percent. So, ironically I guess, the originalpct_change
title (can we spell it out as "Percent change" though?) is actually now accurate, now that we've adjusted the formula -- could we go to that? Happy to approve+merge after that!
fitzgen commented on issue #5060:
Can we mark whether the change is statistically significant or not? Otherwise there is no way to know whether to trust it or not.
fitzgen commented on issue #5060:
I think I've communicated this in various one-off meetings over the years, but for posterity, my ideal output would be something like:
# Benchmark Results <details> <summary>Methods and Configuration</summary> * Baseline: `main` at commit a1b2c3 * Comparison: `feature-branch` at commit d4e5f6 * Significance level: 0.01 * Processes: N * Iterations per process: M * Engine flags: ... * Etc... </details> ## Statistically Significant Results <table> <thead> <tr> <th>Wasm Input</th> <th>Architecture</th> <th>Phase</th> <th>Effect Size</th> </tr> </thead> <tbody> <tr> <td><code>spidermonkey.wasm</code></td> <td>x64</td> <td>Compilation</td> <td>1.03 ± 0.01</td> </tr> <!-- etc... sorted by largest absolute effect size ---> </tbody> </table> ## Statistically Insignificant Results <details> <summary>Statistically insignificant results; hidden by default</details> <!-- same type of table as above --> </details>
The important bits being:
- Highlighting statistically significant changes
- Hiding statistically insignificant changes behind a
<details>
by default- Showing the effect size (what you've been calling a change factor) with its confidence interval
FWIW, we designed the
sightglass-analysis
crate to expose reusable functions to compute statistical significance, effect size, and confidence interval: https://github.com/bytecodealliance/sightglass/blob/main/crates/analysis/src/effect_size.rs#L16
jlb6740 commented on issue #5060:
/bench_x64
jlb6740 commented on issue #5060:
Change factor shows patch effect on x64 if merged compared to current head for main.%0A%0AResults are based on clocktick (CT) event cycles. Change Factor = (Patched_CT - Main_CT) / (Main_CT) %0AA negative change factor means clockticks are expected to be reduced by the patch.%0A%0A|wasm|arch|phase|change_factor|%0A |-|:-:|:-:|:-:|%0A|benchmarks/blake3-scalar|x86_64|Compilation|-4.631%25|%0A|benchmarks/blake3-simd|x86_64|Compilation|2.186%25|%0A|benchmarks/bz2|x86_64|Compilation|-0.093%25|%0A|benchmarks/intgemm-simd|x86_64|Compilation|-1.441%25|%0A|benchmarks/meshoptimizer|x86_64|Compilation|-0.102%25|%0A|benchmarks/noop|x86_64|Compilation|-1.682%25|%0A|benchmarks/pulldown-cmark|x86_64|Compilation|2.070%25|%0A|benchmarks/shootout-ackermann|x86_64|Compilation|-4.602%25|%0A|benchmarks/shootout-base64|x86_64|Compilation|-1.885%25|%0A|benchmarks/shootout-ctype|x86_64|Compilation|-2.564%25|%0A|benchmarks/shootout-ed25519|x86_64|Compilation|-3.418%25|%0A|benchmarks/shootout-fib2|x86_64|Compilation|-1.847%25|%0A|benchmarks/shootout-gimli|x86_64|Compilation|-3.495%25|%0A|benchmarks/shootout-heapsort|x86_64|Compilation|6.073%25|%0A|benchmarks/shootout-keccak|x86_64|Compilation|0.739%25|%0A|benchmarks/shootout-matrix|x86_64|Compilation|-8.418%25|%0A|benchmarks/shootout-memmove|x86_64|Compilation|-4.909%25|%0A|benchmarks/shootout-minicsv|x86_64|Compilation|0.607%25|%0A|benchmarks/shootout-nestedloop|x86_64|Compilation|-2.425%25|%0A|benchmarks/shootout-random|x86_64|Compilation|-2.287%25|%0A|benchmarks/shootout-ratelimit|x86_64|Compilation|3.886%25|%0A|benchmarks/shootout-seqhash|x86_64|Compilation|-2.022%25|%0A|benchmarks/shootout-sieve|x86_64|Compilation|-1.459%25|%0A|benchmarks/shootout-switch|x86_64|Compilation|-4.815%25|%0A|benchmarks/shootout-xblabla20|x86_64|Compilation|3.422%25|%0A|benchmarks/shootout-xchacha20|x86_64|Compilation|-3.053%25|%0A|benchmarks/spidermonkey|x86_64|Compilation|-0.316%25|%0A%0A|wasm|arch|phase|change_factor|%0A |-|:-:|:-:|:-:|%0A|benchmarks/blake3-scalar|x86_64|Instantiation|-30.403%25|%0A|benchmarks/blake3-simd|x86_64|Instantiation|-3.762%25|%0A|benchmarks/bz2|x86_64|Instantiation|7.500%25|%0A|benchmarks/intgemm-simd|x86_64|Instantiation|-1.067%25|%0A|benchmarks/meshoptimizer|x86_64|Instantiation|-2.640%25|%0A|benchmarks/noop|x86_64|Instantiation|-7.303%25|%0A|benchmarks/pulldown-cmark|x86_64|Instantiation|7.822%25|%0A|benchmarks/shootout-ackermann|x86_64|Instantiation|1.992%25|%0A|benchmarks/shootout-base64|x86_64|Instantiation|7.792%25|%0A|benchmarks/shootout-ctype|x86_64|Instantiation|2.504%25|%0A|benchmarks/shootout-ed25519|x86_64|Instantiation|-4.106%25|%0A|benchmarks/shootout-fib2|x86_64|Instantiation|0.998%25|%0A|benchmarks/shootout-gimli|x86_64|Instantiation|-0.532%25|%0A|benchmarks/shootout-heapsort|x86_64|Instantiation|2.495%25|%0A|benchmarks/shootout-keccak|x86_64|Instantiation|-0.245%25|%0A|benchmarks/shootout-matrix|x86_64|Instantiation|3.993%25|%0A|benchmarks/shootout-memmove|x86_64|Instantiation|-10.591%25|%0A|benchmarks/shootout-minicsv|x86_64|Instantiation|-7.414%25|%0A|benchmarks/shootout-nestedloop|x86_64|Instantiation|2.388%25|%0A|benchmarks/shootout-random|x86_64|Instantiation|-2.975%25|%0A|benchmarks/shootout-ratelimit|x86_64|Instantiation|4.045%25|%0A|benchmarks/shootout-seqhash|x86_64|Instantiation|6.400%25|%0A|benchmarks/shootout-sieve|x86_64|Instantiation|-0.312%25|%0A|benchmarks/shootout-switch|x86_64|Instantiation|-0.480%25|%0A|benchmarks/shootout-xblabla20|x86_64|Instantiation|-2.197%25|%0A|benchmarks/shootout-xchacha20|x86_64|Instantiation|-12.450%25|%0A|benchmarks/spidermonkey|x86_64|Instantiation|1.493%25|%0A%0A|wasm|arch|phase|change_factor|%0A |-|:-:|:-:|:-:|%0A|benchmarks/blake3-scalar|x86_64|Execution|0.093%25|%0A|benchmarks/blake3-simd|x86_64|Execution|-0.428%25|%0A|benchmarks/bz2|x86_64|Execution|1.562%25|%0A|benchmarks/intgemm-simd|x86_64|Execution|0.053%25|%0A|benchmarks/meshoptimizer|x86_64|Execution|0.030%25|%0A|benchmarks/noop|x86_64|Execution|4.042%25|%0A|benchmarks/pulldown-cmark|x86_64|Execution|0.612%25|%0A|benchmarks/shootout-ackermann|x86_64|Execution|18.860%25|%0A|benchmarks/shootout-base64|x86_64|Execution|0.292%25|%0A|benchmarks/shootout-ctype|x86_64|Execution|-0.073%25|%0A|benchmarks/shootout-ed25519|x86_64|Execution|0.082%25|%0A|benchmarks/shootout-fib2|x86_64|Execution|-0.094%25|%0A|benchmarks/shootout-gimli|x86_64|Execution|2.252%25|%0A|benchmarks/shootout-heapsort|x86_64|Execution|-0.083%25|%0A|benchmarks/shootout-keccak|x86_64|Execution|-1.394%25|%0A|benchmarks/shootout-matrix|x86_64|Execution|0.033%25|%0A|benchmarks/shootout-memmove|x86_64|Execution|-0.033%25|%0A|benchmarks/shootout-minicsv|x86_64|Execution|-0.293%25|%0A|benchmarks/shootout-nestedloop|x86_64|Execution|0.112%25|%0A|benchmarks/shootout-random|x86_64|Execution|-0.454%25|%0A|benchmarks/shootout-ratelimit|x86_64|Execution|-0.319%25|%0A|benchmarks/shootout-seqhash|x86_64|Execution|-5.423%25|%0A|benchmarks/shootout-sieve|x86_64|Execution|-0.227%25|%0A|benchmarks/shootout-switch|x86_64|Execution|-0.083%25|%0A|benchmarks/shootout-xblabla20|x86_64|Execution|2.013%25|%0A|benchmarks/shootout-xchacha20|x86_64|Execution|-3.499%25|%0A|benchmarks/spidermonkey|x86_64|Execution|0.748%25|%0A%0AAverages (x64):%0A|phase|change_factor|%0A |-|:-:|%0A|Compilation|-1.351%25|%0A|Execution|0.681%25|%0A|Instantiation|-1.372%25|
jlb6740 commented on issue #5060:
/bench_x64
jlb6740 commented on issue #5060:
Change factor shows patch effect on x64 if merged compared to current head for main.%0A%0AResults are based on clocktick (CT) event cycles. Change Factor = (Patched_CT - Main_CT) / (Main_CT) %0AA negative change factor means clockticks are expected to be reduced by the patch.%0A%0A|wasm|arch|phase|change_factor|%0A |-|:-:|:-:|:-:|%0A|benchmarks/blake3-scalar|x86_64|Compilation|-2.264%25|%0A|benchmarks/blake3-simd|x86_64|Compilation|-0.952%25|%0A|benchmarks/bz2|x86_64|Compilation|-1.169%25|%0A|benchmarks/intgemm-simd|x86_64|Compilation|-0.073%25|%0A|benchmarks/meshoptimizer|x86_64|Compilation|-0.470%25|%0A|benchmarks/noop|x86_64|Compilation|1.484%25|%0A|benchmarks/pulldown-cmark|x86_64|Compilation|2.920%25|%0A|benchmarks/shootout-ackermann|x86_64|Compilation|-1.525%25|%0A|benchmarks/shootout-base64|x86_64|Compilation|3.301%25|%0A|benchmarks/shootout-ctype|x86_64|Compilation|-2.706%25|%0A|benchmarks/shootout-ed25519|x86_64|Compilation|1.641%25|%0A|benchmarks/shootout-fib2|x86_64|Compilation|-0.268%25|%0A|benchmarks/shootout-gimli|x86_64|Compilation|6.551%25|%0A|benchmarks/shootout-heapsort|x86_64|Compilation|1.015%25|%0A|benchmarks/shootout-keccak|x86_64|Compilation|-1.059%25|%0A|benchmarks/shootout-matrix|x86_64|Compilation|1.295%25|%0A|benchmarks/shootout-memmove|x86_64|Compilation|-5.877%25|%0A|benchmarks/shootout-minicsv|x86_64|Compilation|1.779%25|%0A|benchmarks/shootout-nestedloop|x86_64|Compilation|0.122%25|%0A|benchmarks/shootout-random|x86_64|Compilation|-0.916%25|%0A|benchmarks/shootout-ratelimit|x86_64|Compilation|2.141%25|%0A|benchmarks/shootout-seqhash|x86_64|Compilation|1.004%25|%0A|benchmarks/shootout-sieve|x86_64|Compilation|0.260%25|%0A|benchmarks/shootout-switch|x86_64|Compilation|3.673%25|%0A|benchmarks/shootout-xblabla20|x86_64|Compilation|-2.125%25|%0A|benchmarks/shootout-xchacha20|x86_64|Compilation|0.693%25|%0A|benchmarks/spidermonkey|x86_64|Compilation|-0.094%25|%0A%0A|wasm|arch|phase|change_factor|%0A |-|:-:|:-:|:-:|%0A|benchmarks/blake3-scalar|x86_64|Instantiation|-8.503%25|%0A|benchmarks/blake3-simd|x86_64|Instantiation|11.889%25|%0A|benchmarks/bz2|x86_64|Instantiation|10.633%25|%0A|benchmarks/intgemm-simd|x86_64|Instantiation|-20.765%25|%0A|benchmarks/meshoptimizer|x86_64|Instantiation|15.852%25|%0A|benchmarks/noop|x86_64|Instantiation|3.321%25|%0A|benchmarks/pulldown-cmark|x86_64|Instantiation|-3.284%25|%0A|benchmarks/shootout-ackermann|x86_64|Instantiation|-0.097%25|%0A|benchmarks/shootout-base64|x86_64|Instantiation|-1.808%25|%0A|benchmarks/shootout-ctype|x86_64|Instantiation|-15.002%25|%0A|benchmarks/shootout-ed25519|x86_64|Instantiation|0.630%25|%0A|benchmarks/shootout-fib2|x86_64|Instantiation|-0.610%25|%0A|benchmarks/shootout-gimli|x86_64|Instantiation|15.031%25|%0A|benchmarks/shootout-heapsort|x86_64|Instantiation|25.119%25|%0A|benchmarks/shootout-keccak|x86_64|Instantiation|5.948%25|%0A|benchmarks/shootout-matrix|x86_64|Instantiation|5.849%25|%0A|benchmarks/shootout-memmove|x86_64|Instantiation|2.700%25|%0A|benchmarks/shootout-minicsv|x86_64|Instantiation|-25.233%25|%0A|benchmarks/shootout-nestedloop|x86_64|Instantiation|3.077%25|%0A|benchmarks/shootout-random|x86_64|Instantiation|1.315%25|%0A|benchmarks/shootout-ratelimit|x86_64|Instantiation|0.327%25|%0A|benchmarks/shootout-seqhash|x86_64|Instantiation|-4.867%25|%0A|benchmarks/shootout-sieve|x86_64|Instantiation|3.642%25|%0A|benchmarks/shootout-switch|x86_64|Instantiation|-0.317%25|%0A|benchmarks/shootout-xblabla20|x86_64|Instantiation|3.272%25|%0A|benchmarks/shootout-xchacha20|x86_64|Instantiation|11.555%25|%0A|benchmarks/spidermonkey|x86_64|Instantiation|-2.845%25|%0A%0A|wasm|arch|phase|change_factor|%0A |-|:-:|:-:|:-:|%0A|benchmarks/blake3-scalar|x86_64|Execution|-1.544%25|%0A|benchmarks/blake3-simd|x86_64|Execution|-3.132%25|%0A|benchmarks/bz2|x86_64|Execution|2.717%25|%0A|benchmarks/intgemm-simd|x86_64|Execution|0.179%25|%0A|benchmarks/meshoptimizer|x86_64|Execution|0.007%25|%0A|benchmarks/noop|x86_64|Execution|-4.482%25|%0A|benchmarks/pulldown-cmark|x86_64|Execution|-0.157%25|%0A|benchmarks/shootout-ackermann|x86_64|Execution|-19.695%25|%0A|benchmarks/shootout-base64|x86_64|Execution|-0.069%25|%0A|benchmarks/shootout-ctype|x86_64|Execution|0.295%25|%0A|benchmarks/shootout-ed25519|x86_64|Execution|0.082%25|%0A|benchmarks/shootout-fib2|x86_64|Execution|0.060%25|%0A|benchmarks/shootout-gimli|x86_64|Execution|-0.897%25|%0A|benchmarks/shootout-heapsort|x86_64|Execution|0.105%25|%0A|benchmarks/shootout-keccak|x86_64|Execution|-0.574%25|%0A|benchmarks/shootout-matrix|x86_64|Execution|-0.213%25|%0A|benchmarks/shootout-memmove|x86_64|Execution|0.211%25|%0A|benchmarks/shootout-minicsv|x86_64|Execution|0.063%25|%0A|benchmarks/shootout-nestedloop|x86_64|Execution|-0.075%25|%0A|benchmarks/shootout-random|x86_64|Execution|0.101%25|%0A|benchmarks/shootout-ratelimit|x86_64|Execution|0.270%25|%0A|benchmarks/shootout-seqhash|x86_64|Execution|0.039%25|%0A|benchmarks/shootout-sieve|x86_64|Execution|0.001%25|%0A|benchmarks/shootout-switch|x86_64|Execution|0.447%25|%0A|benchmarks/shootout-xblabla20|x86_64|Execution|1.642%25|%0A|benchmarks/shootout-xchacha20|x86_64|Execution|1.066%25|%0A|benchmarks/spidermonkey|x86_64|Execution|0.598%25|%0A%0AAverages (x64):%0A|phase|change_factor|%0A |-|:-:|%0A|Compilation|0.310%25|%0A|Execution|-0.850%25|%0A|Instantiation|1.364%25|
jlb6740 commented on issue #5060:
/bench_x64
jlb6740 commented on issue #5060:
%0Aexecution :: cycles :: benchmarks/bz2/benchmark.wasm%0A%0A Δ = 1513143.28 ± 1231561.44 (confidence = 99%25)%0A%0A main.so is 1.00x to 1.02x faster than commit.so!%0A%0A [129798904 131817054.80 138038694] commit.so%0A [127515208 130303911.52 132506288] main.so%0A%0Ainstantiation :: cycles :: benchmarks/pulldown-cmark/benchmark.wasm%0A%0A No difference in performance.%0A%0A [170034 196676.08 218436] commit.so%0A [173046 202642.40 373914] main.so%0A%0Ainstantiation :: cycles :: benchmarks/bz2/benchmark.wasm%0A%0A No difference in performance.%0A%0A [101652 121869.28 137242] commit.so%0A [106258 118855.28 133740] main.so%0A%0Acompilation :: cycles :: benchmarks/pulldown-cmark/benchmark.wasm%0A%0A No difference in performance.%0A%0A [309698748 346607655.52 374942674] commit.so%0A [301942394 349979492.32 394267312] main.so%0A%0Acompilation :: cycles :: benchmarks/bz2/benchmark.wasm%0A%0A No difference in performance.%0A%0A [215154062 234460165.44 267407768] commit.so%0A [217859454 236701668.96 272852560] main.so%0A%0Aexecution :: cycles :: benchmarks/pulldown-cmark/benchmark.wasm%0A%0A No difference in performance.%0A%0A [9344088 9469022.72 9651890] commit.so%0A [9343350 9504960.00 9678308] main.so%0A%0Aexecution :: cycles :: benchmarks/spidermonkey/benchmark.wasm%0A%0A No difference in performance.%0A%0A [1090353462 1104235121.84 1117828124] commit.so%0A [1091742374 1107584374.72 1126788480] main.so%0A%0Acompilation :: cycles :: benchmarks/spidermonkey/benchmark.wasm%0A%0A No difference in performance.%0A%0A [7228229446 7366056318.80 7478912292] commit.so%0A [7210593826 7359829007.52 7506138438] main.so%0A%0Ainstantiation :: cycles :: benchmarks/spidermonkey/benchmark.wasm%0A%0A No difference in performance.%0A%0A [543920 582452.72 736118] commit.so%0A [548450 582405.60 752954] main.so
jlb6740 commented on issue #5060:
/bench_x64
jlb6740 commented on issue #5060:
instantiation :: cycles :: benchmarks/bz2/benchmark.wasm
No difference in performance.
[103568 119207.92 135522] commit.so
[106822 126276.88 272912] main.soinstantiation :: cycles :: benchmarks/pulldown-cmark/benchmark.wasm
No difference in performance.
[170928 190962.88 212534] commit.so
[173412 200371.76 304700] main.soinstantiation :: cycles :: benchmarks/spidermonkey/benchmark.wasm
No difference in performance.
[542924 569039.60 605244] commit.so
[539128 581279.04 757404] main.socompilation :: cycles :: benchmarks/pulldown-cmark/benchmark.wasm
No difference in performance.
[318068526 344151444.88 407384632] commit.so
[306264764 350683370.00 383642122] main.socompilation :: cycles :: benchmarks/bz2/benchmark.wasm
No difference in performance.
[215556738 235741942.72 275455752] commit.so
[217886770 233020488.80 280170694] main.soexecution :: cycles :: benchmarks/bz2/benchmark.wasm
No difference in performance.
[127738454 131527589.52 137189426] commit.so
[127496598 130751974.40 133060306] main.soexecution :: cycles :: benchmarks/spidermonkey/benchmark.wasm
No difference in performance.
[1091080444 1106099439.76 1164561264] commit.so
[1089452668 1099986181.04 1121721354] main.socompilation :: cycles :: benchmarks/spidermonkey/benchmark.wasm
No difference in performance.
[7177863870 7295968763.20 7438418366] commit.so
[7180327178 7320723278.88 7475606826] main.soexecution :: cycles :: benchmarks/pulldown-cmark/benchmark.wasm
No difference in performance.
[9359830 9498456.32 9801272] commit.so
[9336598 9483072.80 9601184] main.so
jlb6740 commented on issue #5060:
Hi @cfallin @fitzgen, Instead of having the current table, how about we just highlight a few of the default benchmarks and print the default message? I think, as has already been suggested, we should be doing any formatting of the output printed in the message in sightglass itself so can work on a github markdown formatting there. Also, this patch increases the number of iterations and parallel processes to help stabilize results allowing us to close https://github.com/bytecodealliance/wasmtime/pull/5064
fitzgen commented on issue #5060:
Instead of having the current table, how about we just highlight a few of the default benchmarks and print the default message?
Happy with using the default output (and separately from the github action growing a "markdown" output format in the sightglass tool).
Not sure what you mean about "just highlight a few of the default benchmarks". Do you mean just run the
default.suite
set of benchmarks? If so, fine by me.
cfallin commented on issue #5060:
Will wait for @cfallin approval
Ah, sorry, didn't realize you were waiting for me here too. Yes, it seems fine to me.
One thing that I just realized is that the results come as a comment via your personal GitHub account. I think that we should change that -- we shouldn't have a dependence on one person's account (it's liable to break if you change or delete your account, it's problematic if one day you aren't working on Wasmtime/Cranelift any more, etc). We don't have to do it in this PR but would you be able to create a dedicated bot account for posting these comments, and give the infra a token for that instead (and somehow share the appropriate details with various folks across BA so we always have access)?
jlb6740 edited a comment on issue #5060:
Will wait for @cfallin approval
Ah, sorry, didn't realize you were waiting for me here too. Yes, it seems fine to me.
@cfallin Says it is still waiting for the +1. Will use bot account to send results back. May need you to create that account here though, not sure I have permission.
jlb6740 edited a comment on issue #5060:
Will wait for @cfallin approval
Ah, sorry, didn't realize you were waiting for me here too. Yes, it seems fine to me.
cfallin edited a comment on issue #5060:
Will wait for @cfallin approval
Ah, sorry, didn't realize you were waiting for me here too. Yes, it seems fine to me.
One thing that I just realized is that the results come as a comment via your personal GitHub account. I think that we should change that -- we shouldn't have a dependence on one person's account (it's liable to break if you change or delete your account, it's problematic if one day you aren't working on Wasmtime/Cranelift any more, etc). We don't have to do it in this PR but would you be able to create a dedicated bot account for posting these comments, and give the infra a token for that instead (and somehow share the appropriate details with various folks across BA so we always have access)?
Last updated: Dec 23 2024 at 12:05 UTC