github-actions[bot] commented on issue #4061:
Subscribe to Label Action
cc @cfallin, @fitzgen
<details>
This issue or pull request has been labeled: "cranelift", "cranelift:area:aarch64", "cranelift:area:machinst", "cranelift:area:x64", "isle"Thus the following users have been cc'd because of the following labels:
- cfallin: isle
- fitzgen: isle
To subscribe or unsubscribe from this label, edit the <code>.github/subscribe-to-label.json</code> configuration file.
Learn more.
</details>
cfallin commented on issue #4061:
Some Sightglass results:
- Execution time (wallclock speedups, higher is faster):
pulldown-cmark
: no diff- `spidermonkey: 1.01 - 1.02x (99% conf)
bz2
: no diffmeshoptimizer
: 1.16x - 1.19x (99% conf) (!)blake3-scalar
: no diffblake3-simd
: no diff- Compilation time (wallclock speedups, higher is faster):
pulldown-cmark
: no diffspidermonkey
: 0.89x - 0.95x (99% conf) (I will profile this tomorrow)bz2
: no diffmeshoptimizer
: no diffblake3-scalar
: no diffblake3-simd
: no diffSo a speedup on two benchmarks, SpiderMonkey (1-2%) and meshoptimizer (16-19%); the latter I suspect is due to some opportunities for compare-load or compare-immediate merging in hot blocks (can double-check with hotblocks tool tomorrow).
The compile time regression on SpiderMonkey I'll take a look at and resolve -- we don't want to take that hit. I suspect the multiple-use analysis can be made faster if needed.
cfallin edited a comment on issue #4061:
Some Sightglass results:
- Execution time (wallclock speedups, higher is faster):
pulldown-cmark
: no diffspidermonkey
: 1.01 - 1.02x (99% conf)bz2
: no diffmeshoptimizer
: 1.16x - 1.19x (99% conf) (!)blake3-scalar
: no diffblake3-simd
: no diff- Compilation time (wallclock speedups, higher is faster):
pulldown-cmark
: no diffspidermonkey
: 0.89x - 0.95x (99% conf) (I will profile this tomorrow)bz2
: no diffmeshoptimizer
: no diffblake3-scalar
: no diffblake3-simd
: no diffSo a speedup on two benchmarks, SpiderMonkey (1-2%) and meshoptimizer (16-19%); the latter I suspect is due to some opportunities for compare-load or compare-immediate merging in hot blocks (can double-check with hotblocks tool tomorrow).
The compile time regression on SpiderMonkey I'll take a look at and resolve -- we don't want to take that hit. I suspect the multiple-use analysis can be made faster if needed.
cfallin commented on issue #4061:
@abrown I went ahead and reworked the
ValueUseState
analysis for better compile time; Sightglass now shows no difference in compile time with this PR vs baseline for SpiderMonkey (or the other benchmarks). Could you take another look at the new algorithm?
Last updated: Dec 23 2024 at 12:05 UTC