fitzgen commented on Issue #4:
Thanks for the comments @jlb6740! Replies inline below.
Evaluating the speedup of a proposed performance optimization highlights the need for developers to be able to quickly run and maybe selectively run a benchmark that targets aspects of the compiler/runtime that they are targeting. This means being able to run offline before pushing a patch for review. This points to a need for a tight coupling between the distribution of wasmtime/cranelift and the runner in addition to maybe a separate offline/remote/nightly runner.
Yep, and I would expect that they would share 99% of their code. The main difference would be that a developer running locally would supply two commit hashes or branches to compare (e.g.
main
andmy-optimization-branch
) and the runner would build and run benchmarks for both, where as in the offline/remote/nightly automation the runner would only build and run the latestmain
version of wasmtime.Testing performance regressions can be done in tiers, a much faster run performs a smoke test that's part of patch merger and a separately the more comprehensive daily test as mentioned.
Yes, exactly. This is the point of having multiple input workloads for our benchmark candidates: one small workload to get a quick smoke test, and one larger workload for in-depth data.
Maybe this is second tier reporting and not default, but I would add a breakdown of wall time into user time and system time. Depending on the benchmark, kernel time can be dominant and so is good to factor in sooner rather than later during offline analysis.
Good idea, this is not only something we can record in the runner, but also something we can classify candidate benchmark programs by in our PCA.
Also for candidate benchmarks, some of the requirements may force us from just taking a code as is without modification (adding means for more than one associated input for example) that may not be so critical.
Yeah, ideally we would only be need to add
bench.start
andbench.end
calls. As a practicality, it may make more sense to err on the side of avoiding benchmark programs that aren't well-factored from their workloads, because otherwise updating such programs in the future may become a real chore.Real widely used programs is certainly desirable, but I personally don't have a problem with micro benchmarks and synthetic benchmarks if they serve their purpose. Microbenchmarks can be great at stress testing to the limits certain aspects of VM that you simply aren't going to get from a real world program. Furthermore, with real world programs you aren't always enlightened into what makes them tick, what aspects are you really timing and testing. With micro benchmarks it is no secret what you are testing and you can know when you should anticipate change in performance.
Regarding not knowing what makes the program tick: by requiring source code for candidate programs, we can hopefully avoid this to some degree. Agreed that micro-benchmarks are easier to quickly grok, though.
Micro-benchmarks have their place, and we can include some as we find blind spots and want to target specific things (like Lars also mentioned) but I think it makes sense, especially as we create the initial set of candidates, to focus on real programs.
I agree with the drawback of not being able to add certain benchmarks because they assume javascript, but I will say there may be somethings we can harvest from those benchmarks with a little work or no work at all. JetStream2 for example has some benchmarks such as gcc_loops that can be compiled directly to wasm without modification and in fact is capable of being auto-vectorization. I've used it myself when looking at SIMD performance of Wasmtime but there maybe others there.
Ah, great! Yeah, we can definitely include that as a candidate then.
The main thing I'd touch on here is the runner. Can and should we evolve Sightglass's benchmark runner. My initial thoughts here are that I think ultimately it may be best to have a Sightglass 2.0. The way the driver is setup now it is really not intended to be tightly coupled with any one VM in order to allow adding any workload without criteria and any runtime. The original code is written in rust and perhaps much of this can be reused, though then again some of the ideas that @abrown has mentioned may suggest gutting this or may suggest a completely fresh start with only Wasmtime and the requirements of this RFC in mind.
Yeah, it has become clear to me that we can replace just the runner with a 2.0 version, and then make it spit out data in a compatible format that the rest of sightglass expects. This way we still keep the server, UI, and graphs and all that working. I'll start prototyping this on top of @abrown's prototype of the new runner.
Also, not all workloads simply measure performance by time spent. Workloads can of course run for a fixed amount of time and then report the amount work done in terms of some metric or score. Other workloads have already put in place timers and print out the time of the important code for you. Do we want our infrastructure to allow workloads/benchmarks to report their own time where we can (through script) parse that data and still report it consistent with all the other benchmark report out?
The downside with workloads like "run X for Y amount of time" is that it is both harder to verify that the program produced the correct result (and we don't want to measure a buggy/broken program!) and counters like instructions retired are not semi-deterministic anymore, which makes comparing executions more difficult. Additionally, it is a different kind of program that the runner will have to interface with and understand how to run and report measurements for, which is additional engineering work for us. Ultimately, I don't think we want to add these kinds of benchmark programs to the suite.
@penzin pointed me to https://github.com/binji/raw-wasm; since you suggest here that we add benchmarks that uses some Wasm-related tools. https://github.com/binji/raw-wasm could provide some inputs to those tools.
abrown edited a comment on Issue #4:
@penzin pointed me to https://github.com/binji/raw-wasm; since you suggest here that we add benchmarks that use some Wasm-related tools. https://github.com/binji/raw-wasm could provide some inputs to those tools.
fitzgen commented on Issue #4:
Motion to finalize with a disposition to merge
I'm proposing that we merge this RFC and I'll add the
motion-to-finalize
anddisposition-merge
tags to the PR. The PR has received numerous pieces of positive feedback and we have milestones for incremental implementation. Prototype work on the implementation has already begun.Stakeholders sign-off
Tagging all employees of BA-affiliated companies who have committed to the wasmtime/cranelift repo in the last three months plus anyone who has given feedback on this PR as a stakeholder.
Fastly
- [ ] @alexcrichton
- [ ] @cfallin
- [ ] @sunfishcode
- [ ] @iximeow
- [X] @fitzgen
- [ ] @pchickey
- [ ] @fst-crenshaw
- [ ] @tschneidereit
IBM
- [ ] @uweigand
Intel
- [ ] @abrown
- [ ] @jlb6740
Mozilla
- [ ] @yurydelendik
- [ ] @lars-t-hansen
- [ ] @julian-seward1
alexcrichton edited a comment on Issue #4:
Motion to finalize with a disposition to merge
I'm proposing that we merge this RFC and I'll add the
motion-to-finalize
anddisposition-merge
tags to the PR. The PR has received numerous pieces of positive feedback and we have milestones for incremental implementation. Prototype work on the implementation has already begun.Stakeholders sign-off
Tagging all employees of BA-affiliated companies who have committed to the wasmtime/cranelift repo in the last three months plus anyone who has given feedback on this PR as a stakeholder.
Fastly
- [x] @alexcrichton
- [ ] @cfallin
- [ ] @sunfishcode
- [ ] @iximeow
- [X] @fitzgen
- [ ] @pchickey
- [ ] @fst-crenshaw
- [ ] @tschneidereit
IBM
- [ ] @uweigand
Intel
- [ ] @abrown
- [ ] @jlb6740
Mozilla
- [ ] @yurydelendik
- [ ] @lars-t-hansen
- [ ] @julian-seward1
tschneidereit edited a comment on Issue #4:
Motion to finalize with a disposition to merge
I'm proposing that we merge this RFC and I'll add the
motion-to-finalize
anddisposition-merge
tags to the PR. The PR has received numerous pieces of positive feedback and we have milestones for incremental implementation. Prototype work on the implementation has already begun.Stakeholders sign-off
Tagging all employees of BA-affiliated companies who have committed to the wasmtime/cranelift repo in the last three months plus anyone who has given feedback on this PR as a stakeholder.
Fastly
- [x] @alexcrichton
- [x] @cfallin
- [ ] @sunfishcode
- [ ] @iximeow
- [X] @fitzgen
- [ ] @pchickey
- [ ] @fst-crenshaw
- [ ] @tschneidereit
IBM
- [ ] @uweigand
Intel
- [ ] @abrown
- [ ] @jlb6740
Mozilla
- [ ] @yurydelendik
- [ ] @lars-t-hansen
- [ ] @julian-seward1
tschneidereit edited a comment on Issue #4:
Motion to finalize with a disposition to merge
I'm proposing that we merge this RFC and I'll add the
motion-to-finalize
anddisposition-merge
tags to the PR. The PR has received numerous pieces of positive feedback and we have milestones for incremental implementation. Prototype work on the implementation has already begun.Stakeholders sign-off
Tagging all employees of BA-affiliated companies who have committed to the wasmtime/cranelift repo in the last three months plus anyone who has given feedback on this PR as a stakeholder.
Fastly
- [x] @alexcrichton
- [x] @cfallin
- [ ] @sunfishcode
- [ ] @iximeow
- [X] @fitzgen
- [ ] @pchickey
- [ ] @fst-crenshaw
- [ ] @tschneidereit
IBM
- [ ] @uweigand
Intel
- [x] @abrown
- [ ] @jlb6740
Mozilla
- [ ] @yurydelendik
- [ ] @lars-t-hansen
- [ ] @julian-seward1
tschneidereit edited a comment on Issue #4:
Motion to finalize with a disposition to merge
I'm proposing that we merge this RFC and I'll add the
motion-to-finalize
anddisposition-merge
tags to the PR. The PR has received numerous pieces of positive feedback and we have milestones for incremental implementation. Prototype work on the implementation has already begun.Stakeholders sign-off
Tagging all employees of BA-affiliated companies who have committed to the wasmtime/cranelift repo in the last three months plus anyone who has given feedback on this PR as a stakeholder.
Fastly
- [x] @alexcrichton
- [x] @cfallin
- [x] @sunfishcode
- [ ] @iximeow
- [X] @fitzgen
- [ ] @pchickey
- [ ] @fst-crenshaw
- [ ] @tschneidereit
IBM
- [ ] @uweigand
Intel
- [x] @abrown
- [ ] @jlb6740
Mozilla
- [ ] @yurydelendik
- [ ] @lars-t-hansen
- [ ] @julian-seward1
tschneidereit edited a comment on Issue #4:
Motion to finalize with a disposition to merge
I'm proposing that we merge this RFC and I'll add the
motion-to-finalize
anddisposition-merge
tags to the PR. The PR has received numerous pieces of positive feedback and we have milestones for incremental implementation. Prototype work on the implementation has already begun.Stakeholders sign-off
Tagging all employees of BA-affiliated companies who have committed to the wasmtime/cranelift repo in the last three months plus anyone who has given feedback on this PR as a stakeholder.
Fastly
- [x] @alexcrichton
- [x] @cfallin
- [x] @sunfishcode
- [ ] @iximeow
- [X] @fitzgen
- [ ] @pchickey
- [ ] @fst-crenshaw
- [x] @tschneidereit
IBM
- [ ] @uweigand
Intel
- [x] @abrown
- [ ] @jlb6740
Mozilla
- [ ] @yurydelendik
- [ ] @lars-t-hansen
- [ ] @julian-seward1
fitzgen commented on Issue #4:
Entering Final Call Period
Once any stakeholder from a different group has signed off, the RFC will move into a 10 calendar day final comment period (FCP), long enough to ensure that other stakeholders have at least a full business week to respond.
The FCP will end on January 22nd.
lars-t-hansen commented on Issue #4:
I appear unable to approve this, but FWIW I have no objections.
tschneidereit edited a comment on Issue #4:
Motion to finalize with a disposition to merge
I'm proposing that we merge this RFC and I'll add the
motion-to-finalize
anddisposition-merge
tags to the PR. The PR has received numerous pieces of positive feedback and we have milestones for incremental implementation. Prototype work on the implementation has already begun.Stakeholders sign-off
Tagging all employees of BA-affiliated companies who have committed to the wasmtime/cranelift repo in the last three months plus anyone who has given feedback on this PR as a stakeholder.
Fastly
- [x] @alexcrichton
- [x] @cfallin
- [x] @sunfishcode
- [ ] @iximeow
- [X] @fitzgen
- [ ] @pchickey
- [ ] @fst-crenshaw
- [x] @tschneidereit
IBM
- [ ] @uweigand
Intel
- [x] @abrown
- [ ] @jlb6740
Mozilla
- [ ] @yurydelendik
- [x] @lars-t-hansen
- [ ] @julian-seward1
tschneidereit commented on Issue #4:
I appear unable to approve this, but FWIW I have no objections.
Yeah, this is unfortunately tricky to fix without an rfc-bot doing it for us. For now, using gh reviews seems like the best path forward, with the champion setting the check marks. Apologies for the rough edges here!
yurydelendik edited a comment on Issue #4:
Motion to finalize with a disposition to merge
I'm proposing that we merge this RFC and I'll add the
motion-to-finalize
anddisposition-merge
tags to the PR. The PR has received numerous pieces of positive feedback and we have milestones for incremental implementation. Prototype work on the implementation has already begun.Stakeholders sign-off
Tagging all employees of BA-affiliated companies who have committed to the wasmtime/cranelift repo in the last three months plus anyone who has given feedback on this PR as a stakeholder.
Fastly
- [x] @alexcrichton
- [x] @cfallin
- [x] @sunfishcode
- [ ] @iximeow
- [X] @fitzgen
- [ ] @pchickey
- [ ] @fst-crenshaw
- [x] @tschneidereit
IBM
- [ ] @uweigand
Intel
- [x] @abrown
- [ ] @jlb6740
Mozilla
- [x] @yurydelendik
- [x] @lars-t-hansen
- [ ] @julian-seward1
fitzgen commented on Issue #4:
The FCP has elapsed without any objections being raised. I'm going to merge this. Thanks everybody!
Last updated: Jan 24 2025 at 00:11 UTC