mmcloughlin opened issue #7549:
I was musing about whether it would be possible for us to generate inputs to ISLE lowering that would exercise all code paths. So it occurred to me to check what coverage the cranelift unit/integration tests already have.
The line coverage reports suggest high coverage, but there are some gaps that might be worth discussion.
Coverage Generation
I found the
taiki-e/cargo-llvm-cov
tool to be very easy to use for coverage report generation.In the
cranelift
directory:cargo llvm-cov --disable-default-ignore-filename-regex --ignore-filename-regex='cargo/' --html
By default the
llvm-cov
tool won't report coverage for the ISLE generated files. The--disable-default-ignore-filename-regex
was necessary to disable this default behavior, and then--ignore-filename-regex='cargo/'
ignores coverage from external dependencies.Results
High-level coverage metrics for the ISLE generated files (columns are function/line/region):
![Screen Shot 2023-11-15 at 8 00 46 PM](https://github.com/bytecodealliance/wasmtime/assets/7133685/35a186da-045a-4847-9d16-cc8c2d04dbe3)
There's a lot of uncovered code to sift through, a lot of which is clearly not a concern (including uncovered
unreachable!
statements). But there seem to be some cases in here that might be worth looking into. For example, inconstructor_to_amode_add
:![Screen Shot 2023-11-15 at 8 02 50 PM](https://github.com/bytecodealliance/wasmtime/assets/7133685/3111469c-fa32-4678-9b71-931ccd722bfd)
Questions
- Are any of these coverage gaps interesting, and worth adding tests for?
- Are these gaps covered by fuzzing? (related #6293)
- Is 100% line coverage actually a reasonable goal here? What is the right metric? (related #1151)
cc @jameysharp @fitzgen
mmcloughlin commented on issue #7549:
To answer the fuzzing question, here's a recent report:
It appears the
to_amode_add
case above is covered by the fuzzer. I haven't looked any deeper into the differences though.(Fuzzing report from https://introspector.oss-fuzz.com/project-profile?project=wasmtime)
alexcrichton commented on issue #7549:
Cranelift I think gets a few primary sources of coverage via our testing, but they're unfortunately not all in the same place:
- Wasmtime-based fuzzing (e.g. the
differential
fuzzer)- Cranelift-based fuzzing (the
cranelift-fuzzgen
fuzzer)- Cranelift's
filetest
-based test suite- Wasmtime's own test test suite which includes the upstream WebAssembly spec test suite
If something isn't covered by any of those then I think it'd be good to enhance. That being said while it would be nice for at least one of those suites to cover everything it's probably not the most practical. I suspect the best place to add new tests would be the
filetest
-based test suite of Cranelift.IMO coverage of certain files is more important than others. For example I think the lowering rules are a great showcase of what should ideally be all covered. Runtime bits in Wasmtime I also think should all hopefully be covered via one way or another. I haven't ever found a good way to manage things like
unreachable!()
which discounts from looking like you're entirely covered but you're still practically covered.
fitzgen commented on issue #7549:
As I mentioned offline, I think the important part for ISLE-generated code isn't reaching 100% line coverage, but 100% rule RHS coverage. If we do the former, we will also do the latter, but the latter is what is important IMO.
I could imagine emitting some sort of probes/counters in the ISLE-generated code to track this kind of thing.
Even better would be a test case generator that takes an ISLE lowering rule and creates a clif filetest and we could run that generator on each of our backends' complete rule sets, as you were musing about, to get 100% rule coverage. Big plus one to that idea from me!
Last updated: Nov 22 2024 at 16:03 UTC