ISLE coverage checking/compilation algorithm · cranelift

I'm curious what algorithm ISLE uses to check patterns for coverage and compile to rust expressions. I know rust uses an extended version of https://dl.acm.org/doi/10.1145/1411304.1411311 and http://moscova.inria.fr/~maranget/papers/warn/warn.pdf (extended to handle ranges and uninhabited types), but that doesn't handle non-linear pattern variables. Was there a paper you used as a guide or is it a home-grown algorithm?

Compiling pattern matching to good decision trees | Proceedings of the 2008 ACM SIGPLAN workshop on ML

This alert has been successfully added and will be sent to:

Chris Fallin (Aug 19 2024 at 15:57):

We actually don't have a "completeness" check; it's valid for a constructor to be partial and to fail to match (in which case it returns None). We do have an "overlap" check however that @Trevor Elliott built here: https://github.com/bytecodealliance/wasmtime/pull/4906; he may be able to say more about how it corresponds to the literature/common algorithms!

Add an overlap checker to ISLE by elliottt · Pull Request #4906 · bytecodealliance/wasmtime

Introduce overlap checking for ISLE as the start of addressing #4717. The overlap checker detects cases where ISLE rules could fire for the same input. Consider the following example from the x64 l...

Trevor Elliott (Aug 19 2024 at 16:05):

originally it used an algorithm described in a paper from Xavier Leroy (https://xavierleroy.org/bibrefs/Leroy-ZINC.html), but @Jamey Sharp realized that it would be significantly less computationally expensive to check each rule in a decl group against every other rule.

the check itself comes down to a pretty simple comparison of the patterns being matched: if they match the same known constructor than we recursively check each argument. if we make it all the way through the arguments without finding a pattern that differs between two rules, we raise an error. this is somewhat complicated by extractors defined in rust, as there's logic that we can't inspect in the ISLE compiler. originally i treated these as wildcards unless both rules used them, in which case the check would recurse. i believe that Jamey might have come up with some better handling for those when he worked through the changes for the trie_again ir.

Jamey Sharp (Aug 19 2024 at 18:06):

Yeah, I still basically treated external terms as wildcards unless they match syntactically. There's also complexity due to equality constraints between terms. However, the "trie again" representation made dealing with and-patterns much easier, if I recall correctly. And there's an additional trick where if we can prove that a higher-priority rule matches every input that a lower-priority rule does then we can report that the lower-priority rule is unreachable, which I added as an extension of the previous definition of "overlap".

I put a lot of thought into the algorithm for building the Rust match tree as well; it is not based on any paper, but is based on my past experience with combinatorial search in general. I'd be interested in talking about it more at some point.

kmeakin (Aug 20 2024 at 14:22):

That's concerning. Does that mean lowering could fail due to IR that is not matched by any pattern?

Afonso Bordado (Aug 20 2024 at 14:30):

wasmtime/cranelift/fuzzgen/src/function_generator.rs at a2f0f2fd03c7a376ee4676f009e3cc8b8fc82e25 · bytecodealliance/wasmtime

A fast and secure runtime for WebAssembly. Contribute to bytecodealliance/wasmtime development by creating an account on GitHub.

Chris Fallin (Aug 20 2024 at 15:35):

No more concerning than any other part of the compiler that has not been proved to be total :-)

It's historically a result of the "gradual migration" approach that we took with ISLE's design: we wanted to be able to move over lowering patterns that override the legacy code, then eventually delete the legacy code. Now that everything is moved over (has been for two years or so) it would be a reasonable next step to add a total mode to decls and check coverage; but that's a large project (you or anyone else is welcome to work on it!)

In the meantime because we have fuzzing (as Afonso mentions above at the CLIF level, also at the Wasm level) we tend to catch "holes" pretty quickly; and the failure mode is loud and obvious in any case (compiler panic rather than a miscompile)

Jamey Sharp (Aug 20 2024 at 17:20):

There are already limited cases where we prove that an ISLE constructor is total and tweak the code we generate for it accordingly, though it's currently only when there's a catch-all low-priority rule. There might be other situations where we could generate better pattern matches if we proved that some fall-through case is unreachable. (I think I left a comment somewhere in the serialize module about that, even.) And we could do better error reporting if we synthesized a rule for each missing case and could tell the user exactly what pattern failed to match; I think Trevor tried to do that a couple years ago but it was more complicated than it seemed at first glance.

Stream: cranelift

Topic: ISLE coverage checking/compilation algorithm

kmeakin (Aug 19 2024 at 15:52):