Equivalence/Transformation verifier of ISLE rules · cranelift

Hello!
Some time ago I discovered Cranelift, it's a pretty cool project! Yesterday I read some of the code (fixed a good-first-issue at first, maybe I can push it in a few days) and seems you lack some kind of after-transformation verification/equivalence checker (like LLVM does) of your rules and also have seen it's one of your roadmap entries. Thought about it for a bit and came up with a simple solution (that probably was already mentioned..), symbolic representations (which are used everywhere) with a solver with your rules (for the equivalence checker) or just running tests on those symbolic inputs.
What do you think? This doesn't sound too bad to implement (... probably?). Was this already mentioned/worked on? Is this too simple and doesn't answer some needs?
New to the community, hello!

bjorn3 (Mar 08 2023 at 15:37):

GitHub - avanhatt/wasmtime: Standalone JIT-style runtime for WebAssembly, using Cranelift

Standalone JIT-style runtime for WebAssembly, using Cranelift - GitHub - avanhatt/wasmtime: Standalone JIT-style runtime for WebAssembly, using Cranelift

bjorn3 (Mar 08 2023 at 15:38):

ghostway (Mar 08 2023 at 15:53):

Interesting. Doesn't seem like I can contribute to that -- seems like it is very disorganized (with a lot of weird Z3 notation I guess) unless you have some document overviewing it? It seems very close to what I imagined, anyway

Jamey Sharp (Mar 08 2023 at 16:26):

That is very much work-in-progress, but has already discovered previously-unknown bugs. So yes, we agree with you: checking equivalence of ISLE rules is a good idea!

I'd certainly be interested to see your approach to the idea if this is something you want to work on. The level of complexity in that team's implementation (and I think they'd agree that it is quite complex) is, I think, necessary to handle the full suite of ISLE rules. But you might find there's a subset of rules where you can do something simpler or faster, and that could be valuable. You could also explore something like property-based testing or fuzzing, which might be a nice complement to the complete search that an SMT solver like Z3 does. Or you might just have fun experimenting, which I think is worth-while too. In any case I'd encourage you to explore your idea if you want to!

ghostway (Mar 08 2023 at 16:41):

What is the scope of their implementation and why is it complex (the former might answer the latter)?

Chris Fallin (Mar 08 2023 at 16:57):

@ghostway it's complex because it's quite a hard problem! I think you may be missing some context here -- this is an ongoing research collaboration and most of the context is in weekly meetings that we have with @Alexa VanHattum and others. I don't think the repo is intended for public consumption yet necessarily, which is why it doesn't have polished documentation. The intent of the collaboration is to eventually publish something, after which time it'll be upstreamed and much better documented

ghostway (Mar 08 2023 at 17:05):

I see I'm missing a lot of context then! Depending on whether this is something the future publishers would like to share, I'd be happy to come to one of the meetings.
I'm not well versed on these things, just had an idea when I've seen it on your roadmap, so I probably don't have a full understanding of the problem.
Thanks for the warm welcome, anyway!

Chris Fallin (Mar 08 2023 at 23:30):

it's probably up to @Alexa VanHattum how we share the current effort; I suspect when the work is merged upstream and fully publicly documented that might be a good time to jump in though

ghostway (Mar 09 2023 at 06:00):

Sure. Is there a rough date for when that's going to happen? In the meantime I'll see what I can make up of this problem

Alexa VanHattum (Mar 09 2023 at 09:49):

Hi @ghostway ! As other have mentioned, our current implementation is very much a research prototype at this stage; we’ll add comprehensive documentation before merging upstream.

As for scope/complexity: our current focus is ISLE rules on integer types (i8, i16, i32, and i64) for the ARM aarch64 backend (though we’ve also been able to apply our prototype to rules in the midend optimizer and x86 backends). The implementation is complex because ISLE is much more expressive than most compilers’ term rewriting DSLs: lowering rules in particular require combining the semantics of ISLE matches themselves, clif (Cranelift IR) terms, ARM aarch64 instructions, and many external Rust functions from external constructors and extractors. Many of these terms do not have semantics that map directly onto a symbolic representation like SMT, so we have custom encodings. In addition, ISLE rules are polymorphic over clif types, so we need custom logic for type inference and type enumeration (SMT requires static bitvector widths).

In terms of contributing: our meetings are currently focused on the final push toward publication, so while you’re welcome to attend now (send me a DM and I can forward info), the meeting topics will likely be fairly inaccessible without context. By late April/early May we should be in a much better place to onboard new contributors. I also typically give weekly updates at the general Cranelift meetings, which have information on attending here. And as Chris and Jamey have said, there are a lot of cool complimentary approaches someone could take to checking transformation equivalence, that I would also be excited to see happen and would be happy to give input on if that's helpful!

meetings/cranelift at main · bytecodealliance/meetings

Contribute to bytecodealliance/meetings development by creating an account on GitHub.

Cranelift: Rewrite `or(and(x, y), not(y)) => or(x, not(y))` by fitzgen · Pull Request #5676 · bytecodealliance/wasmtime

Thanks again to Souper for pointing this one out! cc @regehr

Cranelift: ISLE: aarch64: `imm12_from_negated_value` rules don't apply for i32, i16 · Issue #5903 · bytecodealliance/wasmtime

Our prototype verifier seems to have found a completeness/performance (not correctness) bug in the handling of some narrow immediate values in aarch64. aarch64 has the following rules that check if...

Jamey Sharp (Mar 09 2023 at 15:54):

cranelift: Fuzz mid-end optimizations · Issue #5967 · bytecodealliance/wasmtime

Feature Currently, the cranelift-fuzzgen fuzz target performs differential fuzzing between our Cranelift interpreter and the native backend for the host. The native backend may be configured to run...

ghostway (Mar 10 2023 at 06:48):

@Alexa VanHattum Hello!
Yes, I've seen some rust stuff in the ISLE which was interesting. Why is it conserved to particular backends? Is clif itself too expressive, or is it a limitation on the optimizations you use?
I'll see what I can do about both meetings. Thanks for the details!
Yeah, fuzzing the interpreter does sound good as well (but then, why are you testing on particular backends?)

ghostway (Mar 10 2023 at 13:20):

Aha, I see. ISLE is also for emitting, not only transforming the graph. That's actually very interesting, like LLVM's tablegen but for transforming as well (kind of, not exactly)...

ghostway (Mar 10 2023 at 16:38):

About your idea @Jamey Sharp, it wouldn't be enough to add an optimize function to TestFileCompiler and just do self.ctx.optimize(isa), right? So how would that be done?

Jamey Sharp (Mar 10 2023 at 17:14):

Jamey Sharp (Mar 10 2023 at 19:24):

And to clarify, it isn't that I want to fuzz the interpreter; we're using the interpreter as a reference implementation of CLIF semantics, so to some extent it's "trusted". Instead I want to check that this reference implementation says that a CLIF program after mid-end optimization still means the same thing that it did before optimization.

Jamey Sharp (Mar 10 2023 at 20:07):

@ghostway, I've updated issue 5967 with more detail about how I think that would work. Feel free to ask more questions either here or in issue comments! I'd prefer to discuss that idea in issue comments so there's a record of any questions and answers there.

ghostway (Mar 10 2023 at 20:11):

Thanks for the clarifications on the issue. I'll try to work on it tomorrow maybe.

ghostway (Mar 10 2023 at 20:14):

About the clarifications, yes it's what I did. Just need to add that boolean (if I understood exactly what you wanted. We will see in the eventual pr) :)

ghostway (Mar 11 2023 at 16:57):

@Jamey Sharp I finally got to this, and as a sanity check I switched it to always use the optimized one (with the interpreter). I added a to_optimized method to the TestCase (creates a new instance of TestCase) and I am getting Trap(User(HeapMisaligned)) just after starting. Do you have an idea why would that happen?

Afonso Bordado (Mar 11 2023 at 19:34):

Hey @ghostway , the fuzzgen library sometimes creates CLIF functions that purposefully trap. (this should be less than 1% of the functions that it generates).

What we do in the fuzzer is that if it traps in the interpreter we don't actually run it on the host. In this case It is probably a good idea to ignore the traps.

Or, if we are comparing the interpreter vs the interpreter we can record the original test case results (including the exact trap) and compare it against the optimized version, that way we can ensure that we don't change traps via optimizations.

ghostway (Mar 11 2023 at 20:19):

@Afonso Bordado Hello,
I see, but you already continue if the result of the first run is a trap; so it wouldn't even get to the second run
Also, is compiling and optimizing for the interpreter too expensive (even for small compiles like testcases)? I don't see a pretty way to inter-changeably optimize or compile code without inflating the code considerably

bjorn3 (Mar 11 2023 at 20:22):

@ghostway I think you replied in the wrong topic. If you click the button to edit your message you can edit the topic name of your message to the right topic.

Afonso Bordado (Mar 12 2023 at 00:35):

Well, the first step is always running the interpreter, so after that we can branch and do two completely different things.

It is a bit more code, but it should be fairly clean to split after the first interpreter run.

IIRC running the compiler is the most expensive step, I think I measured it once and it was like 50% of the fuzzer time (This was a long time ago, so I might be misremembering some stuff). For optimizations I haven't measured it recently and we had a bunch of changes in the optimizations that we do.

We shouldn't need to run the compile step for the interpreter, just the optimizations.

ghostway (Mar 12 2023 at 06:19):

Yes, but having two or more (because of lifetimes), possibly uninitialized, variables is not what I'd like to see in rust code. I didn't want to use closures, but they might just be the prettiest way you can do this (I don't want to duplicate that for loop). It still crashes for me, anyway. Let's talk over code and not hypotheticals, maybe this afternoon/evening I'll commit this (morning for me right now)

Stream: cranelift

Topic: Equivalence/Transformation verifier of ISLE rules

ghostway (Mar 08 2023 at 14:04):