ISLE RFC -- motion to finalize · cranelift

Stream: cranelift

Topic: ISLE RFC -- motion to finalize

Chris Fallin (Nov 03 2021 at 00:41):

Hi all -- just wanted to leave a note here that (as of yesterday) I made a Motion to Finalize the RFC regarding our instruction-selector DSL. Here is the link: https://github.com/bytecodealliance/rfcs/pull/15. @fitzgen (he/him) gave an excellent presentation on the latest progress yesterday as well (link).

Merging this RFC would mean that we've decided to move forward with defining our instruction-selection backends in Cranelift with the DSL we've been prototyping. The details of the bindings and definitions can of course be tweaked over time, just as for any other part of the codebase.

So far it has a few approvals but I wanted to make sure folks saw this -- especially if you've participated in conversations so far, and are ok with this direction, please do head over to the RFC and let us know if you give final approval! Or, if not, let us know if any issues still remain.

cc @Benjamin Bouvier @Anton Kirilov @Sam Parker @Johnnie Birch @Ulrich Weigand especially, who have participated actively in the last several discussions in our meetings -- thanks for the time and patience on this :-)

RFC: design of ISLE instruction-selector DSL. by cfallin · Pull Request #15 · bytecodealliance/rfcs

Rendered This RFC proposes a new DSL for the instruction selectors in Cranelift, automating the generation of the backends based on a list of term-rewriting rules provided by the backend author. Th...

ISLE Update

ISLE Instruction Selection DSL Update Nick Fitzgerald Cranelift Bi-Weekly 2021-11-01

Benjamin Bouvier (Nov 03 2021 at 10:57):

Hi! I was absent from the meeting because of a public holiday here. I was a bit curious about the next steps, especially around the approach: does it mean we're going to have the new system checked in in code in wasmtime soon? And then starting to rewrite incrementally lowering so as to use the new system (as opposed as having the new system be developed on the side, and have a one-time switch of the backends to the new systems entirely)? In the latter cases, do we have benchmarks running on a very frequent basis, to make sure that the compile times don't get bad?

Chris Fallin (Nov 03 2021 at 15:59):

Hey Ben -- sorry, didn't realize the date was a conflict for .fr! (FWIW this seems like a totally valid reason to ask for agenda items to move in the future, if important folks can't make it :-) )

The tl;dr of my answer is (i) gradual, not all-at-once switch, and (ii) yes, we'll benchmark, and avoid any perf regressions. In more detail:

The way that the integration branch works now is that the generated code from the patterns is invoked first, and if it returns None, then the original code is invoked. The idea is that this will allow us to have a lot of little PRs to move over a few instruction lowerings at a time, all the while keeping the thing working; this is less risky than trying to match behavior exactly and switch all at once, especially if the backend is a moving target.
Performance should be faster or at parity, by construction (by generating code that is the same as what we write by hand initially, then improving). Actually in the current code we're already doing a few tricks that the handwritten backends do not, like matching directly on the InstructionData rather than opcode first then extracting data separately. In the future we can algorithmically optimize the order of matching and combining of matching effort; @fitzgen had a great idea yesterday for example to do a pass to reorder matches to increase the amount of work shared between rules (isle#11). The cool thing about ideas like that is that it's now practical to make such changes across all backend code; it would've been very hard or impossible to "transpose" the whole backend or turn it inside out for performance with the handwritten methodology.
To be sure of this, we will benchmark compile time when we do the initial PR to bring in the framework and the first few lowering rules, and we will not merge if there is any slowdown until we can fix it. As we move code over, we expect compile time to remain the same or faster, but it would be nice to have benchmarks for this too. I don't know if we want to go to the length of requiring contributors to benchmark manually in every PR -- this seems extreme, when we don't have such a requirement for other changes, and hopefully an initial benchmark plus the argument that the generated code is the same or better is enough -- but I'm curious what you and others say to that.
To some degree, "continuous benchmarking" is the domain of RFCs #3 and #4, in which we agreed we wanted infrastructure like this; I haven't kept up with the progress on implementing these but I know we at least allocated a CI machine for it earlier in the year. @Johnnie Birch do you have any updates on the continuous benchmarking infrastructure?

Sort linearized match ops to maximize trie prefix sharing · Issue #11 · cfallin/isle

After linearizing rule patterns for a given term, but before we insert those linear match ops into the trie, we should do a custom topological sort of the linear match ops's data-flow graph tha...

Last updated: Apr 08 2025 at 18:03 UTC