cfallin opened PR #3560 from isle-docs
to main
:
This documentation provides details for all of the ISLE language
features, and detailed rationale for why many of them are designed in
the way that they are. It is hopefully both a reasonable tutorial and
reference for someone looking to understand the DSL.Note that this documentation is separate from and orthogonal to the
work to document the Cranelift bindings and integration work that
@fitzgen has covered well in #3556. This document can link to that one
and vice-versa once they are both in-tree.In addition to @fitzgen I'm picking a few folks for reviewers who would
likely be consumers of this doc; please do let me know if there are holes
or not-clear parts anywhere!<!--
Please ensure that the following steps are all taken care of before submitting
the PR.
[ ] This has been discussed in issue #..., or if not, please tell us why
here.[ ] A short description of what this does, why it is needed; if the
description becomes long, the matter should probably be discussed in an issue
first.[ ] This PR contains test cases, if meaningful.
- [ ] A reviewer from the core maintainer team has been assigned for this PR.
If you don't know who could review this, please indicate so. The list of
suggested reviewers on the right can help you.Please ensure all communication adheres to the code of conduct.
-->
cfallin requested abrown for a review on PR #3560.
cfallin requested alexcrichton for a review on PR #3560.
cfallin requested fitzgen for a review on PR #3560.
cfallin updated PR #3560 from isle-docs
to main
.
avanhatt submitted PR review.
avanhatt submitted PR review.
avanhatt created PR review comment:
nit: the point about "until no more rules are applicable" isn't necessarily a characteristic of term rewriting systems in general (many systems have rules that can apply an infinite number of times, like
a -> -(-a)
. I'd suggest leaving that part out or replacing with something like "until the term meets some application-specific end condition".
avanhatt created PR review comment:
Could add here that programs don't need to consider the complex tree when changing a single rule, since it is auto-generated by the ISLE compiler.
avanhatt created PR review comment:
I might add here that term rewriting allows a nice separation between specifying when rules are possible and when they are profitable/desired (since not all term rewriting systems choose to apply more specific rules first).
avanhatt created PR review comment:
nit: closing paren
avanhatt created PR review comment:
Can all ISLE terms/expressions operate on either CLIF or the machine dependent IR? That is, can we think of rules are operating on the language that is the union of the input and output languages? Or does something prevent compiler writers from adding a term that rewrites a machine dependent IR term back into a higher level term?
avanhatt created PR review comment:
Give example?
avanhatt created PR review comment:
And boolean constants, as below?
avanhatt created PR review comment:
This is cool!
Does this mean
(A 1 _) -> ...
and(A 2 _) -> ...
would share the function created for(A ...)
?
avanhatt created PR review comment:
Are wildcards the same as symbolic variables here? Sorry if this if a CLIF distinction I'm just unaware of!
avanhatt created PR review comment:
Again, this is system/application specific.
avanhatt created PR review comment:
Maybe just me, but I don't follow the sentence starting with "Here...". Does here mean in the previous generic instruction selection example, or in ISEL?
avanhatt created PR review comment:
Can wildcards be used as constructions if they are nested? I.e.
(A (_ 1 x))
?
avanhatt created PR review comment:
Could be fun here to add a footnote that some systems (e.g., equality graphs in theorem proving) support applying multiple rules "at once" and only choosing the best one after exploring many possible results. Sadly, I'm not sure that there are any more general-audience-friendly write-ups of this: https://cseweb.ucsd.edu/~lerner/papers/popl09.html
avanhatt created PR review comment:
Could call out here that
symbolic
is more special than the other primitives, since it's not a construct in the language itself (I assume?)
avanhatt created PR review comment:
nit: the
(extern ...)
syntax was not introduced above but is here in the "summary", I would leave the syntax for the next section
avanhatt created PR review comment:
nit: around
avanhatt created PR review comment:
Maybe name this as the "context trait" here to make the rest of this section more clear?
avanhatt created PR review comment:
This is a bit confusing since before in the general parts "constructor" was also used for LHS. I'm also not seeing yet how this works with the heuristic of precedence for more-specific rules (though maybe that's coming)
avanhatt created PR review comment:
nit: spell out instruction selection
avanhatt created PR review comment:
Does the system error if two rules with identical priorities both apply? Or fall back to the previous heuristic? What if both rules complete the match?
avanhatt created PR review comment:
Why usually and not always?
avanhatt created PR review comment:
In this case, backtracking is impossible, correct? So nodes can only be used in backtracking if they are not enum destructors?
avanhatt created PR review comment:
Questions/comments at this point:
- Might be nice to show how this decision trie algorithm handles the more-and-less specific rule examples from the beginning. In practice, what does the prioritization mean for how lowering instructions are selected?
- Maybe add a brief reference that instruction selection is in general NP-complete, so accepting backtracking/some level of possible inefficiency is unavoidable.
- What should engineers consider when adding new ISEL rules? Is all of Cranelift's lowering handled by ISEL (I assume no?) and what does the interaction between ISEL and the more traditional instruction selection backend look like? On this point, might be worth clarifying which invariants are enforced by the compiler vs the engineer (as I think I commented in another place, having ISEL or the Rust compiler itself rule out side effects in the LHS seems valuable, as does enforcement of which sublanguage can be on the LHS/RHS).
avanhatt created PR review comment:
This post is about a specific tool but a little more general-audience-friendly: https://blog.sigplan.org/2021/04/06/equality-saturation-with-egg/
avanhatt created PR review comment:
Would this be disallowed if the middle range were
[0, 1]
? Where in the system would a failure happen if this were attempted?
avanhatt created PR review comment:
A range because child nodes can differ in priority?
avanhatt created PR review comment:
This also will make testing/fuzzing much easier, presumably :D
avanhatt edited PR review comment.
avanhatt edited PR review comment.
abrown submitted PR review.
abrown created PR review comment:
interface) of sorts to allow interaction with Rust code, and
abrown created PR review comment:
Is this right?
abrown submitted PR review.
fitzgen submitted PR review.
fitzgen created PR review comment:
"backtracking" is a little strange to see here to me, because we specifically don't do any backtracking
fitzgen submitted PR review.
fitzgen created PR review comment:
Elsewhere (and at the top of this document) we've defined it as "instruction selection lowering expressions"
This document will describe ISLE (Instruction Selection Lowering
fitzgen created PR review comment:
These last two bullet points are indented one space deeper than the earlier bullet points, fwiw.
fitzgen created PR review comment:
interface) of sorts to allow interacting with Rust code, and
fitzgen created PR review comment:
omg I've been writing footnotes by hand in markdown for years, I had no idea that (at least for github markdown) this existed
fitzgen created PR review comment:
This file should probably belong in
cranelift/isle/docs/reference.md
or something next to the other ISLE-specific documentation that is in the isle README.md (and could probably be split out into multiple.md
files)e.g. we don't have the ISLE compiler implementation docs under
cranelift/docs/*
and I don't think it would make sense to move them there
fitzgen created PR review comment:
the term is exactly equal to this constant.
fitzgen created PR review comment:
Does this section have any information that isn't present in that comment?
This level of implementation detail seems better reserved for code comments (as linked to here) or implementation overviews (as in the README), not language reference.
I think this section can probably be removed.
fitzgen created PR review comment:
I personally feel like there is way more background here than necessary and it will overwhelm people who just want to learn the semantics of ISLE so they can hack on Cranelift's instruction selection.
Like everything about Turing completeness seems like an aside that, while interesting, is mostly irrelevant.
fitzgen submitted PR review.
fitzgen created PR review comment:
Err actually this may have been a github rendering artifact
cfallin submitted PR review.
cfallin created PR review comment:
We do actually, depending on how you look at it! Specifically one can see the decision-trie traversal as a depth-first attempt at finding a path to a leaf node, where we backtrack up the tree when a match op fails. In the generated Rust this manifests as a control-flow edge that falls out of the
if match_condition { ... }
without returning aSome(result)
. The "explicit backtracking stack pop" step that one would see in an FSM-like implementation implicitly happens because (i) PC encodes a path through match ops, and we effectively truncate that path by falling through, and (ii) local let-bindings go out of scope, backing up our state.
cfallin submitted PR review.
cfallin created PR review comment:
Apparently the syntax even allows symbolic footnote names (though I tend to just do numbers so it reads ok as plain text too)! And GitHub's markdown renderer recently started to support actually rendering them; it didn't last time I wrote an RFC :-)
cfallin submitted PR review.
cfallin created PR review comment:
Maybe, but that's IMHO there is value in introducing the philosophy behind the approach and the language as well; I'm thinking a bit broader than just folks who want to hack on Cranelift, and this is meant to be reference documentation more than a tutorial. E.g. someone might not realize that it's possible to solve a problem in a certain way by recursing through term-rewrites, unless one is already familiar with the general field; one might not understand how term-rewriting came about and what it can be used for. I want to be helpful to that sort of newcomer who may not be as well-read in the field :-)
Said another way, somewhere we should do a deep-dive with all the details and thinking behind ISLE; this is meant to be that deep-dive, more than a quickstart.
There is definitely room for a "learn just enough about the DSL to be proficient at instruction lowering rules" tutorial to exist, though; or I guess it already sort of does, in the docs (README's tutorial at least) that you've written?
cfallin edited PR review comment.
cfallin submitted PR review.
cfallin created PR review comment:
Thinking about this a bit more -- I can probably wrap the whole TRS section in a "This section provides background on the general term-rewriting-system concept; please feel free to skip ahead if you just want details on ISLE"
goto
; that gives the context and background for those who want it without making it a necessary prerequisite.
cfallin updated PR #3560 from isle-docs
to main
.
cfallin updated PR #3560 from isle-docs
to main
.
cfallin updated PR #3560 from isle-docs
to main
.
cfallin created PR review comment:
Ah, OK, I was following your PR's lead (
cranelift/docs/isle-integration.md
) but I've moved both this and the ISLE-Cranelift integration doc intocranelift/isle/docs/
. (Also renamedisle-integration.md
tocranelift-integration.md
since it's now within the ISLE tree.) This is nicer, agreed!
cfallin submitted PR review.
cfallin submitted PR review.
cfallin created PR review comment:
Computer architect muscle memory! Fixed.
cfallin submitted PR review.
cfallin created PR review comment:
Yes, I think so, but I agree it's pretty terse, so I expanded it with an explanation :-)
cfallin submitted PR review.
cfallin created PR review comment:
Done as above, added a reader-goto around the section "if you just want details...".
cfallin submitted PR review.
cfallin created PR review comment:
Good point, the description of the algorithm itself is probably better left to the single canonical place in the source; removed that. I left the high-level description of the trie and added a note ("We describe this data structure below with the intent to provide an understanding of how the DSL compiler...") as I think it's useful to have this mental model whenever questions of how the generated code will work come up.
cfallin submitted PR review.
cfallin created PR review comment:
Yes, that's a good point! Updated as suggested.
cfallin created PR review comment:
Added this point, thanks.
cfallin submitted PR review.
cfallin created PR review comment:
Updated (removed "symbolic" here and clarified elsewhere) -- sorry for the confusion here. "Symbolic" in this context just meant an externally-defined, imported symbolic name for a particular concrete value; e.g.
$MYVALUE
for au32
. It's not actually a different type of primitive so I removed its mention here.
cfallin submitted PR review.
cfallin created PR review comment:
Just a pattern-matching operator, not anything CLIF-specific!
cfallin submitted PR review.
cfallin created PR review comment:
Thanks, I added a footnote on this!
cfallin submitted PR review.
cfallin created PR review comment:
Yes, exactly; all of the logic to rewrite
(A ...)
is invoked from one function entry point.
cfallin submitted PR review.
cfallin created PR review comment:
Done.
cfallin submitted PR review.
cfallin created PR review comment:
No, a constructor form has to have a concrete constructor name; no polymorphism in that dimension. Added clarification, thanks.
cfallin submitted PR review.
cfallin created PR review comment:
Clarified, thanks.
cfallin submitted PR review.
cfallin created PR review comment:
My mistake, it actually is always, in this context; I was thinking of constructor signatures in general, as they are infallible when imported (Rust code called from the RHS is not allowed to fail/backtrack). In this context we're only discussing internal constructors, so that doesn't apply; updated.
cfallin submitted PR review.
cfallin created PR review comment:
Done.
cfallin submitted PR review.
cfallin created PR review comment:
Indeed! No concrete plans yet to independently test the generated code but possibly we could come up with an alternate implementation of the input side (the extractors), maybe even inventing the operand tree as we go (carry the
arbitrary::Unstructured
in the context object and then generate things on demand). cc @fitzgen for possible fuzzing opportunity :-)
cfallin submitted PR review.
cfallin created PR review comment:
Updated, thanks.
cfallin submitted PR review.
cfallin created PR review comment:
ISLE itself is agnostic to the underlying IR, it just knows about extractors and constructors and various types that the user defines; so one could have
(decl compile_to_machine1 (CLIF) MachInst1)
and(decl compile_to_machine2 (CLIF) MachInst2)
and(decl optimize (CLIF) CLIF)
and define rules for each. At least with the first two, we have a type-level guarantee that lowering goes in the proper direction. The latter could have convergence issues but we could limit the number of passes rather than chase a fixpoint indefinitely. Later in the doc ("A Note on Heterogeneous Types") touches on this distinction from other TRS-based instruction selector systems; hopefully that clarifies a bit more!
cfallin submitted PR review.
cfallin created PR review comment:
There will be an error if two rules have exactly the same LHS pattern, but not otherwise; the more-specific-first heuristic breaks the tie.
I added some text about this and about how it's implementation-defined (i.e. the heuristic could change, and rules shouldn't rely on it for correctness, unlike priorities which have specifically-defined semantics).
cfallin submitted PR review.
cfallin created PR review comment:
Updated language here to talk about "terms" instead, thanks for pointing this out!
cfallin submitted PR review.
cfallin created PR review comment:
Yes, exactly. I ended up removing these few paragraphs because the code also has a similar block comment describing the trie building, but the idea is that it captures the range so that we can allow subtrees to merge as long as there is no intermediate-priority rule that has to be tried before part of the subtree and after another part.
cfallin created PR review comment:
I created #3574 to record this.
cfallin submitted PR review.
cfallin submitted PR review.
cfallin created PR review comment:
Backtracking (up the decision trie) is still possible if no match arm matches; e.g. one can have
match value { Value::A { ref x, ref y } => { ... return Some(...); } Value::B { ref z } => { ... return Some(...); } _ => {} } None
We always add the
_ => {}
catch-all, as we don't do exhaustiveness checking on the match arms. (We silence the unreachable-code warning so that rustc, which does do this, doesn't complain!)
cfallin submitted PR review.
cfallin created PR review comment:
Yes, we would avoid building a trie like that, but it wouldn't cause a failure, but rather a tree-split. The block comment in
trie.rs
covers this a bit better than this summary here (which I removed in favor of just pointing to the block comment).
cfallin submitted PR review.
cfallin created PR review comment:
show how this decision trie algorithm handles the more-and-less specific rule examples from the beginning
There's a bit more discussion in the block comment that's referenced now, but as far as the more-specific-first ordering, it's a bit subtle. IMHO the best way may not be to try to get the programmer to predict the current algorithm but just to make it a more explicit and well-defined ordering. I created #3575 for this.
brief reference that instruction selection is in general NP-complete
Added a note that collapsing the rule patterns to a fast-dispatch
match
is not guaranteed, and that worst-case complexity per rewrite is equivalent to a sequential attempt at each rule in turn. The way that we integrate ISLE into our lowering/isel framework I think takes care of the weighted set cover kind of NP-completeness that one generally has when doing isel over the dataflow graph (we do greedy matching) but the complexity is certainly still something to be aware of.What should engineers consider when adding new ISLE rules? [interaction between ISLE and traditional isel, other details]
@fitzgen's doc I think covers this pretty well, hopefully! If not then we can definitely add more pointers/tips as we encounter/come up with them.
fitzgen submitted PR review.
fitzgen created PR review comment:
Wait, but the
isle-integration.md
stuff is Cranelift-specific, so it does make sense to be insidecranelift/docs
.Where as this documentation is just about ISLE and not any specific embedding, so it makes sense it should be under
cranelift/isle/docs
so it is a part of the ISLE crate/package.
cfallin submitted PR review.
cfallin created PR review comment:
Ah, two different perspectives I guess. I had figured putting all ISLE-related documentation in one place made sense, otherwise a newcomer who just knows "ISLE is an isel thing" and goes looking for its documentation might miss some of it. I don't care too much either way though, so I'm happy to move it back.
cfallin updated PR #3560 from isle-docs
to main
.
cfallin merged PR #3560.
Last updated: Jan 24 2025 at 00:11 UTC