Stream: git-wasmtime

Topic: wasmtime / issue #4717 ISLE rule matching order


view this post on Zulip Wasmtime GitHub notifications bot (Aug 16 2022 at 00:20):

elliottt opened issue #4717:

Feature

While working on finishing the migration of the x64 backend to ISLE, I encountered a case where guarding a rule on specific ISA features being enabled caused the rule to never fire. The problem was that the rule I added overlapped with another rule except in the ISA feature check, and the other rule took precedence. Here's an example from elsewhere in the codebase:
https://github.com/bytecodealliance/wasmtime/blob/0f944937c0c3d90946eaa6199f0baa5ed991c80d/cranelift/codegen/src/isa/x64/lower.isle#L1799-L1809

The LHS of the two rules only differs in the patterns to the first argument of (has_type): both rules match (ty_32_or_64 ty) while the rule on line 1799 also matches (use_bmi1). The rule on line 1799 has been given a priority of 1 to ensure that it's checked before the rule on line 1806, but ideally we wouldn't need to give this annotation here.

My proposal is that we switch to compiling rules by matching rules top-down, left-to-right.

Benefit

The benefit of this change would be that we could tell statically when a rule was not reachable, and raise an error when it would never fire.

Implementation

For matching on enum values we could take the approach outlined in https://www.cs.tufts.edu/~nr/cs257/archive/luc-maranget/jun08.pdf. For handling extractors we could take inspiration from GHC's implementation of view patterns.

Alternatives

We could continue to develop heuristics to determine when rules might be unconditionally shadowed, and inform the programmer through an error or warning.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 16 2022 at 00:30):

cfallin commented on issue #4717:

Thanks @elliottt for bringing this up. I have some thoughts on this inspired by our conversations over the past week, together with our experience with this aspect of ISLE over the past year, that I've written up on the subject below.

ISLE rule-ordering semantics simplification

This issue proposes to simplify the ISLE rule-ordering semantics to
"first matching rule wins", as a result of a year of experience with
the language as it currently stands (a more complex ordering
heuristic) and the issues that this causes.

Overview: Current Design

ISLE currently has semantics for rule ordering that generally follow
the "most specific rule wins" principle. The goal is that the user, or
a tool, should be able to write rules in any order and the resulting
compiler backend should be the same no matter how these rules are
ordered.

Inside the ISLE compiler, this is implemented by sorting edges when
inserting into the match trie. The sort-order is implied by the
PatternInst enum and the order of its arms: lower-ordered ops come
first at any given node in the tree.

The ISLE language contains a concession to allow for manual ordering:
any rule can be annotated with an explicit numeric priority, and rules
are sorted by priority first before being merged into a match trie.

Reasons for Current Design

Intuitive "More-Specific-First" Matching

The first reason given for the current design is that it allows for a
natural "general case then more specific cases" ordering of rule
cases. One can start a group of rules with the most basic handling,
and then get into trickier cases later. This aids reading
comprehension.

Compatibility with Rule-Generating Tools

One might build a tool that generates a large body of lowering rules
from some algorithm that finds equivalences (as in
superoptimization). For such a tool, it is convenient to be able to
(i) just dump the list of rules, without sorting first; and (ii) allow
the user to include these rules without having to think about how they
order with respect to handwritten rules.

Independence of Rules for Verification

When verifying the correctness of ISLE lowering rules against
semantics of the input and output language, it is useful if the
semantics of the language are as-if a rule could fire when the
left-hand side matches at all, regardless of other rules that
exist. An "unordered" / "unspecified" semantics is closer to this than
"first rule wins".

Issues

Unclear mental model

THe primary issue is that the mental model required to be held by the
ISLE programmer is unclear. If the ISLE programmer truly does not
care which rule applies, then all is well. But in real development of
compiler backends, one does very much care: one needs to make sure
that some special cases are actually handled (or else they are useless
dead code that, worse, may be untested); and sometimes one rule that
is shadowed by another is actually incorrect in the shadowed case, so
one relies on the ordering heuristic for correctness.

In general, development of ISLE rules requires the programmer to
understand how the language operates so that they can control which
rule fires. If the ordering rules are complex and subtle, or worse are
"unspecified", then we get a body of rules that either has silent
shadowing, or brittle reliance on particular ordering heuristics that
may change. Even if we attempt to clearly specify the semantics, there
are subtleties that arise because of the way that external Rust code
can be invoked: we don't know when some conditions are disjoint or
not, fallible or infallible, and a set of prioritization rules that
relies on "fallible before infallible" and disjointness thus requires,
in the limit, a full and complete understanding of our helpers'
semantics (via explicit specification or some other method).

One can make an argument that wen building a complex artifact, one
needs one's tools to be as predictable as possible. Complexity is
fine, if it is in service of a well-understood goal (e.g., compiling
Rust code to x86 machine code). But unclear semantics are not.

"Experimental nature" of Priority Usage

The clearest extraction of the issue here comes in our standard advice
for getting proper lowering output: just write the rules, and if the
ordering is wrong, apply some priorities.

Consider what this is implying: it is saying that the DSL's semantics
are unknowable, or at least it's not worth knowing ahead of time what
will happen; instead, one must experiment and then apply corrective
action if the heuristic gets it wrong.

This is problematic because the developer may not always take the time
to do so ("I'm just adding a quick lowering case, it will probably be
hit") -- and because in cases where the metacompiler does get it
right without priorities, an implicit dependence on the current
heuristics is baked in at that point and must be frozen (see below).

In my opinion, it is not feasible to write high-reliability software
if one's tools are unpredictable. Predictable and repeatable behavior
is the cornerstone of writing correct lowerings.

"Most specific first" is not simple or unambiguous

But isn't "most specific first" a predictable design point?

It turns out, empirically, the answer is "no". We have had a number of
bugs where the heuristics do not do what we would expect, mostly
because fallible and infallible extractors are not properly ordered
(see e.g. #4685), and the interaction of explicit priorites with
implicit ones is complex (#4093, #4117). We have a pending issue to
explicitly define what the ordering even is (#3575) but we haven't
been able to do so in a simple way (and it is changing as we fix the
above bugs).

The basic issue is that because we are doing more than simple
tree-matching -- because we can invoke arbitrary Rust code and we have
written helpers that check predicates and conditions in various ways,
and because some checks are fallible and some are not -- "specific" is
unclear. This is aside from the general issue (noted
here)
that when one has a tree-shaped rather than list-shaped pattern match,
specificity is a partial order. (Two patterns can both subsume a
general one by being specific in different dimensions.) One can break
that tie arbitrarily, but that just adds more complexity.

In general, humans (other than those who hack on islec) seem not to
think in terms of these complex heuristics, or at least seem to
sometimes have a different expectation than what the compiler really
does.

Empirical evidence of mental-model mismatch

To add some evidence to this claim, see the following code reviews:

Developers working in ISLE seem to have a mental model that rules are
evaluated top-down. This is likely for several reasons. First, it is
the natural "reading order"; thinking in terms of parallel,
reorderable conditions is not natural for most programmers. Second,
existing languages with match facilities condition programmers this
way, including Rust (as well as ML, Haskell, Scala, etc). Third, it
is clear that there must be some order, and if the ordering
heuristic is otherwise muddy or badly specified, or sometimes changing
(see above), then a programmer will likely fall back on "first rule
first".

In general we want to follow the Principle of Least Surprise: we
should have a simple, predictable tool that behaves like most people
who pick it up think it should.

The task at hand (writing a correct compiler) is hard enough; any
complexity in the tool that is not necessary should be
discarded. Explicitness is a strong principle that results in tools
one can trust and do precise work with.

Subtle bugs in heuristics, and heuristics as load-bearing for correctness

Finally, as seen in several of the above-linked issues and PRs, we
have had multiple bugs in the ordering heuristics and have had to make
changes to get them right.

If a hard-to-understand heuristic is difficult for an ISLE developer
to keep track of, a hard-to-understand heuristic *that sometimes
changes* is even worse.

The "experimental answer" to rule ordering -- try it out, add a
priority if needed -- implicitly creates dependencies on the current
heuristics. If we do so, and then change the heuristics to fix other
cases to behave as we expect they would, we are liable to silently
alter how rules apply (action-at-a-distance) and subtly break parts of
the compiler. Test coverage can hedge against this, but relying on
test coverage to patch over footguns in the language is a game that
dynamically-typed languages play; we don't want to do that if we can
help it.

This potential for latent correctness bugs is terrifying and is
itself a strong reason to migrate away from this status quo,
[message truncated]

view this post on Zulip Wasmtime GitHub notifications bot (Aug 16 2022 at 00:31):

cfallin edited a comment on issue #4717:

Thanks @elliottt for bringing this up. I have some thoughts on this inspired by our conversations over the past week, together with our experience with this aspect of ISLE over the past year, that I've written up on the subject below.

ISLE rule-ordering semantics simplification

This issue proposes to simplify the ISLE rule-ordering semantics to
"first matching rule wins", as a result of a year of experience with
the language as it currently stands (a more complex ordering
heuristic) and the issues that this causes.

Overview: Current Design

ISLE currently has semantics for rule ordering that generally follow
the "most specific rule wins" principle. The goal is that the user, or
a tool, should be able to write rules in any order and the resulting
compiler backend should be the same no matter how these rules are
ordered.

Inside the ISLE compiler, this is implemented by sorting edges when
inserting into the match trie. The sort-order is implied by the
PatternInst enum and the order of its arms: lower-ordered ops come
first at any given node in the tree.

The ISLE language contains a concession to allow for manual ordering:
any rule can be annotated with an explicit numeric priority, and rules
are sorted by priority first before being merged into a match trie.

Reasons for Current Design

Intuitive "More-Specific-First" Matching

The first reason given for the current design is that it allows for a
natural "general case then more specific cases" ordering of rule
cases. One can start a group of rules with the most basic handling,
and then get into trickier cases later. This aids reading
comprehension.

Compatibility with Rule-Generating Tools

One might build a tool that generates a large body of lowering rules
from some algorithm that finds equivalences (as in
superoptimization). For such a tool, it is convenient to be able to
(i) just dump the list of rules, without sorting first; and (ii) allow
the user to include these rules without having to think about how they
order with respect to handwritten rules.

Independence of Rules for Verification

When verifying the correctness of ISLE lowering rules against
semantics of the input and output language, it is useful if the
semantics of the language are as-if a rule could fire when the
left-hand side matches at all, regardless of other rules that
exist. An "unordered" / "unspecified" semantics is closer to this than
"first rule wins".

Issues

Unclear mental model

The primary issue is that the mental model required to be held by the
ISLE programmer is unclear. If the ISLE programmer truly does not
care which rule applies, then all is well. But in real development of
compiler backends, one does very much care: one needs to make sure
that some special cases are actually handled (or else they are useless
dead code that, worse, may be untested); and sometimes one rule that
is shadowed by another is actually incorrect in the shadowed case, so
one relies on the ordering heuristic for correctness.

In general, development of ISLE rules requires the programmer to
understand how the language operates so that they can control which
rule fires. If the ordering rules are complex and subtle, or worse are
"unspecified", then we get a body of rules that either has silent
shadowing, or brittle reliance on particular ordering heuristics that
may change. Even if we attempt to clearly specify the semantics, there
are subtleties that arise because of the way that external Rust code
can be invoked: we don't know when some conditions are disjoint or
not, fallible or infallible, and a set of prioritization rules that
relies on "fallible before infallible" and disjointness thus requires,
in the limit, a full and complete understanding of our helpers'
semantics (via explicit specification or some other method).

One can make an argument that wen building a complex artifact, one
needs one's tools to be as predictable as possible. Complexity is
fine, if it is in service of a well-understood goal (e.g., compiling
Rust code to x86 machine code). But unclear semantics are not.

"Experimental nature" of Priority Usage

The clearest extraction of the issue here comes in our standard advice
for getting proper lowering output: just write the rules, and if the
ordering is wrong, apply some priorities.

Consider what this is implying: it is saying that the DSL's semantics
are unknowable, or at least it's not worth knowing ahead of time what
will happen; instead, one must experiment and then apply corrective
action if the heuristic gets it wrong.

This is problematic because the developer may not always take the time
to do so ("I'm just adding a quick lowering case, it will probably be
hit") -- and because in cases where the metacompiler does get it
right without priorities, an implicit dependence on the current
heuristics is baked in at that point and must be frozen (see below).

In my opinion, it is not feasible to write high-reliability software
if one's tools are unpredictable. Predictable and repeatable behavior
is the cornerstone of writing correct lowerings.

"Most specific first" is not simple or unambiguous

But isn't "most specific first" a predictable design point?

It turns out, empirically, the answer is "no". We have had a number of
bugs where the heuristics do not do what we would expect, mostly
because fallible and infallible extractors are not properly ordered
(see e.g. #4685), and the interaction of explicit priorites with
implicit ones is complex (#4093, #4117). We have a pending issue to
explicitly define what the ordering even is (#3575) but we haven't
been able to do so in a simple way (and it is changing as we fix the
above bugs).

The basic issue is that because we are doing more than simple
tree-matching -- because we can invoke arbitrary Rust code and we have
written helpers that check predicates and conditions in various ways,
and because some checks are fallible and some are not -- "specific" is
unclear. This is aside from the general issue (noted
here)
that when one has a tree-shaped rather than list-shaped pattern match,
specificity is a partial order. (Two patterns can both subsume a
general one by being specific in different dimensions.) One can break
that tie arbitrarily, but that just adds more complexity.

In general, humans (other than those who hack on islec) seem not to
think in terms of these complex heuristics, or at least seem to
sometimes have a different expectation than what the compiler really
does.

Empirical evidence of mental-model mismatch

To add some evidence to this claim, see the following code reviews:

Developers working in ISLE seem to have a mental model that rules are
evaluated top-down. This is likely for several reasons. First, it is
the natural "reading order"; thinking in terms of parallel,
reorderable conditions is not natural for most programmers. Second,
existing languages with match facilities condition programmers this
way, including Rust (as well as ML, Haskell, Scala, etc). Third, it
is clear that there must be some order, and if the ordering
heuristic is otherwise muddy or badly specified, or sometimes changing
(see above), then a programmer will likely fall back on "first rule
first".

In general we want to follow the Principle of Least Surprise: we
should have a simple, predictable tool that behaves like most people
who pick it up think it should.

The task at hand (writing a correct compiler) is hard enough; any
complexity in the tool that is not necessary should be
discarded. Explicitness is a strong principle that results in tools
one can trust and do precise work with.

Subtle bugs in heuristics, and heuristics as load-bearing for correctness

Finally, as seen in several of the above-linked issues and PRs, we
have had multiple bugs in the ordering heuristics and have had to make
changes to get them right.

If a hard-to-understand heuristic is difficult for an ISLE developer
to keep track of, a hard-to-understand heuristic *that sometimes
changes* is even worse.

The "experimental answer" to rule ordering -- try it out, add a
priority if needed -- implicitly creates dependencies on the current
heuristics. If we do so, and then change the heuristics to fix other
cases to behave as we expect they would, we are liable to silently
alter how rules apply (action-at-a-distance) and subtly break parts of
the compiler. Test coverage can hedge against this, but relying on
test coverage to patch over footguns in the language is a game that
dynamically-typed languages play; we don't want to do that if we can
help it.

This potential for latent correctness bugs is terrifying and is
itself a strong reason to migrate away from this statu
[message truncated]

view this post on Zulip Wasmtime GitHub notifications bot (Aug 16 2022 at 00:38):

cfallin edited a comment on issue #4717:

Thanks @elliottt for bringing this up. I have some thoughts on this inspired by our conversations over the past week, together with our experience with this aspect of ISLE over the past year, that I've written up on the subject below.

ISLE rule-ordering semantics simplification

This issue proposes to simplify the ISLE rule-ordering semantics to
"first matching rule wins", as a result of a year of experience with
the language as it currently stands (a more complex ordering
heuristic) and the issues that this causes.

Overview: Current Design

ISLE currently has semantics for rule ordering that generally follow
the "most specific rule wins" principle. The goal is that the user, or
a tool, should be able to write rules in any order and the resulting
compiler backend should be the same no matter how these rules are
ordered.

Inside the ISLE compiler, this is implemented by sorting edges when
inserting into the match trie. The sort-order is implied by the
PatternInst enum and the order of its arms: lower-ordered ops come
first at any given node in the tree.

The ISLE language contains a concession to allow for manual ordering:
any rule can be annotated with an explicit numeric priority, and rules
are sorted by priority first before being merged into a match trie.

Reasons for Current Design

Intuitive "More-Specific-First" Matching

The first reason given for the current design is that it allows for a
natural "general case then more specific cases" ordering of rule
cases. One can start a group of rules with the most basic handling,
and then get into trickier cases later. This aids reading
comprehension.

Compatibility with Rule-Generating Tools

One might build a tool that generates a large body of lowering rules
from some algorithm that finds equivalences (as in
superoptimization). For such a tool, it is convenient to be able to
(i) just dump the list of rules, without sorting first; and (ii) allow
the user to include these rules without having to think about how they
order with respect to handwritten rules.

Independence of Rules for Verification

When verifying the correctness of ISLE lowering rules against
semantics of the input and output language, it is useful if the
semantics of the language are as-if a rule could fire when the
left-hand side matches at all, regardless of other rules that
exist. An "unordered" / "unspecified" semantics is closer to this than
"first rule wins".

Issues

Unclear mental model

The primary issue is that the mental model required to be held by the
ISLE programmer is unclear. If the ISLE programmer truly does not
care which rule applies, then all is well. But in real development of
compiler backends, one does very much care: one needs to make sure
that some special cases are actually handled (or else they are useless
dead code that, worse, may be untested); and sometimes one rule that
is shadowed by another is actually incorrect in the shadowed case, so
one relies on the ordering heuristic for correctness.

In general, development of ISLE rules requires the programmer to
understand how the language operates so that they can control which
rule fires. If the ordering rules are complex and subtle, or worse are
"unspecified", then we get a body of rules that either has silent
shadowing, or brittle reliance on particular ordering heuristics that
may change. Even if we attempt to clearly specify the semantics, there
are subtleties that arise because of the way that external Rust code
can be invoked: we don't know when some conditions are disjoint or
not, fallible or infallible, and a set of prioritization rules that
relies on "fallible before infallible" and disjointness thus requires,
in the limit, a full and complete understanding of our helpers'
semantics (via explicit specification or some other method).

One can make an argument that when building a complex artifact, one
needs one's tools to be as predictable as possible. Complexity is
fine, if it is in service of a well-understood goal (e.g., compiling
Rust code to x86 machine code). But unclear semantics are not.

"Experimental nature" of Priority Usage

The clearest extraction of the issue here comes in our standard advice
for getting proper lowering output: just write the rules, and if the
ordering is wrong, apply some priorities.

Consider what this is implying: it is saying that the DSL's semantics
are unknowable, or at least it's not worth knowing ahead of time what
will happen; instead, one must experiment and then apply corrective
action if the heuristic gets it wrong.

This is problematic because the developer may not always take the time
to do so ("I'm just adding a quick lowering case, it will probably be
hit") -- and because in cases where the metacompiler does get it
right without priorities, an implicit dependence on the current
heuristics is baked in at that point and must be frozen (see below).

In my opinion, it is not feasible to write high-reliability software
if one's tools are unpredictable. Predictable and repeatable behavior
is the cornerstone of writing correct lowerings.

"Most specific first" is not simple or unambiguous

But isn't "most specific first" a predictable design point?

It turns out, empirically, the answer is "no". We have had a number of
bugs where the heuristics do not do what we would expect, mostly
because fallible and infallible extractors are not properly ordered
(see e.g. #4685), and the interaction of explicit priorites with
implicit ones is complex (#4093, #4117). We have a pending issue to
explicitly define what the ordering even is (#3575) but we haven't
been able to do so in a simple way (and it is changing as we fix the
above bugs).

The basic issue is that because we are doing more than simple
tree-matching -- because we can invoke arbitrary Rust code and we have
written helpers that check predicates and conditions in various ways,
and because some checks are fallible and some are not -- "specific" is
unclear. This is aside from the general issue (noted
here)
that when one has a tree-shaped rather than list-shaped pattern match,
specificity is a partial order. (Two patterns can both subsume a
general one by being specific in different dimensions.) One can break
that tie arbitrarily, but that just adds more complexity.

In general, humans (other than those who hack on islec) seem not to
think in terms of these complex heuristics, or at least seem to
sometimes have a different expectation than what the compiler really
does.

Empirical evidence of mental-model mismatch

To add some evidence to this claim, see the following code reviews:

Developers working in ISLE seem to have a mental model that rules are
evaluated top-down. This is likely for several reasons. First, it is
the natural "reading order"; thinking in terms of parallel,
reorderable conditions is not natural for most programmers. Second,
existing languages with match facilities condition programmers this
way, including Rust (as well as ML, Haskell, Scala, etc). Third, it
is clear that there must be some order, and if the ordering
heuristic is otherwise muddy or badly specified, or sometimes changing
(see above), then a programmer will likely fall back on "first rule
first".

In general we want to follow the Principle of Least Surprise: we
should have a simple, predictable tool that behaves like most people
who pick it up think it should.

The task at hand (writing a correct compiler) is hard enough; any
complexity in the tool that is not necessary should be
discarded. Explicitness is a strong principle that results in tools
one can trust and do precise work with.

Subtle bugs in heuristics, and heuristics as load-bearing for correctness

Finally, as seen in several of the above-linked issues and PRs, we
have had multiple bugs in the ordering heuristics and have had to make
changes to get them right.

If a hard-to-understand heuristic is difficult for an ISLE developer
to keep track of, a hard-to-understand heuristic *that sometimes
changes* is even worse.

The "experimental answer" to rule ordering -- try it out, add a
priority if needed -- implicitly creates dependencies on the current
heuristics. If we do so, and then change the heuristics to fix other
cases to behave as we expect they would, we are liable to silently
alter how rules apply (action-at-a-distance) and subtly break parts of
the compiler. Test coverage can hedge against this, but relying on
test coverage to patch over footguns in the language is a game that
dynamically-typed languages play; we don't want to do that if we can
help it.

This potential for latent correctness bugs is terrifying and is
itself a strong reason to migrate away from this stat
[message truncated]

view this post on Zulip Wasmtime GitHub notifications bot (Aug 16 2022 at 01:05):

jameysharp commented on issue #4717:

For another source on compiling pattern-match constructs with these top-down left-to-right semantics, Simon Peyton Jones' 1987 book, "The Implementation of Functional Programming Languages", is available as a free PDF. Chapter 5 covers a reasonably straight-forward implementation.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 16 2022 at 01:53):

elliottt edited issue #4717:

Feature

While working on finishing the migration of the x64 backend to ISLE, I encountered a case where guarding a rule on specific ISA features being enabled caused the rule to never fire. The problem was that the rule I added overlapped with another rule except in the ISA feature check, and the other rule took precedence. Here's an example from elsewhere in the codebase:
https://github.com/bytecodealliance/wasmtime/blob/0f944937c0c3d90946eaa6199f0baa5ed991c80d/cranelift/codegen/src/isa/x64/lower.isle#L1799-L1809

The LHS of the two rules only differs in the patterns to the first argument of (has_type): both rules match (ty_32_or_64 ty) while the rule on line 1799 also matches (use_bmi1). The rule on line 1799 has been given a priority of 1 to ensure that it's checked before the rule on line 1806, but ideally we wouldn't need to give this annotation here.

My proposal is that we switch to compiling ISLE by matching rules top-down, left-to-right.

Benefit

The benefit of this change would be that we could tell statically when a rule was not reachable, and raise an error when it would never fire.

Implementation

For matching on enum values we could take the approach outlined in https://www.cs.tufts.edu/~nr/cs257/archive/luc-maranget/jun08.pdf. For handling extractors we could take inspiration from GHC's implementation of view patterns.

Alternatives

We could continue to develop heuristics to determine when rules might be unconditionally shadowed, and inform the programmer through an error or warning.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 17 2022 at 00:33):

elliottt commented on issue #4717:

I ran into another instance of this today: the sse_cmp_op rule in the x64 lowering has overlapping cases for vectored floating point types. The rules on lines on 1504 and 1505 overlap with the rules on 1506 and 1507 respectively:
https://github.com/bytecodealliance/wasmtime/blob/c569e7bea5ab9c2f9a617aa1ba7de3d926d880be/cranelift/codegen/src/isa/x64/inst.isle#L1499-L1507

Any change in the heuristic that put the constant patterns before the fallible extractors would cause a panic during instruction emission, as the vector_all_ones rule uses the result of sse_cmp_op as the op for an XmmRmR instruction:
https://github.com/bytecodealliance/wasmtime/blob/c569e7bea5ab9c2f9a617aa1ba7de3d926d880be/cranelift/codegen/src/isa/x64/inst.isle#L1520-L1527

view this post on Zulip Wasmtime GitHub notifications bot (Aug 17 2022 at 00:34):

elliottt edited a comment on issue #4717:

I ran into another instance of this today: the sse_cmp_op rule in the x64 lowering has overlapping cases for vectored floating point types. The rules on lines on 1504 and 1505 overlap with the rules on 1506 and 1507 respectively:
https://github.com/bytecodealliance/wasmtime/blob/c569e7bea5ab9c2f9a617aa1ba7de3d926d880be/cranelift/codegen/src/isa/x64/inst.isle#L1499-L1507

Any change in the heuristic that put the constant patterns before the fallible extractors would cause a panic during instruction emission, as the vector_all_ones rule uses the result of sse_cmp_op as the op for an XmmRmR instruction, and the cmpps and cmppd instructions require an immediate argument:
https://github.com/bytecodealliance/wasmtime/blob/c569e7bea5ab9c2f9a617aa1ba7de3d926d880be/cranelift/codegen/src/isa/x64/inst.isle#L1520-L1527

view this post on Zulip Wasmtime GitHub notifications bot (Aug 17 2022 at 17:03):

elliottt commented on issue #4717:

It's worth pointing out that for the example in https://github.com/bytecodealliance/wasmtime/issues/4717#issuecomment-1217308075 we would still not be able to raise shadowing errors: the patterns that shadow the constant patterns are made up of fallible external extractors, and we don't have any insight into what will cause them to succeed or fail. This is the same situation that comes up in haskell with view patterns.

However, we would benefit from predictable rule matching order, as the problematic rules on lines 1506 and 1507 would not suddenly fire due to heuristic changes.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 21 2022 at 13:36):

jameysharp commented on issue #4717:

It's occurred to me that we're trying to solve two different issues which might benefit from being treated separately. We're currently using priorities and heuristics to express validity preconditions as well as preferences between multiple valid rules. In other words, a key part of the optimizer's cost model is implicit in the ISLE rules for the backend. Changing the matching order to top-down doesn't change that.

We could instead require every rule to express all of its preconditions, and have a separate way of expressing the cost model. That might mean that for usability we need new ISLE constructs for concisely expressing preconditions, I dunno. Ideally we'd let the user choose between several cost models when running Cranelift.

In that approach, the order of rules doesn't matter, just like today. Formal verification is especially important then because, like today, if you get the preconditions wrong in some rule, it may happen to work a lot of the time due to other rules getting selected first; and like today, you can't tell if that's happening just by looking at the ISLE rules.

One way to use a cost model is to impose a total ordering on rules. Different cost models impose different orderings so you can build different instruction selectors from the same rules. This can be done while building the compiler so there's no compile-time cost for that flexibility.

But where it really helps to keep ISLE rules independent of cost model is if you start using e-graphs in the backend. The general idea is to take the full e-graph from mid-end optimization, apply all valid lowering rules at all e-nodes, and combine the results in a new e-graph. Now you can use the cost model in a way that takes dynamic context into account when there are partially overlapping rules, during extraction instead of during lowering. I think this subsumes peephole optimization.

I've hand-waved away a lot of important details. The main thing is that I think it's worth thinking about other ways to let the consumer of ISLE rules decide what priority to give each rule, or even to find all matching rules. At the same time, backend developers do need to be able to control those priorities based on their knowledge of the ISA. I'm just no longer convinced that control should be expressed in the ISLE rules.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 22 2022 at 02:21):

cfallin commented on issue #4717:

Thanks @jameysharp for your thoughts here!

We could instead require every rule to express all of its preconditions, and have a separate way of expressing the cost model. That might mean that for usability we need new ISLE constructs for concisely expressing preconditions, I dunno. Ideally we'd let the user choose between several cost models when running Cranelift.

Unfortunately, this proposal is basically the status quo today already. A correct rule should express all conditions necessary for it to fire. And the priority mechanism is a way for the user to provide a cost model. It's possible I'm not understanding some subtlety of your proposal, but to me it reduces more or less to: keep the current design, but do it without bugs (or add some unspecified other mechanism to help do it without bugs, but then... let's talk about that other mechanism, if we have one).

The main body of my argument is that, in practice, this isn't working. The difficulty is not coming up with a working abstraction -- rules that explicitly encode all conditions needed for them to fire -- but the way that that requirement plays out in practice as compared to developers' working mental models. See the "Empirical evidence of mental-model mismatch" section in my comment above.

Concretely: how do we ensure that this "all preconditions included" is the case? I already very carefully review for this, and (i) I have still missed some cases, leading to the bugs linked above, and (ii) we shouldn't have to rely on "think really hard and never make a mistake" to uphold such a property. Formal verification is one answer, but this too feels wrong: we shouldn't stick with an empirically difficult-to-use abstraction early in the pipeline just because we can catch it with a very heavyweight technique later. Bugs cheaper to catch the earlier one catches them and all that.

But where it really helps to keep ISLE rules independent of cost model is if you start using e-graphs in the backend. The general idea is to take the full e-graph from mid-end optimization, apply all valid lowering rules at all e-nodes, and combine the results in a new e-graph. Now you can use the cost model in a way that takes dynamic context into account when there are partially overlapping rules, during extraction instead of during lowering. I think this subsumes peephole optimization.

This I think deserves a largely separate discussion, and I do agree that it's interesting to think about a "late extraction" approach, but really quickly, I think that the proper way to approach this while designing for correctness would be to provide an explicit "multiple options" ("unordered rules") mechanism that works within a broader context of nailed-down, simple-to-understand total ordering, not the other way around. This is because I want the programmer to have to opt into considering all options and testing them.

That I think leads to another way of expressing my broad concerns here, related to the "experimental approach" concerns above: if the default is any-rule-could-fire with some unknowable (and explicitly reconfigurable, under your proposal!) heuristic driving it, then the user has to consider how all possible rule firing orders could result in correct or incorrect lowerings. Human brains mostly aren't built (or mine isn't anyway) for considering combinatorial possibilities at every step. (This is why unstructured pthreads programming is so much harder than structured concurrency; ordered by default, explicitly unordered where specified, is far easier than the opposite.) The analogy to structured concurrency tells us how to proceed here as well: we want well-specified combinators that say "I really do mean that these three different rules are all equally valid and I have thought about the implications", rather than "I happened upon a bug experimentally in the everything-parallel world and want to serialize just this bit as a likely fix".

Does that make some sense at least? Happy to talk further in the Cranelift meeting tomorrow; I really am curious how others feel about this, especially folks with more scars from real ISLE programming and debugging :-)

view this post on Zulip Wasmtime GitHub notifications bot (Aug 22 2022 at 06:59):

bjorn3 commented on issue #4717:

Concretely: how do we ensure that this "all preconditions included" is the case?

Maybe by having a fuzzer which changes rule priority and disables rules at random and then doing differential fuzzing of the program execution with the rule set used for regular compilation where failure to codegen won't be considered a crash?

view this post on Zulip Wasmtime GitHub notifications bot (Aug 22 2022 at 15:10):

cfallin commented on issue #4717:

Maybe by having a fuzzer which changes rule priority and disables rules at random and then doing differential fuzzing of the program execution with the rule set used for regular compilation where failure to codegen won't be considered a crash?

That's certainly an option (actually it's even an item in the 2022 Cranelift roadmap, IIRC); but I think that it sort of sits in a similar place to verification in this regard: it would catch issues but I think at a higher cost than simplifying semantics and making it easier to avoid bugs in the first place.

In other words may main concern at the moment is that we have a tool whose behavior is hard to understand and predict, and whose behavior seems to differ from many peoples' intuitions, and hence the tool is hard to use; if we fix that then we can still do all sorts of fuzzing and verification but the need to lean on it will be less, and it can spend its time finding other interesting bugs.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 22 2022 at 18:24):

cfallin commented on issue #4717:

We had a good discussion on this issue today in the Cranelift meeting; thanks all for the input!

I unfortunately didn't take detailed comment-by-comment notes but by the end of the discussion we had emerging a consensus that the overlap checking is actually the most interesting part; and if we got that right, the ordering of the rules themselves in textual order is actually not as important. There is still the question of whether we have "locally unordered" groups of rules, allowing for the flexibility of some sort of cost model, or whether we have a total order, but overlap checking and explicit opt-in to overlap lets us solve the high-order correctness+explicitness issue.

Basically the idea is: by default it is an error for one rule to shadow another. Shadowing should be clearly semantically defined at the language level, not just "engine can't prove disjoint": for example, we could say that any two rules are disjoint if they match a different enum arm or different constant against the same value (same expression from root arguments). Conversely we should probably say that external extractors have no disjointness at the language-semantics level, and allow for negation at the language level ("this extractor did not match") rather than building ad-hoc pairs of "A" and "not A" extractors.

Then we make the existing rules we have "disjoint-clean" by adding in "not the earlier rule R, and ..." prefixes to their left-hand sides. This becomes easier if we can name rules explicitly (thanks @avanhatt for this idea).

Then we can allow explicit "overlap with unconstrained priority" if/when we want: rather than "not R, and ..." as a left hand side, we can say "either R or ..."; or perhaps group R1 and R2 (with overlapping left-hand sides) in one syntactic "unordered group" (but there are reasons why this may not be practical when generated code is involved); or perhaps a separate out-of-band declaration that R1 and R2 are overlapping.

I think that captures most of what we settled on, but I am probably forgetting some details; my apologies if so; please do add anything I've missed.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 02 2022 at 15:51):

akirilov-arm labeled issue #4717:

Feature

While working on finishing the migration of the x64 backend to ISLE, I encountered a case where guarding a rule on specific ISA features being enabled caused the rule to never fire. The problem was that the rule I added overlapped with another rule except in the ISA feature check, and the other rule took precedence. Here's an example from elsewhere in the codebase:
https://github.com/bytecodealliance/wasmtime/blob/0f944937c0c3d90946eaa6199f0baa5ed991c80d/cranelift/codegen/src/isa/x64/lower.isle#L1799-L1809

The LHS of the two rules only differs in the patterns to the first argument of (has_type): both rules match (ty_32_or_64 ty) while the rule on line 1799 also matches (use_bmi1). The rule on line 1799 has been given a priority of 1 to ensure that it's checked before the rule on line 1806, but ideally we wouldn't need to give this annotation here.

My proposal is that we switch to compiling ISLE by matching rules top-down, left-to-right.

Benefit

The benefit of this change would be that we could tell statically when a rule was not reachable, and raise an error when it would never fire.

Implementation

For matching on enum values we could take the approach outlined in https://www.cs.tufts.edu/~nr/cs257/archive/luc-maranget/jun08.pdf. For handling extractors we could take inspiration from GHC's implementation of view patterns.

Alternatives

We could continue to develop heuristics to determine when rules might be unconditionally shadowed, and inform the programmer through an error or warning.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 02 2022 at 15:52):

github-actions[bot] commented on issue #4717:

Subscribe to Label Action

cc @cfallin, @fitzgen

<details>
This issue or pull request has been labeled: "isle"

Thus the following users have been cc'd because of the following labels:

To subscribe or unsubscribe from this label, edit the <code>.github/subscribe-to-label.json</code> configuration file.

Learn more.
</details>

view this post on Zulip Wasmtime GitHub notifications bot (May 12 2024 at 00:48):

jameysharp closed issue #4717:

Feature

While working on finishing the migration of the x64 backend to ISLE, I encountered a case where guarding a rule on specific ISA features being enabled caused the rule to never fire. The problem was that the rule I added overlapped with another rule except in the ISA feature check, and the other rule took precedence. Here's an example from elsewhere in the codebase:
https://github.com/bytecodealliance/wasmtime/blob/0f944937c0c3d90946eaa6199f0baa5ed991c80d/cranelift/codegen/src/isa/x64/lower.isle#L1799-L1809

The LHS of the two rules only differs in the patterns to the first argument of (has_type): both rules match (ty_32_or_64 ty) while the rule on line 1799 also matches (use_bmi1). The rule on line 1799 has been given a priority of 1 to ensure that it's checked before the rule on line 1806, but ideally we wouldn't need to give this annotation here.

My proposal is that we switch to compiling ISLE by matching rules top-down, left-to-right.

Benefit

The benefit of this change would be that we could tell statically when a rule was not reachable, and raise an error when it would never fire.

Implementation

For matching on enum values we could take the approach outlined in https://www.cs.tufts.edu/~nr/cs257/archive/luc-maranget/jun08.pdf. For handling extractors we could take inspiration from GHC's implementation of view patterns.

Alternatives

We could continue to develop heuristics to determine when rules might be unconditionally shadowed, and inform the programmer through an error or warning.

view this post on Zulip Wasmtime GitHub notifications bot (May 12 2024 at 00:48):

jameysharp commented on issue #4717:

I think this issue has largely been resolved by the overlap checker and trie-again work. Explicit rule priorities are obnoxious sometimes but it seems to be pretty clear to people how they work.


Last updated: Nov 22 2024 at 16:03 UTC