elliottt opened PR #4941 from trevor/resolve-x64-inst-prelude-overlaps
to main
:
Resolve overlap in the ISLE prelude and the x64 inst module by introducing new types that allow better sharing of extractor resuls, or falling back on priorities.
<!--
Please ensure that the following steps are all taken care of before submitting
the PR.
[ ] This has been discussed in issue #..., or if not, please tell us why
here.[ ] A short description of what this does, why it is needed; if the
description becomes long, the matter should probably be discussed in an issue
first.[ ] This PR contains test cases, if meaningful.
- [ ] A reviewer from the core maintainer team has been assigned for this PR.
If you don't know who could review this, please indicate so. The list of
suggested reviewers on the right can help you.Please ensure all communication adheres to the code of conduct.
-->
elliottt edited PR #4941 from trevor/resolve-x64-inst-prelude-overlaps
to main
:
Resolve overlap in the ISLE prelude and the x64 inst module by introducing new types that allow better sharing of extractor resuls, or falling back on priorities.
This PR makes the following changes in overlap counts for the different backends:
branch x64 aarch64 s390x main 168 214 446 this 138 212 440 <!-- Please ensure that the following steps are all taken care of before submitting
the PR.
[ ] This has been discussed in issue #..., or if not, please tell us why
here.[ ] A short description of what this does, why it is needed; if the
description becomes long, the matter should probably be discussed in an issue
first.[ ] This PR contains test cases, if meaningful.
- [ ] A reviewer from the core maintainer team has been assigned for this PR.
If you don't know who could review this, please indicate so. The list of
suggested reviewers on the right can help you.Please ensure all communication adheres to the code of conduct.
-->
elliottt updated PR #4941 from trevor/resolve-x64-inst-prelude-overlaps
to main
.
cfallin submitted PR review.
cfallin submitted PR review.
cfallin created PR review comment:
re-indenting here is fine but could you (i) align the comment above (line 180) with new indent, and (ii) fix the alignment of the comment below (line 185)?
cfallin created PR review comment:
Incomplete comment?
cfallin created PR review comment:
Rather than the
(and subpat var)
idiom here, could we write either(and var subpat)
, or preferablyvar @ subpat
? It's a little hard for me to visually parse what's happening here otherwise. (Likewise for the three cases below.)
cfallin created PR review comment:
i128
s, actually! (This may be an outdated comment from somewhere else?)
cfallin created PR review comment:
Can we have SingleGpr and MultiGpr here instead?
elliottt updated PR #4941 from trevor/resolve-x64-inst-prelude-overlaps
to main
.
elliottt submitted PR review.
elliottt created PR review comment:
Without or patterns, that makes the existing behavior of
is_gpr_type
impossible to capture. Is it the case that everywhere it's used now it implies the single register case?
elliottt submitted PR review.
elliottt created PR review comment:
This was the fault of an overzealous lisp mode in neovim. I reverted the changes it made and disabled it :)
elliottt created PR review comment:
Thanks, fixed!
elliottt submitted PR review.
elliottt submitted PR review.
elliottt created PR review comment:
We can't write
var @ subpat
because ISLE doesn't currently support that for the RHS of an extractor definition. I standardized on the(and pat var)
form to ensure that these newly introducedand
patterns wouldn't run afoul of theand
order bug that we discussed in the meeting today. Happy to reorder them once we fix that bug, as I like the(and var pat)
form more as well.
elliottt created PR review comment:
Yep, it used to be in the middle of the
imm
rules, so I moved it above. I'll fix the comment :+1:
elliottt submitted PR review.
elliottt updated PR #4941 from trevor/resolve-x64-inst-prelude-overlaps
to main
.
elliottt updated PR #4941 from trevor/resolve-x64-inst-prelude-overlaps
to main
.
cfallin created PR review comment:
Can we use two rules then?
(rule (is_gpr_type (RegisterClass.SingleGpr)) $true) (rule (is_gpr_type (RegisterClass.MultiGpr)) $true) (rule (is_gpr_type (RegisterClass.Xmm)) $false)
In general I'm not a huge fan of isolated bool flags in data structures as the meaning isn't carried with them (one has to look back to the type definition) but if it isn't possible to do here for some reason I'm fine with the original too.
cfallin submitted PR review.
cfallin edited PR review comment.
cfallin created PR review comment:
Urgh, sorry, just realized (with your pointing out) that this is an extractor definition; in that case the original is fine!
cfallin submitted PR review.
cfallin created PR review comment:
Ah, yeah, we should fix that but for now this is fine then.
cfallin submitted PR review.
elliottt has marked PR #4941 as ready for review.
jameysharp created PR review comment:
Ah, I'm glad to see this! I remember we both had to think hard about what order these rules would match in, and it's nice that the overlap checker agrees that this is confusing.
jameysharp created PR review comment:
I'm not super fond of this
intcc_is_ordered
extractor. Is there a reason not to put the unordered cases at priority 3, matchingEqual
andNotEqual
directly, and let the ordered case be a wildcard fallback at priority 2?
jameysharp submitted PR review.
jameysharp created PR review comment:
Do values of B128 type fit in a single general-purpose register?
jameysharp created PR review comment:
I got very confused trying to see why
(range_nonempty)
is necessary. But you're adding it here because otherwiserange_singleton
overlaps withrange_unwrap
, right?
jameysharp submitted PR review.
jameysharp created PR review comment:
I really like this RangeView approach!
jameysharp created PR review comment:
Why does this exclude sign-extended loads of vector types? The overlap checker can't tell that
multi_lane
andfits_in_32
don't overlap, so I understand needing to do something, but I'd think those two rules just need different priorities. Since they actually don't overlap, it shouldn't matter which way around you assign the priorities, right?
jameysharp created PR review comment:
Hmmm, is it better to duplicate the rule like this, or to put the single-gpr case at a higher priority?
elliottt created PR review comment:
Yep, exactly. Originally I had a third constructor in
RangeView
,Singleton
, that identified the case where there was a single element in the list. The problem was that using the one function in the implementation ofrange_singleton
andrange_unwrap
now meant that there was an unhandled case here. I fell back on implementingrange_singleton
in terms ofrange_unwrap
and an explicit assertion about the tail, but maybe removingrange_singleton
entirely would make this more clear?
elliottt submitted PR review.
elliottt submitted PR review.
elliottt created PR review comment:
No. As currently implemented, they would be either 128
0
s, or 1281
s.
elliottt created PR review comment:
I think that would be fine, however the rule that I've been trying to follow is to use priorities as a fall back for when there's not a clear way to split the rules by a concrete pattern. Maybe there's a different name we could use here that would make it seem less objectionable?
elliottt submitted PR review.
elliottt submitted PR review.
elliottt created PR review comment:
The original implementation used named the
ExtKind
value as it was handling these two cases. As we don't have or-patterns, and we're trying to handle the remaining cases that differ fromExtKind.SignExtend
on line 1615 above, duplicating the rule was the easiest path forward.
elliottt created PR review comment:
In the future, I'd love to use an or-pattern here. Given the current compilation for rules at different priorities, I'm tempted to say that duplication for rules with a small RHS is preferable. I have an idea that I'd like to try out for using the overlap information to allow earlier match reuse in the trie, and in that case using different priorities wouldn't be nearly as costly.
elliottt submitted PR review.
jameysharp submitted PR review.
jameysharp created PR review comment:
Then the expression for
single_register
should bety != I128 && ty != B128
, right?
jameysharp submitted PR review.
jameysharp created PR review comment:
Is it possible to have a right-hand side that emits nothing? If so, I think the best way to write
copy_to_regs_range
would be a(range_empty)
case that's a no-op, and a(range_unwrap head tail)
case that emits one move before recursing ontail
. Essentially, a normalfold
-based implementation ofmap
. That should generate less code as well as being clearly non-overlapping.
elliottt submitted PR review.
elliottt created PR review comment:
The previous implementation of
is_single_register_type
only checked that the type wasn't equal toI128
, so I opted to preserve the existing behavior.
https://github.com/bytecodealliance/wasmtime/blob/3a2b32bf4df83ab3d624f8c187116d02adb2c69c/cranelift/codegen/src/isa/x64/lower/isle.rs#L567-L573
jameysharp created PR review comment:
I think the name
intcc_is_ordered
is a fine description of what the extractor does. I just don't like the contortions you had to go through to convince the overlap checker that theEqual
andNotEqual
cases don't overlap with the(intcc_is_ordered $true)
case. If we had or-patterns this would be(and (or (IntCC.Equal) (IntCC.NotEqual)) (IntCC.NotEqual))
, which is a confusing way to say(IntCC.NotEqual)
. This is trying to express a priority relationship without using priorities, and I'd rather just use priorities.
jameysharp submitted PR review.
elliottt updated PR #4941 from trevor/resolve-x64-inst-prelude-overlaps
to main
.
elliottt created PR review comment:
The thing that stopped me from doing this before was that I didn't have a way to make a
Unit
value. I'll addunit
to the prelude, because I like this approach more than the current singleton checking one.One nice feature of the current implementation is that it will fail to match any rules if
copy_regs_to_range
is given an empty range, but I'm not sure that's worth keeping the current confusing implementation over.
elliottt submitted PR review.
elliottt submitted PR review.
elliottt created PR review comment:
That's not what I was suggesting for the use of or-patterns: I think we would not have the
and
anymore for theEqual
and NotEqualcases, and instead would list the remaining cases where we have
(intcc_is_ordered $true)` now. That feels pretty natural to me, and is likely how you'd write it in rust if you wanted to enumerate all the cases, right?
elliottt edited PR review comment.
jameysharp submitted PR review.
jameysharp created PR review comment:
Okay, to check: the
multi_lane
andfits_in_32
extractors never match the same types, right? My understanding was thatmulti_lane
only matches vector types, and thatfits_in_32
only matches types that are at most 32 bits wide, and that the narrowest vector types we have are 64 bits wide.If that's the case, then I think this change is incorrect: previously, this rule would fire for vector loads regardless of the sign extension mode, but this version of the rule only fires for the
None
andZeroExtend
modes.Now maybe that's okay, if for instance vector loads always use mode
None
, or something along those lines. But if so, that needs at least a comment or something.
elliottt updated PR #4941 from trevor/resolve-x64-inst-prelude-overlaps
to main
.
elliottt submitted PR review.
elliottt created PR review comment:
At any rate, I've removed
intcc_is_ordered
, and put explicit priorities in place instead. We can revisit this if we decide to implement or-patterns in the future.
jameysharp submitted PR review.
jameysharp created PR review comment:
I guess I don't feel strongly about duplication for rules with a small RHS. I'm just more concerned about maintainability of the ISLE source than I am about the performance of the ISLE-generated code right now. We know we'll fix the latter by improving the ISLE compiler, but the former is harder to recover.
jameysharp submitted PR review.
jameysharp created PR review comment:
Oh, sorry, I wasn't clear: I know you weren't suggesting expanding
intcc_is_ordered
to an or-pattern. I just meant that's how I had to reason about what the and-patterns were doing, and it was a confusing thing to reason about.I agree, listing the remaining cases with an or-pattern instead of priorities feels pretty natural to me, too. Only it's a bit of a shame that we can't check for match completeness, in case somebody accidentally misses a condition code.
elliottt created PR review comment:
So apparently
i8x4
is a type we support. I've removed the duplication here, and resolved this overlap with priorities instead.
elliottt submitted PR review.
jameysharp submitted PR review.
jameysharp created PR review comment:
I have concerns about the previous implementation then. :sweat_smile: I'm guessing this means we've never handed a B128 to the x64 lowering rules that used this check. And that seems... not unlikely, I suppose?
elliottt submitted PR review.
elliottt created PR review comment:
OK, I've switched to resolving this with priorities.
elliottt submitted PR review.
elliottt created PR review comment:
Sure, but let's keep two things in mind while reviewing this PR: 1) we're not trying to fix existing bugs, only resolve overlaps, 2) we're talking about removing all the boolean types in the short-term.
elliottt updated PR #4941 from trevor/resolve-x64-inst-prelude-overlaps
to main
.
jameysharp submitted PR review.
jameysharp created PR review comment:
Ooh, I really like how this worked out once you added
(unit)
. That strikes me as a great general-purpose thing for the prelude anyway.If we want to keep the "fails on empty range" behavior, we could split the term: an outer
copy_to_regs_range
that just checks for a non-empty range before delegating to this recursive term that generates the actual copies. That would make the intent explicit, too.
jameysharp created PR review comment:
Yep, I agree with both of your points. I think it's still worth noting existing bugs during any review, and in general opening issues to track them if we decide not to fix them immediately. But in this specific case, you're right, it doesn't matter since we're going to remove the boolean types.
jameysharp submitted PR review.
jameysharp submitted PR review.
jameysharp submitted PR review.
jameysharp created PR review comment:
range_nonempty
isn't needed now so you could delete it.
elliottt updated PR #4941 from trevor/resolve-x64-inst-prelude-overlaps
to main
.
elliottt updated PR #4941 from trevor/resolve-x64-inst-prelude-overlaps
to main
.
elliottt merged PR #4941.
Last updated: Nov 22 2024 at 17:03 UTC