scottmcm opened PR #7636 from scottmcm:spaceship
to bytecodealliance:main
:
Previous zulip conversation: <https://bytecodealliance.zulipchat.com/#narrow/stream/217117-cranelift/topic/3-way.20compare.20.28Rust.20.60cmp.60.2C.20C.2B.2B20.20.60.3C.3D.3E.60.29/near/404519375>
This PR adds a bunch of ISLE opt patterns for three-way comparison on primitives, as exposed by
Ord::cmp
in Rust or as<=>
"spaceship" in C++20. It's split in two parts:
- Detecting nested
select
s that are actually doing this, of which there are sadly many possible forms and no well-known-best way. By doing this cranelift can emit a much simpler MachInst pattern for it than if it needs to use cmovs and constants.- Adding simplifications when common comparisons are done on the result of the spaceship. This is particularly relevant for any code written against a 3-way-comparison interface but that only needs a 2-way result. For example, if you use
.sort_by(|a, b| b.cmp(a))
in Rust to reverse-sort something, it immediately just checks forOrdering::Less
, and if you have#[derive(PartialOrd)] struct Foo(u32);
then the generated<
operator is doing the equivalent of(a <=> b) < 0
, which this will improve toa < b
.<!--
Please make sure you include the following information:
If this work has been discussed elsewhere, please include a link to that
conversation. If it was discussed in an issue, just mention "issue #...".Explain why this change is needed. If the details are in an issue already,
this can be brief.Our development process is documented in the Wasmtime book:
https://docs.wasmtime.dev/contributing-development-process.htmlPlease ensure all communication follows the code of conduct:
https://github.com/bytecodealliance/wasmtime/blob/main/CODE_OF_CONDUCT.md
-->
scottmcm requested abrown for a review on PR #7636.
scottmcm requested wasmtime-compiler-reviewers for a review on PR #7636.
scottmcm submitted PR review.
scottmcm created PR review comment:
For example, this is the select pair that Clang 17 appears to use today: https://cpp.godbolt.org/z/nx16oGrPj
%cmp.lt = icmp ult i32 %a, %b %sel.lt = select i1 %cmp.lt, i8 -1, i8 1 %cmp.eq = icmp eq i32 %a, %b %sel.eq = select i1 %cmp.eq, i8 0, i8 %sel.lt
for which it emits the surprisingly-long assembly
xor ecx, ecx cmp edi, esi mov eax, 0 sbb eax, eax or al, 1 cmp edi, esi movzx eax, al cmove eax, ecx
whereas with this PR, cranelift can do just
function %spaceship_u32(i32, i32) -> i8 { block0(v0: i32, v1: i32): v2 = icmp ugt v0, v1 v3 = icmp ult v0, v1 v4 = isub v2, v3 return v4 } ; VCode: ; pushq %rbp ; movq %rsp, %rbp ; block0: ; cmpl %esi, %edi ; setnbe %al ; cmpl %esi, %edi ; setb %r10b ; subl %eax, %r10d, %eax ; movq %rbp, %rsp ; popq %rbp ; ret
(That does have an extra
cmp
, but there's enough ISLE in this already that I'll leave fixing that for a future PR.)
scottmcm submitted PR review.
scottmcm created PR review comment:
This is (after LLVM optimizations) the one that Rust started using in https://github.com/rust-lang/rust/issues/63758 : https://rust.godbolt.org/z/rrrvx8xce
github-actions[bot] commented on PR #7636:
Subscribe to Label Action
cc @cfallin, @fitzgen
<details>
This issue or pull request has been labeled: "cranelift", "isle"Thus the following users have been cc'd because of the following labels:
- cfallin: isle
- fitzgen: isle
To subscribe or unsubscribe from this label, edit the <code>.github/subscribe-to-label.json</code> configuration file.
Learn more.
</details>
alexcrichton submitted PR review:
This looks great to me, thanks for adding these!
I've left a few comments, although they're mostly just cosmetic. The one about
(extractor)
might be interesting to explore if you're curious but don't take it as a requirement for this PR.
alexcrichton submitted PR review:
This looks great to me, thanks for adding these!
I've left a few comments, although they're mostly just cosmetic. The one about
(extractor)
might be interesting to explore if you're curious but don't take it as a requirement for this PR.
alexcrichton created PR review comment:
In the x64 backend there's a
maybe_uextend
extractor which you might be able to use here to deduplicate these rules.Additionally the comment above includes:
;; x < y ? -1 : x == y ? 0 : +1
but these rules don't seem to match that, is that a mistake?
alexcrichton created PR review comment:
One possible way you could simplify this is something like:
(decl spaceship_s (Value Value) Value) (extractor (spaceship_s v) (isub _ (sgt rty x y) (slt rty x y)))
I forget the precise syntax for
(extractor ...)
but I think it can be used along the lines of a macro-of-sorts to reduce the repetition here. That being said I've found them historically a bit cumbersome to use and I'm not sure you can pattern-match out therty
still as well (and maybety
too?) so this may not amount to much after all.
alexcrichton created PR review comment:
Doesn't need to be included in this PR specifically, but to make sure I understand, these can all be duplicated for
uextend
with the exception of changingsigned_cond_code
, right?
scottmcm submitted PR review.
scottmcm created PR review comment:
It does and it doesn't depending on your perspective.
That case is picked up by that pattern, as you can see in the
cmp_u1a
test https://github.com/bytecodealliance/wasmtime/pull/7636/files#diff-f7401bd506e4a5000af4eed4a8d0b9a4138c423291933d35ec683ff6a88f63f8R208But it happens via this
select (icmp ...) 0 1
simplification pattern: https://github.com/bytecodealliance/wasmtime/pull/7615/files#diff-ec130b69b5b51fce365916a438a4e1178165f56a6d4f0a3ec89d6bbafec0b454R12-R16And thus the pattern here sees it as
x < y ? -1 : x != y
.
scottmcm submitted PR review.
scottmcm created PR review comment:
There's definitely a more interesting pattern in here trying to get out.
I started down a path of "I think this actually works for any fits-in-
i8
constant", but then I think it's more like the "known bits" pattern thatuextend
has above, sincei8
is just the simple version of that, and I started adding new things to the prelude and such to try to match them well, then thought better of it, reverted all that, and just did the basic "compare with zero" stuff like the "zero-extended integers aren't zero" pattern just above it.So yeah, I think there's far more possible here, but I didn't want to go there this PR.
scottmcm submitted PR review.
scottmcm created PR review comment:
I'll give this (and the
maybe_uextend
from the other thread) a try. There's so much copy-pasting here -- and corresponding opportunities to forget to update theugt
s tosgt
s and vice versa -- that even if it's a bit cumbersome it may well be worth it.
scottmcm updated PR #7636.
scottmcm commented on PR #7636:
Adding the extractors and constructors worked great! Huge improvement, thanks.
maybe_uextend
didn't migrate trivially to the common prelude, so I think I'll skip that one for now. It doesn't help nearly as much as thespaceship_[su]
patterns anyway.
alexcrichton submitted PR review:
Nice I agree it worked out well! Thanks again!
alexcrichton commented on PR #7636:
Failure looks unrelated to this PR, I'll debug that separately.
alexcrichton commented on PR #7636:
Ok once https://github.com/bytecodealliance/wasmtime/pull/7651 lands I'll retry landing this
alexcrichton merged PR #7636.
Last updated: Nov 22 2024 at 16:03 UTC