afonso360 opened PR #6196 from afonso360:long-bmask
to bytecodealliance:main
:
:wave: Hey,
This is something I wrote in #5888 and then @elliottt pointed out that It's just
bmask
. I checked if we had a rule for this in the mid end and we don't.This is a really long long rule, so I'm not sure how often it's going to fire. But I also got that lowering from here so it is out there in the real world.
This is based on top of #6140 so that we don't have to rebase that again.
afonso360 requested fitzgen for a review on PR #6196.
afonso360 requested wasmtime-compiler-reviewers for a review on PR #6196.
afonso360 edited PR #6196:
:wave: Hey,
This is something I wrote in #5888 and then @elliottt pointed out that It's just
bmask
. I checked if we had a rule for this in the mid end and we don't.This is a really long long rule, and I'm not sure how often it's going to fire. But I also got that lowering from here so it is out there in the real world.
This is based on top of #6140 so that we don't have to rebase that again.
afonso360 updated PR #6196.
afonso360 edited PR #6196:
:wave: Hey,
This is something I wrote in #5888 and then @elliottt pointed out that It's just
bmask
. I checked if we had a rule for this in the mid end and we don't.This is a really long long rule, and I'm not sure how often it's going to fire. But I also got that lowering from here so it is out there in the real world.
This is based on top of #6140 so that we don't have to rebase that again.
I'm going to let the fuzzer run on this for a while.
afonso360 edited PR #6196:
:wave: Hey,
This is something I wrote in #5888 and then @elliottt pointed out that It's just
bmask
. I checked if we had a rule for this in the mid end and we don't.This is a really long rule, and I'm not sure how often it's going to fire. But I also got that lowering from here so it is out there in the real world.
This is based on top of #6140 so that we don't have to rebase that again.
I'm going to let the fuzzer run on this for a while.
afonso360 updated PR #6196.
afonso360 updated PR #6196.
afonso360 updated PR #6196.
fitzgen submitted PR review.
fitzgen submitted PR review.
fitzgen created PR review comment:
Is this actually correct? I couldn't convince myself, so I asked Z3 to find an
x
for whichbmask(x)
and!(((x | ((!x) + 1)) >> (ty_bits - 1)) - 1)
were not equal (that is, I asked it to find a counterexample):(declare-const x (_ BitVec 32)) (assert (! (= (ite (= x #x00000000) #x00000000 #xffffffff) ;; !(((x | ((!x) + 1)) >> 31) - 1) (bvnot (bvsub (bvlshr (bvor x (bvadd (bvnot x) #x00000001)) #x0000001f) #x00000001))))) (check-sat) (get-model)
It reports
sat
, meaning that it found a counterexample, and it says the counterexample is zero:sat ( (define-fun x () (_ BitVec 32) #x00000000) )
So unless I'm missing something and/or translated the expression incorrectly, I think this optimization is incorrect?
fitzgen submitted PR review.
fitzgen created PR review comment:
Then again, there is an explicit test for 0 in the run tests, so I must have done something wrong...
cfallin submitted PR review.
cfallin created PR review comment:
The SMT counterexample is confusing me too:
x1 := (bvnot x)
withx := 0
->0xffffffff
x2 := (bvadd x1 1)
->0
x3 := (bvor x2 x)
->(bvor 0 0)
->0
x4 := (bvlshr x3 31)
->0
x5 := (bvsub x4 1)
->0xffffffff
x6 := (bvnot x5)
->0
and on the LHS for the if-then-else we have
(ite (= 0 0) 0 0xffffffff)
->0
. Seems like the optimization did what it was supposed to? I too am confused why we get SAT here.
jameysharp submitted PR review.
jameysharp created PR review comment:
That Z3 expression looks like a correct translation of that pattern to my eyes, so I'm confused too.
I think the rule seems correct:
!input + 1
is equivalent to-input
in two's-complement.- 0 is the only value which, when negated, leaves the sign bit clear. (
INT_MIN
leaves the sign bit set; all other cases flip the sign bit.)- Shifting the sign bit down after
x | -x
then gives us 0 if the input was 0, and 1 otherwise.- Subtracting 1 means we get -1 if the input was 0, and 0 otherwise.
- Inverting the result should then give us the right answer.
cfallin submitted PR review.
cfallin created PR review comment:
And I think this is a reasonable informal proof, by cases:
Prove:
;; bmask(input) = !(((input | ((!input) + 1)) >> 63) - 1)
(for 64-bit input)
- Take the three cases
input == 0
: we have the above evaluation; the RHS produces0
andbmask(0) == 0
.input < 0
:input
has MSB = 1, thusinput | ...
has MSB = 1. Right-shift by 63, we have1
;1 - 1
=0
; outer bnot gives us all-ones.input > 0
:input
has MSB = 0, thus!input
has MSB = 1. Furthermore!input != 0xffff...ffff
(else we would have hadinput == 0
). So!input + 1
also has MSB = 1 (because it will not wrap around, because of previous statement). Soinput | ((!input) + 1)
has MSB = 1. Right-shift by 63, we have1
;1 - 1
=0
; outer bnot gives us all-ones.
cfallin submitted PR review.
cfallin created PR review comment:
(@afonso360 a comment to this effect in the source would be very useful for the future!)
cfallin submitted PR review.
cfallin created PR review comment:
Ah, and after posting I see @jameysharp 's comment too. Both of us arrived at the same reasoning more or less so I'm fairly confident here :-)
jameysharp submitted PR review.
jameysharp created PR review comment:
We can extract out more broadly-applicable rules that should compose nicely into this one.
Rewrite both of these patterns to
(ineg ty input)
:
(iadd ty (bnot ty input) (iconst ty (u64_from_imm64 1)))
(bnot ty (isub ty input (iconst ty (u64_from_imm64 1))))
In a 64-bit word,
-(x >> 63)
, where right-shift is unsigned, should I think be equivalent tox >> 63
with a signed right-shift. Or in general, rewrite negation of unsigned shifts by N-1 bits in N-bit words.Then this rule only needs to match
(x | -x) >> (N-1)
, with a signed right-shift.That should make all these rules more broadly applicable as well as (somewhat) easier to understand.
x | -x
leaves the trailing 0s alone and sets the remaining bits to 1. (I found the inverse of this pattern in "Hacker's Delight", on page 12.) I don't think there's a useful rule we can extract from that pattern alone.
elliottt submitted PR review.
elliottt created PR review comment:
The problem with the smtlib version was the use of
!
, which is for annotating expressions: switching tonot
fixes the problem:(declare-const x (_ BitVec 32)) (assert (not (= (ite (= x #x00000000) #x00000000 #xffffffff) ;; !(((x | ((!x) + 1)) >> 31) - 1) (bvnot (bvsub (bvlshr (bvor x (bvadd (bvnot x) #x00000001)) #x0000001f) #x00000001))))) (check-sat) (get-model)
produces the following output:
% z3 -smt2 test.smt unsat (error "line 14 column 10: model is not available")
fitzgen submitted PR review.
fitzgen submitted PR review.
fitzgen created PR review comment:
Nitpick: instead of hard coding
63
lets make thisbits(ty) - 1
.Also, lets add a comment with an informal proof / argument about why this is correct, as @cfallin mentioned.
jameysharp submitted PR review.
jameysharp created PR review comment:
I think the correctness argument is much less important to write out if this is split into the four separate rules I suggested. They're individually much easier to verify.
afonso360 updated PR #6196.
afonso360 updated PR #6196.
afonso360 updated PR #6196.
jameysharp submitted PR review.
jameysharp submitted PR review.
jameysharp created PR review comment:
This comment is great, thank you!
I guess we should add the commutative version of this too, for
(-x | x)
, to go with(x | -x)
.
jameysharp created PR review comment:
I'd like a couple more cases, maybe instead of the cases for 2 or 3:
- 0xFFFF_FFFF_FFFF_FFFF
- 0x8000_0000_0000_0000
I'd also be fine with narrowing this to
i32
ori16
so the interesting edge cases have smaller constants…
jameysharp created PR review comment:
This rule would also work for
(x + (-1))
and its commutative equivalent.I considered suggesting that we should have a rule that transforms
isub
where the second operand is constant intoiadd
of the negation of that constant. Then any rules involving addition with a constant can fire forisub
as well. For example,(!x) - (-1)
would fire the rules above.That's not ideal though since it replaces an instruction with another instruction of equal cost, and I think we're converging on the idea that every egraph rule should rewrite to an expression of lower total cost. So in this case we'd want six rules in total:
!x + 1 == 1 + !x == !x - (-1) == -x
!(x - 1) == !(x + (-1)) == !((-1) + x) == -x
jameysharp created PR review comment:
Elsewhere in this file these comments were written in the other order, with the pattern we're looking for first and the simpler equivalent afterward, and I think that makes a little more sense. So this example would instead be:
;; (!x) + 1 == ineg(x)
And similarly through the rest of the rules in this PR.
afonso360 updated PR #6196.
afonso360 submitted PR review.
jameysharp merged PR #6196.
Last updated: Nov 22 2024 at 16:03 UTC