bjorn3 commented on issue #4721:
cc https://github.com/bytecodealliance/wasmtime/issues/3250
One of my arguments against this is that it makes the clif ir harder to read and harder to generate.
jameysharp commented on issue #4721:
Hah, I guess I accidentally implemented what @cfallin proposed just by being confused. Thanks for digging that link up! It's helpful context.
I noticed a couple things that I think are existing bugs while exploring this:
The backends all panic if they see these opcodes during lowering... Except that x64 has rules in
lower.isle
to matchiadd_imm
. Can those rules ever fire?I see
simple_preopt
was intended to rewrite "(iadd (iconst x) (iadd (iconst y) z))" to fold the constants together, and it uses the*_imm
opcodes as an intermediate step. But it looks like that only actually happens if the constant is the first operand, not the second, because only one case has a recursive call tosimplify
.Independent of whether we keep the
*_imm
instructions, I'm also wondering: Cansimple_preopt
be replaced with ISLE and match on the full tree structure, instead of rewriting to this intermediate form to avoid having to look three layers down in the tree of operators? I wonder if it would reduce compile time by not modifying instructions that it can't actually improve. Is that something that should happen after the e-graph work lands?
afonso360 commented on issue #4721:
Wow, there is a lot more stuff attached to
_imm
than i expected!Is it possible to print a comment with the const value if one of the sides of the
iadd
/isub
/etc.. is a const? That way we could recover some of the readability.For example:
v0 = iconst.i16 0xABCD v1 = ... v2 = ... v3 = iadd.i16 v2, v0 ; v0 = 0xABCD
It would also help in cases other than
*_imm
. But probably does not cover all cases.
bjorn3 commented on issue #4721:
Independent of whether we keep the *_imm instructions, I'm also wondering: Can simple_preopt be replaced with ISLE and match on the full tree structure
I believe that is exactly what will happen with the egraph PR if it gets merged. https://github.com/bytecodealliance/wasmtime/pull/4249/files#diff-a596f1a407284b0b3591c17053fe0eff7f0229b70a2fcd9282fc2aedf9cd38a4
bjorn3 commented on issue #4721:
Is it possible to print a comment with the const value if one of the sides of the iadd/isub/etc.. is a const? That way we could recover some of the readability.
I like that! It would resolve my readability concerns. It doesn't resolve my clif ir generation concern though.
afonso360 commented on issue #4721:
It doesn't resolve my clif ir generation concern though.
Reading the PR it looks like we keep the
_imm
builder functions, but they now emit theop
+iconst
, so there should no difference to users of cranelift. It that what you were referring to or did I miss something?
afonso360 edited a comment on issue #4721:
It doesn't resolve my clif ir generation concern though.
Reading the PR it looks like we keep the
_imm
builder functions, but they now emit theop
+iconst
, so there should no difference to users of cranelift. It that what you were referring to or did I miss something?Although I do think we should note in the docs that those helpers are not actual instructions by themselves.
bjorn3 commented on issue #4721:
Reading the PR it looks like we keep the _imm builder functions, but they now emit the op + iconst, so there should no difference to users of cranelift.
Right, missed that.
cfallin commented on issue #4721:
@jameysharp thanks for exploring here. As others have pointed out, there's a lot of prior thought on this, and overall I think we do want to remove these variants in due time.
The backends all panic if they see these opcodes during lowering... Except that x64 has rules in lower.isle to match iadd_imm. Can those rules ever fire?
Currently the legalizer rewrites them from
_imm
form to op-with-iconst before the lowering backends see the CLIF; this is why we don't need to handle them in each backend.I see simple_preopt was intended to rewrite "(iadd (iconst x) (iadd (iconst y) z))" to fold the constants together, and it uses the *_imm opcodes as an intermediate step. But it looks like that only actually happens if the constant is the first operand, not the second, because only one case has a recursive call to simplify.
Independent of whether we keep the *_imm instructions, I'm also wondering: Can simple_preopt be replaced with ISLE and match on the full tree structure, instead of rewriting to this intermediate form to avoid having to look three layers down in the tree of operators? I wonder if it would reduce compile time by not modifying instructions that it can't actually improve. Is that something that should happen after the e-graph work lands?
Indeed, that's one of the main effects of my mid-end optimizer work; it will replace
simple_preopt
just as you are suggesting.Now, regarding
iadd
vsiadd_imm
and friends:I think that breaking the ops into smaller pieces, and keeping
iadd
separate from itsimm
, makes the most sense from an optimization-rules perspective: we otherwise need rules to match both. (For example, multiply-add needs to also match multiply-add-imm.) Likewise it's useful to keepiconst
separate because we want to do special things with it sometimes, like rematerialization and/or use of constant pools.Part of the reason for the original
_imm
forms, aside from the code-size advantage that @sunfishcode mentions in #3250, was that old Cranelift rewrote CLIF until each op had a direct machine-instruction equivalent; soiadd_imm
was useful to represent the reg-immediate form of the x86 add instruction. We no longer have that constraint, though.So I think the right way forward is in spirit similar to what you've done here, but (i) do it after the mid-end optimizer lands (pending RFC approval!) because there are some immediate followup thoughts in how to make legalization ride on top of it; and (ii) generate the
_imm
helpers on theInstBuilder
in a programmatic way, rather than manually writing them.
jameysharp commented on issue #4721:
Is it possible to print a comment with the const value if one of the sides of the
iadd
/isub
/etc.. is a const? That way we could recover some of the readability.
...
v3 = iadd.i16 v2, v0 ; v0 = 0xABCD
This turned out to be really easy to do. I've opened #4725 for that.
The backends all panic if they see these opcodes during lowering... Except that x64 has rules in lower.isle to match iadd_imm. Can those rules ever fire?
Currently the legalizer rewrites them from
_imm
form to op-with-iconst before the lowering backends see the CLIF; this is why we don't need to handle them in each backend.Right. So these rules in x64's lower.isle can never match, yeah?
So I think the right way forward is in spirit similar to what you've done here, but (i) do it after the mid-end optimizer lands (pending RFC approval!) because there are some immediate followup thoughts in how to make legalization ride on top of it; and (ii) generate the
_imm
helpers on theInstBuilder
in a programmatic way, rather than manually writing them.(i) makes sense. (ii) isn't immediately (!) obvious to me: is it that you want to make this pattern available as a convenience on more instructions, or you're worried about the correctness of the hand-written wrappers, or...?
cfallin commented on issue #4721:
(i) makes sense. (ii) isn't immediately (!) obvious to me: is it that you want to make this pattern available as a convenience on more instructions, or you're worried about the correctness of the hand-written wrappers, or...?
Basically, generating the code is less error-prone and allows us to extend the pattern to arbitrary instructions as we choose fit; it's a lower-entropy encoding of the system design. We already generate
.iadd()
so generating.iadd_imm()
given a special flag isn't a huge stretch. In contrast, mixing generated code and handwritten code (this PR) feels asymmetric and is difficult to maintain and extend in the future.
cfallin commented on issue #4721:
Right. So these rules in x64's lower.isle can never match, yeah?
Apparently so! We can go ahead and delete those; I suspect fuzz coverage, if we were to get it in terms of ISLE source, would show this as well.
jameysharp commented on issue #4721:
Right. So these rules in x64's lower.isle can never match, yeah?
Apparently so! We can go ahead and delete those; I suspect fuzz coverage, if we were to get it in terms of ISLE source, would show this as well.
Great, I've opened #4726 for deleting those.
Basically, generating the code is less error-prone and allows us to extend the pattern to arbitrary instructions as we choose fit; it's a lower-entropy encoding of the system design. We already generate
.iadd()
so generating.iadd_imm()
given a special flag isn't a huge stretch. In contrast, mixing generated code and handwritten code (this PR) feels asymmetric and is difficult to maintain and extend in the future.Okay, I can see that. It's not obvious to me what code that should generate, though, because the current
InstBuilder
traits assume each method will build exactly one instruction, consuming the builder. These_imm
variants, being a kind of macro for multiple instructions, are fundamentally different.But we can sort that out when it's time to re-visit this after your e-graph work lands.
Last updated: Nov 22 2024 at 17:03 UTC