github-actions[bot] commented on Issue #2022:
Subscribe to Label Action
cc @bnjbvr, @fitzgen
<details>
This issue or pull request has been labeled: "cranelift", "cranelift:area:peepmatic"Thus the following users have been cc'd because of the following labels:
- bnjbvr: cranelift
- fitzgen: cranelift:area:peepmatic
To subscribe or unsubscribe from this label, edit the <code>.github/subscribe-to-label.json</code> configuration file.
Learn more.
</details>
MaxGraey commented on Issue #2022:
I do think that Peepmatic support for bnot instructions is useful, so if you
would be interested in separating out just the changes required to support
bnot, then I would be happy to merge them.What if I just add support same transform also for
simple_preopt.rs
for parity?
MaxGraey edited a comment on Issue #2022:
I do think that Peepmatic support for bnot instructions is useful, so if you
would be interested in separating out just the changes required to support
bnot, then I would be happy to merge them.What if I just add support same transform also for
simple_preopt.rs
for parity in this PR?
fitzgen commented on Issue #2022:
What if I just add support same transform also for
simple_preopt.rs
for parity in this PR?
simple_preopt.rs
optimizations need to be careful to carry their weight (which is part of what peepmatic is trying to improve upon) and I'm not sure what the exact calculus is that makes an optimization worth it or not. Perhaps @sunfishcode or @bnjbvr could comment on whether this would be a welcome addition tosimple_preopt.rs
.If so, then yes let's add it
preopt.peepmatic
as well!
bnjbvr commented on Issue #2022:
Thanks for working on this!
simple_preopt.rs optimizations need to be careful to carry their weight (which is part of what peepmatic is trying to improve upon) and I'm not sure what the exact calculus is that makes an optimization worth it or not. Perhaps @sunfishcode or @bnjbvr could comment on whether this would be a welcome addition to simple_preopt.rs.
The simple preopt pass was designed to catch frequent patterns that lead to better code generation and that were not directly produced by the wasm producer. We expect most of these optimizations to have already happened in e.g. LLVM. But the IR translation makes it so that some new pattern-matching opportunities on the CLIR arise. Most of simple preopt phase was mostly acting as an helper for instruction selection (legalization) that happened later in the pipeline, by selecting the
_imm
variants so that legalization would use machine instructions with immediates; the "pre" in preopt refers to the fact that it runs before legalization. However, instruction selection happens in a totally different place now, with the new target backends, so this part is becoming less useful.Integer transforms like could be valuable at the
speed
optimization level; there's some overhead for pattern-matching every single instruction, so we try to be careful to not put every optimization we can think of. Usually our line of thinking has been:
- is it reasonable that the CLIF producer already performed such an optimization? If yes, do no replicate it in Cranelift.
- data-driven approach: for large CLIF inputs (from large wasm programs), how many times is the optimization effective? if it's very frequent (there's no "official" threshold, so common sense is required here), then do it in Cranelift.
For instance, constant propagation doesn't quite happen in Cranelift, because we really expect the producer to have done it. For other producers which don't have advanced optimization passes like this, there's an external crate for
cranelift-preopt
that does exactly this.So considering the answers to the two questions above, it might be worth putting it in simple_preopt, or implementing it in the
preopt
crate?
MaxGraey commented on Issue #2022:
@bnjbvr thanks for explanation!
Main goal of this PR is speedup code produced by webassembly. As you know wasm doesn't havenot
operation and emulate it viax ^ -1
. Also perhaps it could be useful even for LLVM code built with-O0
for some reasons.
julian-seward1 commented on Issue #2022:
Also, for something as simple as
x ^ -1
, the new backend(s) that are under development, could easily enough match that in their instruction selectors and directly emit anot
insn as appropriate. There's no requirement for any more heavyweight machinery to handle this particular case.
MaxGraey commented on Issue #2022:
Hmm, could you explain which kind of rules are preferable for peepmatic DSL? It would really help me.
Last updated: Nov 22 2024 at 16:03 UTC