Stream: git-wasmtime

Topic: wasmtime / Issue #1709 Add the ability to build and regis...


view this post on Zulip Wasmtime GitHub notifications bot (May 14 2020 at 18:07):

fitzgen opened Issue #1709:

E.g. it may make sense for the cg_clif (the cranelift-based backend for rustc) to have its own peephole optimizations pass that contains optimizations specific to code generated by cg_clif.

view this post on Zulip Wasmtime GitHub notifications bot (May 14 2020 at 18:07):

fitzgen labeled Issue #1709:

E.g. it may make sense for the cg_clif (the cranelift-based backend for rustc) to have its own peephole optimizations pass that contains optimizations specific to code generated by cg_clif.

view this post on Zulip Wasmtime GitHub notifications bot (May 14 2020 at 18:07):

fitzgen labeled Issue #1709:

E.g. it may make sense for the cg_clif (the cranelift-based backend for rustc) to have its own peephole optimizations pass that contains optimizations specific to code generated by cg_clif.

view this post on Zulip Wasmtime GitHub notifications bot (May 14 2020 at 18:07):

github-actions[bot] commented on Issue #1709:

Subscribe to Label Action

cc @fitzgen

<details>
This issue or pull request has been labeled: "cranelift:area:peepmatic"

Thus the following users have been cc'd because of the following labels:

To subscribe or unsubscribe from this label, edit the <code>.github/subscribe-to-label.json</code> configuration file.

Learn more.
</details>

view this post on Zulip Wasmtime GitHub notifications bot (May 14 2020 at 18:07):

github-actions[bot] commented on Issue #1709:

Subscribe to Label Action

cc @bnjbvr

<details>
This issue or pull request has been labeled: "cranelift"

Thus the following users have been cc'd because of the following labels:

To subscribe or unsubscribe from this label, edit the <code>.github/subscribe-to-label.json</code> configuration file.

Learn more.
</details>

view this post on Zulip Wasmtime GitHub notifications bot (Jul 15 2020 at 23:12):

MaxGraey commented on Issue #1709:

Regarding other peepmatic transform rules I'm wondering is it possible implement such simplification for integers which actually behave as boolean? Something like:

(=> (when (icmp_imm eq 0 $x)
      (bit-width $x 1))
    (bxor_imm 1 $x))

and other thansforms which have i1 (boolean) type. Or it doesn't make sense for peepmatic?

view this post on Zulip Wasmtime GitHub notifications bot (Jul 16 2020 at 00:59):

fitzgen commented on Issue #1709:

Although there is a b1 type for 1-bit booleans, there isn't really an i1 type for 1-bit integers.

As for specifically replacing an icmp_imm with an bxor_imm, I'm not convinced it would improve codegen since both instructions have a single uop and latency of 1 cycle on x86_64 (I don't know about aarch64).

On the more general topic of "what kind of optimizations make sense for peepmatic?": I don't think it makes sense to focus on adding one-off peepmatic optimizations to Cranelift at this time. Instead, I am planning on writing

With these two bits in hand, we will be able to automatically find missing optimizations, and generate the optimal RHS for each of our given LHSes. This should be much more fruitful (and less buggy!) than writing optimizations by hand.

But of course by the time we have these synthesized optimizations from Souper, we need a way to hook them into Cranelift, which is what resolving this issue should provide :)

If you're interested in helping out with this stuff, let me know and I can try and divide up these tasks into smaller bits! Also, I probably didn't explain everything super well, so if you have questions about what I am talking about, don't hesitate to ask questions.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 16 2020 at 07:44):

MaxGraey commented on Issue #1709:

a left-hand side extractor, that extracts candidate LHSes from clif IR into Souper IR
we will be able to automatically find missing optimizations, and generate the optimal RHS for each of our given LHSes

That's sound very cool! As I understand it's working in this repo currently?

view this post on Zulip Wasmtime GitHub notifications bot (Jul 16 2020 at 07:50):

MaxGraey edited a comment on Issue #1709:

a left-hand side extractor, that extracts candidate LHSes from clif IR into Souper IR
we will be able to automatically find missing optimizations, and generate the optimal RHS for each of our given LHSes

That's sound very cool! As I understand progress going here?

view this post on Zulip Wasmtime GitHub notifications bot (Jul 16 2020 at 07:52):

MaxGraey edited a comment on Issue #1709:

a left-hand side extractor, that extracts candidate LHSes from clif IR into Souper IR
we will be able to automatically find missing optimizations, and generate the optimal RHS for each of our given LHSes

That's sound very cool! As I understand progress going here?

As for specifically replacing an icmp_imm with an bxor_imm, I'm not convinced it would improve codegen since both instructions have a single uop and latency of 1 cycle on x86_64 (I don't know about aarch64).

Thanks for explanation. I just ask about this due to LLVM's cost model it seems always prefer bit-wise xor for i1 types

view this post on Zulip Wasmtime GitHub notifications bot (Jul 16 2020 at 07:52):

MaxGraey edited a comment on Issue #1709:

a left-hand side extractor, that extracts candidate LHSes from clif IR into Souper IR
we will be able to automatically find missing optimizations, and generate the optimal RHS for each of our given LHSes

That's sound very cool! As I understand progress going here?

As for specifically replacing an icmp_imm with an bxor_imm, I'm not convinced it would improve codegen since both instructions have a single uop and latency of 1 cycle on x86_64 (I don't know about aarch64).

Thanks for explanation. I just ask about this due to LLVM's cost model it seems always prefer bit-wise xor for i1/b1 types

view this post on Zulip Wasmtime GitHub notifications bot (Jul 16 2020 at 07:53):

MaxGraey edited a comment on Issue #1709:

a left-hand side extractor, that extracts candidate LHSes from clif IR into Souper IR
we will be able to automatically find missing optimizations, and generate the optimal RHS for each of our given LHSes

That's sound very cool! As I understand progress going here?

As for specifically replacing an icmp_imm with an bxor_imm, I'm not convinced it would improve codegen since both instructions have a single uop and latency of 1 cycle on x86_64 (I don't know about aarch64).

Thanks for explanation. I just ask about this due to LLVM's cost model it seems always prefer bit-wise xor for i1/b1 types in this case

view this post on Zulip Wasmtime GitHub notifications bot (Jul 17 2020 at 16:45):

fitzgen commented on Issue #1709:

That's sound very cool! As I understand progress going here?

Sort of. Jubi and I are working closely together, but her project is taking a slightly different path (translating Souper optimizations into Rust source code that implements a peephole pass directly, rather than using peepmatic, which didn't exist when she started). That said, we are sharing notes and brainstorming together.

I haven't started working on the extractor or the Souper-to-peepmatic stuff yet, because I've been busy with reference types.

I just ask about this due to LLVM's cost model it seems always prefer bit-wise xor for i1/b1 types in this case

It could be canonicalization, like you mentioned in the other thread. Or perhaps on some microarchs it makes a difference.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 17 2020 at 19:41):

MaxGraey commented on Issue #1709:

I haven't started working on the extractor or the Souper-to-peepmatic stuff yet, because I've been busy with reference types

Ah, got it. Yeah, implementing reference types is really puzzled for vm and compilers. I heard llvm / lld has some difficulties with that.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 17 2020 at 19:41):

MaxGraey edited a comment on Issue #1709:

I haven't started working on the extractor or the Souper-to-peepmatic stuff yet, because I've been busy with reference types

Ah, got it. Yeah, implementing reference types is really puzzled for vm and compilers. I heard llvm / lld has some difficulties with that as well

view this post on Zulip Wasmtime GitHub notifications bot (Jul 18 2020 at 09:01):

MaxGraey commented on Issue #1709:

I'm wondering is it make sense add optimization for double equal to zero? For example binaryen (wasm target) when optimize for size usually prefer (i32.eqz (i32.eqz (local.get $x))) instead (i32.ne (local.get $x) (i32.const 0))) due to first version one byte smaller


Last updated: Dec 23 2024 at 12:05 UTC