Stream: git-wasmtime

Topic: wasmtime / issue #4123 Cranelift: improve codegen for boo...


view this post on Zulip Wasmtime GitHub notifications bot (May 10 2022 at 21:48):

cfallin labeled issue #4123:

The code generation for (i) branches on booleans, and (ii) branches on integer values that come from compares but are not directly observable (e.g. in a different basic block), is suboptimal. We often see:

The root causes are:

Some combination of more aggressive pattern matching and demanded-bits analysis could improve the codegen in these cases.

view this post on Zulip Wasmtime GitHub notifications bot (May 10 2022 at 21:48):

cfallin labeled issue #4123:

The code generation for (i) branches on booleans, and (ii) branches on integer values that come from compares but are not directly observable (e.g. in a different basic block), is suboptimal. We often see:

The root causes are:

Some combination of more aggressive pattern matching and demanded-bits analysis could improve the codegen in these cases.

view this post on Zulip Wasmtime GitHub notifications bot (May 10 2022 at 21:48):

cfallin labeled issue #4123:

The code generation for (i) branches on booleans, and (ii) branches on integer values that come from compares but are not directly observable (e.g. in a different basic block), is suboptimal. We often see:

The root causes are:

Some combination of more aggressive pattern matching and demanded-bits analysis could improve the codegen in these cases.

view this post on Zulip Wasmtime GitHub notifications bot (May 10 2022 at 21:48):

cfallin opened issue #4123:

The code generation for (i) branches on booleans, and (ii) branches on integer values that come from compares but are not directly observable (e.g. in a different basic block), is suboptimal. We often see:

The root causes are:

Some combination of more aggressive pattern matching and demanded-bits analysis could improve the codegen in these cases.

view this post on Zulip Wasmtime GitHub notifications bot (May 31 2022 at 10:00):

sparker-arm commented on issue #4123:

Would preventing the legalizing of br_icmp also help here? And possibly even do the opposite and combine br and icmp, when the icmp has single user?

view this post on Zulip Wasmtime GitHub notifications bot (May 31 2022 at 16:43):

cfallin commented on issue #4123:

Possibly... I'm a bit torn because our general principle is to decompose ops at the CLIF level, to allow for better optimization (this has certainly been our rule for SIMD for example). I could imagine a case where a cset produces a bool hoisted out of a loop, and the loop branches on that rather than a fresh compare, which reduces the loop-carried live set by one register. (This actually makes me think we might eventually want to have a notion of "hoisted for perf reasons, do not re-merge" feed from the mid-end into isel pattern matching, but that's a separate conversation!)

An ad-hoc fusing pass is also somewhat brittle; imho it's better to have one place where we reason about macro-op matching (namely isel).

I think we should be able to do OK with better pattern matching, at least to remove the masking to start; but this is still pretty open to investigation!


Last updated: Oct 23 2024 at 20:03 UTC