whitequark edited Issue #1743:
While looking into another issue I minimized the following testcase, which produces invalid IR:
target x86_64 function u0:323() system_v { block0: v221 = bconst.b1 false v222 = bconst.b1 false v223 = bnot v221 v224 = band v223, v222 trap unreachable }
[-] v223 = bnot v221 ;~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ; error: inst2: v223 is a real GPR value defined by a ghost instruction
It seems to me that it should work but I don't understand what's the missing legalization or selection rule is in this case. @iximeow?
github-actions[bot] commented on Issue #1743:
Subscribe to Label Action
cc @bnjbvr
<details>
This issue or pull request has been labeled: "cranelift"Thus the following users have been cc'd because of the following labels:
- bnjbvr: cranelift
To subscribe or unsubscribe from this label, edit the <code>.github/subscribe-to-label.json</code> configuration file.
Learn more.
</details>
whitequark commented on Issue #1743:
I've tried legalizing it like this but it doesn't work:
--- a/cranelift/codegen/meta/src/shared/legalize.rs +++ b/cranelift/codegen/meta/src/shared/legalize.rs @@ -5,6 +5,7 @@ use crate::cdsl::xform::{TransformGroupBuilder, TransformGroups}; use crate::shared::immediates::Immediates; use crate::shared::types::Float::{F32, F64}; use crate::shared::types::Int::{I128, I16, I32, I64, I8}; +use crate::shared::types::Bool::B1; use cranelift_codegen_shared::condcodes::{CondCode, IntCC}; #[allow(clippy::many_single_char_names, clippy::cognitive_complexity)] @@ -396,6 +397,14 @@ pub(crate) fn define(insts: &InstructionGroup, imm: &Immediates) -> TransformGro ); } + widen.legalize( + def!(a = bnot.B1(b)), + vec![ + def!(x = bint.I32(b)), + def!(a = icmp_imm.I32(intcc_eq, x, Literal::constant(&imm.uimm32, 0))) + ] + ); + // Widen instructions with one input operand. for &op in &[bnot, popcnt] { for &int_ty in &[I8, I16] {
whitequark edited a comment on Issue #1743:
I've tried legalizing it like this but it doesn't do anything:
--- a/cranelift/codegen/meta/src/shared/legalize.rs +++ b/cranelift/codegen/meta/src/shared/legalize.rs @@ -5,6 +5,7 @@ use crate::cdsl::xform::{TransformGroupBuilder, TransformGroups}; use crate::shared::immediates::Immediates; use crate::shared::types::Float::{F32, F64}; use crate::shared::types::Int::{I128, I16, I32, I64, I8}; +use crate::shared::types::Bool::B1; use cranelift_codegen_shared::condcodes::{CondCode, IntCC}; #[allow(clippy::many_single_char_names, clippy::cognitive_complexity)] @@ -396,6 +397,14 @@ pub(crate) fn define(insts: &InstructionGroup, imm: &Immediates) -> TransformGro ); } + widen.legalize( + def!(a = bnot.B1(b)), + vec![ + def!(x = bint.I32(b)), + def!(a = icmp_imm.I32(intcc_eq, x, Literal::constant(&imm.uimm32, 0))) + ] + ); + // Widen instructions with one input operand. for &op in &[bnot, popcnt] { for &int_ty in &[I8, I16] {
bjorn3 commented on Issue #1743:
You can try expand instead of widen.
whitequark commented on Issue #1743:
Tried that, it breaks with a really strange message:
error[E0425]: cannot find value `typeof_x` in this scope --> /home/whitequark/Projects/wasmtime/target/debug/build/cranelift-codegen-ca1b406eb4035fa6/out/legalizer.rs:195:68 | 195 | let predicate = predicate && TYPE_SETS[0].contains(typeof_x); | ^^^^^^^^ help: a local variable with a similar name exists: `typeof_a` error[E0425]: cannot find value `typeof_x` in this scope --> /home/whitequark/Projects/wasmtime/target/debug/build/cranelift-codegen-ca1b406eb4035fa6/out/legalizer.rs:197:46 | 197 | let y = pos.ins().iconst(typeof_x, -1); | ^^^^^^^^ help: a local variable with a similar name exists: `typeof_a` error[E0425]: cannot find value `x` in this scope --> /home/whitequark/Projects/wasmtime/target/debug/build/cranelift-codegen-ca1b406eb4035fa6/out/legalizer.rs:198:61 | 198 | let a = pos.func.dfg.replace(inst).bxor(x, y); | ^ help: a local variable with a similar name exists: `a`
iximeow commented on Issue #1743:
I think that entirely confused the legalizer codegen :D I'd expect the
Expand bnot using xor
bit in the shared legalizer to cover this case://# Expand bnot using xor. let minus_one = Literal::constant(&imm.imm64, -1); expand.legalize( def!(a = bnot(x)), vec![def!(y = iconst(minus_one)), def!(a = bxor(x, y))], );
I'm a little at a loss as to why this doesn't apply.
That said, adding a direct encoding for
B1
to x86 does make your test pass:diff --git a/cranelift/codegen/meta/src/isa/x86/encodings.rs b/cranelift/codegen/meta/src/isa/x86/encodings.rs index 7863e2bd8..9f1517253 100644 --- a/cranelift/codegen/meta/src/isa/x86/encodings.rs +++ b/cranelift/codegen/meta/src/isa/x86/encodings.rs @@ -1454,6 +1454,7 @@ fn define_alu( // x86 has a bitwise not instruction NOT. e.enc_i32_i64(bnot, rec_ur.opcodes(&NOT).rrr(2)); e.enc_b32_b64(bnot, rec_ur.opcodes(&NOT).rrr(2)); + e.enc_both(bnot.bind(B1), rec_ur.opcodes(&NOT).rrr(2)); // Also add a `b1` encodings for the logic instructions. // TODO: Should this be done with 8-bit instructions? It would improve partial register
iximeow commented on Issue #1743:
I'm a little at a loss as to why this doesn't apply.
Theory: the integer immediate forbids this legalization from applying for an incompatible
B1
operand?
abrown commented on Issue #1743:
I was just looking at this but because of other problems:
B1
s are going to look in theexpand_flags
transform group for legalizations, https://github.com/bytecodealliance/wasmtime/blob/master/cranelift/codegen/meta/src/isa/x86/mod.rs#L40-L59. If it can't find it there, there is no "look in the next transform group" logic so it will just fail.
iximeow commented on Issue #1743:
expand_flags
ought to be chained withexpand
rules afterward: https://github.com/bytecodealliance/wasmtime/blob/162fcd3d756a5b0f2d39a22968b99243f404ce81/cranelift/codegen/meta/src/shared/legalize.rs#L954-L964
x86_expand
is then chained toexpand_flags
later downstream: https://github.com/bytecodealliance/wasmtime/blob/4ec16fa057774dcd0d82dce05f8160689f0fe050/cranelift/codegen/meta/src/isa/x86/legalize.rs#L11-L19so I'm actually more confused at
B1
types beingexpand_flags
rather thanx86_expand
:(
abrown commented on Issue #1743:
Ah, I think you're right: there's chaining for the DSL legalizations (I think?) but not for the custom functions?
abrown closed Issue #1743:
While looking into another issue I minimized the following testcase, which produces invalid IR:
target x86_64 function u0:323() system_v { block0: v221 = bconst.b1 false v222 = bconst.b1 false v223 = bnot v221 v224 = band v223, v222 trap unreachable }
[-] v223 = bnot v221 ;~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ; error: inst2: v223 is a real GPR value defined by a ghost instruction
It seems to me that it should work but I don't understand what's the missing legalization or selection rule is in this case. @iximeow?
Last updated: Dec 23 2024 at 13:07 UTC