ggreif opened PR #13334 from ggreif:gabor/ctz-clz-brif-lowering to bytecodealliance:main:
Summary
Follow-up to #13332. That PR added egraph rules collapsing
(eq (ctz X) 0)/(ne (ctz X) 0)/(eq (clz X) 0)/(ne (clz X) 0)to direct LSB / sign-bit tests — but only when the comparison is mediated by an expliciticmp. The wasm front-end translateswasm if (ctz X)tobrif (ireduce.i32 (ctz.i64 X))directly (noicmp), so the egraph rules don't fire on the wasm-natural shape.This PR closes the gap by specialising
is_nonzeroin the x64 backend — the helper that allbrif/select/trapiflowerings funnel through.Rules
In
cranelift/codegen/src/isa/x64/inst.isle:(rule 3 (is_nonzero (ctz (ty_32_or_64 ty) val)) (CondResult.CC (x64_test ty val (RegMemImm.Imm 1)) (CC.Z))) (rule 3 (is_nonzero (ireduce _ (ctz (ty_32_or_64 ty) val))) (CondResult.CC (x64_test ty val (RegMemImm.Imm 1)) (CC.Z))) (rule 3 (is_nonzero (clz (ty_32_or_64 ty) val)) (let ((gpr Gpr val)) (CondResult.CC (x64_test ty gpr gpr) (CC.NS)))) (rule 3 (is_nonzero (ireduce _ (clz (ty_32_or_64 ty) val))) (let ((gpr Gpr val)) (CondResult.CC (x64_test ty gpr gpr) (CC.NS))))The
ireducevariant catches the wasm front-end'si32.wrap_i64over a 64-bitctz/clz— a no-op on values in [0, bitwidth].Test deltas (
tests/disas/ctz-clz-bool-condition.wat)
consumer before after if_ctz_bare_i325 insns ( bsfl + cmovel + test + jne)2 ( testl \$1, %edx; je)if_ctz_bare_i645 insns ( bsfq + cmovq + test + jne)2 ( testq \$1, %rdx; je)if_clz_bare_i327 insns ( bsr + cmov + sub + test + jne)2 ( testl + jns)The icmp-mediated cases (collapsed by #13332's egraph rules) are unchanged. The numeric-comparison negative test (
(ctz X) == 4) stays untouched.Motivation
Motoko's
moccodegen emitsi64.ctz X; i32.wrap_i64; iffor compactness/sign tests in the EOP backend (see caffeinelabs/motoko#6103). Before this PR, that lowers to 5 native instructions per dispatch; after, 2.A concrete idiomatic example: in Motoko, the
let-elsepattern overResultlet #ok payload = queryProp(...) else return defaultValue;desugars to a 2-arm refutable variant match (
#okvs#err). The variant-tag hashes are \`hash(\"ok\") = 0x611C\` (LSB 0) and \`hash(\"err\") = 0x4D0765\` (LSB 1) — they differ exactly at the LSB. The planned variant-switch \`BitTest\` dispatch (caffeinelabs/motoko's \`gabor/variant-switch\`) recognizes this and emits a single LSB-test for the dispatch; combined with this PR, the entire let-else lowers to \`load hash; testq \$1, ...; jcc\` on x64 — three instructions for a pattern match. Every \`Result\`-returning API + every \`let-else\`-style early return collapses to this shape.Aggregated across hot paths (variant-switch dispatch, GC compact/heap discriminator, sign tests, …) this is meaningful.
Follow-ups (not in this PR)
- aarch64, riscv64, s390x analogues — separate PRs once x64 reviewer feedback lands.
- \`select\`-consumer variant — \`select\` already routes through \`is_nonzero_cmp\` → \`is_nonzero\`, so this PR's rules cover it too without extra work.
Last updated: Jun 01 2026 at 09:49 UTC