Stream: git-wasmtime

Topic: wasmtime / issue #12368 Cranelift: aarch64: user-controll...


view this post on Zulip Wasmtime GitHub notifications bot (Jan 18 2026 at 03:44):

mmcloughlin opened issue #12368:

The following lower_fmla rules enable user-controlled recursion.

https://github.com/bytecodealliance/wasmtime/blob/v40.0.2/cranelift/codegen/src/isa/aarch64/lower.isle#L606-L615

These rules will peel away an arbitrary number of fneg arguments to an fma
instruction, alternating between a fused-multiply-add and
fused-multiply-subtract operation.

.clif Test Case

Generate a CLIF function with a stack of fneg arguments using a script like
fneg.py:

import sys


def generate_fneg_fma_rec(count):
    print("function %fma_fneg_rec(f32x4, f32x4, f32x4) -> f32x4 {")
    print("block0(v1: f32x4, v2: f32x4, v3: f32x4):")

    n = 4
    for _ in range(count):
        print(f"    v{n} = fneg v{n - 2}")
        print(f"    v{n + 1} = fneg v{n - 1}")
        n += 2

    print(f"    v{n} = fma v{n - 2}, v{n - 1}, v1")
    print(f"    return v{n}")
    print("}")


def main():
    count = int(sys.argv[1]) if len(sys.argv) > 1 else 1
    generate_fneg_fma_rec(count)


if __name__ == "__main__":
    main()

Steps to Reproduce

Generate a large instance and compile with clif-util (at v40.0.2):

$ python3 fneg.py 100000 >fneg100000.clif
$ cargo run --bin clif-util -- compile --target aarch64 -p --disasm fneg100000.clif
...

Expected Results

Expect function to compile and execute successfully. Ideally, it would be
optimized to a single fma.

Actual Results

Observe that rule recursion leads to stack overflow, for a sufficiently large
instance.

thread 'main' (2897042) has overflowed its stack
fatal runtime error: stack overflow, aborting
fish: Job 1, 'cargo run --bin clif-util -- co' terminated by signal SIGABRT (Abort)

Versions and Environment

Cranelift version or commit: v40.0.2

Operating system: Mac OSX

Architecture: AArch64

Extra Info

Related #12333

view this post on Zulip Wasmtime GitHub notifications bot (Jan 18 2026 at 03:44):

mmcloughlin added the bug label to Issue #12368.

view this post on Zulip Wasmtime GitHub notifications bot (Jan 18 2026 at 03:44):

mmcloughlin added the cranelift label to Issue #12368.

view this post on Zulip Wasmtime GitHub notifications bot (Jan 18 2026 at 03:45):

mmcloughlin edited issue #12368:

The following lower_fmla rules enable user-controlled recursion.

https://github.com/bytecodealliance/wasmtime/blob/v40.0.2/cranelift/codegen/src/isa/aarch64/lower.isle#L606-L615

These rules will peel away an arbitrary number of fneg arguments to an fma
instruction, alternating between a fused-multiply-add and
fused-multiply-subtract operation.

.clif Test Case

Generate a CLIF function with a stack of fneg arguments using a script like
fneg.py:

import sys


def generate_fneg_fma_rec(count):
    print("function %fma_fneg_rec(f32x4, f32x4, f32x4) -> f32x4 {")
    print("block0(v1: f32x4, v2: f32x4, v3: f32x4):")

    n = 4
    for _ in range(count):
        print(f"    v{n} = fneg v{n - 2}")
        print(f"    v{n + 1} = fneg v{n - 1}")
        n += 2

    print(f"    v{n} = fma v{n - 2}, v{n - 1}, v1")
    print(f"    return v{n}")
    print("}")


def main():
    count = int(sys.argv[1]) if len(sys.argv) > 1 else 1
    generate_fneg_fma_rec(count)


if __name__ == "__main__":
    main()

Steps to Reproduce

Generate a large instance:

python3 fneg.py 100000 >fneg100000.clif

Compile with clif-util (at v40.0.2):

cargo run --bin clif-util -- compile --target aarch64 -p --disasm fneg100000.clif

Expected Results

Expect function to compile and execute successfully. Ideally, it would be
optimized to a single fma.

Actual Results

Observe that rule recursion leads to stack overflow, for a sufficiently large
instance.

thread 'main' (2897042) has overflowed its stack
fatal runtime error: stack overflow, aborting
fish: Job 1, 'cargo run --bin clif-util -- co' terminated by signal SIGABRT (Abort)

Versions and Environment

Cranelift version or commit: v40.0.2

Operating system: Mac OSX

Architecture: AArch64

Extra Info

Related #12333

view this post on Zulip Wasmtime GitHub notifications bot (Jan 18 2026 at 03:45):

mmcloughlin edited issue #12368:

The following lower_fmla rules enable user-controlled recursion.

;; Special case: if one of the multiplicands is `fneg` then peel that away,
;; reverse the operation being performed, and then recurse on `lower_fmla`
;; again to generate the actual instruction.
;;
;; Note that these are the highest priority cases for `lower_fmla` to peel
;; away as many `fneg` operations as possible.
(rule 5 (lower_fmla op (fneg x) y z size)
        (lower_fmla (neg_fmla op) x y z size))
(rule 6 (lower_fmla op x (fneg y) z size)
        (lower_fmla (neg_fmla op) x y z size))

https://github.com/bytecodealliance/wasmtime/blob/v40.0.2/cranelift/codegen/src/isa/aarch64/lower.isle#L606-L615

These rules will peel away an arbitrary number of fneg arguments to an fma
instruction, alternating between a fused-multiply-add and
fused-multiply-subtract operation.

.clif Test Case

Generate a CLIF function with a stack of fneg arguments using a script like
fneg.py:

import sys


def generate_fneg_fma_rec(count):
    print("function %fma_fneg_rec(f32x4, f32x4, f32x4) -> f32x4 {")
    print("block0(v1: f32x4, v2: f32x4, v3: f32x4):")

    n = 4
    for _ in range(count):
        print(f"    v{n} = fneg v{n - 2}")
        print(f"    v{n + 1} = fneg v{n - 1}")
        n += 2

    print(f"    v{n} = fma v{n - 2}, v{n - 1}, v1")
    print(f"    return v{n}")
    print("}")


def main():
    count = int(sys.argv[1]) if len(sys.argv) > 1 else 1
    generate_fneg_fma_rec(count)


if __name__ == "__main__":
    main()

Steps to Reproduce

Generate a large instance:

python3 fneg.py 100000 >fneg100000.clif

Compile with clif-util (at v40.0.2):

cargo run --bin clif-util -- compile --target aarch64 -p --disasm fneg100000.clif

Expected Results

Expect function to compile and execute successfully. Ideally, it would be
optimized to a single fma.

Actual Results

Observe that rule recursion leads to stack overflow, for a sufficiently large
instance.

thread 'main' (2897042) has overflowed its stack
fatal runtime error: stack overflow, aborting
fish: Job 1, 'cargo run --bin clif-util -- co' terminated by signal SIGABRT (Abort)

Versions and Environment

Cranelift version or commit: v40.0.2

Operating system: Mac OSX

Architecture: AArch64

Extra Info

Related #12333

view this post on Zulip Wasmtime GitHub notifications bot (Jan 20 2026 at 20:02):

alexcrichton closed issue #12368:

The following lower_fmla rules enable user-controlled recursion.

;; Special case: if one of the multiplicands is `fneg` then peel that away,
;; reverse the operation being performed, and then recurse on `lower_fmla`
;; again to generate the actual instruction.
;;
;; Note that these are the highest priority cases for `lower_fmla` to peel
;; away as many `fneg` operations as possible.
(rule 5 (lower_fmla op (fneg x) y z size)
        (lower_fmla (neg_fmla op) x y z size))
(rule 6 (lower_fmla op x (fneg y) z size)
        (lower_fmla (neg_fmla op) x y z size))

https://github.com/bytecodealliance/wasmtime/blob/v40.0.2/cranelift/codegen/src/isa/aarch64/lower.isle#L606-L615

These rules will peel away an arbitrary number of fneg arguments to an fma
instruction, alternating between a fused-multiply-add and
fused-multiply-subtract operation.

.clif Test Case

Generate a CLIF function with a stack of fneg arguments using a script like
fneg.py:

import sys


def generate_fneg_fma_rec(count):
    print("function %fma_fneg_rec(f32x4, f32x4, f32x4) -> f32x4 {")
    print("block0(v1: f32x4, v2: f32x4, v3: f32x4):")

    n = 4
    for _ in range(count):
        print(f"    v{n} = fneg v{n - 2}")
        print(f"    v{n + 1} = fneg v{n - 1}")
        n += 2

    print(f"    v{n} = fma v{n - 2}, v{n - 1}, v1")
    print(f"    return v{n}")
    print("}")


def main():
    count = int(sys.argv[1]) if len(sys.argv) > 1 else 1
    generate_fneg_fma_rec(count)


if __name__ == "__main__":
    main()

Steps to Reproduce

Generate a large instance:

python3 fneg.py 100000 >fneg100000.clif

Compile with clif-util (at v40.0.2):

cargo run --bin clif-util -- compile --target aarch64 -p --disasm fneg100000.clif

Expected Results

Expect function to compile and execute successfully. Ideally, it would be
optimized to a single fma.

Actual Results

Observe that rule recursion leads to stack overflow, for a sufficiently large
instance.

thread 'main' (2897042) has overflowed its stack
fatal runtime error: stack overflow, aborting
fish: Job 1, 'cargo run --bin clif-util -- co' terminated by signal SIGABRT (Abort)

Versions and Environment

Cranelift version or commit: v40.0.2

Operating system: Mac OSX

Architecture: AArch64

Extra Info

Related #12333


Last updated: Jan 29 2026 at 13:25 UTC