Stream: git-wasmtime

Topic: wasmtime / issue #4462 Cranelift: Missing SIMD `fma` lowe...


view this post on Zulip Wasmtime GitHub notifications bot (Jul 17 2022 at 13:29):

afonso360 opened issue #4462:

:wave: Hey,

It looks like we don't have any lowering for the fma instruction when used with SIMD types.

.clif Test Case

function %fma_f32x4(f32x4, f32x4, f32x4) -> f32x4 {
block0(v0: f32x4, v1: f32x4, v2: f32x4):
    v3 = fma v0, v1, v2
    return v3
}

Steps to Reproduce

cargo run -- compile --target x86_64 ./the-above.clif

Expected Results

A successful compilation.

Actual Results

We don't have this implemented

thread 'main' panicked at 'internal error: entered unreachable code: implemented in ISLE: inst = `v3 = fma.f32x4 v0, v1, v2`, type = `Some(types::F32X4)`', cranelift\codegen\src\isa\x64\lower.rs:808:9
n

Versions and Environment

Cranelift version or commit: main
Operating system: Windows
Architecture: x86_64

Extra Info

In #4460 I tried to implement this using the vfmadd231ps instruction, but I had issues implementing emit for it and I'd like some help.

It looks like that instruction is only available in VEX encoding (or EVEX for avx512 but I don't have a machine with that). Is our EvexInstruction encoder suitable for emitting VEX instructions, or are they completely different? Do we have a way to emit VEX instructions?

I'm not too familiar with x86 but I'd like to pick this up with some help.

cc: @abrown

view this post on Zulip Wasmtime GitHub notifications bot (Jul 17 2022 at 13:29):

afonso360 labeled issue #4462:

:wave: Hey,

It looks like we don't have any lowering for the fma instruction when used with SIMD types.

.clif Test Case

function %fma_f32x4(f32x4, f32x4, f32x4) -> f32x4 {
block0(v0: f32x4, v1: f32x4, v2: f32x4):
    v3 = fma v0, v1, v2
    return v3
}

Steps to Reproduce

cargo run -- compile --target x86_64 ./the-above.clif

Expected Results

A successful compilation.

Actual Results

We don't have this implemented

thread 'main' panicked at 'internal error: entered unreachable code: implemented in ISLE: inst = `v3 = fma.f32x4 v0, v1, v2`, type = `Some(types::F32X4)`', cranelift\codegen\src\isa\x64\lower.rs:808:9
n

Versions and Environment

Cranelift version or commit: main
Operating system: Windows
Architecture: x86_64

Extra Info

In #4460 I tried to implement this using the vfmadd231ps instruction, but I had issues implementing emit for it and I'd like some help.

It looks like that instruction is only available in VEX encoding (or EVEX for avx512 but I don't have a machine with that). Is our EvexInstruction encoder suitable for emitting VEX instructions, or are they completely different? Do we have a way to emit VEX instructions?

I'm not too familiar with x86 but I'd like to pick this up with some help.

cc: @abrown

view this post on Zulip Wasmtime GitHub notifications bot (Jul 17 2022 at 13:29):

afonso360 labeled issue #4462:

:wave: Hey,

It looks like we don't have any lowering for the fma instruction when used with SIMD types.

.clif Test Case

function %fma_f32x4(f32x4, f32x4, f32x4) -> f32x4 {
block0(v0: f32x4, v1: f32x4, v2: f32x4):
    v3 = fma v0, v1, v2
    return v3
}

Steps to Reproduce

cargo run -- compile --target x86_64 ./the-above.clif

Expected Results

A successful compilation.

Actual Results

We don't have this implemented

thread 'main' panicked at 'internal error: entered unreachable code: implemented in ISLE: inst = `v3 = fma.f32x4 v0, v1, v2`, type = `Some(types::F32X4)`', cranelift\codegen\src\isa\x64\lower.rs:808:9
n

Versions and Environment

Cranelift version or commit: main
Operating system: Windows
Architecture: x86_64

Extra Info

In #4460 I tried to implement this using the vfmadd231ps instruction, but I had issues implementing emit for it and I'd like some help.

It looks like that instruction is only available in VEX encoding (or EVEX for avx512 but I don't have a machine with that). Is our EvexInstruction encoder suitable for emitting VEX instructions, or are they completely different? Do we have a way to emit VEX instructions?

I'm not too familiar with x86 but I'd like to pick this up with some help.

cc: @abrown

view this post on Zulip Wasmtime GitHub notifications bot (Jul 17 2022 at 13:40):

afonso360 edited issue #4462:

:wave: Hey,

It looks like we don't have any lowering for the fma instruction when used with SIMD types.

.clif Test Case

function %fma_f32x4(f32x4, f32x4, f32x4) -> f32x4 {
block0(v0: f32x4, v1: f32x4, v2: f32x4):
    v3 = fma v0, v1, v2
    return v3
}

Steps to Reproduce

cargo run -- compile --target x86_64 ./the-above.clif

Expected Results

A successful compilation.

Actual Results

We don't have this implemented

thread 'main' panicked at 'internal error: entered unreachable code: implemented in ISLE: inst = `v3 = fma.f32x4 v0, v1, v2`, type = `Some(types::F32X4)`', cranelift\codegen\src\isa\x64\lower.rs:808:9
n

Versions and Environment

Cranelift version or commit: main
Operating system: Windows
Architecture: x86_64

Extra Info

In #4460 I tried to implement this using the vfmadd231ps instruction, but I had issues encoding the instruction and I'd like some help.

It looks like that instruction is only available in VEX encoding (or EVEX for avx512 but I don't have a machine with that). Is our EvexInstruction encoder suitable for emitting VEX instructions, or are they completely different? Do we have a way to emit VEX instructions?

I'm not too familiar with x86 but I'd like to pick this up with some help.

cc: @abrown

view this post on Zulip Wasmtime GitHub notifications bot (Jul 18 2022 at 17:08):

abrown commented on issue #4462:

Hm, so here are some thoughts:

Hopefully that information helps. Ping me on Zulip if you want to have a more "live" discussion.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 25 2022 at 22:01):

cfallin closed issue #4462:

:wave: Hey,

It looks like we don't have any lowering for the fma instruction when used with SIMD types.

.clif Test Case

function %fma_f32x4(f32x4, f32x4, f32x4) -> f32x4 {
block0(v0: f32x4, v1: f32x4, v2: f32x4):
    v3 = fma v0, v1, v2
    return v3
}

Steps to Reproduce

cargo run -- compile --target x86_64 ./the-above.clif

Expected Results

A successful compilation.

Actual Results

We don't have this implemented

thread 'main' panicked at 'internal error: entered unreachable code: implemented in ISLE: inst = `v3 = fma.f32x4 v0, v1, v2`, type = `Some(types::F32X4)`', cranelift\codegen\src\isa\x64\lower.rs:808:9
n

Versions and Environment

Cranelift version or commit: main
Operating system: Windows
Architecture: x86_64

Extra Info

In #4460 I tried to implement this using the vfmadd231ps instruction, but I had issues encoding the instruction and I'd like some help.

It looks like that instruction is only available in VEX encoding (or EVEX for avx512 but I don't have a machine with that). Is our EvexInstruction encoder suitable for emitting VEX instructions, or are they completely different? Do we have a way to emit VEX instructions?

I'm not too familiar with x86 but I'd like to pick this up with some help.

cc: @abrown


Last updated: Jan 24 2025 at 00:11 UTC