wasmtime / PR #10691 Add initial `f16` and `f128` support... · git-wasmtime

Stream: git-wasmtime

Topic: wasmtime / PR #10691 Add initial `f16` and `f128` support...

Wasmtime GitHub notifications bot (Apr 28 2025 at 21:43):

beetrees opened PR #10691 from beetrees:f16-f128-s390x-mvp to bytecodealliance:main:

This PR adds initial support for passing f16 and f128 values around to the s390x backend. Support is added for the load, store, bitcast, f16const and f128const CLIF instructions.

Note that the s390x ABI specification currently does not specify the ABI for f16. However, Clang recently added support for f16 in https://github.com/llvm/llvm-project/pull/109164 (as opposed to LLVM just supporting it at the LLVM IR level) using a straightforward extrapolation of the ABI (passing f16 in floating point registers just like f32 and f64), so on that basis I've not put the f16 ABI behind the enable_llvm_abi_extensions setting.

f16/f128 issue: https://github.com/bytecodealliance/wasmtime/issues/8312

Wasmtime GitHub notifications bot (Apr 28 2025 at 21:43):

beetrees requested wasmtime-compiler-reviewers for a review on PR #10691.

Wasmtime GitHub notifications bot (Apr 28 2025 at 21:43):

beetrees requested cfallin for a review on PR #10691.

Wasmtime GitHub notifications bot (Apr 28 2025 at 22:19):

cfallin commented on PR #10691:

cc @uweigand -- would you mind reviewing the s390x backend changes here? (Thanks!)

Wasmtime GitHub notifications bot (Apr 28 2025 at 23:44):

github-actions[bot] commented on PR #10691:

Subscribe to Label Action

cc @cfallin, @fitzgen

<details>
This issue or pull request has been labeled: "cranelift", "cranelift:area:machinst", "isle"

Thus the following users have been cc'd because of the following labels:

cfallin: isle

fitzgen: isle

To subscribe or unsubscribe from this label, edit the <code>.github/subscribe-to-label.json</code> configuration file.

Learn more.
</details>

Wasmtime GitHub notifications bot (Apr 29 2025 at 12:35):

uweigand commented on PR #10691:

Thanks for working on s390x! For f128, current hardware actually has two different ways of representing them - there's the "traditional" way of splitting an f128 value across a pair of 64-bit floating-point registers, but since z14 we can also hold f128 values in a single vector register, with a full set of arithmetic operations available.

This patch is based on the traditional FPR pair approach, but I would much prefer using single vector registers, for several reasons:

z14 is the minimum architecture level for cranelift, so the VR-based instructions are always available

Handling a single VR is much simpler than handling a pair of FPRs

Specifically, while your patch hasn't run into this problem yet, once you actually want to use any of the traditional f128 arithmetic instructions, those only operate on architected pairs (N, N+2), not any two random FPRs. But the cranelift regalloc actually isn't capable of allocating pairs, so those would have to be hardcoded everywhere

Note that the one place where we'd still require pairs (even when using VRs everywhere else) is for function arguments and returns, as the ABI was defined for older systems. But for the ABI we have to hard-code registers anyway, so that should be much less of an issue. (For our own non-system ABIs, we could even chose to pass f128 in VR as well.)

LLVM (and GCC) will also use VRs to hold f128 if you build with -march=z14 or higher.

Wasmtime GitHub notifications bot (Apr 29 2025 at 13:40):

beetrees updated PR #10691.

Wasmtime GitHub notifications bot (Apr 29 2025 at 13:40):

beetrees updated PR #10691.

Wasmtime GitHub notifications bot (Apr 29 2025 at 13:45):

beetrees commented on PR #10691:

Done. f128 is now stored in a single vector register.

Note that the one place where we'd still require pairs (even when using VRs everywhere else) is for function arguments and returns, as the ABI was defined for older systems. But for the ABI we have to hard-code registers anyway, so that should be much less of an issue. (For our own non-system ABIs, we could even chose to pass f128 in VR as well.)

I'm unsure what you mean by this; according to the s390x ABI specification f128 is always passed and returned indirectly.

Wasmtime GitHub notifications bot (Apr 29 2025 at 13:57):

uweigand commented on PR #10691:

Done. f128 is now stored in a single vector register.

Thanks, I'll have a closer look shortly.

I'm unsure what you mean by this; according to the s390x ABI specification f128 is always passed and returned indirectly.

Sorry, I must have gotten confused. You're correct, of course.

Wasmtime GitHub notifications bot (Apr 29 2025 at 16:21):

uweigand commented on PR #10691:

OK, this all LGTM now. Thanks again!

Wasmtime GitHub notifications bot (Apr 29 2025 at 16:30):

cfallin submitted PR review:

Thanks for the review. @uweigand! Giving this an approval based on that, and merging.

Wasmtime GitHub notifications bot (Apr 29 2025 at 16:31):

cfallin submitted PR review:

Thanks for the review, @uweigand! Giving this an approval based on that, and merging.

Wasmtime GitHub notifications bot (Apr 29 2025 at 16:50):

cfallin commented on PR #10691:

@alexcrichton it looks like the merge queue checks failed on a diff of a C++ header -- IIRC you had worked on this recently?

Wasmtime GitHub notifications bot (Apr 29 2025 at 17:47):

alexcrichton commented on PR #10691:

I think that was a failure to download wasm.hh and it manifested as a weird error, so I'm going to re-queue, but I could very well be wrong...

Wasmtime GitHub notifications bot (Apr 29 2025 at 18:09):

alexcrichton merged PR #10691.

Wasmtime GitHub notifications bot (Jun 24 2025 at 10:15):

tgross35 commented on PR #10691:

For posterity, the most recently published principles of operation doc now specifies f16 https://www.ibm.com/docs/en/module_1678991624569/pdf/SA22-7832-14.pdf

Wasmtime GitHub notifications bot (Jun 24 2025 at 11:04):

uweigand commented on PR #10691:

For posterity, the most recently published principles of operation doc now specifies f16 https://www.ibm.com/docs/en/module_1678991624569/pdf/SA22-7832-14.pdf

To be clear, this was already in the previous (z16) version. Both for z16 and z17, the "BFP tiny" (f16) format is only used in a few special operations involving the AI Unit (NNPA and related instructions); there is no general arithmetic (or even conversion) support for this data type.

Last updated: Feb 24 2026 at 05:28 UTC