beetrees opened PR #10652 from beetrees:f16-f128-riscv64-mvp to bytecodealliance:main:
On the riscv64 backend, this PR adds initial support for using
f16without theZfhextension being enabled, as well as initial support forf128. Support is added for theload,store,bitcast,select,f16const,f128const,bnot,band,borandbxorCLIF instructions, as well as adding thezfhminandzvfhtarget features (alongside the pre-existingzfhtarget feature).cc @afonso360 who previously added initial
f16support in #9135
f16/f128issue: #8312
beetrees requested cfallin for a review on PR #10652.
beetrees requested wasmtime-compiler-reviewers for a review on PR #10652.
beetrees requested wasmtime-default-reviewers for a review on PR #10652.
bjorn3 submitted PR review.
bjorn3 created PR review comment:
"Zfhmin: Minimal Half-Precision Floating-Point",
bjorn3 submitted PR review.
bjorn3 created PR review comment:
"Zvfh: Vector Extension for Half-Precision Floating-Point",
beetrees updated PR #10652.
beetrees submitted PR review.
beetrees created PR review comment:
Done
beetrees submitted PR review.
beetrees created PR review comment:
Done
github-actions[bot] commented on PR #10652:
Subscribe to Label Action
cc @cfallin, @fitzgen
<details>
This issue or pull request has been labeled: "cranelift", "cranelift:meta", "isle"Thus the following users have been cc'd because of the following labels:
- cfallin: isle
- fitzgen: isle
To subscribe or unsubscribe from this label, edit the <code>.github/subscribe-to-label.json</code> configuration file.
Learn more.
</details>
cfallin submitted PR review:
A few nits but overall this looks plausible to me -- thanks for working through the tedious details!
cfallin created PR review comment:
Could we add a comment here describing why we're OR'ing in all ones in the high bits? I'm not sure if I understand why myself -- does this allow more immediates to fit into better/smaller codegen patterns for
immperhaps?
cfallin created PR review comment:
Likewise here as above -- where does the constant come from?
cfallin created PR review comment:
Is there any particular reason that we remove
aarch64 has_fp16from the list of targets tested by this runtest and those below? Can we keep that while addingriscv64below it?
beetrees updated PR #10652.
beetrees submitted PR review.
beetrees created PR review comment:
I've added a comment. RISC-V stores smaller floats inside larger floating-point registers using NaN-boxing, meaning that the smaller float will be stored in the lower bits and all the other bits will be set to 1 (see section 21.2 of the RISC-V ISA manual for details). The correctly-sized
fmvinstructions do this automatically, but when using a 32-bitfmvfor a 16-bit float the NaN-boxing has to be done manually.
beetrees updated PR #10652.
beetrees submitted PR review.
beetrees created PR review comment:
This was a leftover from when
bitcast-f16-f128.clifwas split intof16-bitcast.clifandf128-bitcast.clifin #9135. Theaarch64directives were just moved tof128-bitcast.clif, withf16-bitcast.clifonly being tested on x86_64 and riscv64. This PR addsaarch64(both with and withouthas_fp16) tof16-bitcast.clifand removes thehas_fp16case fromf128-bitcast.clifashas_fp16is only relevant to testingf16, notf128.
bjorn3 submitted PR review.
bjorn3 created PR review comment:
Does this rule only applies when 16bit float instructions are not supported? If so I think we could get away with only NaN-boxing on the ABI boundary and within the function leaving the high bits undefined. I don't know if that will be faster though.
beetrees submitted PR review.
beetrees created PR review comment:
When the
Zfhminextension is not enabled (which is the only case where NaN-boxing is done manually), the only place NaN-boxing is required is at the ABI boundary. However, I doubt it would be beneficial to move NaN-boxing from bitcasts to the ABI boundary as I expect that most uses off16are going to be calls to other functions (such as__extendhfsf2fromcompiler_builtins). It would be faster to NaN-box the value at most once (the value will already be NaN-boxed if it is an argument to the function or return value from another function) rather than every time it is used.
beetrees edited PR review comment.
cfallin submitted PR review.
cfallin created PR review comment:
Ah, right, of course -- saw the seemingly unrelated change but that cleanup makes sense, thanks.
cfallin submitted PR review:
Thanks, LGTM!
cfallin merged PR #10652.
Last updated: Dec 06 2025 at 07:03 UTC