alexcrichton requested afonso360 for a review on PR #7327.
alexcrichton opened PR #7327 from alexcrichton:rv64-frm
to bytecodealliance:main
:
This commit started out with the goal of deleting the
FcvtToInt
macro instruction by moving it into ISLE but it ended up having some surrounding refactors as well. The original goal is achieved and all float-to-int conversion now happens in ISLE. Additionally handling of the FRM (float rounding mode) state of each instruction is now explicit which means that Cranelift code emission now defaults toFRM::RNE
(round-to-nearest ties-to-evens) rather than the runtime state in thefcsr
. This was a theoretical bug with the previous code where if the register didn't haveRNE
in it then the code wouldn't perform correctly.
alexcrichton requested abrown for a review on PR #7327.
alexcrichton requested wasmtime-compiler-reviewers for a review on PR #7327.
afonso360 submitted PR review:
Neat! I like explicitly including FRM in the instructions instead of relying on FCSR.
This also fixes #5992 and #5993, could you add those test cases as regression tests and enable those instruction in fuzzgen here?
I've been fuzzing this with those instruction enabled since yesterday and it hasn't found anything.
afonso360 submitted PR review:
Neat! I like explicitly including FRM in the instructions instead of relying on FCSR.
This also fixes #5992 and #5993, could you add those test cases as regression tests and enable those instruction in fuzzgen here?
I've been fuzzing this with those instruction enabled since yesterday and it hasn't found anything.
afonso360 created PR review comment:
I went looking for an answer to this, the psABI document states:
The Floating-Point Control and Status Register (fcsr) must have thread storage duration in accordance with C11 section 7.6 "Floating-point environment <fenv.h>".
Which doesn't mean a whole lot to me, but is clarified in the PR that introduced that sentence to mean that OS's are responsible for preserving it per thread, and otherwise programs are allowed to do pretty much whatever they feel like.
I couldn't find any information as to whether using
fcsr
is better than encoding the rounding mode in the instruction, but I personally like including the rounding mode explicitly it in the instruction, as long as it doesn't break anything.
afonso360 created PR review comment:
I found this slightly hard to read, do you think something like this would be more readable? This is purely personal preference so feel free to ignore it.
The comment regarding
fcvt
would be helpful though since it took me a while to figure that out.match (float, int) { // Saturating cases for larger integers are handled using the `fcvt.{w,d}.{s,d}` instruction directly, that // automatically saturates up/down to the correct limit. (I8, F32) if saturating => f32::from(i8::MIN).to_bits().into(), (I8, F64) if saturating => f64::from(i8::MIN).to_bits(), (I16, F32) if saturating => f32::from(i16::MIN).to_bits().into(), (I16, F64) if saturating => f64::from(i16::MIN).to_bits(), (_, F32) if !saturating => f32_cvt_to_int_bounds(true, int.bits()).0.to_bits().into(), (_, F64) if !saturating => f64_cvt_to_int_bounds(true, int.bits()).0.to_bits(), _ => unimplemented!(), }
afonso360 edited PR review comment.
alexcrichton updated PR #7327.
alexcrichton updated PR #7327.
alexcrichton submitted PR review.
alexcrichton created PR review comment:
No I struggled trying to make this readable, and I like what you have, so replacing!
alexcrichton updated PR #7327.
alexcrichton submitted PR review.
alexcrichton created PR review comment:
Oh nice, thanks! Ok so sounds like
fcsr
isn't something that needs to be preserved across a function call so it could be used here.To clarify though I wasn't thinking of using
fcsr
for rounding modes here but rather instead for overflow detection. Basically using the bits described by this in the spec:The accrued exception flags indicate the exception conditions that have arisen on any floating-point arithmetic instruction since the field was last reset by software, as shown in Table 11.2. The base RISC-V ISA does not support generating a trap on the setting of a floating-point exception flag.
In theory just before this float-to-int conversion it might be possible to set the
fcsr
exception flags to zero, perform the conversion, and then maybe trap if the exception flags are nonzero. I haven't found good docs yet as to how precisely the flags are set on thefcvt.*
instructions so I was holding off on possibly trying this until the future. (and that LLVM doesn't seem to use this and I presume LLVM folks have thought more about perf than I have)
alexcrichton updated PR #7327.
alexcrichton edited PR #7327:
This commit started out with the goal of deleting the
FcvtToInt
macro instruction by moving it into ISLE but it ended up having some surrounding refactors as well. The original goal is achieved and all float-to-int conversion now happens in ISLE. Additionally handling of the FRM (float rounding mode) state of each instruction is now explicit which means that Cranelift code emission now defaults toFRM::RNE
(round-to-nearest ties-to-evens) rather than the runtime state in thefcsr
. This was a theoretical bug with the previous code where if the register didn't haveRNE
in it then the code wouldn't perform correctly.Closes #6015
afonso360 submitted PR review.
afonso360 created PR review comment:
Oh yeah, that's a neat idea. I have no idea if that is better or worse than manually checking, but if we could get the same semantics, it would at least be a much shorter sequence. I'd like to benchmark that, but I only have an in-order core to test and I'm not sure if that would bias the results against larger machines.
afonso360 merged PR #7327.
Last updated: Dec 23 2024 at 12:05 UTC