Stream: git-wasmtime

Topic: wasmtime / PR #7327 riscv64: Refactor FRM and fcvt-to-int...


view this post on Zulip Wasmtime GitHub notifications bot (Oct 22 2023 at 20:49):

alexcrichton requested afonso360 for a review on PR #7327.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 22 2023 at 20:49):

alexcrichton opened PR #7327 from alexcrichton:rv64-frm to bytecodealliance:main:

This commit started out with the goal of deleting the FcvtToInt macro instruction by moving it into ISLE but it ended up having some surrounding refactors as well. The original goal is achieved and all float-to-int conversion now happens in ISLE. Additionally handling of the FRM (float rounding mode) state of each instruction is now explicit which means that Cranelift code emission now defaults to FRM::RNE (round-to-nearest ties-to-evens) rather than the runtime state in the fcsr. This was a theoretical bug with the previous code where if the register didn't have RNE in it then the code wouldn't perform correctly.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 22 2023 at 20:49):

alexcrichton requested abrown for a review on PR #7327.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 22 2023 at 20:49):

alexcrichton requested wasmtime-compiler-reviewers for a review on PR #7327.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 23 2023 at 12:34):

afonso360 submitted PR review:

Neat! I like explicitly including FRM in the instructions instead of relying on FCSR.

This also fixes #5992 and #5993, could you add those test cases as regression tests and enable those instruction in fuzzgen here?

I've been fuzzing this with those instruction enabled since yesterday and it hasn't found anything.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 23 2023 at 12:34):

afonso360 submitted PR review:

Neat! I like explicitly including FRM in the instructions instead of relying on FCSR.

This also fixes #5992 and #5993, could you add those test cases as regression tests and enable those instruction in fuzzgen here?

I've been fuzzing this with those instruction enabled since yesterday and it hasn't found anything.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 23 2023 at 12:34):

afonso360 created PR review comment:

I went looking for an answer to this, the psABI document states:

The Floating-Point Control and Status Register (fcsr) must have thread storage duration in accordance with C11 section 7.6 "Floating-point environment <fenv.h>".

Which doesn't mean a whole lot to me, but is clarified in the PR that introduced that sentence to mean that OS's are responsible for preserving it per thread, and otherwise programs are allowed to do pretty much whatever they feel like.


I couldn't find any information as to whether using fcsr is better than encoding the rounding mode in the instruction, but I personally like including the rounding mode explicitly it in the instruction, as long as it doesn't break anything.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 23 2023 at 12:34):

afonso360 created PR review comment:

I found this slightly hard to read, do you think something like this would be more readable? This is purely personal preference so feel free to ignore it.

The comment regarding fcvt would be helpful though since it took me a while to figure that out.

match (float, int) {
    // Saturating cases for larger integers are handled using the `fcvt.{w,d}.{s,d}` instruction directly, that
    // automatically saturates up/down to the correct limit.
    (I8, F32) if saturating => f32::from(i8::MIN).to_bits().into(),
    (I8, F64) if saturating => f64::from(i8::MIN).to_bits(),
    (I16, F32) if saturating => f32::from(i16::MIN).to_bits().into(),
    (I16, F64) if saturating => f64::from(i16::MIN).to_bits(),

    (_, F32) if !saturating => f32_cvt_to_int_bounds(true, int.bits()).0.to_bits().into(),
    (_, F64) if !saturating => f64_cvt_to_int_bounds(true, int.bits()).0.to_bits(),
    _ => unimplemented!(),
}

view this post on Zulip Wasmtime GitHub notifications bot (Oct 23 2023 at 12:38):

afonso360 edited PR review comment.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 23 2023 at 14:29):

alexcrichton updated PR #7327.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 23 2023 at 14:42):

alexcrichton updated PR #7327.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 23 2023 at 14:46):

alexcrichton submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 23 2023 at 14:46):

alexcrichton created PR review comment:

No I struggled trying to make this readable, and I like what you have, so replacing!

view this post on Zulip Wasmtime GitHub notifications bot (Oct 23 2023 at 14:46):

alexcrichton updated PR #7327.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 23 2023 at 14:49):

alexcrichton submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 23 2023 at 14:49):

alexcrichton created PR review comment:

Oh nice, thanks! Ok so sounds like fcsr isn't something that needs to be preserved across a function call so it could be used here.

To clarify though I wasn't thinking of using fcsr for rounding modes here but rather instead for overflow detection. Basically using the bits described by this in the spec:

The accrued exception flags indicate the exception conditions that have arisen on any floating-point arithmetic instruction since the field was last reset by software, as shown in Table 11.2. The base RISC-V ISA does not support generating a trap on the setting of a floating-point exception flag.

In theory just before this float-to-int conversion it might be possible to set the fcsr exception flags to zero, perform the conversion, and then maybe trap if the exception flags are nonzero. I haven't found good docs yet as to how precisely the flags are set on the fcvt.* instructions so I was holding off on possibly trying this until the future. (and that LLVM doesn't seem to use this and I presume LLVM folks have thought more about perf than I have)

view this post on Zulip Wasmtime GitHub notifications bot (Oct 23 2023 at 15:16):

alexcrichton updated PR #7327.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 23 2023 at 15:17):

alexcrichton edited PR #7327:

This commit started out with the goal of deleting the FcvtToInt macro instruction by moving it into ISLE but it ended up having some surrounding refactors as well. The original goal is achieved and all float-to-int conversion now happens in ISLE. Additionally handling of the FRM (float rounding mode) state of each instruction is now explicit which means that Cranelift code emission now defaults to FRM::RNE (round-to-nearest ties-to-evens) rather than the runtime state in the fcsr. This was a theoretical bug with the previous code where if the register didn't have RNE in it then the code wouldn't perform correctly.

Closes #6015

view this post on Zulip Wasmtime GitHub notifications bot (Oct 23 2023 at 20:18):

afonso360 submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 23 2023 at 20:18):

afonso360 created PR review comment:

Oh yeah, that's a neat idea. I have no idea if that is better or worse than manually checking, but if we could get the same semantics, it would at least be a much shorter sequence. I'd like to benchmark that, but I only have an in-order core to test and I'm not sure if that would bias the results against larger machines.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 23 2023 at 21:11):

afonso360 merged PR #7327.


Last updated: Nov 22 2024 at 17:03 UTC