primoly opened issue #8967:
NaNs in IEEE floats are encoded just like infinities, except that at least one of the mantissa bits is set to one. Wasm NaNs in results of float ops (_except_
fabs
,fneg
andfcopysign
) have the additional requirement that the most significant bit of the mantissa must be be one (arithmetic NaN). Is this actually honoured by Cranelift codegen? If NaNs are always canonicalised then this is guaranteed (as the canonical NaN is also an arithmetic one), but if NaNs are simply propagated then this could be an issue.https://webassembly.github.io/spec/core/exec/numerics.html#nan-propagation
https://webassembly.github.io/spec/core/syntax/values.html#syntax-float
https://webassembly.github.io/spec/core/syntax/values.html#arithmetic-nan
primoly edited issue #8967:
NaNs in IEEE floats are encoded just like infinities, except that at least one of the mantissa bits is set to one. Wasm NaNs in results of float ops (_except_
fabs
,fneg
andfcopysign
) have the additional requirement that the most significant bit of the mantissa must be one (arithmetic NaN). Is this actually honoured by Cranelift codegen? If NaNs are always canonicalised then this is guaranteed (as the canonical NaN is also an arithmetic one), but if NaNs are simply propagated then this could be an issue.https://webassembly.github.io/spec/core/exec/numerics.html#nan-propagation
https://webassembly.github.io/spec/core/syntax/values.html#syntax-float
https://webassembly.github.io/spec/core/syntax/values.html#arithmetic-nan
alexcrichton commented on issue #8967:
I believe in theory all Cranelift operations/backends/etc all do what wasm wants. That being said I also wouldn't preclude the existence of a bug somewhere. I haven't read up on the context of #8954 but is there a specific case you're wondering about?
cfallin commented on issue #8967:
I think this is a question of auditing the backends to ensure that each ISA generates the appropriate kind of NaN. I'm not familiar enough with the FP subsets of our supported ISAs to say for sure; @abrown, @afonso360, @uweigand, others, thoughts on x64, aarch64, riscv64, s390x? Do they always generate arithmetic NaNs?
cfallin commented on issue #8967:
(As an additional note, I'd be surprised if Wasm were spec'd in such a way that e.g. a
f32.add
required more than a bare FP add instruction on common ISAs -- that would seem to go against the "predictable performance" design, and I haven't seen "ensure it's an arithmetic NaN" logic in other engines -- but this is good to double-check.)
primoly commented on issue #8967:
(As an additional note, I'd be surprised if Wasm were spec'd in such a way that e.g. a
f32.add
required more than a bare FP add instruction on common ISAs -- that would seem to go against the "predictable performance" design, and I haven't seen "ensure it's an arithmetic NaN" logic in other engines -- but this is good to double-check.)I think you can only observe the exact bits of a float with certain instructions (memory storing, reinterpret bits as int, others?), so if those instructions do some kind of canonicalisation guaranteeing arithmetic NaNs, that would likely be enough. For stores to memory the performance impact should be negligible, because stores are so expensive, the additional conversion to arithmetic NaNs wouldn’t add much.
fitzgen commented on issue #8967:
(As an additional note, I'd be surprised if Wasm were spec'd in such a way that e.g. a
f32.add
required more than a bare FP add instruction on common ISAs -- that would seem to go against the "predictable performance" design, and I haven't seen "ensure it's an arithmetic NaN" logic in other engines -- but this is good to double-check.)+1 to this; I'd also be very surprised.
primoly edited a comment on issue #8967:
(As an additional note, I'd be surprised if Wasm were spec'd in such a way that e.g. a
f32.add
required more than a bare FP add instruction on common ISAs -- that would seem to go against the "predictable performance" design, and I haven't seen "ensure it's an arithmetic NaN" logic in other engines -- but this is good to double-check.)I think you can only observe the exact bits of a float with certain instructions (memory storing, reinterpret bits as int, others?), so if those instructions do some kind of canonicalisation guaranteeing arithmetic NaNs, that would likely be enough. For stores to memory the performance impact should be negligible, because stores are so expensive, the additional conversion to arithmetic NaNs wouldn’t add much.
Edit: I don’t actually know much about the performance of CPU instructions, maybe with the speed of L1 caches stores are not expensive as I wrote above.
afonso360 commented on issue #8967:
For RISC-V the behavior is fairly straightforward. (See §21.3. "NaN Generation and Propagation" of the ISA Unprivileged Manual)
Except when otherwise stated, if the result of a floating-point operation is NaN, it is the canonical
NaN. The canonical NaN has a positive sign and all significand bits clear except the MSB, a.k.a. the
quiet bit. For single-precision floating-point, this corresponds to the pattern 0x7fc00000AFAIK the exceptions to this are
fmin
/fmax
and the sign injection instructions.So I think this is compliant with the WASM Spec directly, and for
fmin
/fmax
we do some corrections.
uweigand commented on issue #8967:
The s390x ISA does not use the term "arithmetic NaN". However, the definition of this term in the WebAssembly spec (most significant payload bit is 1) corresponds exactly to what the s390x ISA calls "quiet NaN":
There are two types of NaNs, signaling and quiet. A signaling NaN (SNaN) is distinguished from the corresponding quiet NaN (QNaN) by the leftmost fraction bit: zero for the SNaN and one for the QNaN. A special QNaN is supplied as the default result for an IEEE-invalid-operation exception; it has a plus sign and a leftmost fraction bit of one, with the remaining fraction bits being set to zeros.
The ISA guarantees that the result of any floating-point instruction (except those only manipulating the sign bit) will always be a quiet NaN (if it is a NaN). More specifically if the output is a NaN, it is either a copy of one of the input operands (possibly transformed from a signaling NaN to a quiet NaN), or else the default NaN described in the quote above (which corresponds to what WebAssembly calls the "canonical NaN").
Given that behavior of the ISA, it seems unlikely that code generated by the wasmtime backend would ever be able to violate the WebAssembly NaN propagation rules on s390x (although I haven't explicitly verified all patterns).
primoly closed issue #8967:
NaNs in IEEE floats are encoded just like infinities, except that at least one of the mantissa bits is set to one. Wasm NaNs in results of float ops (_except_
fabs
,fneg
andfcopysign
) have the additional requirement that the most significant bit of the mantissa must be one (arithmetic NaN). Is this actually honoured by Cranelift codegen? If NaNs are always canonicalised then this is guaranteed (as the canonical NaN is also an arithmetic one), but if NaNs are simply propagated then this could be an issue.https://webassembly.github.io/spec/core/exec/numerics.html#nan-propagation
https://webassembly.github.io/spec/core/syntax/values.html#syntax-float
https://webassembly.github.io/spec/core/syntax/values.html#arithmetic-nan
primoly commented on issue #8967:
Thanks everyone, especially @afonso360 and @uweigand for digging up the RISC-V and s390x specifications. Now that I know of the difference between quiet and singling NaN it became clear to me why Wasm made that choice about the “arithmetic NaN” (quiet NaN). I haven’t checked how this is handled in x86-64 and AArch64 but I presume they either return the canonical NaN (as RISC-V) or some arithmetic NaN (as s390x).
Last updated: Nov 22 2024 at 17:03 UTC