alexcrichton opened PR #5567 from libcalls
to main
:
Long ago Wasmtime used to have logic for resolving relocations post-compilation for libcalls which I ended up removing during refactorings last year. As #5563 points out, however, it's possible to get Wasmtime to panic by disabling SSE features which forces Cranelift to use libcalls for some floating-point operations instead. Note that this also requires disabling SIMD because SIMD support has a baseline of SSE 4.2.
This commit pulls back the old implementations of various libcalls and reimplements logic necessary to have them work on CPUs without SSE 4.2
Closes #5563
<!--
Please ensure that the following steps are all taken care of before submitting
the PR.
[ ] This has been discussed in issue #..., or if not, please tell us why
here.[ ] A short description of what this does, why it is needed; if the
description becomes long, the matter should probably be discussed in an issue
first.[ ] This PR contains test cases, if meaningful.
- [ ] A reviewer from the core maintainer team has been assigned for this PR.
If you don't know who could review this, please indicate so. The list of
suggested reviewers on the right can help you.Please ensure all communication adheres to the code of conduct.
-->
alexcrichton has marked PR #5567 as ready for review.
jameysharp submitted PR review.
jameysharp submitted PR review.
jameysharp created PR review comment:
It'd be nice if there was a good way to share these implementations of
round_ties_even
with the ones incranelift/codegen/src/ir/immediates.rs
. None come to my mind off-hand since the runtime can be built in a configuration without the compiler, but it'd be nice.The existing version there doesn't have the "Check for NaNs" section that this version does. I haven't thought hard enough about this to understand what that's for, but maybe you could add a comment linking to some reference for why this is the right implementation?
The existing version does have a link to rust-lang/rust#96710 which I think is worth copying here.
jameysharp created PR review comment:
Minor typo:
/// Returns the libcall corresponding to the provided symbol name,
jameysharp created PR review comment:
Even after reading the above comment several times I still had a hard time figuring out that these calls to
arbitrary()
can't be moved into the loop.How about something like this to avoid having to keep two lists of feature names in sync, and hopefully make it less tempting to inline the variables into the loop?
let new_flags = ["has_sse3", "has_ssse3", "has_sse41", "has_sse42"] .into_iter() .map(|name| (name, u.arbitrary()?)) .collect::<arbitrary::Result<HashMap<&'static str, bool>>()?; for (name, val) in flags { if let Some(new_value) = new_flags.get(name) { *val = new_value.to_string(); } }
Also, I think this is okay, but just to check: Each of these features implies the previous one (e.g. if you have SSE4.2 you also have SSE4.1). Is it okay to turn some off without turning off later ones?
alexcrichton created PR review comment:
Is it okay to turn some off without turning off later ones?
While I'm not 100% certain myself I believe that this question is up to Cranelift mostly. I believe Cranelift is mostly structured around "if this feature is active emit this instr" while it doesn't make sense to disable sse3 but leave sse4.1 enabled I don't think it will break anything since it would be a sort of "chaos mode" for cranelift stressing it.
If this becomes a problem for Cranelift, however, we can tweak the fuzz input to respect what the actual CPU feature hierarchy is.
alexcrichton submitted PR review.
alexcrichton updated PR #5567 from libcalls
to main
.
alexcrichton updated PR #5567 from libcalls
to main
.
alexcrichton submitted PR review.
alexcrichton created PR review comment:
These libcalls were removed in https://github.com/bytecodealliance/wasmtime/pull/4470 and I copied their source from just before that commit.
Apparently
nearest
has a slightly storied history:
- https://github.com/bytecodealliance/wasmtime/pull/2219 fixed it for NaN cases
- https://github.com/bytecodealliance/wasmtime/pull/2171 made the implementation more efficient based on musl
- https://github.com/bytecodealliance/wasmtime/pull/34 was the original implementation
I ran
cargo run wast ./tests/spec_testsuite/f32.wast --cranelift-set has_sse41=false --wasm-features=-simd
and commented out the "Check for NaNs" block and the tests failed, so the block is definitely load-bearing. That may mean that Cranelift's implementation needs to be updated to account for NaN's perhaps? (I don't really understand the actual underlying code here I'm just copying bytes)
jameysharp submitted PR review.
alexcrichton merged PR #5567.
Last updated: Jan 24 2025 at 00:11 UTC