MozarellaMan opened PR #5490 from update_x86_target_intel_cpu_presets_to_match_llvm
to main
:
<!--
Please ensure that the following steps are all taken care of before submitting
the PR.
[x] This has been discussed in issue #..., or if not, please tell us why
here.[x] A short description of what this does, why it is needed; if the
description becomes long, the matter should probably be discussed in an issue
first.[ ] This PR contains test cases, if meaningful.
- [ ] A reviewer from the core maintainer team has been assigned for this PR.
If you don't know who could review this, please indicate so. The list of
suggested reviewers on the right can help you.Please ensure all communication adheres to the code of conduct.
-->
This is for #5444I decided to just start with the Intel CPUs for now just to get feedback on whether this is an alright way to sync up + update cranelift's to match LLVM's presets. I used this (thanks @bjorn3!) as a reference for the features and presets used by LLVM.
Changes
- added
Copy, Clone
derive toPresetIndex
to enable composing presets multiple times- added missing x86 Intel CPU presets (old and new)
- updated existing presets with apparently missing features (
cannonlake
andicelake
)Notes
- What's the best way to test this? Can it be meaningfully tested?
- Should I include links to the exact feature lists for each preset as a comment with each preset or just one link at the top of all the presets?
- In some cases, LLVM has a different name for the same architecture feature e.g. LLVM has
FeatureDQI
, cranelift hashas_avx512dq
- this led to me updating the existing presets with features that those archtectures have (or at least LLVM says they have those features!)
MozarellaMan updated PR #5490 from update_x86_target_intel_cpu_presets_to_match_llvm
to main
.
abrown created PR review comment:
Is this duplicated from up above?
abrown submitted PR review.
abrown submitted PR review.
jameysharp submitted PR review.
jameysharp submitted PR review.
jameysharp created PR review comment:
Looks like LLVM has an alias for
silvermont
calledslm
which I guess we should mirror here.
jameysharp created PR review comment:
I think there's a mismatch between how we specify features and how LLVM does it.
Apparently LLVM's
FeatureSSE41
impliesFeatureSSSE3
which impliesFeatureSSE3
, for example. Our boolean settings don't imply other settings. Instead, if you look at thenehalem
preset for example, it's specified aspreset!(has_sse3 && has_ssse3 && has_sse41 && ...)
.So these definitions for
core2
andpenryn
are both missinghas_sse3
, andpenryn
is missinghas_ssse3
. Later presets are similarly missing implied features.I'm not sure whether it's better to follow the existing pattern and duplicate these implied feature sets everywhere, or to introduce new presets for features which are related in this way:
let sse3 = settings.add_preset("sse3", "SSE3 and earlier.", preset!(has_sse3)); let ssse3 = settings.add_preset("ssse3", "SSSE3 and earlier.", preset!(sse3 && has_ssse3)); let sse41 = settings.add_preset("sse41", "SSE4.1 and earlier.", preset!(ssse3 && has_sse41)); let sse42 = settings.add_preset("sse42", "SSE4.2 and earlier.", preset!(sse41 && has_sse42)); ... settings.add_preset("nocona", "Nocona microarchitecture.", preset!(sse3)); settings.add_preset("core2", "Core 2 microarchitecture.", preset!(ssse3)); settings.add_preset("penryn", "Penryn microarchitecture.", preset!(sse41)); let nehalem = settings.add_preset("nehalem", "Nehalem microarchitecture.", preset!(sse42 && has_popcnt));
Either way the flattened list of features is included in comments in the generated Rust in
target/*/build/cranelift-codegen-*/out/settings-x86.rs
, which is nice for double-checking the results.(@fitzgen, perhaps you have opinions on this?)
jameysharp created PR review comment:
Let's match how LLVM defines Haswell. Apparently we were missing AVX/AVX2 and FMA.
preset!(ivy_bridge && has_avx2 && has_bmi1 && has_bmi2 && has_fma && has_lzcnt),
jameysharp created PR review comment:
Since LLVM claims Haswell included FMA, we can drop it from the preset for Broadwell.
jameysharp created PR review comment:
I appreciate that you're keeping the preset with the
icelake
name that we used previously. But let's do something more like this to emphasize that it's a compatibility alias, not from LLVM:let icelake_client = settings.add_preset("icelake-client", ..., preset!(cannonlake && has_avx512bitalg)); // LLVM doesn't use the name "icelake" but Cranelift did in the past; alias it settings.add_preset("icelake", ..., preset!(icelake_client));
jameysharp created PR review comment:
Yeah, this copy of the
alderlake
definition should be deleted. But LLVM is confusing here because it defines Alder Lake between Sapphire Rapids and Raptor Lake, but uses it as a base for earlier microarchitectures. Maybe leave a comment here noting it was defined above, and a comment at the definition above noting that LLVM defines it later.
jameysharp created PR review comment:
If I've read the LLVM tables correctly, Nocona is the first Intel 64-bit microarchitecture, so these are the 32-bit targets I'd like to remove.
// Netburst
MozarellaMan updated PR #5490 from update_x86_target_intel_cpu_presets_to_match_llvm
to main
.
MozarellaMan submitted PR review.
MozarellaMan created PR review comment:
Personally, I would prefer the features implying the others - it makes sense now that you pointed it out. The added description in the setting preset makes that more obivous.
MozarellaMan updated PR #5490 from update_x86_target_intel_cpu_presets_to_match_llvm
to main
.
MozarellaMan updated PR #5490 from update_x86_target_intel_cpu_presets_to_match_llvm
to main
.
MozarellaMan updated PR #5490 from update_x86_target_intel_cpu_presets_to_match_llvm
to main
.
MozarellaMan updated PR #5490 from update_x86_target_intel_cpu_presets_to_match_llvm
to main
.
jameysharp submitted PR review.
jameysharp submitted PR review.
jameysharp created PR review comment:
If I'm reading the LLVM tables correctly, it exposes this as _both_ "slm" and "silvermont", but defines them identically. So we should do the same. I think it makes sense to use the longer name "silvermont" as the let-bound preset, and define the "slm" preset in terms of "silvermont".
MozarellaMan updated PR #5490 from update_x86_target_intel_cpu_presets_to_match_llvm
to main
.
jameysharp submitted PR review.
jameysharp created PR review comment:
I think the duplicate definition of
alderlake
is the last issue remaining here. Once that's fixed I think this PR is ready!
MozarellaMan updated PR #5490 from update_x86_target_intel_cpu_presets_to_match_llvm
to main
.
jameysharp submitted PR review.
jameysharp has enabled auto merge for PR #5490.
jameysharp merged PR #5490.
Last updated: Dec 23 2024 at 12:05 UTC