Stream: git-cranelift

Topic: cranelift / Issue #955 Reduce codegen annoyances


view this post on Zulip GitHub (Feb 28 2020 at 23:26):

alexcrichton transferred Issue #955:

Having worked with cranelift for a few weeks now, I've had several discussions about how to improve the code in the cranelift-codegen module. I am sure that moving meta from Python to Rust has made the code easier to understand and use but here are some things I've noticed as I've worked in this module:

  1. __static references__: cranelift-codegen contains many by_name retrievals throughout the encodings, formats, legalization, etc. This means that when I add a new instruction, e.g., I can't encode it until I've found the alphabetically-ordered location to insert a by_name("my_inst")--and I forget to do this every time. @bnjbvr mentioned that we might be able to use lazy_static to initialize them. Then we could use instructions, e.g., by importing crate::shared and writing shared::splat where they are used. This may work for instructions, formats, recipes (maybe?), but we may run into issues with settings since SettingGroupBuilder::build does more than just collect predicates.
  2. __eliminate templates__: cranelift-codegen has the concept of Recipes; the x86 subdirectory has the additional concept of Templates (previously TailRecipes). Perhaps I just don't understand the purpose of Template clearly enough (and would appreciate an explanation), but having both Recipe and Template is very confusing; it is unclear when to build one or the other and the difference seems minimal. Is there a way Recipe can be improved to cover any features it is missing and Template eliminated?
  3. __recipe naming convention__: The x86 recipes have a confusing assortment of names: sometimes they make sense (null and null_fpr) but other times the naming is less than clear (e.g. brid might stand for a branch recipe with a double-word immediate?). This makes it difficult to find existing recipes that may already do what I need. Can we explain the naming convention in the documentation and/or adopt a naming convention that is more explanatory?
  4. __non-string recipes__: The recipe code itself (i.e. in emit) is difficult to change because it is a string, not Rust code. If a recipe is coded incorrectly, usually the build will fail; if the code was statically analyzable the errors would be clearer sooner. If instead emit could be passed a closure or function to do the appropriate recipe work, then the code would be statically analyzable: .emit(|sink, bits, regs| { ... }. This won't work as-is because each recipe uses different parameters and types, but perhaps a procedural macro wrapped around this (i.e. emit(recipe_from!(|sink, bits, regs| { ... })) could reduce a real Rust closure/function to the strings emit expects while allowing IDEs to statically analyze the code.
  5. __re-factor PerCpuModeEncodings__: The x86 encodings use a PerCpuModeEncodings struct to join an instruction, its bound types, its encodings, its recipes, etc. Currently there are a confusing assortment of methods for adding different subsets of these (e.g. enc_x86_64_instp, enc_both, enc_x86_64, enc_i32_i64, etc.). Perhaps a builder-like pattern would allow us to remove some of these methods and clarify what is happening; something like: e.inst(band).with_types([B1, B8, B16]).uses_recipe(rec_rr.opcodes(...)).when_isa([use_sse3]).when_predicate(...).encode();. For instructions that have several encodings following a pattern (i.e. like that of enc_i32_i64) we could add a .encode_with_pattern(...) accepting an enum for the various patterns.

I am opening this issue to discuss if any of the above are feasible and desirable. Please add to this list if there are other items.


Last updated: Oct 23 2024 at 20:03 UTC