cfallin edited issue #1150:
Motivation
Cranelift's IR went through a huge overall from the original python generator to one written in Rust. This has made it easier to work on it as we're using the same language to generate the code that used in the rest of the project. However it is still not ideal in terms of contributing new instructions or changing existing instructions as it is not clear what steps are required, and which parts of the meta code generator need to be modified to correctly add an instruction.
This could be partially solved by providing better documentation, however documentation doesn't improve the experience of contributing to the codebase itself, and documentation is likely to fall out of date as time goes on similar to the current situation, as there's no strict link between an instruction and its documentation besides what's available in
InstBuilder
.Instead I would like to propose rewriting at least some of the meta code generator with a greater focus on data locality, so that if someone wants to contribute to the IR, the number of overall steps is reduced. This should also help with documenting the IR format, as the documentation can be more closely coupled.
Proposed Solution
I would like to propose something along the lines of using a data format to encode information about a instruction and how it is encoded. I've used YAML to illustrate what this could look like, though I'm not advocating for the use of YAML specifically.
--- name: jump doc: > Jump. Unconditionally jump to an extended basic block, passing the specified EBB arguments. The number and types of arguments must match the destination EBB. attributes: - terminator - branch operands_in: - Ebb # This would also imply `args` encodings: # Encode from recipe x86: recipe: x86_JMP # If this was YAML specifically you could use anchors for recipes. e.g. x86: << *X86_JMP # Or encode directly riscv: emit: > let dest = i64::from(func.offsets[destination]); let disp = dest - i64::from(sink.offset()); put_uj(bits, disp, 0, sink);
The main advantage to this approach is that moves the current imperative style of
cranelift-codegen
into something that is more declarative and data oriented. To me this provides more clarity around how a instruction is defined and used, if you wanted to create a new instruction one could simply copy and paste from another already working instruction.Drawbacks
While the goal of this proposal is to simplify working on the IR language itself it could make the underlying meta code more complex, as well potentially increasing the compile time to build
cranelift-codegen
since there would now be a deserialisation step that wasn't there before.Alternatives
Instead of changing the system there could be a greater focus on building resources for contributing to cranelift that try to explain how to use the current system.
cfallin labeled issue #1150:
Motivation
Cranelift's IR went through a huge overall from the original python generator to one written in Rust. This has made it easier to work on it as we're using the same language to generate the code that used in the rest of the project. However it is still not ideal in terms of contributing new instructions or changing existing instructions as it is not clear what steps are required, and which parts of the meta code generator need to be modified to correctly add an instruction.
This could be partially solved by providing better documentation, however documentation doesn't improve the experience of contributing to the codebase itself, and documentation is likely to fall out of date as time goes on similar to the current situation, as there's no strict link between an instruction and its documentation besides what's available in
InstBuilder
.Instead I would like to propose rewriting at least some of the meta code generator with a greater focus on data locality, so that if someone wants to contribute to the IR, the number of overall steps is reduced. This should also help with documenting the IR format, as the documentation can be more closely coupled.
Proposed Solution
I would like to propose something along the lines of using a data format to encode information about a instruction and how it is encoded. I've used YAML to illustrate what this could look like, though I'm not advocating for the use of YAML specifically.
--- name: jump doc: > Jump. Unconditionally jump to an extended basic block, passing the specified EBB arguments. The number and types of arguments must match the destination EBB. attributes: - terminator - branch operands_in: - Ebb # This would also imply `args` encodings: # Encode from recipe x86: recipe: x86_JMP # If this was YAML specifically you could use anchors for recipes. e.g. x86: << *X86_JMP # Or encode directly riscv: emit: > let dest = i64::from(func.offsets[destination]); let disp = dest - i64::from(sink.offset()); put_uj(bits, disp, 0, sink);
The main advantage to this approach is that moves the current imperative style of
cranelift-codegen
into something that is more declarative and data oriented. To me this provides more clarity around how a instruction is defined and used, if you wanted to create a new instruction one could simply copy and paste from another already working instruction.Drawbacks
While the goal of this proposal is to simplify working on the IR language itself it could make the underlying meta code more complex, as well potentially increasing the compile time to build
cranelift-codegen
since there would now be a deserialisation step that wasn't there before.Alternatives
Instead of changing the system there could be a greater focus on building resources for contributing to cranelift that try to explain how to use the current system.
Last updated: Jan 24 2025 at 00:11 UTC