cfallin opened issue #4125:
We currently define an
enum
to represent machine instructions for each backend. This is fairly high-level and generally nice, allowing us to use Rust types to model arguments and modes, etc.However, there is a lot of "glue" in the layer of code that supports doing things with these enums that is very tedious and error-prone to write:
- Each instruction enum has a
get_operands
that must list every Reg as an early/late def/use/mod or tmp (and missing one is a bug);- Each instruction enum has an
emit
that must use anAllocationConsumer
to take regalloc results in the same order asget_operands
provided vregs, and mismatching the two is a subtle and hard-to-catch bug;- Each instruction enum has a pretty-printing implementation, with similar
AllocationConsumer
usage constraints, and also with very repetitive code ("print the operand and these three regs");- There are a number of other kinds of metadata (is a move, is a safepoint) that are fairly mechanical
match
es over the enum;- Possibly others (but the above are the main error-prone bits).
Ideally, we could allow the backend author to define a new "assembler library" by:
- Writing the enum of instruction formats, with decorations/annotations for "reg use/def" and other metadata;
- Providing an implementation of a generated trait for emission with one method per instruction format (enum arm), with typesafe named arguments for registers (with generated code matching up allocations, avoiding any errors);
- Either specifying a "default" for pretty-printing (opcode,
Display
trait for other bits, registers?) or else using a similar trait approach as for emission.Then we could generate the rest: essentially the whole implementation of the
MachInst
trait, or at least most of it.There has been some informal discussion about whether expanded information about instructions should be in toml/yaml-type files or something else; I believe it might actually be better to keep this in the ISLE, with some sort of annotation syntax, both to reduce the cognitive load of languages that must be learnt, and to keep as few sources-of-truth around as possible.
cfallin labeled issue #4125:
We currently define an
enum
to represent machine instructions for each backend. This is fairly high-level and generally nice, allowing us to use Rust types to model arguments and modes, etc.However, there is a lot of "glue" in the layer of code that supports doing things with these enums that is very tedious and error-prone to write:
- Each instruction enum has a
get_operands
that must list every Reg as an early/late def/use/mod or tmp (and missing one is a bug);- Each instruction enum has an
emit
that must use anAllocationConsumer
to take regalloc results in the same order asget_operands
provided vregs, and mismatching the two is a subtle and hard-to-catch bug;- Each instruction enum has a pretty-printing implementation, with similar
AllocationConsumer
usage constraints, and also with very repetitive code ("print the operand and these three regs");- There are a number of other kinds of metadata (is a move, is a safepoint) that are fairly mechanical
match
es over the enum;- Possibly others (but the above are the main error-prone bits).
Ideally, we could allow the backend author to define a new "assembler library" by:
- Writing the enum of instruction formats, with decorations/annotations for "reg use/def" and other metadata;
- Providing an implementation of a generated trait for emission with one method per instruction format (enum arm), with typesafe named arguments for registers (with generated code matching up allocations, avoiding any errors);
- Either specifying a "default" for pretty-printing (opcode,
Display
trait for other bits, registers?) or else using a similar trait approach as for emission.Then we could generate the rest: essentially the whole implementation of the
MachInst
trait, or at least most of it.There has been some informal discussion about whether expanded information about instructions should be in toml/yaml-type files or something else; I believe it might actually be better to keep this in the ISLE, with some sort of annotation syntax, both to reduce the cognitive load of languages that must be learnt, and to keep as few sources-of-truth around as possible.
cfallin labeled issue #4125:
We currently define an
enum
to represent machine instructions for each backend. This is fairly high-level and generally nice, allowing us to use Rust types to model arguments and modes, etc.However, there is a lot of "glue" in the layer of code that supports doing things with these enums that is very tedious and error-prone to write:
- Each instruction enum has a
get_operands
that must list every Reg as an early/late def/use/mod or tmp (and missing one is a bug);- Each instruction enum has an
emit
that must use anAllocationConsumer
to take regalloc results in the same order asget_operands
provided vregs, and mismatching the two is a subtle and hard-to-catch bug;- Each instruction enum has a pretty-printing implementation, with similar
AllocationConsumer
usage constraints, and also with very repetitive code ("print the operand and these three regs");- There are a number of other kinds of metadata (is a move, is a safepoint) that are fairly mechanical
match
es over the enum;- Possibly others (but the above are the main error-prone bits).
Ideally, we could allow the backend author to define a new "assembler library" by:
- Writing the enum of instruction formats, with decorations/annotations for "reg use/def" and other metadata;
- Providing an implementation of a generated trait for emission with one method per instruction format (enum arm), with typesafe named arguments for registers (with generated code matching up allocations, avoiding any errors);
- Either specifying a "default" for pretty-printing (opcode,
Display
trait for other bits, registers?) or else using a similar trait approach as for emission.Then we could generate the rest: essentially the whole implementation of the
MachInst
trait, or at least most of it.There has been some informal discussion about whether expanded information about instructions should be in toml/yaml-type files or something else; I believe it might actually be better to keep this in the ISLE, with some sort of annotation syntax, both to reduce the cognitive load of languages that must be learnt, and to keep as few sources-of-truth around as possible.
cfallin labeled issue #4125:
We currently define an
enum
to represent machine instructions for each backend. This is fairly high-level and generally nice, allowing us to use Rust types to model arguments and modes, etc.However, there is a lot of "glue" in the layer of code that supports doing things with these enums that is very tedious and error-prone to write:
- Each instruction enum has a
get_operands
that must list every Reg as an early/late def/use/mod or tmp (and missing one is a bug);- Each instruction enum has an
emit
that must use anAllocationConsumer
to take regalloc results in the same order asget_operands
provided vregs, and mismatching the two is a subtle and hard-to-catch bug;- Each instruction enum has a pretty-printing implementation, with similar
AllocationConsumer
usage constraints, and also with very repetitive code ("print the operand and these three regs");- There are a number of other kinds of metadata (is a move, is a safepoint) that are fairly mechanical
match
es over the enum;- Possibly others (but the above are the main error-prone bits).
Ideally, we could allow the backend author to define a new "assembler library" by:
- Writing the enum of instruction formats, with decorations/annotations for "reg use/def" and other metadata;
- Providing an implementation of a generated trait for emission with one method per instruction format (enum arm), with typesafe named arguments for registers (with generated code matching up allocations, avoiding any errors);
- Either specifying a "default" for pretty-printing (opcode,
Display
trait for other bits, registers?) or else using a similar trait approach as for emission.Then we could generate the rest: essentially the whole implementation of the
MachInst
trait, or at least most of it.There has been some informal discussion about whether expanded information about instructions should be in toml/yaml-type files or something else; I believe it might actually be better to keep this in the ISLE, with some sort of annotation syntax, both to reduce the cognitive load of languages that must be learnt, and to keep as few sources-of-truth around as possible.
abrown commented on issue #4125:
I think this would be a good idea (I've had it in the past but it seems like a lot of work). I think an advantage of this is that it would make it easier to refactor the x64
Inst
variants to more closely align with the ISA instructions; IMO, it is not always clear how to encode an instruction given the current "classes of instructions" available. If this issue were implemented with a way to template the shapes of instructions (e.g.,<hole> xmm, xmm/m128
) then it would be possible to generateInst
variants that match the ISA.
Last updated: Jan 24 2025 at 00:11 UTC