wasmtime / issue #4125 Cranelift: generate more from decl... · git-wasmtime

Stream: git-wasmtime

Topic: wasmtime / issue #4125 Cranelift: generate more from decl...

Wasmtime GitHub notifications bot (May 10 2022 at 22:25):

We currently define an enum to represent machine instructions for each backend. This is fairly high-level and generally nice, allowing us to use Rust types to model arguments and modes, etc.

However, there is a lot of "glue" in the layer of code that supports doing things with these enums that is very tedious and error-prone to write:

Each instruction enum has a get_operands that must list every Reg as an early/late def/use/mod or tmp (and missing one is a bug);

Each instruction enum has an emit that must use an AllocationConsumer to take regalloc results in the same order as get_operands provided vregs, and mismatching the two is a subtle and hard-to-catch bug;

Each instruction enum has a pretty-printing implementation, with similar AllocationConsumer usage constraints, and also with very repetitive code ("print the operand and these three regs");

There are a number of other kinds of metadata (is a move, is a safepoint) that are fairly mechanical matches over the enum;

Possibly others (but the above are the main error-prone bits).

Ideally, we could allow the backend author to define a new "assembler library" by:

Writing the enum of instruction formats, with decorations/annotations for "reg use/def" and other metadata;

Providing an implementation of a generated trait for emission with one method per instruction format (enum arm), with typesafe named arguments for registers (with generated code matching up allocations, avoiding any errors);

Either specifying a "default" for pretty-printing (opcode, Display trait for other bits, registers?) or else using a similar trait approach as for emission.

Then we could generate the rest: essentially the whole implementation of the MachInst trait, or at least most of it.

There has been some informal discussion about whether expanded information about instructions should be in toml/yaml-type files or something else; I believe it might actually be better to keep this in the ISLE, with some sort of annotation syntax, both to reduce the cognitive load of languages that must be learnt, and to keep as few sources-of-truth around as possible.

Wasmtime GitHub notifications bot (May 10 2022 at 22:25):

cfallin labeled issue #4125:

We currently define an enum to represent machine instructions for each backend. This is fairly high-level and generally nice, allowing us to use Rust types to model arguments and modes, etc.

However, there is a lot of "glue" in the layer of code that supports doing things with these enums that is very tedious and error-prone to write:

Each instruction enum has a get_operands that must list every Reg as an early/late def/use/mod or tmp (and missing one is a bug);

Each instruction enum has an emit that must use an AllocationConsumer to take regalloc results in the same order as get_operands provided vregs, and mismatching the two is a subtle and hard-to-catch bug;

Each instruction enum has a pretty-printing implementation, with similar AllocationConsumer usage constraints, and also with very repetitive code ("print the operand and these three regs");

There are a number of other kinds of metadata (is a move, is a safepoint) that are fairly mechanical matches over the enum;

Possibly others (but the above are the main error-prone bits).

Ideally, we could allow the backend author to define a new "assembler library" by:

Writing the enum of instruction formats, with decorations/annotations for "reg use/def" and other metadata;

Providing an implementation of a generated trait for emission with one method per instruction format (enum arm), with typesafe named arguments for registers (with generated code matching up allocations, avoiding any errors);

Either specifying a "default" for pretty-printing (opcode, Display trait for other bits, registers?) or else using a similar trait approach as for emission.

Then we could generate the rest: essentially the whole implementation of the MachInst trait, or at least most of it.

There has been some informal discussion about whether expanded information about instructions should be in toml/yaml-type files or something else; I believe it might actually be better to keep this in the ISLE, with some sort of annotation syntax, both to reduce the cognitive load of languages that must be learnt, and to keep as few sources-of-truth around as possible.

Wasmtime GitHub notifications bot (May 10 2022 at 22:25):

cfallin labeled issue #4125:

We currently define an enum to represent machine instructions for each backend. This is fairly high-level and generally nice, allowing us to use Rust types to model arguments and modes, etc.

However, there is a lot of "glue" in the layer of code that supports doing things with these enums that is very tedious and error-prone to write:

Each instruction enum has a get_operands that must list every Reg as an early/late def/use/mod or tmp (and missing one is a bug);

Each instruction enum has an emit that must use an AllocationConsumer to take regalloc results in the same order as get_operands provided vregs, and mismatching the two is a subtle and hard-to-catch bug;

Each instruction enum has a pretty-printing implementation, with similar AllocationConsumer usage constraints, and also with very repetitive code ("print the operand and these three regs");

There are a number of other kinds of metadata (is a move, is a safepoint) that are fairly mechanical matches over the enum;

Possibly others (but the above are the main error-prone bits).

Ideally, we could allow the backend author to define a new "assembler library" by:

Writing the enum of instruction formats, with decorations/annotations for "reg use/def" and other metadata;

Providing an implementation of a generated trait for emission with one method per instruction format (enum arm), with typesafe named arguments for registers (with generated code matching up allocations, avoiding any errors);

Either specifying a "default" for pretty-printing (opcode, Display trait for other bits, registers?) or else using a similar trait approach as for emission.

Then we could generate the rest: essentially the whole implementation of the MachInst trait, or at least most of it.

There has been some informal discussion about whether expanded information about instructions should be in toml/yaml-type files or something else; I believe it might actually be better to keep this in the ISLE, with some sort of annotation syntax, both to reduce the cognitive load of languages that must be learnt, and to keep as few sources-of-truth around as possible.

Wasmtime GitHub notifications bot (May 10 2022 at 22:25):

cfallin labeled issue #4125:

We currently define an enum to represent machine instructions for each backend. This is fairly high-level and generally nice, allowing us to use Rust types to model arguments and modes, etc.

However, there is a lot of "glue" in the layer of code that supports doing things with these enums that is very tedious and error-prone to write:

Each instruction enum has a get_operands that must list every Reg as an early/late def/use/mod or tmp (and missing one is a bug);

Each instruction enum has an emit that must use an AllocationConsumer to take regalloc results in the same order as get_operands provided vregs, and mismatching the two is a subtle and hard-to-catch bug;

Each instruction enum has a pretty-printing implementation, with similar AllocationConsumer usage constraints, and also with very repetitive code ("print the operand and these three regs");

There are a number of other kinds of metadata (is a move, is a safepoint) that are fairly mechanical matches over the enum;

Possibly others (but the above are the main error-prone bits).

Ideally, we could allow the backend author to define a new "assembler library" by:

Writing the enum of instruction formats, with decorations/annotations for "reg use/def" and other metadata;

Providing an implementation of a generated trait for emission with one method per instruction format (enum arm), with typesafe named arguments for registers (with generated code matching up allocations, avoiding any errors);

Either specifying a "default" for pretty-printing (opcode, Display trait for other bits, registers?) or else using a similar trait approach as for emission.

Then we could generate the rest: essentially the whole implementation of the MachInst trait, or at least most of it.

There has been some informal discussion about whether expanded information about instructions should be in toml/yaml-type files or something else; I believe it might actually be better to keep this in the ISLE, with some sort of annotation syntax, both to reduce the cognitive load of languages that must be learnt, and to keep as few sources-of-truth around as possible.

Wasmtime GitHub notifications bot (May 10 2022 at 23:04):

abrown commented on issue #4125:

I think this would be a good idea (I've had it in the past but it seems like a lot of work). I think an advantage of this is that it would make it easier to refactor the x64 Inst variants to more closely align with the ISA instructions; IMO, it is not always clear how to encode an instruction given the current "classes of instructions" available. If this issue were implemented with a way to template the shapes of instructions (e.g., <hole> xmm, xmm/m128) then it would be possible to generate Inst variants that match the ISA.

Last updated: Apr 17 2025 at 21:03 UTC