abrown opened Issue #2252:
As mentioned by @cfallin in https://github.com/bytecodealliance/wasmtime/pull/2248#discussion_r497853577, there is a potential abstraction leak re:
Inst
. The way I see it,Inst
was originally designed to have variants that abstracted common classes of instructions (e.g.div
) but eventually grew variants that more closely matched x64 encoding formats (e.g.unary_rm_r
). NowInst
contains both kinds--abstract instructions and encoding formats--and this causes confusion (e.g. which format am I supposed to use here? Or is there an instruction that covers this?) and potential bugs (e.g. handing an opcode to a format that should never encode that opcode as @bnjbvr has pointed out). This problem will only be exacerbated by adding other types of encoding formats like VEX and EVEX.I see several possible directions:
- refactor
Inst
to only contain abstract instructions- refactor
Inst
to only contain x64 encoding formats and figure out some way to limit opcodes to certain formats (this is tricky because the same opcode can be used in multiple formats)- refactor
Inst
to contain only x64 instructions, perhaps parameterized; I have mentioned before that I feel that the level of abstraction for lowering is at the machine code instruction (like in v8)Thoughts?
abrown commented on Issue #2252:
cc: @jlb6740, @julian-seward1
bnjbvr commented on Issue #2252:
Yeah, I think a refactoring there is overdue. In my opinion, the way forward would be to have different Inst each time we need to, that is when:
- the Inst format (condition codes, etc.) doesn't match any available in the Inst list
- the given Inst is specialized for a list of given register classes inputs and outputs
- or the given Inst has "meta" semantics (nominal SP adjustment, soon the uninitialized Xmm)
If an Vcode Inst doesn't have meta semantics, and there's another inst with the same format and input/output register classes, then a new Inst is not required.
Re: naming. I followed the initial choices from @julian-seward1 to have the Inst be named after their operands format, but we should rethink about this, and follow aarch64's example here of naming the instructions after what they do, not how they're defined at a low-level.
Re: the possibility of creating impossible inst: the
SseOpcode
enum is useful but too wide; instead we should have a subset ofSseOpcode
s for each Inst ctor, so that it becomes a compile error instead of a dynamic panic (or worse, a silent runtime error) to use the wrong opcode for a given inst. We can either have different enums that convert down toSseOpcode
, or just one enum perInst
that's handled down to the bottom. I think the latter is preferable, since having enums that convert toSseOpcode
would still make it possible to miss cases when emitting the machine instructions.Re: ergonomics. We should go down the road of higher level assembly instructions, like it's been done for Inst::load/store etc, and use this pattern more. Maybe not all the time, but it's immensely useful to do so when the instruction is being constructed many times in many places, in my opinion.
bnjbvr labeled Issue #2252:
As mentioned by @cfallin in https://github.com/bytecodealliance/wasmtime/pull/2248#discussion_r497853577, there is a potential abstraction leak re:
Inst
. The way I see it,Inst
was originally designed to have variants that abstracted common classes of instructions (e.g.div
) but eventually grew variants that more closely matched x64 encoding formats (e.g.unary_rm_r
). NowInst
contains both kinds--abstract instructions and encoding formats--and this causes confusion (e.g. which format am I supposed to use here? Or is there an instruction that covers this?) and potential bugs (e.g. handing an opcode to a format that should never encode that opcode as @bnjbvr has pointed out). This problem will only be exacerbated by adding other types of encoding formats like VEX and EVEX.I see several possible directions:
- refactor
Inst
to only contain abstract instructions- refactor
Inst
to only contain x64 encoding formats and figure out some way to limit opcodes to certain formats (this is tricky because the same opcode can be used in multiple formats)- refactor
Inst
to contain only x64 instructions, perhaps parameterized; I have mentioned before that I feel that the level of abstraction for lowering is at the machine code instruction (like in v8)Thoughts?
bjorn3 commented on Issue #2252:
Was this done?
cfallin commented on Issue #2252:
I think we still could potentially clean up / orthogonalize things -- we haven't directly addressed this yet, AFAIK.
Last updated: Dec 23 2024 at 12:05 UTC