wasmtime / Issue #2252 `Inst` abstraction in x64 backend · git-wasmtime

Stream: git-wasmtime

Topic: wasmtime / Issue #2252 `Inst` abstraction in x64 backend

Wasmtime GitHub notifications bot (Oct 01 2020 at 19:05):

As mentioned by @cfallin in https://github.com/bytecodealliance/wasmtime/pull/2248#discussion_r497853577, there is a potential abstraction leak re: Inst. The way I see it, Inst was originally designed to have variants that abstracted common classes of instructions (e.g. div) but eventually grew variants that more closely matched x64 encoding formats (e.g. unary_rm_r). Now Inst contains both kinds--abstract instructions and encoding formats--and this causes confusion (e.g. which format am I supposed to use here? Or is there an instruction that covers this?) and potential bugs (e.g. handing an opcode to a format that should never encode that opcode as @bnjbvr has pointed out). This problem will only be exacerbated by adding other types of encoding formats like VEX and EVEX.

I see several possible directions:

refactor Inst to only contain abstract instructions

refactor Inst to only contain x64 encoding formats and figure out some way to limit opcodes to certain formats (this is tricky because the same opcode can be used in multiple formats)

refactor Inst to contain only x64 instructions, perhaps parameterized; I have mentioned before that I feel that the level of abstraction for lowering is at the machine code instruction (like in v8)

Thoughts?

Wasmtime GitHub notifications bot (Oct 01 2020 at 19:06):

abrown commented on Issue #2252:

cc: @jlb6740, @julian-seward1

Wasmtime GitHub notifications bot (Oct 02 2020 at 09:33):

bnjbvr commented on Issue #2252:

Yeah, I think a refactoring there is overdue. In my opinion, the way forward would be to have different Inst each time we need to, that is when:

the Inst format (condition codes, etc.) doesn't match any available in the Inst list

the given Inst is specialized for a list of given register classes inputs and outputs

or the given Inst has "meta" semantics (nominal SP adjustment, soon the uninitialized Xmm)

If an Vcode Inst doesn't have meta semantics, and there's another inst with the same format and input/output register classes, then a new Inst is not required.

Re: naming. I followed the initial choices from @julian-seward1 to have the Inst be named after their operands format, but we should rethink about this, and follow aarch64's example here of naming the instructions after what they do, not how they're defined at a low-level.

Re: the possibility of creating impossible inst: the SseOpcode enum is useful but too wide; instead we should have a subset of SseOpcodes for each Inst ctor, so that it becomes a compile error instead of a dynamic panic (or worse, a silent runtime error) to use the wrong opcode for a given inst. We can either have different enums that convert down to SseOpcode, or just one enum per Inst that's handled down to the bottom. I think the latter is preferable, since having enums that convert to SseOpcode would still make it possible to miss cases when emitting the machine instructions.

Re: ergonomics. We should go down the road of higher level assembly instructions, like it's been done for Inst::load/store etc, and use this pattern more. Maybe not all the time, but it's immensely useful to do so when the instruction is being constructed many times in many places, in my opinion.

Wasmtime GitHub notifications bot (Oct 02 2020 at 11:25):

bnjbvr labeled Issue #2252:

As mentioned by @cfallin in https://github.com/bytecodealliance/wasmtime/pull/2248#discussion_r497853577, there is a potential abstraction leak re: Inst. The way I see it, Inst was originally designed to have variants that abstracted common classes of instructions (e.g. div) but eventually grew variants that more closely matched x64 encoding formats (e.g. unary_rm_r). Now Inst contains both kinds--abstract instructions and encoding formats--and this causes confusion (e.g. which format am I supposed to use here? Or is there an instruction that covers this?) and potential bugs (e.g. handing an opcode to a format that should never encode that opcode as @bnjbvr has pointed out). This problem will only be exacerbated by adding other types of encoding formats like VEX and EVEX.

I see several possible directions:

refactor Inst to only contain abstract instructions

refactor Inst to only contain x64 encoding formats and figure out some way to limit opcodes to certain formats (this is tricky because the same opcode can be used in multiple formats)

refactor Inst to contain only x64 instructions, perhaps parameterized; I have mentioned before that I feel that the level of abstraction for lowering is at the machine code instruction (like in v8)

Thoughts?

Wasmtime GitHub notifications bot (Mar 20 2021 at 09:21):

bjorn3 commented on Issue #2252:

Was this done?

Wasmtime GitHub notifications bot (Mar 20 2021 at 20:28):

cfallin commented on Issue #2252:

I think we still could potentially clean up / orthogonalize things -- we haven't directly addressed this yet, AFAIK.

Last updated: Jan 09 2026 at 13:15 UTC