Hi!
I'm really struggling to write an ISLE isel pattern for an instruction with a destructive operand. I want one of my input registers to be reused as the output. After wrestling with how to even try to express this in something that compiles, both my attempts fail to tie the input to the register. Has anyone done this before and could point me at an example? cheers!
Hi @Sam Parker -- sorry to hear this is causing difficulties! The system is actually designed to not allow this, in anticipation of SSA-based VCode. The solution is via something we invented called "move mitosis". The basic idea is that all instructions have three-operand forms (aarch64 already almost always has that) with a separate destination, and the ISLE lowering can produce arbitrary destinations. If a particular instruction needs to emit a separate pre-move to turn a separate-dest form into a mutate-in-place form, then the "move mitosis" trait method can emit that move, and edit self
to have src2 == dst or whatnot. Then in emission we assert that. More details in x64/inst/mod.rs (grep mov_mitosis
) which pervasively uses this pattern
I wrote some docs on this stuff here: https://github.com/bytecodealliance/wasmtime/blob/main/cranelift/docs/isle-integration.md#lowering-rules-are-always-pure-use-ssa
let me know if there are things that doesn't answer for you and we can update it :)
Excellent, thanks both. I had assumed SSA was the issue, it makes sense. Would regalloc2 enable us to express this register constraints more directly?
Okay, 'move mitosis will eventually go away completely once we switch over to regalloc2' has answered that one!
Chris Fallin said:
... aarch64 already almost always has that...
Unfortunately that is not really the case with the SVE instructions - we frequently need to fit 2 5-bit source vector register fields and a 3-bit predicate register field into a 32-bit instruction, and as a result there is no space left for an explicit destination.
SVE introduces a new move instruction variant called MOVPRFX
to ameliorate this - compared to other moves, it has the constraint that it must precede an instruction that can be prefixed in that way (usually a destructive operation); on the other hand it provides a very strong microarchitectural hint that it can be fused with the subsequent instruction. It looks like the "move mitosis" mechanism will allow us to generate MOVPRFX
instructions as needed.
I think our other option is to treat destructive operations as constructive throughout the backend, and then generate an optional prefix at emission time if the register allocator hasn't picked the same destination as the source.
@Anton Kirilov interesting! Yeah I think taking an approach like the x64 backend's here is the right choice, as opposed to always emitting a prefix: the former ("move mitosis") lets us hint to the regalloc, eventually with regalloc2 at least, that with the right assignments the move could be elided
Last updated: Nov 22 2024 at 17:03 UTC