Stream: cranelift

Topic: destructive operands with isle


view this post on Zulip Sam Parker (Dec 07 2021 at 16:15):

Hi!
I'm really struggling to write an ISLE isel pattern for an instruction with a destructive operand. I want one of my input registers to be reused as the output. After wrestling with how to even try to express this in something that compiles, both my attempts fail to tie the input to the register. Has anyone done this before and could point me at an example? cheers!

view this post on Zulip Chris Fallin (Dec 07 2021 at 17:06):

Hi @Sam Parker -- sorry to hear this is causing difficulties! The system is actually designed to not allow this, in anticipation of SSA-based VCode. The solution is via something we invented called "move mitosis". The basic idea is that all instructions have three-operand forms (aarch64 already almost always has that) with a separate destination, and the ISLE lowering can produce arbitrary destinations. If a particular instruction needs to emit a separate pre-move to turn a separate-dest form into a mutate-in-place form, then the "move mitosis" trait method can emit that move, and edit self to have src2 == dst or whatnot. Then in emission we assert that. More details in x64/inst/mod.rs (grep mov_mitosis) which pervasively uses this pattern

view this post on Zulip fitzgen (he/him) (Dec 07 2021 at 22:32):

I wrote some docs on this stuff here: https://github.com/bytecodealliance/wasmtime/blob/main/cranelift/docs/isle-integration.md#lowering-rules-are-always-pure-use-ssa

let me know if there are things that doesn't answer for you and we can update it :)

Standalone JIT-style runtime for WebAssembly, using Cranelift - wasmtime/isle-integration.md at main · bytecodealliance/wasmtime

view this post on Zulip Sam Parker (Dec 08 2021 at 09:16):

Excellent, thanks both. I had assumed SSA was the issue, it makes sense. Would regalloc2 enable us to express this register constraints more directly?

view this post on Zulip Sam Parker (Dec 08 2021 at 09:25):

Okay, 'move mitosis will eventually go away completely once we switch over to regalloc2' has answered that one!

view this post on Zulip Anton Kirilov (Dec 14 2021 at 10:39):

Chris Fallin said:

... aarch64 already almost always has that...

Unfortunately that is not really the case with the SVE instructions - we frequently need to fit 2 5-bit source vector register fields and a 3-bit predicate register field into a 32-bit instruction, and as a result there is no space left for an explicit destination.

view this post on Zulip Anton Kirilov (Dec 14 2021 at 11:00):

SVE introduces a new move instruction variant called MOVPRFX to ameliorate this - compared to other moves, it has the constraint that it must precede an instruction that can be prefixed in that way (usually a destructive operation); on the other hand it provides a very strong microarchitectural hint that it can be fused with the subsequent instruction. It looks like the "move mitosis" mechanism will allow us to generate MOVPRFX instructions as needed.

view this post on Zulip Anton Kirilov (Dec 14 2021 at 11:02):

I think our other option is to treat destructive operations as constructive throughout the backend, and then generate an optional prefix at emission time if the register allocator hasn't picked the same destination as the source.

view this post on Zulip Chris Fallin (Dec 14 2021 at 19:07):

@Anton Kirilov interesting! Yeah I think taking an approach like the x64 backend's here is the right choice, as opposed to always emitting a prefix: the former ("move mitosis") lets us hint to the regalloc, eventually with regalloc2 at least, that with the right assignments the move could be elided


Last updated: Nov 22 2024 at 17:03 UTC