It seems to me that the entire literature about SSA IRs is using phi instructions instead of block parameters. Both representations are equivalent to each other, however, I really wonder, given that "the entire rest of the world" is using phi instructions why has Cranelift chosen to use block parameters instead?
When I was taking a look into cranelift I started to feel it might be because translation from Wasm to SSA IR is simpler with block parameters because Wasm already provides block signatures out of the box. For phi instructions it is a bit harder to map since there is no particular ordering as is with block parameters.
Is there some person who might know a bit more about the rationals behind this design decision?
I would really appreciate an answer or some clues. :)
The original idea is that phi instructions aren't really instructions, in some key ways. Their uses don't happen inside the block; they happen on the incoming edges. Block parameters, along with putting the arguments on the jump instructions, mean that the uses of those arguments happen exactly where they appear to.
It's generally easier to understand block params.
IIRC, LLVM folks have said that if they knew about block parameters back then, they would have used them instead of Phis.
Thanks a lot @Dan Gohman and @fitzgen (he/him) for this information! I didn't know that. For a long time I was wondering about pros and cons between phi instructions and block parameters and over the course of some discussions I dug out people seemed to be more in favor of phi instructions but now I start to feel that this might not be the full story. Due to this information I built my own SSA IR upon phi instructions but now I wonder if this was a backwards decision ... :S
Block parameters truly are simpler imo. It is super interesting to know that LLVM folks even would have picked block parameters over phi instructions nowadays.
One of the things we learned along the way is that the way Cranelift does block parameters creates some tricky corner cases around critical edges.
The operands on a conditional branch apply to both destinations, and Cranelift currently requires the number of arguments match between every branch and its destination. So if you add a block parameter to a block, you then add an argument to the incoming branch, and if it's conditional, you then have to add a block parameter to the other destination. And if that edge is also critical, then you have to keep iterating to keep things consistent.
I have seen some tricky code around so-called "side effects" if you are referring to those. I have to out myself as a non-compiler expert so I really learned a ton while just looking dumbly at the super cool Cranelift and Wasmtime codebase.
It's related, yeah. Cranelift early on chose to not allow jump tables to jump to blocks with block parameters.
This means it has to split critical edges on the fly during SSA construction.
They are equivalent, so not a big deal.
Removing params/phis (e.g. because the value is never used) is interesting:
However, there is a dual problem when removing a whole (never jumped to) block:
removing a block means you need to know all phis that reference values from that block and remove those values
from the phi's arguments
block params won't have any jumps to that block, so nothing to fixup
In the graphics shader IR I'm currently designing for the Kazan Vulkan driver, it uses block parameters where every CFG edge has individual arguments for that edge, so a switch instruction has a separate set of parameters for every case
so it doesn't have that issue with critical edges
Here's what I did In my (somewhat Cranelift-inspired) IR:
This works out nicely because it avoids a layer of indirection in the ValueData
to check whether a value refers to an instruction or block parameter. Also finding the source value of a phi is easy since there is only one possible terminator (unconditional jump) allowed in the predecessors.
Currently trying to digest all this amazing information. :yum:
@fitzgen (he/him) What you said about the duality between phi instructions and block parameters sounds super cool.
I wonder if it might make sense to use for example phi instructions during SSA construction and at the end of construction convert those in a single sweep into block parameters? :thinking:
@Jacob Lifshay Do I understand correctly that the problem with block parameters is when there is the same set of block parameters between 2 blocks that could span over multiple edges between those blocks (for example with br_table
) and you are breaking this problem up by "simply" using a set of parameters per real edge?
Is this what people are referring to with "critical edge splitting"? Sorry! Will definitely have to look up stuff. :sweat_smile:
So if I understood correctly the way how Cranelift IR works with multiple branch instructions at the end of a block having only parameters between blocks instead of per edge causes trouble. And if an SSA IR only has a single terminate instruction (without support for br_table
) then there are no problems, right?
@Amanieu I wonder: How are you mapping the ordering between phi instructions and block parameter indices? That's exactly what is missing from the puzzle in my head and what is missing from my SSA IR. I found out about this when inspecting the line starting here https://github.com/bytecodealliance/wasmtime/blob/main/cranelift/wasm/src/code_translator.rs#L400.
/// A `Phi` represents an "argument" to a basic block coming from a
/// predecessor block. The actual values for all `Phi`s in a basic block are
/// passed as a list of values to the `Jump` instructions that point to the
/// block.
///
/// All `Phi` instructions *must* be at the start of the basic block. There
/// cannot be any other instructions mixed with the `Phi` instructions.
///
/// A `Phi` instruction is tied to the basic block that it is located in:
/// it cannot be moved to another block, duplicated or removed without also
/// adjusting all of the `Jump` instructions that point to that block.
Phi { index: usize },
Basically the phi instruction contains its own index and must be at that index in the basic block.
If your IR supports multiple results per instruction then you might be able to get away with just a single phi as the first instruction with N results.
If your IR supports multiple results per instruction then you might be able to get away with just a single phi as the first instruction with N results.
It actually does but I do not understand how this solves the problem of having no mapping between phi instructions and block parameters. Sorry for all my non-expertise with compilers. :see_no_evil:
Oh wait do you mean that I merge all my phi instructions into one monster-phi instruction and therefore no longer need a mapping between phi instructions and block parameters since I simply use the tuple indices of the phi instruction operands? :O
Yes, essentially.
Thank you all for all the super valuable information! It really helped me a lot! :hug:
Hero Bird said:
Jacob Lifshay Do I understand correctly that the problem with block parameters is when there is the same set of block parameters between 2 blocks that could span over multiple edges between those blocks (for example with
br_table
) and you are breaking this problem up by "simply" using a set of parameters per real edge?
Yes, afaict.
Is this what people are referring to with "critical edge splitting"? Sorry! Will definitely have to look up stuff. :sweat_smile:
No, critical edge splitting is where a new empty basic block is inserted in all CFG edges where both the edge's source block has multiple target edges (e.g. br_table) and the edge's target block has multiple source edges.
This process transforms the CFG to a form where every edge has a unique block where instructions can be inserted that are executed when and only when that edge is executed -- basically allowing you to insert instructions in edges.
So if I understood correctly the way how Cranelift IR works with multiple branch instructions at the end of a block having only parameters between blocks instead of per edge causes trouble.
Yup.
And if an SSA IR only has a single terminate instruction (without support for
br_table
) then there are no problems, right?
Problems are caused whenever a block-ending branch instruction has multiple target blocks that need to share block parameters, hence why I designed Kazan's IR to always have separate parameter sets for each target block.
Jacob Lifshay 20:54
In the graphics shader IR I'm currently designing for the Kazan Vulkan driver, it uses block parameters where every CFG edge has individual arguments for that edge, so a switch instruction has a separate set of parameters for every case
I find this design very elegant and wonder if there are any non-obvious (or obvious) drawbacks when having different parameter sets per branch edge instead of per block edge. I am currently thinking about adjusting my own SSA IR towards this style using block parameters instead of phi instructions. It seriously is more expressive than using phi instructions or block parameters that only allow the same set of parameters between blocks which may lead to sometimes requiring fewer blocks. However, maybe it has some hidden downsides? :thinking:
well, I haven't gotten far enough along in writing the shader compiler to notice any drawbacks other than the obvious storing more stuff in the IR increases IR memory usage...
Cranelift early on chose to not allow jump tables to jump to blocks with block parameters.
Do you remember why this was chosen? Would people choose this design again with all the knowledge they gained over time today?
I am currently using this paper (https://pp.info.uni-karlsruhe.de/uploads/publikationen/braun13cc.pdf) for the SSA construction. I found it to be intuitive for IRs with phi instructions. When using block parameters where conditional jumps and branch tables allow for different sets of branch arguments per edge (so you have multiple edges between a pair of blocks) I begin to believe that this algorithm is no longer applicable. :upside_down:
That paper is exactly what cranelift_frontend uses for SSA construction. https://github.com/bytecodealliance/wasmtime/blob/e41d88214455987751997b9d6173c5dee93204da/cranelift/frontend/src/ssa.rs#L4 (It does link to a different place, but it is the same paper)
@bjorn3 Yes I was aware of this but I was worried that having multiple edges might break certain assumptions. Fortunately they don't.
I was able to rewrite my SSA IR from phi instructions to block parameters. The new implementation is still very fresh and uses multiple edges between blocks to avoid critical edge splitting.
The PR for anyone who is interested: https://github.com/Robbepop/runwell/pull/11
Thank you all for the help! I would have not been able to pull this off without you. :blush:
Last updated: Dec 23 2024 at 12:05 UTC