Pro tips from an SSA hacker.<p>- Ignore the SSA conversion algorithms that don’t use dominance frontiers. Also ignore the dominator tree generation algorithms that aren’t Langauer-Trajan. Reason: you’ll want to have a dominator tree (ie immediate dominators) anyway for a bunch of SSA optimizations. So, you’ll want Langauer-Tarjan. And if you have that, then using dominance frontiers to generate SSA is just easier.<p>- CPS isn’t really anything like SSA. Don’t believe that hype. Any claim in PL of two languages or forms being equivalent is not falsifiable because any two Turing complete languages are going to have some transformation between them, so really these papers are just saying “hey look I invented a translation that we all knew was going to be possible and I like my translation for aesthetic reasons”. Working with CPS is nothing like working with SSA.<p>- The best way I know of representing Phi in your SSA IR is a Pizlo special. I haven’t written it up except by implementing it (JavaScriptCore uses Pizlo SSA). Here’s the idea. Phi takes no arguments (no block arguments and no value arguments). Each Phi has a shadow variable (in addition to the implicit SSA variable it assigns to). Phi just loads the value out of its shadow variable and assigns it to its implicit variable. Then I have a second instruction called Upsilon that takes two args: an input variable and a Phi. It loads the input variable’s implicit value (just as any SSA use would) and stores it to the Phi’s shadow variable. The result of using this IR is that you have zero coupling between your CFG and the SSA graph. It makes CFG transforms much much easier to write. It also means much less special casing of Phi, in practice.
I worked on compilers at Fortify which does static code analysis, and it's always surprised me SSA does not come up more in context about how to write clean, maintainable, and secure code.<p>SSA makes auditing code for dataflow issues much easier - both for people and programs. If you learn how to transform your code to SSA, you might find yourself choosing to write code this way so that you know, exactly, where that input to whatever sensitive string you are building came from. You can more easily trace the lifetime of that input and see what transformations it has been through already.<p>I feel like I am the only person in the world talking about SSA outside of compilers! It is useful for humans as well, and we all want to write readable and secure code.
I'll add my own way of constructing SSA, which is mostly untested but worked fine in my compiler: <a href="https://paulbiggar.com/research/fit-2009.pdf" rel="nofollow">https://paulbiggar.com/research/fit-2009.pdf</a>
This looks helpful. I would love to find a simple method of generating SSA that can deal with goto statements.<p>A paper that was suggested to me (not mentioned in the linked blog post) is:<p>"Practical Improvements to the Construction and Destruction of Static Single Assignment Form",
by Preston Briggs, Keith D. Cooper, Timothy J. Harvey, I. Taylor Simpson
all at Rice University
SOFTWARE—PRACTICE AND EXPERIENCE, VOL. 28(8), 1–28 (July 1998)
Can anyone suggest simple to implement algorithms for SSA destruction? I find Cytron's destruction easy (paired with copy propagation), but the more recent ones are difficult to implement directly from the papers.
I don't remember the exact details[0] but the AIs were saying Destination Driven Code Generation is a good (and simple) algorithm for lowering to SSA form. Something about it naturally produces a CFG or the dominance frontier or IDK, I filed it away as something to look into later.<p>It does make sense because it can produce reasonably efficient machine code directly from an AST so it is doing a lot of work a naive SSA lowering algorithm leaves for later stages.<p>[0] leave this as an exercise for the reader, we aren't on speaking terms lately.