đź“–Linear scan register allocation on SSA form
. Last modified:
permalink - authors
- Wimmer, Christian and Franz, Michael
- year
- 2010
- p.170 Linear-scan performance is close to linear, so is often used in JIT compilers. On the other hand, graph coloring algorithms produce better results but are slower, so used where compiler can spend extra time.
- p.170 The interference graph of a program in SSA form is chordal (triangulated structure). This simplifies many algorithms (in particular, graph coloring is polynomial on chordal graphs)
- p.171 Register allocation is one of the last global optimization, so moving out of SSA after register allocation is reasonable as SSA is not beneficial after that.
- p.172 If definition and use are in different blocks, this means that definition block dominates the use block. (Because definitions dominate all uses.)
- p.173 The algorithm needs linear block order where (1) all dominators of a block are before this block, and (2) all blocks belonging to the same loop are contiguous. The algorithm can work for other orders but this gives best performance and less gaps in lifetimes.
p.173 lifetime algorithm.
inputs:
- SSA form
- linear block order (described above)
outputs:
- for each virtual register: intervals with holes where it is active (i.e. multiple ranges per register)
state:
liveIn: set of virtual registers per block
algorithm:
- loop through all blocks in reverse order
- initialize
liveInto the union of allliveInsets of all successors of the block. also add inputs of successors’ phi functions that correspond to current block - for all live registers, add range from start of the block to end of the block
- loop through the operations in the block in reverse order:
- for uses: add register to live set, add range from start of the block to the use (unioning/merging with existing range)
- for defs: shorten the first liveness range to start from current position, remove it from live set
- for each phi function at start of the block, remove it from the live set
- if the block is a loop header (has incoming backward edge):
- for each live register: add range from start of current block to end of the last block in the loop (i.e. registers that are live at the start of the header must live throughout the loop)
- initialize
- p.174 the algorithm does not work with irreducible loops (loops with multiple entry points). For irreducible loops, the two options are:
- perform precise analysis of the loop and union all live variables at all loop headers — they must be live throughout the loop
- or, make sure that
liveInset of all loop headers is empty. this can be done by inserting phi functions at each header that “copies” values for use inside the loop
- p.174 the liveness algorithm can be modified to produce interference graph instead (to be used for coloring). To do that, when a definition is encountered, an interference edge must be add to all currently live registers. (Definitions are enough because SSA guarantees that if two registers interfere somewhere, they also interfere at definition of one of the registers.) For loops, register that are live at the loop header, interfere with all registers defined in the loop
- p.175 When the original linear scan algorithm was modified to allow holes, this introduced two bits that make it non-linear (finding intersections of current with other intervals). For SSA form, there is an optimization that allows skipping these intersections unless the current interval has been split (spilled) because splitting destroys SSA.
- p.177 “The elimination of interval intersection checks described in Section 5 does not gain a measurable speedup.”
Backlinks
- 📝 § SSA