Optimising Compilers Past Paper
Basic concepts
basic block
- a maximal sequence of instructions,
- each of which has exactly one predecessor (except possibly for the first) and
- exactly one successor (except possibly for the last);
- a largest-possible piece of straight-line code which contains no interesting control flow, with each instruction being executed in turn.
- useful because they reduce the space and time requirements for analysis algorithms.
- by allowing data flow info to be calculated and stored once per block rather than per instruction.
- and recomputed inside a block when necessary.
- y2006p8q8 (a)
flow graph (node is 3-address code)
- the control flow graph of basic blocks has the same topology as of instructions (modulo straight bits).
- y2006p8q8 (b)
Control-flow
Unreachable code elimination
Reachability, control-flow property: will the control reach here?
Call graph
if simplification
Data-flow analysis framework
Data-flow equations: relationship between values of that property immediately before in() and after out() each node n in a flow graph.
- y2017p9q10 (a)
- General framework, propagating information either forwards (AVAIL) or backwards (LVA) through the flowgraph.
- Operator: either the intersection ∩ (AVAIL) or union ∪ (LVA) is used, so as to under/over-approximate the property.
Let Position posi or posj be in(n) / out(n), the set of values setA(n) / setB(n) involved with the property immediately before / after the instruction. Depending on the property, one of the first or third equation with operator Op=⋃∣∣⋂ is used in the data-flow equations.
Nodes p
| pos = in(n)
Node n : setA(n) <=> setB(n) ...
| pos = out(n)
Nodes s
in(n)=posj(n)=out(n)={}∣∣Opp∈pred(n)out(p)(posi(n)∖setI(n))∪setJ(n), depends on the propertyOps∈succ(n)in(s)
|
∩ |
∪ |
pred |
AVAIL |
RD |
succ |
VBE |
LVA |
AVAILable expression analysis
- all paths to here include computation of it.
Reaching Definition (RD),
Write-write anomaly
Live Variable Analysis (LVA)
- some path from here leads to a use.
Very Busy Expression (VBE), Must-use
Loop invariant expression
Dominators analysis
More: Use-Definition chain
General
Problem in data-flow analysis
Live variable analysis (LVA)
Liveness, data-flow property: will the value be used later?
...
↑ in(n)
Node n : def(n) => ref(n) ...
↑ out(n)
Nodes s
in−live(n)=(out−live(n)∖def(n))∪ref(n)out−live(n)=s∈succ(n)⋃in−live(s)
For a variable x at program point p,
- semantic liveness
- some change to the value of x at p will cause the program to have different I/O execution results.
- not generally computable.
- syntactic liveness
- there is a path in the flowgraph (not necessarily followed in any execution) from p to a statement that reads the value of x.
- a computable over-approximation.
Past papers,
Data-flow anomalies
Dead code elimination
Definition, Reference, Un-definition
Clash or interference graph
Register allocation
- map virtual registers to limited physical registers,
- allocate them optimally, i.e. keep data transfers between registers and memory to a minimum.
- via a graph colouring problem, where
- variables are the nodes, the registers are colours and
- edges between variables indicate that both variables are live at one time during execution.
- limitation for JIT compiler: layered opt. for cold / hot code.
- NP-hard problem and thus slow with high complexity.
Past papers
Live range splitting
Static single assignment (SSA)
Available expression analysis (AVAIL)
Availability, data-flow property: will the value be assigned before?
Nodes p
↓ in(n)
Node n : gen(n) <= kill(n) ...
↓ out(n)
...
in−avail(n)={}∣∣p∈pred(n)⋂out−avail(p)out−avail(n)=(in−avail(n)∖kill(n))∪gen(n)
Redundancy elimination
Common sub-expression elimination (CSE)
Copy propagation
Code hoisting (VBE analysis)
Loop-invariant code motion (LICM)
Partial redundancy elimination (CSE+LICM)
Higher-level opt
Strength reduction
- replaces expensive operations with less expensive operations.
- Examples: replace x2 by x⋅x or 2x by right shift x≫1.
- array access
a[i]
involves a (hidden) multiplication as *(a + 4*i)
, assuming that integers have a size of 4.
- y2004p8q7 (a)
Abstract interpretation
Strictness analysis
- it is safe to pass an argument by value (CBV) when
- the function fails to terminate whenever the argument fails to terminate.
- ∀x.f#(x=⊥)=⊥.
- the abstract domain D#={⊥,⊤}, where ⊥=0 is guaranteed non-termination and ⊤=1 is possible termination.
- vs. neededness: the function will always evaluate its argument.
- neededness ⊆ strictness, since even if the parameter is not evaluated (needed), the termination behaviour of the function remains the same.
Past papers
Turn data-flow equations into strictness
Conversion among abstract interpretation, set-constraint, rule-based analysis
Constraint-based analysis
Set-constraint analysis
0CFA
- application: reordering or parallelisation.
- generate a tree and identify program points by labelling them,
- each program point with label i gets a flow variable αi.
Past papers
Turn data-flow equations into constraints
Conversion among abstract interpretation, set-constraint, rule-based analysis
Inference-based analysis
Rule-based analysis
Effect system
Conversion among abstract interpretation, set-constraint, rule-based analysis
Alias analysis
Points-to analysis (forward)
Anderson's analysis
Turn data-flow equations into constraints
Conversion among abstract interpretation, set-constraint, rule-based analysis
Instruction scheduling
target code level for different architectures
Register allocation vs instruction scheduling
Decompilation
Dominance
Optimisation
General opt
Phase ordering
General optimisations