CSCE 434 Lecture 36

« previous | Wednesday, November 20, 2013 | next »

Lecture Slides

Project 6 Details

T-registers are caller saves (useful for values that won't extend pass a function boundary)
S-registers are callee saves (useful for persistent values)

We are just supposed to use the T-registers

Data Flow Analysis

Vocabulary

basic block: a set of instructions such that if one executes, then they all execute; no way to jump in or out of the block
local optimizations: optimizations that are within a basic block
global optimizations: optimizations that are across basic blocks
interprocedural optimizations: procedural analysis; analyze whole program in terms of procedures / functions
control-flow graphs: graph whose vertices are basic blocks and edges are branches between basic blocks
data flow analysis: static analysis of run-time flow of values

Examples

Liveness: which variables are used later in the program; uses reverse flow (start at bottom/last usage and work backwards through program)
Reaching definitions: which declarations/assignments could be reached at this in a program; uses forward flow (start at top and work forwards through program)
Constant propagation: are there any variables that we can say for sure have a particular value
Meet Over all Paths (MOP): Ideally we use only execution paths that the program might actually take; For our purposes, we will assume all branches and basic blocks are possible execution paths

Data flow is greatly complicated by use of pointers.

someone can change value of variable without us (compiler) knowing
Usual solution is to give up and say that value is always live
That's why we have languages like Java and Decaf: no pointers.

Values

For Liveness analysis, we can use a set: if a variable is included in the set, then it is live
For Reaching definitions, we can use a multimap: map variable names to definitions
For Constant propagation, we can use a map: map variable names to their known values

Usually, we just store values at the start and end of a basic block,

but for this project, we store these two sets at every instruction.
The IN sets are the unions of all parents' OUT sets

Transfer Functions

Calculates effect each function has on our value (used to construct OUT set

Within a basic block, we keep track of two sets: GEN and KILL

GEN: new information produced by instruction
KILL: information invalidated by the instruction

The OUT set is computed by (IN − KILL) + GEN

For example, in Liveness analysis (recall that it is a reverse flow), GEN is any variable that is used, and KILL is any variable that is defined:

{{Note|An instruction that writes to a dead value can be removed}}


=== Combining Paths ===
To compute the '''IN''' set for a basic block, we need a few helper operations.

We use a (semi)lattice: {{HL|bg=t|fg=#999|(realized on '''reaching definitions''' sets)}}
* requires that we can relate some (not all) values using <math><</math> operator {{HL|bg=t|fg=#999|(think "proper subset")}}
* meet operator <math>\wedge</math> returns the largest value that is less than both its operands {{HL|bg=t|fg=#999|(think "intersection")}}
* top element <math>\downvdash</math> such that <math>\downvdash \wedge x = x</math> {{HL|bg=t|fg=#999|(think "universal set")}}
* bottom element <math>\upvdash</math> such that <math>\upvdash \wedge x = \perp</math> {{HL|bg=t|fg=#999|(think "empty set")}}
* we require finite, monotonic semilattices

Value for beginning of block is meet of all values that can reach that block from any path

==== Meet Table for Constant Propagation ====

{| class="wikitable"
! <math>\wedge</math> !! const val <math>d</math> !! nonconst !! unknown
|-
! const val <math>c</math>
| <code>c == d ? c : nonconst</code>
| nonconst
| <math>c</math>
|-
! nonconst
| nonconst
| nonconst
| nonconst
|-
! unknown
| <math>d</math>
| nonconst
| unknown
|}


==== Algorithm ====
{{Star|This will be on the final exam!}}

<source lang="cpp">
/*----------------------------------------------------------
 * Perform data flow analysis
 */
template <typename ValueType, typename FlowType>
void DataFlow<ValueType, FlowType>::analyze()
{
  // Initialize worklist with all instructions (except first)
  // Initialize value for all instructions
  std::list<Instruction*> worklist;
  {
    typename FlowType::iterator p= flow.first();
    df_in[*p]= init();
    df_out[*p]= effect(*p, init());

    while (p != flow.last())
    {
      ++p;
      df_in[*p]= top();
      df_out[*p]= effect(*p, top());
      worklist.push_back( *p );
    }
  }

  while (!worklist.empty())
  {
    Instruction* i= worklist.front();
    worklist.pop_front();

    // Calculate meet of df_out for all incoming edges
    ValueType total_in= top();
    EdgeList& prev_edges= flow.in()[i];
    for (EdgeList::iterator p= prev_edges.begin(); p != prev_edges.end(); ++p)
      total_in= meet(total_in, df_out[*p]);

    // If we changed something, reseed worklist
    if (total_in != df_in[i])
    {
      df_in[i]= total_in;
      df_out[i]= effect(i, total_in);
      EdgeList& next_edges= flow.out()[i];
      for (EdgeList::iterator n= next_edges.begin(); n != next_edges.end(); ++n)
      {
        if (find(worklist.begin(), worklist.end(), *n) == worklist.end())
          worklist.push_back(*n);
      }
    }
  }
}