CSCE 434 Lecture 34
« previous | Friday, November 15, 2013 | next »
Optimization
Instruction Scheduling
Reorder instructions to hide latencies (prevent from waiting; e.g. on store/load to/from memory)
- assumes a fixed program
- changes demand registers
Program behavior remains unchanged; the only difference is that the code runs faster. This fact is provable.
This can be done easily within a basic block.
Tree-matching or instruction matching
Assume we have enough registers or target "important" information into registers
Register Allocation
Ideally, we do not want to spill registers onto the stack.
Assume infinite number of registers at first.
Data Storage
Data bus size determines word length.
On that note, synchronization variables should be the length of one cache line.
Simple Expressions
In following code, newtemp()
should always return a register.
If we run out of registers, it will pick a victim, tell it to get out, spill it to the stack, and proceed to use it. However, it will have to be put back when we are done using the register
int expr(Node *node)
{
int result, t1, t2, t3;
switch (node.type) {
case PLUS:
t1 = expr(node.left);
t2 = expr(node.right);
result = newtemp();
emit(ADD, result, t1, t2);
break;
case ID:
t1 = base(node.val);
t2 = offset(node.val);
t3 = newtemp();
emit(ADD, t3, t1, t2);
break;
case NUM:
result = newtemp();
emit(LOADI, result, node.val);
break;
}
return result;
}
Complex Expressions
Array References
First, agree to storage scheme because memory is linear (i.e. agree on linearization scheme):
- row-major order
- rightmost subscript varies fastest
- A[1,1], A[1,2], A[1,3], A[2,1], A[2,2], A[2,3]
- column-major order
- leftmost subscript varies fastest
- A[1,1], A[2,1], A[1,2], A[2,2], A[1,3], A[2,3]
- indirection vectors
- vector of pointer to pointers to ... to values
- uses much more space
Computing an Address
A[i]
- row-major order: .
- column-major order:
Expensive!