CSCE 312 Lecture 2
« previous | Thursday, September 1, 2011 | next »
Great Realities of Computers
Precision
Ints are not integers, Floats are not Real Numbers
Only 32 bits to represent numbers:
- Large Integers cause wraparound/overflow errors (i.e. is x2 ≥ 0? Not necessarily)
- 40,000 * 40,000 = 1,600,000,000
- 50,000 * 50,000 = ??
- Floats of differing ranges cause truncation errors: (i.e. s (x + y) + z = x + (y + z)? Not for floats)
- (1e20 + -1e20) + 3.14 = 3.14
- 1e20 + (-1e20 + 3.14) = ??
Because of these limitations, computer arithmetic:
- Cannot generate true random values
- Sometimes doesn't follow usual mathematical rules
Assembly Language
You've got to know Assembly (even though you'll never write a program in it)
- Be able to understand performance due to bugs
- Über hackage! :D
- Understand machine-level execution of higher-level code
Aside: Cycles and Scheduling
Using assembly to access PC may not give actual cycle count of program execution:
- I/O done asynchronously through buffers
- Processes handled alternatively by CPU depending on priority
- Scheduling (i.e what process the CPU is currently focusing on) maintained by Kernel (OS-level)
- Throughput
- how many jobs the CPU can finish in 1 second (for example)
- Response Time (Latency)
- how fast a job can be completed
Memory Matters
Memory (unlike in theoretical Turing Machine) is not unbounded; we have a finite supply to allocate and manage.
Referencing bugs are nasty! C and C++ offer no memory protection for out-of-bounds, pointer errors, and memory leaks by default (code can access memory of any process!)
Memory Hierarchy (Top = fast, small, $$$$$; bottom = slow, abundant, $)
- Registers (L0)
- On-Chip Cache (L1; SRAM)
- Note: SRAM chips are made from 6 transistors in circuit and are self-enhancing (no recharge required)
- Off-Chip Cache (L2; SRAM)
- Main Memory (L3; DRAM)
- Local Disks
- Network
Example: Taking Advantage of Memory Structure
Which code snippet is faster?
int key[1000];
|
struct pair {
|
The first code snippet will grab a bunch of keys and store them in the cache, then immediately replace the cached keys with data. | The code with the struct to bind data together keeps the key and data together. When the memory accesses the key element, it will also grab more information around it, including its data counterpart and store it in the cache. |
Performance
There's more to performance than asymptotic complexity
Constant factors matter too!
True optimization requires knowledge of computer's inner workings.
Computer I/O
Computers do more than execute programs
Data Input/Output makes computers interactive and responsive
Network communications allow computers to even talk to each other, but introduces new potential problems:
- concurrent operations by processes
- unreliable communication/transfer media
- platform compatibility
General Terms
Measuring Data
- 1 bit = 0 or 1
- 1 nibble = 4 bits
- 8 bits = 2 nibbles = 1 byte
- 4 bytes = 1 word
- 1 kilobyte (KB) = 1024 (210) bytes
- 1 megabyte (MB) = 1024 KB = 220 bytes
- 1 gigabyte (GB) = 1024 MB = 230 bytes
Measuring Time
- 1 ms = 10−3 sec
- 1 µs = 10−6 sec
- 1 ns = 10−9 sec
- 1 ps = 10−12 sec
Model of a Computer
- CPU: Central Processing Unit
- ALU: Arithmetic/Logic Unit
- PC: Program Counter
- USB: Universal Serial Bus
Buses and Bridges
- Hardware Bus
- comparable to a bus in real-life (not a taxi)
- makes scheduled round-trips
- data knows when to get off
- Hardware Bridge
- where buses meet to transfer data to other routes
Main Memory
- Temporary storage between permanent storage and CPU caches & registers
- Holds both the program instructions and program data
- Made of DRAM chips (each DRAM piece is a single transistor)
- Note: DRAM 1 values must be "recharged" every 80 ms or they will lose capacitance and become 0s
- Linear array of bytes
Processor
- Interprets instructions that are stored in Main Memory
- PC points to next instruction in memory
- Processor does only one thing (at a very fast rate): execute the next instruction pointed to by the PC
- Very simple instruction model defined by instruction set architecture (ISA)
- Sample ISA's: 80x86/Pentium, PowerPC, DEC Alpha, MIPS, SPARC, HP, etc.
- abstract interface between hardware and low-level software