MATH 417 Lecture 1
« previous | Tuesday, January 14, 2014 | next »
Numerical Analysis
Midterm 3/4 6 Quizzes Machine problems
Textbook: find cheapest one. Edition does not matter
Goals
- Programming (check!)
- Understanding
Numbers
e.g. Real ()
In real life, we have finite space
Floating Point Numbers
- follow the IEEE 754 binary standard (floating point arithmetic) to store numbers: 1 bit sign + 52 bits for leading digits + 11 bits exponent
- This allows for numbers between (approx) and
- Numbers in this range have about 16 digits of precision
Hence
Density
- Numbers are very dense around -1, 0, and 1, but get sparser for large numbers.
- There are apart between 0 and 1, but only apart between 100 and 101
Numbers have to go from infinite precision to finite precision:
Approximation
Let be the approximation of
Two ways to approximate numbers (e.g. )
- Rounding:
- Truncation:
Error
How different are and ?
- Absolute error:
- Relative error: (a mile and a foot is not much different than just a mile)
Example: Given , will not always be between and . A better way to implement this is a + 0.5 * (b - a)
, which will always be between and .
First Machine Problem
Approximate :
First attempt:
s = 0
for (int i = 0; i < PARTIAL_MAX; i++) {
s += 1.0 / (i * i)
}
This has a large error because adding small values to larger values does not always work (1020 + 0.01 = 1020).
Second attempt: reverse order
s = 0
for (int i = PARTIAL_MAX-1; i >= 0; i--) {
s += 1.0 / (i * i)
}
This will give better precision
Chapter 2: Root Finding / Solving
Some things are easy ( or ), but how do we solve something like ?
Rolle's Theorem states that for and , where , there exists at least one such that .
Given , we want to find such that
2.1: Bisection Algorithm
A binary search algorithm.
double left = POINT_A;
double right = POINT_B;
double midpt;
for (int i = 0; i < N; i++) {
midpt = left + 0.5 * (right - left);
if (f(midpt) * f(right) < 0) {
left = midpt;
} else {
right = midpt;
}
}
After one step, our absolute error is .
After steps, our absolute error is
Thus our relative error is .
Thus if we want , we choose , so