MATH 417 Lecture 1

From Notes
Jump to navigation Jump to search

« previous | Tuesday, January 14, 2014 | next »


Numerical Analysis

Midterm 3/4 6 Quizzes Machine problems

Textbook: find cheapest one. Edition does not matter

Goals

  • Programming (check!)
  • Understanding


Numbers

e.g. Real ()

In real life, we have finite space

Floating Point Numbers

  • follow the IEEE 754 binary standard (floating point arithmetic) to store numbers: 1 bit sign + 52 bits for leading digits + 11 bits exponent
  • This allows for numbers between (approx) and
  • Numbers in this range have about 16 digits of precision

Hence

, where and


Density

  • Numbers are very dense around -1, 0, and 1, but get sparser for large numbers.
  • There are apart between 0 and 1, but only apart between 100 and 101

Numbers have to go from infinite precision to finite precision:

Approximation

Let be the approximation of

Two ways to approximate numbers (e.g. )

  • Rounding:
  • Truncation:


Error

How different are and ?

  1. Absolute error:
  2. Relative error: (a mile and a foot is not much different than just a mile)

Example: Given , will not always be between and . A better way to implement this is a + 0.5 * (b - a), which will always be between and .

First Machine Problem

Approximate :

First attempt:

s = 0
for (int i = 0; i < PARTIAL_MAX; i++) {
    s += 1.0 / (i * i)
}

This has a large error because adding small values to larger values does not always work (1020 + 0.01 = 1020).

Second attempt: reverse order

s = 0
for (int i = PARTIAL_MAX-1; i >= 0; i--) {
    s += 1.0 / (i * i)
}

This will give better precision


Chapter 2: Root Finding / Solving

Some things are easy ( or ), but how do we solve something like ?

Rolle's Theorem states that for and , where , there exists at least one such that .

Given , we want to find such that

2.1: Bisection Algorithm

A binary search algorithm.

double left = POINT_A;
double right = POINT_B;
double midpt;

for (int i = 0; i < N; i++) {
    midpt = left + 0.5 * (right - left);
    if (f(midpt) * f(right) < 0) {
        left = midpt;
    } else {
        right = midpt;
    }
}

After one step, our absolute error is .

After steps, our absolute error is

Thus our relative error is .

Thus if we want , we choose , so