MATH 417 Lecture 25

« previous | Tuesday, April 22, 2014 | next »

Section 8.1 and 8.2

Least Squares approximation

$f(x)\approx a_{1}\,\phi _{1}(x)+a_{2}\,\phi _{2}(x)+\dots +a_{n}\,\phi _{n}(x)$

usually $n=2$ (linear) or $n=3$ (quadratic)

$\left\langle f,g\right\rangle$ is given, so we wish to minimize $\left\|f-p\right\|^{2}=\left\langle f-p,f-p\right\rangle$

In 8.1, $\left\langle f,g\right\rangle =\sum _{i=1}^{n}w_{i}\,f_{i}\,g_{i}$ , where $w_{i}>0$ is a probability weight distribution.

In 8.2, $\left\langle f,g\right\rangle =\int _{a}^{b}w(x)\,f(x)\,g(x)$

Common definitions of $w(x)$ :

$w(x)=1$ (classic definition)
$w(x)={\frac {1}{1+x^{2}}}$ on $\left[-\infty ,\infty \right]$
$w(x)={\sqrt {1-x^{2}}}$ on $\left[-1,1\right]$
$w(x)={\frac {1}{\sqrt {1-x^{2}}}}$

Basic Concept

We compute $\Psi (a_{1},a_{2},\ldots ,a_{n})=\int _{0}^{1}\left|f(x)-p(x)\right|^{2}\,w(x)\,\mathrm {d} x$ and find the minimum. To find this minimum, we solve for $\nabla \Psi =0$ . This gives a system of normal equations:

$A={\begin{bmatrix}\left\langle \phi _{1},\phi _{1}\right\rangle &\left\langle \phi _{1},\phi _{2}\right\rangle &\dots &\left\langle \phi _{1},\phi _{n}\right\rangle \\\left\langle \phi _{2},\phi _{1}\right\rangle &\left\langle \phi _{2},\phi _{2}\right\rangle &\dots &\left\langle \phi _{2},\phi _{n}\right\rangle \\\vdots &\vdots &\ddots &\vdots \\\left\langle \phi _{n},\phi _{1}\right\rangle &\left\langle \phi _{2},\phi _{n}\right\rangle &\dots &\left\langle \phi _{n},\phi _{n}\right\rangle \end{bmatrix}}$

This matrix is positive definite, so it has an inverse.

$b={\begin{bmatrix}\left\langle \phi _{1},f\right\rangle \\\left\langle \phi _{2},f\right\rangle \\\vdots \\\left\langle \phi _{n},f\right\rangle \end{bmatrix}}$

This system is solvable, but it's a pain to solve.

Original function

f(x)=x^{3}

.

Best Quadratic fit

p(x)={\frac {1}{20}}-{\frac {3}{5}}\,x+{\frac {3}{2}}\,x^{2}

.

For example, find the best fit to $x^{3}$ by $p(x)=a_{1}+a_{2}\,x+a_{3}\,x^{2}$ on $\left[0,1\right]$

${\begin{bmatrix}1&{\frac {1}{2}}&{\frac {1}{3}}\\{\frac {1}{2}}&{\frac {1}{3}}&{\frac {1}{4}}\\{\frac {1}{3}}&{\frac {1}{4}}&{\frac {1}{5}}\end{bmatrix}}\,{\begin{bmatrix}a_{1}\\a_{2}\\a_{3}\end{bmatrix}}={\begin{bmatrix}\left\langle 1,x^{3}\right\rangle \\\left\langle x,x^{3}\right\rangle \\\left\langle x^{2},x^{3}\right\rangle \end{bmatrix}}$

Simplifying Normal Equations

What if $A$ were a diagonal matrix? That is, what if $\left\langle \phi _{i},\phi _{j}\right\rangle =0$ for $i\neq j$ ?

Then $\phi _{1}$ , $\phi _{2}$ , etc. would be an orthogonal basis for our regression space.

Continuing with our example, let $p=c_{1}\,\phi _{1}+c_{2}\,\phi _{2}+c_{3}\,\phi (3)$ , where $\phi _{1}(x)=1$ , $\phi _{2}(x)=2x-1$ , and $\phi _{3}(x)=3(2x-1)^{2}-1$ . $\left\{\phi _{1},\phi _{2},\phi _{3}\right\}$ is an orthogonoal basis for our regression space.

${\begin{bmatrix}\left\langle 1,1\right\rangle &0&0\\0&\left\langle 2x-1,2x-1\right\rangle &0\\0&0&\left\langle 3(2x-1)^{2}-1,3(2x-1)^{2}-1\right\rangle \end{bmatrix}}\,{\begin{bmatrix}c_{1}\\c_{2}\\c_{3}\end{bmatrix}}={\begin{bmatrix}\left\langle 1,x^{3}\right\rangle \\\left\langle 2x-1,x^{3}\right\rangle \\\left\langle 3(2x-1)^{2}-1,x^{3}\right\rangle \end{bmatrix}}$

This gives a much easier system to solve:

${\begin{bmatrix}1&0&0\\0&{\frac {1}{3}}&0\\0&0&{\frac {4}{5}}\end{bmatrix}}\,{\begin{bmatrix}c_{1}\\c_{2}\\c_{3}\end{bmatrix}}={\begin{bmatrix}{\frac {1}{4}}\\{\frac {2}{5}}-{\frac {1}{4}}\\{\frac {1}{10}}\end{bmatrix}}$

There are two ways to find an orthogonal basis for $p$

Legendre polynomials
Gram-Schmidt

Legendre Polynomials

$L_{n}(x)={\frac {\mathrm {d} ^{n}}{\mathrm {d} x^{n}}}\,\left[\left(1-x^{2}\right)^{n}\right]$

For $n\in \left\{0,1,2\right\}$ , we have:

$L_{0}(x)=1$ ,
$L_{1}(x)=x$ , and
$L_{2}(x)=3x^{2}-1$

These only work on $\left[-1,1\right]$ , so we shift and scale to $\left[0,1\right]$ by substituting $y={\frac {x+1}{2}}$ (or $x=2y-1$ ). Performing this substitution gives our $\phi$ 's above.

Gram-Schmidt

This will always give an orthonormal basis, so $\left\langle \phi _{i},\phi _{j}\right\rangle =\delta _{ij}$ :

def ip(f, g, a, b):
    return integrate(f*g, x, a, b)

def norm(f, a, b):
    return sqrt(ip(f, f, a, b))

def normalize(f, a, b):
    return f/norm(f, a, b)

def gram_schmidt(basis, a, b):
    on_basis = [normalize(basis[0], a, b)]
    for i in xrange(1, len(basis)):
        p = sum(ip(basis[i], e, a, b) * e for e in on_basis)
        next_e = normalize(basis[i] - p, a, b)
        on_basis.append(next_e)
    return map(simplify, on_basis)

$\varphi _{1}(x)=1$
$\varphi _{2}(x)={\sqrt {3}}\,\left(2x-1\right)$
$\varphi _{3}(x)={\sqrt {5}}\,\left(6x^{2}-6x+1\right)$

Now we have

${\begin{bmatrix}1&0&0\\0&1&0\\0&0&1\end{bmatrix}}\,{\begin{bmatrix}\alpha _{1}\\\alpha _{2}\\\alpha _{3}\end{bmatrix}}={\begin{bmatrix}{\frac {1}{4}}\\{\frac {3{\sqrt {3}}}{20}}\\{\frac {\sqrt {5}}{20}}\end{bmatrix}}$

Quiz Discussion: Matrix Norms

$\left\|A\right\|_{p}=\max _{\left\|x\right\|_{p}=1}\left\|A\,x\right\|_{p}$

This is horrible. Bojan Popov

Find $\left\|A\right\|_{1}$ , $\left\|A\right\|_{2}$ , and $\left\|A\right\|_{\infty }$ for

$A={\begin{bmatrix}1&0&-2\\0&2&0\\-1&0&3\end{bmatrix}}$

$\left\|A\right\|_{\infty }$ is the max row sum, so $\left\|A\right\|_{\infty }=4$

$\left\|A\right\|_{1}$ is just the max column sum, so $\left\|A\right\|_{1}=5$

$\left\|A\right\|_{2}={\sqrt {\rho (A^{T}\,A)}}$ , so

$A^{T}\,A={\begin{bmatrix}2&0&-5\\0&4&0\\-5&0&13\end{bmatrix}}$ ,
$\rho (A^{T}\,A)=\max _{i}\lambda _{i}=\max \left\{4,{\frac {15+{\sqrt {221}}}{2}},{\frac {15-{\sqrt {221}}}{2}}\right\}={\frac {15+{\sqrt {221}}}{2}}$
${\sqrt {\rho (A^{T}\,A)}}={\sqrt {\frac {15+{\sqrt {221}}}{2}}}$