# MATH 323 Lecture 23

« previous | Tuesday, November 20, 2012 | next »

## Least Squares Problem

given a ${\displaystyle m\times n}$ matrix ${\displaystyle A}$ with rank ${\displaystyle n}$, the least-squares solution to the equation ${\displaystyle A{\vec {x}}={\vec {b}}}$ is given by

${\displaystyle A^{T}A{\vec {x}}=A^{T}{\vec {b}}}$

and has the unique solution

${\displaystyle {\hat {x}}=(A^{T}A)^{-1}A^{T}{\vec {b}}}$

the vector ${\displaystyle {\vec {p}}\in A{\vec {x}}}$ in the range of ${\displaystyle A}$ is closest to ${\displaystyle {\vec {b}}}$.

Assume the columns of ${\displaystyle A}$ constitute an orthonormal basis in ${\displaystyle R(A)}$.

### Theorem 5.5.6

If the column vectors of ${\displaystyle A}$ form an orthonormal set of vectors in ${\displaystyle \mathbb {R} ^{m}}$, then ${\displaystyle A^{T}A=I}$ and the solution to the least squares problem is

${\displaystyle {\hat {x}}=A^{T}{\vec {b}}}$

#### Proof

We need to show that ${\displaystyle A^{T}A=I}$.

If ${\displaystyle A^{T}A=(c_{ij})}$, then ${\displaystyle c_{ij}={\vec {a}}_{i}\cdot {\vec {a}}_{j}=\delta _{ij}={\begin{cases}1&i=j\\0&{\text{otherwise}}\end{cases}}}$

Therefore, the diagonal entries for ${\displaystyle i=j}$ will be 1, and all other matrix cells will be 0.

Q.E.D.

### Theorem 5.5.7

Let ${\displaystyle S}$ be a subspace in ${\displaystyle (V,\langle \cdot ,\cdot \rangle )}$ and let ${\displaystyle x\in V}$.

Let ${\displaystyle \dim {S}=n}$ and ${\displaystyle \{x_{1},\ldots ,x_{n}\}}$ be an orthonormal basis for ${\displaystyle S}$.

If ${\displaystyle p=\sum _{i=1}^{n}c_{i}\,x_{i}}$ where ${\displaystyle c_{i}=\langle x,x_{i}}$ for all ${\displaystyle i}$, then ${\displaystyle p-x\in S^{\perp }}$.

#### Proof

Show that ${\displaystyle (p-x)\perp x_{i}}$ for any ${\displaystyle i}$:

{\displaystyle {\begin{aligned}\langle p-x,x_{i}\rangle &=\left\langle \sum _{j=1}^{n}c_{j}\,x_{j},x_{i}\right\rangle -\langle x,x_{i}\rangle \\&=\sum _{j=1}^{n}c_{j}\langle x_{j},x_{i}\rangle -\langle x,x_{i}\rangle \\&=\sum _{j=1}^{n}c_{j}\delta _{ij}-\langle x,x_{i}\rangle \\&=c_{i}\langle x_{i},x_{i}\rangle -c_{i}\\&=c_{i}-c_{i}\\&=0\end{aligned}}}

### Theorem 5.5.8

Under hypothesis of #Theorem 5.5.7, ${\displaystyle p}$ is the element of ${\displaystyle S}$ that is closest to ${\displaystyle x}$. That is, ${\displaystyle \|y-x\|>\|p-x\|}$ for all ${\displaystyle y\neq p}$ in ${\displaystyle S}$.

#### Corollary 5.5.9

Let ${\displaystyle S}$ be a nonzero subspace in ${\displaystyle \mathbb {R} ^{m}}$ and let ${\displaystyle p\in \mathbb {R} ^{m}}$.

If \{\vec{u}_1, \ldots, \vec{u}_k \} is an orthonormal basis for ${\displaystyle S}$ and ${\displaystyle U=({\vec {u}}_{1},\ldots ,{\vec {u}}_{k})}$, then the projection ${\displaystyle {\vec {p}}}$ of ${\displaystyle {\vec {b}}}$ onto ${\displaystyle S}$ is given by \vec{p} = UU^T \vec{b}[/itex]

##### Proof

${\displaystyle {\vec {p}}=\sum _{i=1}^{k}c_{i}\,{\vec {u}}_{i}=U{\vec {c}}=UU^{T}{\vec {b}}}$

${\displaystyle {\vec {c}}={\begin{pmatrix}{\vec {u}}_{1}^{T}{\vec {b}}\\\vdots \\{\vec {u}}_{k}^{T}{\vec {b}}\end{pmatrix}}=U^{T}{\vec {b}}}$

Let ${\displaystyle P=UU^{T}}$ be the projection matrix for ${\displaystyle S}$.

Going back to the formula for the least squares solution, ${\displaystyle {\hat {x}}=(A^{T}A)^{-1}A^{T}{\vec {b}}}$, ${\displaystyle P=(A^{T}A)^{-1}A^{T}}$

${\displaystyle P}$ is unique to ${\displaystyle S}$ regardless of basis used. However, ${\displaystyle P=UU^{T}}$ can only be used for orthonormal bases; other calculations must use ${\displaystyle P=(A^{T}A)^{-1}A^{T}}$

### Example

Find best least squares approximation of ${\displaystyle e^{x}}$ on [0,1] by a linear function (subspace of C[0,1])

• Space: C[0,1]
• Inner product: ${\displaystyle \langle f,g\rangle =\int _{0}^{1}f(x)\,g(x)\,\mathrm {d} x}$
• Subsace: ${\displaystyle S=\mathrm {Span} \{1,x\}}$ (linear functions)

Find ${\displaystyle a}$ such that ${\displaystyle (x-a)\perp 1}$:

${\displaystyle \langle 1,x-a\rangle =\int _{0}^{1}(x-a)\,\mathrm {d} x={\frac {1}{2}}-a=0}$

Therefore ${\displaystyle a={\tfrac {1}{2}}}$, and ${\displaystyle \{1,x-{\tfrac {1}{2}}\}}$ forms an orthogonal basis for ${\displaystyle S}$.

Find orthonormal basis for ${\displaystyle S}$:

{\displaystyle {\begin{aligned}\left\|x-{\frac {1}{2}}\right\|&={\sqrt {\int _{0}^{1}\left(x-{\frac {1}{2}}\right)^{2}\,\mathrm {d} x}}={\frac {1}{\sqrt {12}}}\\\|1\|&=1\\u_{1}&=1\\u_{2}&={\frac {x-{\frac {1}{2}}}{\left\|x-{\frac {1}{2}}\right\|}}={\sqrt {12}}\left(x-{\frac {1}{2}}\right)\end{aligned}}}

Therefore ${\displaystyle \{1,{\sqrt {12}}(x-{\tfrac {1}{2}})\}}$ forms an orthonormal basis for ${\displaystyle S}$

Find least squares approximation:

{\displaystyle {\begin{aligned}p&=c_{1}\,u_{1}+c_{2}\,u_{2}\\c_{1}&=\langle \mathrm {e} ^{x},u_{1}\rangle =\int _{0}^{1}\mathrm {e} ^{x}\,\mathrm {d} x=\mathrm {e} -1\\c_{2}&=\langle \mathrm {e} ^{x},u_{2}\rangle =\int _{0}^{1}\mathrm {e} ^{x}\,\left(x-{\frac {1}{2}}\right)\,\mathrm {d} x={\sqrt {3}}(3-e)\\p&=(\mathrm {e} -1)\cdot 1+({\sqrt {3}}(\mathrm {e} -3))\cdot ({\sqrt {12}}(x-{\tfrac {1}{2}}))\\&=(4\mathrm {e} -10)+(6(3-\mathrm {e} ))\,x\end{aligned}}}

## Fourier Approximation

{\displaystyle {\begin{aligned}r_{k}&={\sqrt {a_{k}^{2}+b_{k}^{2}}}\\\theta _{k}&=\arctan {\left({\frac {b_{k}}{a_{k}}}\right)}\\a_{k}\cos {kx}+b_{k}\sin {kx}&=r_{k}\cos {kx-\theta _{k}}\end{aligned}}}

The last formula is the equation for harmonic motion.

## Gram-Schmidt Process

For ${\displaystyle V,\langle \cdot ,\cdot \rangle }$, and ${\displaystyle \{x_{1},\ldots ,x_{n}\}}$ basis in ${\displaystyle V}$,

How do we obtain ${\displaystyle \{u_{1},\ldots ,u_{n}\}}$, an orthonormal basis for ${\displaystyle S}$ so that ${\displaystyle \mathrm {Span} \{x_{1},\ldots ,x_{n}\}=\mathrm {Span} \{u_{1},\ldots ,u_{n}\}}$?

{\displaystyle {\begin{aligned}u_{1}&={\frac {x_{1}}{\|x_{1}\|}}\\\mathrm {Span} \{x_{1}\}&=\mathrm {Span} \{u_{1}\}=V_{1}\\p_{1}&=\mathrm {proj} _{V_{1}}(x_{2})=\langle x_{2},u_{1}\rangle \cdot u_{1}\\&x_{2}-p_{1}\perp V_{1}\\u_{2}&={\frac {x_{2}-p_{1}}{\|x_{2}-p_{1}\|}}\\\mathrm {Span} \{x_{1},x_{2}\}&=\mathrm {Span} \{u_{1},u_{2}\}=V_{2}\\p_{2}&=\mathrm {proj} _{V_{1}}(x_{3})=\langle x_{3},u_{2}\rangle \cdot u_{2}\\&x_{3}-p_{2}\perp V_{2}\\&\ldots \end{aligned}}}

### Theorem 5.6.1

Let ${\displaystyle \{x_{1},\ldots ,x_{n}\}}$ be a basis for ${\displaystyle V,\langle \cdot ,\cdot \rangle }$. Let ${\displaystyle u_{1}={\tfrac {x_{1}}{\|x_{1}\|}}}$ and define ${\displaystyle u_{2},\ldots ,u_{n}}$ recursively by

${\displaystyle u_{k+1}={\frac {x_{k+1}-p_{k}}{\|x_{k+1}-p_{k}\|}}}$

Where ${\displaystyle p_{k}=\langle x_{k+1},u_{1}\rangle u_{1}+\langle x_{k+1},u_{2}\rangle u_{2}+\dots +\langle x_{k+1},u_{k}\rangle u_{k}}$ is the projection of ${\displaystyle x_{k+1}}$ onto ${\displaystyle v_{k}=\mathrm {Span} (u_{1},\ldots ,u_{k})=\mathrm {Span} (x_{1},\ldots ,x_{k})}$.

The set ${\displaystyle \{u_{1},\ldots ,u_{n}\}}$ is an orthonormal basis for ${\displaystyle V}$.