Strang Lecture 16: Projection Matrices and Least Squares
- The projection matrix is \(P=A(A^TA)^{-1}A^T\)
-
If you project \(b\) and \(b\) is in the column space, then \(Pb=b\):
\[Pb=A(A^TA)^{-1}A^Tb\] \[\text{note: }b = Ax\] \[Pb=A(A^TA)^{-1}A^TAx\] \[Pb=A\frac{A^TA}{A^TA}x\] \[Pb=Ax\] \[Pb=b\] -
If you project \(b\) and \(b\) is perpendicular to the column space, then \(Pb=0\) (note, \(b\) is in the nullspace of \(A^T\) because it is perpendicular to each column of \(A\)):
\[Pb=A(A^TA)^{-1}A^Tb\] \[\text{note: }A^Tb = 0\] \[Pb=A(A^TA)^{-1}0\] \[Pb=0\] -
When we project \(b\) into the columnspace of \(A\), the projection \(p=Pb\) plus the error \(e\) results in \(b\) (\(p+e=b\)). Note that we are also projecting \(b\) into the left nullspace to find \(e\). Thus \(e=(I-P)b\). It is not clear to me why the \(I-P\) appears other than:
\[p+e = Pb + (I-P)b = Pb + Ib - Pb = Ib = b\]
Least squares
-
Find the best straight line \(y=C+Dt\) through the points (1,1), (2,2), and (3,2).
-
Each point defines a linear equation:
\[C + D = 1\] \[C + 2D = 2\] \[C + 3D = 2\] -
This translates to matrix form:
\[\begin{bmatrix} \begin{array}{rr} 1 & 1 \\ 1 & 2 \\ 1 & 3 \\ \end{array} \end{bmatrix} * \begin{bmatrix} \begin{array}{r} C \\ D \\ \end{array} \end{bmatrix} = \begin{bmatrix} \begin{array}{r} 1 \\ 2 \\ 2 \\ \end{array} \end{bmatrix}\] -
What are we minimizing when we try to find the best possible projection of \(b\) into the columnspace. Minimize \(\lvert Ax-b\rvert^2=\lvert e\rvert^2\). The error is how far each point lies from the line.
-
For the least squares method of calculating the error, outliers can unduly influence the result.
-
Pretty good discussion of the projections vs the original points at 17:15. Essentially we project all the points in our dataset onto a line. We choose the line such that the line minimizes the total error squared.
-
The goal of least squares: Find \(\hat{x} = \begin{bmatrix} \hat{C} \\ \hat{D} \end{bmatrix}\) and \(P\)
\[A^TA\hat{x} = A^Tb\] \[A^TA = \begin{bmatrix} \begin{array}{rrr} 1 & 1 & 1\\ 1 & 2 & 3\\ \end{array} \end{bmatrix} * \begin{bmatrix} \begin{array}{rr} 1 & 1 \\ 1 & 2 \\ 1 & 3 \\ \end{array} \end{bmatrix} = \begin{bmatrix} \begin{array}{rr} 3 & 6 \\ 6 & 14 \\ \end{array} \end{bmatrix}\] \[A^TAb = \begin{bmatrix} \begin{array}{rrr} 1 & 1 & 1\\ 1 & 2 & 3\\ \end{array} \end{bmatrix} * \begin{bmatrix} \begin{array}{rrr} 1 & 1 & 1\\ 1 & 2 & 2\\ 1 & 3 & 2\\ \end{array} \end{bmatrix} = \begin{bmatrix} \begin{array}{rrr} 3 & 6 & 5\\ 6 & 14 & 11\\ \end{array} \end{bmatrix}\] -
Normal equations:
\[3C + 6D = 5\] \[6C + 14D = 11\] -
Solution: \(D=\frac{1}{2}\) and \(C=\frac{2}{3}\).
-
30:00 - good discussion of the error vector and how it relates to the matrix world. He shows how \(b=p+e\) in this example.
\[b = p+e\] \[\begin{bmatrix} \begin{array}{r} 1 \\ 2 \\ 2 \\ \end{array} \end{bmatrix} = \begin{bmatrix} \begin{array}{r} \frac{7}{6} \\ \frac{10}{6} \\ \frac{13}{6} \\ \end{array} \end{bmatrix} + \begin{bmatrix} \begin{array}{r} \frac{-1}{6} \\ \frac{2}{6} \\ \frac{-1}{6} \\ \end{array} \end{bmatrix}\] -
Note \(e\) is perpendicular to all vectors in the columnspace.
Back to Linear Algebra
-
If \(A\) has independent columns, then \(A^TA\) is invertible.
-
How to prove?
\[\text{Suppose } A^TAx=0 \text{ then prove } x=0\] \[x^TA^TAx=0\] \[(Ax)^T(Ax) = 0\] \[Ax=0\] \[A \text{ has independent columns and } Ax=0 \text{ then } x=0\] -
Columns are definitely independent if they are perpendicular unit vectors (orthonormal vectors).
TODO
- What is the projection matrix for \(b\) into the left nullspace of \(A\)?