Example Problems

Calculation

Given a basis $\mathcal{B} = \{\vec{b}_1, \vec{b}_2, \ldots, \vec{b}_n\}$ , which spans a subspace $V$ , we want to find an orthogonal basis $\mathcal{U} = \{\vec{u}_1, \vec{u}_2, \ldots, \vec{u}_n\}$ that also spans $V$ .

Base case: We let $\vec{u}_1 = \vec{b}_1$ . If $V$ is a one-dimensional subspace, then we are done.
Iterative case: We take each vector from $\mathcal{B}$ one by one and project it onto the existing orthogonal basis vectors. For $\vec{b}_2$ , since we only have $\vec{u}_1 = \vec{b}_1$ in our orthonormal basis, we only do one projection.
$\text{proj}_U \vec{b}_2 = \frac{\vec{b}_2 \cdot \vec{u}_1}{\vec{u_1} \cdot \vec{u}_1}\vec{u}_1$
We know that the error vector will be orthogonal to our existing orthogonal basis, so we subtract the projection from the $\vec{b}$ vector.
$\vec{u}_2 = \vec{b}_2 - \text{proj}_U \vec{b}_2 = \vec{b}_2 - \frac{\vec{b}_2 \cdot \vec{u}_1}{\vec{u_1} \cdot \vec{u}_1}\vec{u}_1$
For each successive $\vec{b}$ vector, we repeat this algorithm. Therefore, $\vec{u}_3 = \vec{b}_3 - \text{proj}_U \vec{b}_3 = \vec{b}_3 - \frac{\vec{b}_3 \cdot \vec{u}_1}{\vec{u_1} \cdot \vec{u}_1}\vec{u}_1 - \frac{\vec{b}_3 \cdot \vec{u}_2}{\vec{u}_2 \cdot \vec{u}_2}\vec{u}_2$
$\vdots$
$\vec{u}_n = \vec{b}_n - \text{proj}_U \vec{b}_n = \vec{b}_n - \frac{\vec{b}_n \cdot \vec{u}_1}{\vec{u_1} \cdot \vec{u}_1}\vec{u}_1 - \cdots - \frac{\vec{b}_n \cdot \vec{u}_{n - 1}}{\vec{u}_{n - 1} \cdot \vec{u}_{n - 1}}\vec{u}_{n - 1}$

Notes: 1. Gram-Schmidt turns a set of vectors into an orthogonal basis. Since basis vectors have to be linearly independent, Gram-Schmidt will return some $\vec{0}$ if the set of vectors contains linearly dependent vectors. Intuitively, since the linearly dependent vector is already in the subspace, it is equal to its projection, so the error vector is $\vec{0}$ . 2. Order matters. You will get a different basis depending on the order of the $\vec{b}$ vectors in which you perform Gram-Schmidt.

Example

Find an orthonormal basis for $\mathcal{B} = \left\{\begin{bmatrix}1 \\ 1\end{bmatrix}, \begin{bmatrix}2 \\ 1\end{bmatrix}\right\}$ .

Let $\mathcal{U} = \{\vec{u}_1, \vec{u}_2\}$ be the orthogonal basis. Then, $\vec{u}_1 = \vec{b}_1 = \begin{bmatrix}1 \\ 1\end{bmatrix}$ .

$\vec{u}_2 = \vec{b}_2 - \frac{\vec{b}_2 \cdot \vec{u}_1}{\vec{u}_1 \cdot \vec{u}_1}\vec{u}_1 = \begin{bmatrix}2 \\ 1\end{bmatrix} - \frac{\begin{bmatrix}2 \\ 1\end{bmatrix} \cdot \begin{bmatrix}1 \\ 1\end{bmatrix}}{\begin{bmatrix}1 \\ 1\end{bmatrix} \cdot \begin{bmatrix}1 \\ 1\end{bmatrix}}\begin{bmatrix}1 \\ 1\end{bmatrix} = \begin{bmatrix}2 \\ 1\end{bmatrix} - \frac{3}{2}\begin{bmatrix}1 \\ 1\end{bmatrix} = \begin{bmatrix}\frac{1}{2} \\ -\frac{1}{2}\end{bmatrix}$

Sanity check:

$\vec{u}_1 \cdot \vec{u}_2 = \begin{bmatrix}1 \\ 1\end{bmatrix} \cdot \begin{bmatrix}\frac{1}{2} \\ -\frac{1}{2}\end{bmatrix} = 0$

$\mathcal{U} = \left\{\begin{bmatrix}1 \\ 1\end{bmatrix}, \begin{bmatrix}\frac{1}{2} \\ -\frac{1}{2}\end{bmatrix}\right\}$

However, the question asks for an orthonormal basis, so we just normalize each of these vectors to get our orthonormal basis:

$\left\{\begin{bmatrix}\frac{\sqrt{2}}{2} \\ \frac{\sqrt{2}}{2}\end{bmatrix}, \begin{bmatrix}\frac{\sqrt{2}}{2} \\ -\frac{\sqrt{2}}{2}\end{bmatrix}\right\}$

System of Equations in Orthogonal Bases

Assume that $\mathcal{U} = \{\vec{u}_1, \vec{u}_2, \ldots, \vec{u}_n\}$ is an orthogonal basis for the vector space $V$ . For another vector $\vec{y} \in V$ , we want to express $\vec{y}$ as a linear combination of the orthogonal basis vectors and solve $\vec{y} = c_1\vec{u}_1 + c_2\vec{u}_2 + \cdots + c_n\vec{u}_n$ .

If we dot both sides with $\vec{u}_i, 1 \leq i \leq n$ , then we get

$\vec{y} \cdot \vec{u}_i = c_1\vec{u}_1 \cdot \vec{u}_i + c_2\vec{u}_2 \cdot \vec{u}_i + \cdots + c_n\vec{u}_n \cdot \vec{u}_i$ .

However, remember that since $\mathcal{U}$ is an orthonormal basis, $\vec{u}_j \cdot \vec{u}_k = 0$ if $j \neq k$ . Therefore, the above equation becomes

$\vec{y} \cdot \vec{u}_i = c_i\vec{u}_i \cdot \vec{u}_i$

Solving for the coefficient $c_i$ ,

$c_i = \frac{\vec{y} \cdot \vec{u}_i}{\vec{u}_i \cdot \vec{u}_i}$

Therefore, in cases of orthogonal bases, we only need to find the projections of the vector $\vec{y}$ onto each basis vector.

QR Factorization

What and why

QR Factorization is basically using Gram-Schmidt to find an orthonormal basis spanning the column space of a matrix $\textbf{A}$ and re-writing the Gram-Schmidt process in matrix form. QR Factorization produces an orthogonal matrix $\textbf{Q}$ (remember that an orthogonal matrix has orthonormal columns) and an upper triangular matrix $\textbf{R}$ , where $\textbf{A} = \textbf{Q}\textbf{R}$ .

Calculation

The most straighforward way to calculate $\textbf{Q}$ and $\textbf{R}$ is to first calculate $\textbf{Q}$ using Gram-Schmidt and then normalizing the result. To find $\textbf{R}$ , we can use the property that $\textbf{Q}$ is orthonormal, i.e. $\textbf{Q}^{-1} = \textbf{Q}^T$ .

$\textbf{A} = \textbf{Q}\textbf{R}$

$\textbf{Q}^T\textbf{A} = \textbf{Q}^T\textbf{Q}\textbf{R}$

$\textbf{R} = \textbf{Q}^T\textbf{A}$

The other way (which will not be discussed here) is to use the Gram-Schmidt equations and re-writing them in matrix form. When we then normalize the columns $\textbf{Q}$ (i.e. divide by the lengths of each vector), we multiply the corresponding row in $\textbf{R}$ by the lengths of the $\vec{q}$ vectors.

Example

Using the example from above, where $\textbf{A} = \begin{bmatrix}1 & 2 \\ 1 & 1\end{bmatrix}$ , we know that $\textbf{Q} = \begin{bmatrix}\frac{\sqrt{2}}{2} & \frac{\sqrt{2}}{2} \\ \frac{\sqrt{2}}{2} & -\frac{\sqrt{2}}{2}\end{bmatrix}$ .

To find $\textbf{R}$ ,

$\textbf{R} = \textbf{Q}^T\textbf{A} = \begin{bmatrix}\frac{\sqrt{2}}{2} & \frac{\sqrt{2}}{2} \\ \frac{\sqrt{2}}{2} & -\frac{\sqrt{2}}{2}\end{bmatrix}\begin{bmatrix}1 & 2 \\ 1 & 1\end{bmatrix} = \begin{bmatrix}\sqrt{2} & \frac{3\sqrt{2}}{2} \\ 0 & \frac{\sqrt{2}}{2}\end{bmatrix}$

Note that $\textbf{R}$ is upper triangular. Check for yourself that $\textbf{A} = \textbf{Q}\textbf{R}$ .

Least Squares using QR Factorization

QR Factorization is helpful when solving doing least squares $\textbf{A}^T\textbf{A}\vec{x} = \textbf{A}^T\vec{b}$ because if we can factorize $\textbf{A} = \textbf{Q}\textbf{R}$ , we can take advantage of the property that $\textbf{Q}$ is orthonormal and that $\textbf{R}$ is invertible (and easy to invert because it's upper triangular).

$\textbf{A}^T\textbf{A}\vec{x} = \textbf{A}^T\vec{b}$

$(\textbf{Q}\textbf{R})^T\textbf{Q}\textbf{R}\vec{x} = (\textbf{Q}\textbf{R})^T\vec{b}$

$\textbf{R}^T\textbf{Q}^T\textbf{Q}\textbf{R}\vec{x} = \textbf{R}^T\textbf{Q}^T\vec{b}$

$\textbf{R}^T\textbf{R}\vec{x} = \textbf{R}^T\textbf{Q}^T\vec{b}$

$\textbf{R}\vec{x} = \textbf{Q}^T\vec{b}$

$\vec{x} = \textbf{R}^{-1}\textbf{Q}^T\vec{b}$

PreviousDescription NextBasis

Last updated 4 years ago