Example Problems

Calculation

Given a basis B={b1,b2,,bn}\mathcal{B} = \{\vec{b}_1, \vec{b}_2, \ldots, \vec{b}_n\}, which spans a subspace VV, we want to find an orthogonal basis U={u1,u2,,un}\mathcal{U} = \{\vec{u}_1, \vec{u}_2, \ldots, \vec{u}_n\} that also spans VV.

  1. Base case: We let u1=b1\vec{u}_1 = \vec{b}_1. If VV is a one-dimensional subspace, then we are done.

  2. Iterative case: We take each vector from B\mathcal{B} one by one and project it onto the existing orthogonal basis vectors. For b2\vec{b}_2, since we only have u1=b1\vec{u}_1 = \vec{b}_1 in our orthonormal basis, we only do one projection.

    projUb2=b2u1u1u1u1\text{proj}_U \vec{b}_2 = \frac{\vec{b}_2 \cdot \vec{u}_1}{\vec{u_1} \cdot \vec{u}_1}\vec{u}_1

    We know that the error vector will be orthogonal to our existing orthogonal basis, so we subtract the projection from the b\vec{b} vector.

    u2=b2projUb2=b2b2u1u1u1u1\vec{u}_2 = \vec{b}_2 - \text{proj}_U \vec{b}_2 = \vec{b}_2 - \frac{\vec{b}_2 \cdot \vec{u}_1}{\vec{u_1} \cdot \vec{u}_1}\vec{u}_1

    For each successive b\vec{b} vector, we repeat this algorithm. Therefore, u3=b3projUb3=b3b3u1u1u1u1b3u2u2u2u2\vec{u}_3 = \vec{b}_3 - \text{proj}_U \vec{b}_3 = \vec{b}_3 - \frac{\vec{b}_3 \cdot \vec{u}_1}{\vec{u_1} \cdot \vec{u}_1}\vec{u}_1 - \frac{\vec{b}_3 \cdot \vec{u}_2}{\vec{u}_2 \cdot \vec{u}_2}\vec{u}_2

    \vdots

    un=bnprojUbn=bnbnu1u1u1u1bnun1un1un1un1\vec{u}_n = \vec{b}_n - \text{proj}_U \vec{b}_n = \vec{b}_n - \frac{\vec{b}_n \cdot \vec{u}_1}{\vec{u_1} \cdot \vec{u}_1}\vec{u}_1 - \cdots - \frac{\vec{b}_n \cdot \vec{u}_{n - 1}}{\vec{u}_{n - 1} \cdot \vec{u}_{n - 1}}\vec{u}_{n - 1}

Notes: 1. Gram-Schmidt turns a set of vectors into an orthogonal basis. Since basis vectors have to be linearly independent, Gram-Schmidt will return some 0\vec{0} if the set of vectors contains linearly dependent vectors. Intuitively, since the linearly dependent vector is already in the subspace, it is equal to its projection, so the error vector is 0\vec{0}. 2. Order matters. You will get a different basis depending on the order of the b\vec{b} vectors in which you perform Gram-Schmidt.

Example

Find an orthonormal basis for B={[11],[21]}\mathcal{B} = \left\{\begin{bmatrix}1 \\ 1\end{bmatrix}, \begin{bmatrix}2 \\ 1\end{bmatrix}\right\}.

Let U={u1,u2}\mathcal{U} = \{\vec{u}_1, \vec{u}_2\} be the orthogonal basis. Then, u1=b1=[11]\vec{u}_1 = \vec{b}_1 = \begin{bmatrix}1 \\ 1\end{bmatrix}.

u2=b2b2u1u1u1u1=[21][21][11][11][11][11]=[21]32[11]=[1212]\vec{u}_2 = \vec{b}_2 - \frac{\vec{b}_2 \cdot \vec{u}_1}{\vec{u}_1 \cdot \vec{u}_1}\vec{u}_1 = \begin{bmatrix}2 \\ 1\end{bmatrix} - \frac{\begin{bmatrix}2 \\ 1\end{bmatrix} \cdot \begin{bmatrix}1 \\ 1\end{bmatrix}}{\begin{bmatrix}1 \\ 1\end{bmatrix} \cdot \begin{bmatrix}1 \\ 1\end{bmatrix}}\begin{bmatrix}1 \\ 1\end{bmatrix} = \begin{bmatrix}2 \\ 1\end{bmatrix} - \frac{3}{2}\begin{bmatrix}1 \\ 1\end{bmatrix} = \begin{bmatrix}\frac{1}{2} \\ -\frac{1}{2}\end{bmatrix}

Sanity check:

u1u2=[11][1212]=0\vec{u}_1 \cdot \vec{u}_2 = \begin{bmatrix}1 \\ 1\end{bmatrix} \cdot \begin{bmatrix}\frac{1}{2} \\ -\frac{1}{2}\end{bmatrix} = 0

U={[11],[1212]}\mathcal{U} = \left\{\begin{bmatrix}1 \\ 1\end{bmatrix}, \begin{bmatrix}\frac{1}{2} \\ -\frac{1}{2}\end{bmatrix}\right\}

However, the question asks for an orthonormal basis, so we just normalize each of these vectors to get our orthonormal basis:

{[2222],[2222]}\left\{\begin{bmatrix}\frac{\sqrt{2}}{2} \\ \frac{\sqrt{2}}{2}\end{bmatrix}, \begin{bmatrix}\frac{\sqrt{2}}{2} \\ -\frac{\sqrt{2}}{2}\end{bmatrix}\right\}

System of Equations in Orthogonal Bases

Assume that U={u1,u2,,un}\mathcal{U} = \{\vec{u}_1, \vec{u}_2, \ldots, \vec{u}_n\} is an orthogonal basis for the vector space VV. For another vector yV\vec{y} \in V, we want to express y\vec{y} as a linear combination of the orthogonal basis vectors and solve y=c1u1+c2u2++cnun\vec{y} = c_1\vec{u}_1 + c_2\vec{u}_2 + \cdots + c_n\vec{u}_n.

If we dot both sides with ui,1in\vec{u}_i, 1 \leq i \leq n, then we get

yui=c1u1ui+c2u2ui++cnunui\vec{y} \cdot \vec{u}_i = c_1\vec{u}_1 \cdot \vec{u}_i + c_2\vec{u}_2 \cdot \vec{u}_i + \cdots + c_n\vec{u}_n \cdot \vec{u}_i.

However, remember that since U\mathcal{U} is an orthonormal basis, ujuk=0\vec{u}_j \cdot \vec{u}_k = 0 if jkj \neq k. Therefore, the above equation becomes

yui=ciuiui\vec{y} \cdot \vec{u}_i = c_i\vec{u}_i \cdot \vec{u}_i

Solving for the coefficient cic_i,

ci=yuiuiuic_i = \frac{\vec{y} \cdot \vec{u}_i}{\vec{u}_i \cdot \vec{u}_i}

Therefore, in cases of orthogonal bases, we only need to find the projections of the vector y\vec{y} onto each basis vector.

QR Factorization

What and why

QR Factorization is basically using Gram-Schmidt to find an orthonormal basis spanning the column space of a matrix A\textbf{A} and re-writing the Gram-Schmidt process in matrix form. QR Factorization produces an orthogonal matrix Q\textbf{Q} (remember that an orthogonal matrix has orthonormal columns) and an upper triangular matrix R\textbf{R}, where A=QR\textbf{A} = \textbf{Q}\textbf{R}.

Calculation

The most straighforward way to calculate Q\textbf{Q} and R\textbf{R} is to first calculate Q\textbf{Q} using Gram-Schmidt and then normalizing the result. To find R\textbf{R}, we can use the property that Q\textbf{Q} is orthonormal, i.e. Q1=QT\textbf{Q}^{-1} = \textbf{Q}^T.

A=QR\textbf{A} = \textbf{Q}\textbf{R}

QTA=QTQR\textbf{Q}^T\textbf{A} = \textbf{Q}^T\textbf{Q}\textbf{R}

R=QTA\textbf{R} = \textbf{Q}^T\textbf{A}

The other way (which will not be discussed here) is to use the Gram-Schmidt equations and re-writing them in matrix form. When we then normalize the columns Q\textbf{Q} (i.e. divide by the lengths of each vector), we multiply the corresponding row in R\textbf{R} by the lengths of the q\vec{q} vectors.

Example

Using the example from above, where A=[1211]\textbf{A} = \begin{bmatrix}1 & 2 \\ 1 & 1\end{bmatrix}, we know that Q=[22222222]\textbf{Q} = \begin{bmatrix}\frac{\sqrt{2}}{2} & \frac{\sqrt{2}}{2} \\ \frac{\sqrt{2}}{2} & -\frac{\sqrt{2}}{2}\end{bmatrix}.

To find R\textbf{R},

R=QTA=[22222222][1211]=[2322022]\textbf{R} = \textbf{Q}^T\textbf{A} = \begin{bmatrix}\frac{\sqrt{2}}{2} & \frac{\sqrt{2}}{2} \\ \frac{\sqrt{2}}{2} & -\frac{\sqrt{2}}{2}\end{bmatrix}\begin{bmatrix}1 & 2 \\ 1 & 1\end{bmatrix} = \begin{bmatrix}\sqrt{2} & \frac{3\sqrt{2}}{2} \\ 0 & \frac{\sqrt{2}}{2}\end{bmatrix}

Note that R\textbf{R} is upper triangular. Check for yourself that A=QR\textbf{A} = \textbf{Q}\textbf{R}.

Least Squares using QR Factorization

QR Factorization is helpful when solving doing least squares ATAx=ATb\textbf{A}^T\textbf{A}\vec{x} = \textbf{A}^T\vec{b} because if we can factorize A=QR\textbf{A} = \textbf{Q}\textbf{R}, we can take advantage of the property that Q\textbf{Q} is orthonormal and that R\textbf{R} is invertible (and easy to invert because it's upper triangular).

ATAx=ATb\textbf{A}^T\textbf{A}\vec{x} = \textbf{A}^T\vec{b}

(QR)TQRx=(QR)Tb(\textbf{Q}\textbf{R})^T\textbf{Q}\textbf{R}\vec{x} = (\textbf{Q}\textbf{R})^T\vec{b}

RTQTQRx=RTQTb\textbf{R}^T\textbf{Q}^T\textbf{Q}\textbf{R}\vec{x} = \textbf{R}^T\textbf{Q}^T\vec{b}

RTRx=RTQTb\textbf{R}^T\textbf{R}\vec{x} = \textbf{R}^T\textbf{Q}^T\vec{b}

Rx=QTb\textbf{R}\vec{x} = \textbf{Q}^T\vec{b}

x=R1QTb\vec{x} = \textbf{R}^{-1}\textbf{Q}^T\vec{b}

Last updated