Example Problems

Calculation

Let us derive a method for solving for the eigenvalues, eigenvectors, and eigenspaces of a given matrix.

We know that for a given eigenvector v\vec{v}, Av=λv\textbf{A}\vec{v} = \lambda\vec{v}

However, we have 2 unknowns: v\vec{v} and λ\lambda, so we try to eliminate one of them.

We can re-write the equation as Av=λIv\textbf{A}\vec{v} = \lambda\textbf{I}\vec{v}

where I\textbf{I} is the identity matrix. We then subtract Av\textbf{A}\vec{v} from both sides.

λIvAv=0\lambda\textbf{I}\vec{v} - \textbf{A}\vec{v} = \vec{0}

Factoring out the v\vec{v}, we get

(λIA)v=0(\lambda\textbf{I} - \textbf{A})\vec{v} = \vec{0}

We observe that v\vec{v} is in the null space of the matrix λIA\lambda\textbf{I} - \textbf{A}. Since we know that any eigenvector v\vec{v} is non-trivial, we have to make sure that the null space λIA\lambda\textbf{I} - \textbf{A} is non-trivial as well.

Recall the Invertible Matrix Theorem, which states that if the null space of a matrix is non-trivial, its determinant must be 0. Therefore, to solve for the eigenvalues, we use the equation det(λIA)=0\text{det}(\lambda\textbf{I} - \textbf{A}) = 0

det(λIA)=0\text{det}(\lambda\textbf{I} - \textbf{A}) = 0 is called the the characteristic polynomial of A\textbf{A} because it is a polynomial in λ\lambda. To find the eigenvalues of a matrix, we find the roots of the characteristic polynomial det(λIA)=0\text{det}(\lambda\textbf{I} - \textbf{A}) = 0.

Sometimes, an eigenvalue λi\lambda_i can appear multiple times as a root. We define the algebraic multiplicity of the eigenvalue λi\lambda_i to be the number of times λi\lambda_i appears as a root in the characteristic polynomial.

Notes:

  1. Observe what happens if an eigenvalue of A\textbf{A} is 0. Then, det(0IA)=det(A)=0    det(A)=0\text{det}(0\textbf{I} - \textbf{A}) = \text{det}(-\textbf{A}) = 0 \implies \text{det}(\textbf{A}) = 0. Therefore, a matrix A\textbf{A} has an eigenvalue of 0 if and only if A\textbf{A} is non-invertible.

  2. If A\textbf{A} is a triangular matrix, we can easily find the eigenvalues of A\textbf{A}. Assume that d1,d2,,dnd_1, d_2, \ldots, d_n are the entries along the diagonal in such a matrix. From cofactor expansion, we know that det(A)=d1d2dn\text{det}(\textbf{A}) = d_1d_2\ldots d_n. This means that det(λIA)=(λd1)(λd2)(λdn)=0\text{det}(\lambda\textbf{I} - \textbf{A}) = (\lambda - d_1)(\lambda - d_2)\cdots(\lambda - d_n) = 0, so d1,d2,,dnd_1, d_2, \ldots, d_n are the roots of the characteristic polynomial, i.e. the eigenvalues. The eigenvalues of a triangular matrix are its diagonal entries.

Once we have all of the eigenvalues λ1,λ2,,λn\lambda_1, \lambda_2, \ldots, \lambda_n of a A\textbf{A}, we can find the eigenvectors associated with each eigenvalue by using the equation (λiIA)v=0(\lambda_i\textbf{I} - \textbf{A})\vec{v} = \vec{0}. In fact, we are basically finding the basis vectors for the null space of λiIA\lambda_i\textbf{I} - \textbf{A} for i=1,2,,ni = 1, 2, \ldots, n.

If the dimension of the null space of λiIA\lambda_i\textbf{I} - \textbf{A} for an eigenvalue λi\lambda_i is larger than 1, we will expect to have multiple eigenvectors associated with the same eigenvalue λi\lambda_i.

We therefore define the geometric multiplicity of an eigenvalue to be the dimension of the associated null space of λiIA\lambda_i\textbf{I} - \textbf{A}.

Example

Let us try to find the eigenvalues, eigenvectors, and eigenspaces of the matrix A=[112002013]\textbf{A} = \begin{bmatrix}1 & -1 & 2 \\ 0 & 0 & 2 \\ 0 & -1 & 3\end{bmatrix}.

We first find the roots of the characteristic polynomial.

det(λIA)=det([λ000λ000λ][112002013])=det([λ1120λ201λ3])=0\text{det}(\lambda\textbf{I} - \textbf{A}) = \text{det}\left(\begin{bmatrix}\lambda & 0 & 0 \\ 0 & \lambda & 0 \\ 0 & 0 & \lambda\end{bmatrix} - \begin{bmatrix}1 & -1 & 2 \\ 0 & 0 & 2 \\ 0 & -1 & 3\end{bmatrix}\right) = \text{det}\left(\begin{bmatrix}\lambda - 1 & 1 & -2 \\ 0 & \lambda & -2 \\ 0 & 1 & \lambda - 3\end{bmatrix}\right) = 0

Using cofactor expansion along the first column, we get

(λ1)(λ(λ3)+2)=0(\lambda - 1)(\lambda(\lambda - 3) + 2) = 0

(λ1)(λ23λ+2)=0(\lambda - 1)(\lambda^2 - 3\lambda + 2) = 0

(λ1)(λ1)(λ2)=(λ1)2(λ2)=0(\lambda - 1)(\lambda - 1)(\lambda - 2) = (\lambda - 1)^2(\lambda - 2) = 0

We see that A\textbf{A} has the eigenvalues λ1=1\lambda_1 = 1 with algebraic multiplicity of 2 and λ2=2\lambda_2 = 2 with algebraic multiplicity of 1.

We then find the basis vectors for the null space of λiIA\lambda_i\textbf{I} - \textbf{A} for each eigenvalue.

λ1=1\lambda_1 = 1:

Null(IA)=Null([012012012])=span{[100],[121]}\text{Null}(\textbf{I} - \textbf{A}) = \text{Null}\left(\begin{bmatrix}0 & 1 & -2 \\ 0 & 1 & -2 \\ 0 & 1 & -2\end{bmatrix}\right) = \text{span}\left\{\begin{bmatrix}1 \\ 0 \\ 0\end{bmatrix}, \begin{bmatrix}1 \\ 2 \\ 1\end{bmatrix}\right\}

Therefore, the eigenvectors associated with λ1=1\lambda_1 = 1 are [100]\begin{bmatrix}1 \\ 0 \\ 0\end{bmatrix} and [121]\begin{bmatrix}1 \\ 2 \\ 1\end{bmatrix}. Since there are 2 eigenvectors associated with λ1=1\lambda_1 = 1, we say that the eigenvalue λ1=1\lambda_1 = 1 has a geometric multiplicity of 2. The eigenspace corresponding to λ1=1\lambda_1 = 1 is E1=span{[100],[121]}E_1 = \text{span}\left\{\begin{bmatrix}1 \\ 0 \\ 0\end{bmatrix}, \begin{bmatrix}1 \\ 2 \\ 1\end{bmatrix}\right\}.

λ2=2\lambda_2 = 2:

Null(2IA)=Null([112022011])=Null([101011000])=span{[111]}\text{Null}(2\textbf{I} - \textbf{A}) = \text{Null}\left(\begin{bmatrix}1 & 1 & -2 \\ 0 & 2 & -2 \\ 0 & 1 & -1\end{bmatrix}\right) = \text{Null}\left(\begin{bmatrix}1 & 0 & -1 \\ 0 & 1 & -1 \\ 0 & 0 & 0\end{bmatrix}\right) = \text{span}\left\{\begin{bmatrix}1 \\ 1 \\ 1\end{bmatrix}\right\}

[111]\begin{bmatrix}1 \\ 1 \\ 1\end{bmatrix} is the eigenvector associated with λ2=2\lambda_2 = 2, which has a geometric multiplicity of 2. E2=span{[111]}E_2 = \text{span}\left\{\begin{bmatrix}1 \\ 1 \\ 1\end{bmatrix}\right\}.

Diagonalization

What and why

Imagine a matrix A\textbf{A} represented a state transition matrix, and we want to find out the steady state. We would have to calculate limkAk\lim_{k \to \infty} \textbf{A}^k, but matrix multiplication is a very tedious process. If we could express A\textbf{A} as a product of 3 matrices, PDP1\textbf{P}\textbf{D}\textbf{P}^{-1}, where P\textbf{P} is an invertible matrix and D\textbf{D} is a diagonal matrix, then we could easily find the limit. Assume A=PDP1\textbf{A} = \textbf{P}\textbf{D}\textbf{P}^{-1}, that is A\textbf{A} is diagonalizable.

Ak=(PDP1)k=PDP1PDP1PDP1=PD(P1P)D(P1DP)(P1DP)DP1=PDkP1\textbf{A}^k = (\textbf{P}\textbf{D}\textbf{P}^{-1})^k = \textbf{P}\textbf{D}\textbf{P}^{-1}\textbf{P}\textbf{D}\textbf{P}^{-1} \cdots \textbf{P}\textbf{D}\textbf{P}^{-1} = \textbf{P}\textbf{D}(\textbf{P}^{-1}\textbf{P})\textbf{D}(\textbf{P}^{-1}\textbf{D}\textbf{P}) \cdots (\textbf{P}^{-1}\textbf{D}\textbf{P})\textbf{D}\textbf{P}^{-1} = \textbf{P}\textbf{D}^k\textbf{P}^{-1}

Calculating Dk\textbf{D}^k is very easy; in fact, for any diagonal matrix D=[d1000d20000dn]\textbf{D} = \begin{bmatrix}d_1 & 0 & \cdots & 0 \\ 0 & d_2 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & 0 & d_n\end{bmatrix}, Dk=[d1k000d2k0000dnk]\textbf{D}^k = \begin{bmatrix}d_1^k & 0 & \cdots & 0 \\ 0 & d_2^k & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & 0 & d_n^k\end{bmatrix}, so we just need to find the kk-th power of each diagonal entry in D\textbf{D}.

Calculation

Let ARn×n\textbf{A} \in \mathbb{R}^{n \times n}, D=[d1000d20000dn]\textbf{D} = \begin{bmatrix}d_1 & 0 & \cdots & 0 \\ 0 & d_2 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & 0 & d_n\end{bmatrix}, and P=[p1p2pn]\textbf{P} = \begin{bmatrix}\vec{p}_1 & \vec{p}_2 & \cdots & \vec{p}_n\end{bmatrix}.

A=PDP1\textbf{A} = \textbf{P}\textbf{D}\textbf{P}^{-1}

AP=PD\textbf{A}\textbf{P} = \textbf{P}\textbf{D}

A[p1p2pn]=[p1p2pn][d1000d20000dn]\textbf{A}\begin{bmatrix}\vec{p}_1 & \vec{p}_2 & \cdots & \vec{p}_n \\ \end{bmatrix} = \begin{bmatrix}\vec{p}_1 & \vec{p}_2 & \cdots & \vec{p}_n\end{bmatrix}\begin{bmatrix}d_1 & 0 & \cdots & 0 \\ 0 & d_2 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & 0 & d_n\end{bmatrix}

We then get nn equations:

Ap1=d1p1\textbf{A}\vec{p}_1 = d_1\vec{p}_1

Ap2=d2p2\textbf{A}\vec{p}_2 = d_2\vec{p}_2

\vdots

Apn=dnpn\textbf{A}\vec{p}_n = d_n\vec{p}_n

We recognize these equations as the definitions of eigenvectors and eigenvalues. Therefore, the diagonal matrix D\textbf{D} is just a matrix with the eigenvalues of A\textbf{A} along its diagonal and the invertible P\textbf{P} is just a matrix with the eigenvectors of P\textbf{P} as column vectors.

If (λ1,v1),(λ2,v2),,(λn,vn)(\lambda_1, \vec{v}_1), (\lambda_2, \vec{v}_2), \ldots, (\lambda_n, \vec{v}_n) are the eigenpairs of the matrix A\textbf{A}, then D=[λ1000λ2000λn]\textbf{D} = \begin{bmatrix}\lambda_1 & 0 & \cdots & 0 \\ 0 & \lambda_2 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & \lambda_n\end{bmatrix} and P=[v1v2vn]\textbf{P} = \begin{bmatrix}\vec{v}_1 & \vec{v}_2 & \cdots & \vec{v}_n\end{bmatrix}.

Note: Order matters! The eigenvector vi\vec{v}_i in the ii-th column of P\textbf{P} must correspond to the eigenvalue λi\lambda_i in the ii-th column of D\textbf{D}.

We would then need to find P1\textbf{P}^{-1} by finding the inverse of P\textbf{P}.

Let's now see what happens if we want to calculate Ak\textbf{A}^k. For this, we will use the row vectors of P1\textbf{P}^{-1}, which we will define as P1=[q1Tq2TqnT]\textbf{P}^{-1} = \begin{bmatrix}\vec{q}_1^T \\ \vec{q}_2^T \\ \vdots \\ \vec{q}_n^T\end{bmatrix}. We know that

Ak=[p1p2pn][λ1k000λ2k000λnk][q1Tq2TqnT]=[p1p2pn][λ1kq1Tλ2kq2TλnkqnT]\textbf{A}^k = \begin{bmatrix}\vec{p}_1 & \vec{p}_2 & \cdots & \vec{p}_n\end{bmatrix}\begin{bmatrix}\lambda_1^k & 0 & \cdots & 0 \\ 0 & \lambda_2^k & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & \lambda_n^k\end{bmatrix}\begin{bmatrix}\vec{q}_1^T \\ \vec{q}_2^T \\ \vdots \\ \vec{q}_n^T\end{bmatrix} = \begin{bmatrix}\vec{p}_1 & \vec{p}_2 & \cdots & \vec{p}_n\end{bmatrix}\begin{bmatrix}\lambda_1^k\vec{q}_1^T \\ \lambda_2^k\vec{q}_2^T \\ \vdots \\ \lambda_n^k\vec{q}_n^T\end{bmatrix}

Ak=λ1kp1q1T+λ2kp2q2T++λnkpnqnT\textbf{A}^k = \lambda_1^k\vec{p}_1\vec{q}_1^T + \lambda_2^k\vec{p}_2\vec{q}_2^T + \cdots + \lambda_n^k\vec{p}_n\vec{q}_n^T

We could express Ak\textbf{A}^k as a sum of matrices piqiT\vec{p}_i\vec{q}_i^T scaled by the respective eigenvalue to the kk-th power λik\lambda_i^k. If λi<1\lambda_i < 1, then limkλik=0\lim_{k \to \infty} \lambda_i^k = 0, so we could basically eliminate the matrix associated with λi<1\lambda_i < 1 from the above equation because its scaling factor λik\lambda_i^k approaches 00.

Example

Taking our previous example, we have the following eigenvalues and eigenvectors.

λ1=1:[100],[121]\lambda_1 = 1: \begin{bmatrix}1 \\ 0 \\ 0\end{bmatrix}, \begin{bmatrix}1 \\ 2 \\ 1\end{bmatrix}

λ2=2:[111]\lambda_2 = 2: \begin{bmatrix}1 \\ 1 \\ 1\end{bmatrix}

We can therefore diagonalize [112002013]\begin{bmatrix}1 & -1 & 2 \\ 0 & 0 & 2 \\ 0 & -1 & 3\end{bmatrix} as PDP1\textbf{P}\textbf{D}\textbf{P}^{-1}, where D=[100010002]\textbf{D} = \begin{bmatrix}1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 2\end{bmatrix} and P=[111021011]\textbf{P} = \begin{bmatrix}1 & 1 & 1 \\ 0 & 2 & 1 \\ 0 & 1 & 1\end{bmatrix}. Finally, we have to calculate P1\textbf{P}^{-1}.

Also note that there are multiple ways to diagonalize the above matrix. We could have just as well diagonalized it where D=[100020001]\textbf{D} = \begin{bmatrix}1 & 0 & 0 \\ 0 & 2 & 0 \\ 0 & 0 & 1\end{bmatrix} and P=[111210110]\textbf{P} = \begin{bmatrix}1 & 1 & 1 \\ 2 & 1 & 0 \\ 1 & 1 & 0\end{bmatrix}. Just make such that the eigenvector in the first column of P\textbf{P} is associated with the eigenvalue in the first column of D\textbf{D}, that the eigenvector in the second column of P\textbf{P} is associated with the eigenvalue in the second column of D\textbf{D}, etc.

Diagonalizability

We note that a matrix ARn×n\textbf{A} \in \mathbb{R}^{n \times n} is only diagonalize if it is square and has exactly n eigenvectors. This leads to two conclusions about the diagonalizability of a matrix A\textbf{A}:

  1. If A\textbf{A} has nn unique eigenvalues, then it must have nn eigenvectors, so it is automatically diagonalizable.

  2. If the sum of the geometric multiplicities of all eigenvalues of A\textbf{A} equals nn, i.e. A\textbf{A} has nn eigenvectors, then A\textbf{A} is diagonalizable.

Note: Even if the sum of the algebraic multiplicities equals nn, this does not mean the A\textbf{A} has nn eigenvectors. Take A=[1101]\textbf{A} = \begin{bmatrix}1 & 1 \\ 0 & 1\end{bmatrix} for example. The characteristic equation is (λ1)2=0(\lambda - 1)^2 = 0, so λ1=1\lambda_1 = 1 has an algebraic multiplicity of 2. However, Null([0100])=span{[10]}\text{Null}\left(\begin{bmatrix}0 & 1 \\ 0 & 0\end{bmatrix}\right) = \text{span}\left\{\begin{bmatrix}1 \\ 0\end{bmatrix}\right\}, so the geometric multiplicity of λ1=1\lambda_1 = 1 is only 1. In general, number of columns in Asum of algebraic multiplicitiessum of geometric multiplicities\text{number of columns in } \textbf{A} \geq \text{sum of algebraic multiplicities} \geq \text{sum of geometric multiplicities}

Last updated