Matrix Diagonalization

Matrix Diagonalization Concept

In matrix theory, we often seek ways to simplify matrix forms to make them easier to analyze and compute. Diagonalization is one of the most powerful techniques to achieve this. Imagine transforming a complex space into a more orderly space where each dimension does not interfere with each other.

The main goal of diagonalization is to find a special basis so that the linear transformation $y = A \cdot x$ can be represented through a diagonal matrix $B = S^{-1} \cdot A \cdot S$ . If this basis is an orthonormal basis, then the transformation matrix has the property $S^{-1} = S^T$ .

Definition of Diagonalization

A matrix $A \in \mathbb{K}^{n \times n}$ is called diagonalizable if it is similar to some diagonal matrix $\Lambda \in \mathbb{K}^{n \times n}$ , that is, if there exists an invertible matrix $S \in \mathbb{K}^{n \times n}$ such that:

\Lambda = S^{-1} \cdot A \cdot S

Basic Conditions for Diagonalization

When can a matrix $A \in \mathbb{K}^{n \times n}$ be diagonalized? The answer is when we can find a basis of $\mathbb{K}^n$ that consists entirely of eigenvectors $v_1, \ldots, v_n \in \mathbb{K}^n$ of $A$ with corresponding eigenvalues $\lambda_1, \ldots, \lambda_n \in \mathbb{K}$ .

The diagonal matrix $\Lambda$ is:

\Lambda = \begin{pmatrix} \lambda_1 & & \\ & \ddots & \\ & & \lambda_n \end{pmatrix} = \text{diag}(\lambda_1, \ldots, \lambda_n)

and $S$ is the matrix with columns:

S = (v_1 \quad \ldots \quad v_n)

If $A$ is diagonalizable, then the columns $v_1, \ldots, v_n$ of $S$ form a basis of eigenvectors. From $\Lambda = S^{-1} \cdot A \cdot S$ we obtain $A \cdot S = S \cdot \Lambda$ and thus $A \cdot v_i = \lambda_i \cdot v_i$ for $i = 1, \ldots, n$ .

Conversely, if $v_1, \ldots, v_n$ is a basis of eigenvectors, then $S$ is invertible and from $A \cdot v_i = \lambda_i \cdot v_i$ for $i = 1, \ldots, n$ we obtain $A \cdot S = S \cdot \Lambda$ and thus $\Lambda = S^{-1} \cdot A \cdot S$ .

Example of Non-Diagonalizable Case

Consider the matrix:

A = \begin{pmatrix} 1 & 2 \\ 0 & 1 \end{pmatrix}

This matrix has eigenvalue $\lambda = 1$ with algebraic multiplicity $\mu_A(1) = 2$ . The eigenspace is the kernel (null space) of $A - 1 \cdot I$ :

\text{Eig}_A(1) = \text{Kern}(A - 1 \cdot I)

= \text{Kern}\begin{pmatrix} 0 & 2 \\ 0 & 0 \end{pmatrix}

= \text{Span}\begin{pmatrix} 1 \\ 0 \end{pmatrix}

which has dimension 1. Since there are no other eigenvalues and eigenvectors, and there is no basis of $\mathbb{K}^2$ consisting of eigenvectors of $A$ , then $A$ is not diagonalizable.

Requirements for Matrix Diagonalization

If a matrix $A \in \mathbb{K}^{n \times n}$ is diagonalizable, then the characteristic polynomial $\chi_A(t)$ of $A$ over $\mathbb{K}$ factors into linear factors:

\chi_A(t) = (\lambda_1 - t) \cdots (\lambda_n - t)

where $A$ has $n$ eigenvalues that need not be distinct $\lambda_i \in \mathbb{K}$ .

When all eigenvalues are distinct, the process becomes simpler. If $A \in \mathbb{K}^{n \times n}$ and the characteristic polynomial $\chi_A(t)$ of $A$ over $\mathbb{K}$ factors into linear factors:

\chi_A(t) = (\lambda_1 - t) \cdots (\lambda_n - t)

with pairwise distinct eigenvalues $\lambda_i \neq \lambda_j$ for $i \neq j$ with $i, j \in \{1, \ldots, n\}$ , then $A$ is certainly diagonalizable.

Why is this so? Because eigenvectors for pairwise distinct eigenvalues of $A$ are always linearly independent and form a basis of $\mathbb{K}^n$ .

But what if $A$ has repeated eigenvalues? We must check this more carefully. Eigenvalues have algebraic multiplicity $\mu_A(\lambda_i)$ and geometric multiplicity $\dim \text{Eig}_A(\lambda_i)$ with the relationship:

\dim \text{Eig}_A(\lambda_i) \leq \mu_A(\lambda_i)

Diagonalization Characterization Theorem

For a matrix $A \in \mathbb{K}^{n \times n}$ , the following statements are equivalent:

$A$ is diagonalizable.
Both of the following conditions are satisfied. First, the characteristic polynomial of $A$ must factor into linear factors:

$\chi_A(t) = (\lambda_1 - t)^{\mu_A(\lambda_1)} \cdots (\lambda_k - t)^{\mu_A(\lambda_k)}$

with pairwise distinct eigenvalues $\lambda_1, \ldots, \lambda_k \in \mathbb{K}$ of $A$ . Second, for all eigenvalues of $A$ , the algebraic multiplicity must equal the geometric multiplicity:

$\mu_A(\lambda_i) = \dim \text{Eig}_A(\lambda_i) \quad \text{for } i = 1, \ldots, k$
The direct sum of all eigenspaces is the entire vector space:

$\text{Eig}_A(\lambda_1) \oplus \cdots \oplus \text{Eig}_A(\lambda_k) = \mathbb{K}^n$

This means there exists a basis of $\mathbb{K}^n$ consisting of eigenvectors of $A$ .

For each $i = 1, \ldots, k$ , let $v_1^{(i)}, \ldots, v_{d_i}^{(i)}$ be a basis of eigenvectors of $A$ for the eigenspace $\text{Eig}_A(\lambda_i)$ . Then:

v_1^{(1)}, \ldots, v_{d_1}^{(1)}, v_1^{(2)}, \ldots, v_{d_2}^{(2)}, \ldots, v_1^{(k)}, \ldots, v_{d_k}^{(k)}

is a basis of $\mathbb{K}^n$ consisting of eigenvectors of $A$ . Therefore, $A$ is diagonalizable.

Command Palette