Command Palette

Search for a command to run...

Linear Methods of AI

Matrix Diagonalization

Matrix Diagonalization Concept

In matrix theory, we often seek ways to simplify matrix forms to make them easier to analyze and compute. Diagonalization is one of the most powerful techniques to achieve this. Imagine transforming a complex space into a more orderly space where each dimension does not interfere with each other.

The main goal of diagonalization is to find a special basis so that the linear transformation y=Axy = A \cdot x can be represented through a diagonal matrix B=S1ASB = S^{-1} \cdot A \cdot S. If this basis is an orthonormal basis, then the transformation matrix has the property S1=STS^{-1} = S^T.

Definition of Diagonalization

A matrix AKn×nA \in \mathbb{K}^{n \times n} is called diagonalizable if it is similar to some diagonal matrix ΛKn×n\Lambda \in \mathbb{K}^{n \times n}, that is, if there exists an invertible matrix SKn×nS \in \mathbb{K}^{n \times n} such that:

Λ=S1AS\Lambda = S^{-1} \cdot A \cdot S

Basic Conditions for Diagonalization

When can a matrix AKn×nA \in \mathbb{K}^{n \times n} be diagonalized? The answer is when we can find a basis of Kn\mathbb{K}^n that consists entirely of eigenvectors v1,,vnKnv_1, \ldots, v_n \in \mathbb{K}^n of AA with corresponding eigenvalues λ1,,λnK\lambda_1, \ldots, \lambda_n \in \mathbb{K}.

The diagonal matrix Λ\Lambda is:

Λ=(λ1λn)=diag(λ1,,λn)\Lambda = \begin{pmatrix} \lambda_1 & & \\ & \ddots & \\ & & \lambda_n \end{pmatrix} = \text{diag}(\lambda_1, \ldots, \lambda_n)

and SS is the matrix with columns:

S=(v1vn)S = (v_1 \quad \ldots \quad v_n)

If AA is diagonalizable, then the columns v1,,vnv_1, \ldots, v_n of SS form a basis of eigenvectors. From Λ=S1AS\Lambda = S^{-1} \cdot A \cdot S we obtain AS=SΛA \cdot S = S \cdot \Lambda and thus Avi=λiviA \cdot v_i = \lambda_i \cdot v_i for i=1,,ni = 1, \ldots, n.

Conversely, if v1,,vnv_1, \ldots, v_n is a basis of eigenvectors, then SS is invertible and from Avi=λiviA \cdot v_i = \lambda_i \cdot v_i for i=1,,ni = 1, \ldots, n we obtain AS=SΛA \cdot S = S \cdot \Lambda and thus Λ=S1AS\Lambda = S^{-1} \cdot A \cdot S.

Example of Non-Diagonalizable Case

Consider the matrix:

A=(1201)A = \begin{pmatrix} 1 & 2 \\ 0 & 1 \end{pmatrix}

This matrix has eigenvalue λ=1\lambda = 1 with algebraic multiplicity μA(1)=2\mu_A(1) = 2. The eigenspace is the kernel (null space) of A1IA - 1 \cdot I:

EigA(1)=Kern(A1I)\text{Eig}_A(1) = \text{Kern}(A - 1 \cdot I)
=Kern(0200)= \text{Kern}\begin{pmatrix} 0 & 2 \\ 0 & 0 \end{pmatrix}
=Span(10)= \text{Span}\begin{pmatrix} 1 \\ 0 \end{pmatrix}

which has dimension 1. Since there are no other eigenvalues and eigenvectors, and there is no basis of K2\mathbb{K}^2 consisting of eigenvectors of AA, then AA is not diagonalizable.

Requirements for Matrix Diagonalization

If a matrix AKn×nA \in \mathbb{K}^{n \times n} is diagonalizable, then the characteristic polynomial χA(t)\chi_A(t) of AA over K\mathbb{K} factors into linear factors:

χA(t)=(λ1t)(λnt)\chi_A(t) = (\lambda_1 - t) \cdots (\lambda_n - t)

where AA has nn eigenvalues that need not be distinct λiK\lambda_i \in \mathbb{K}.

When all eigenvalues are distinct, the process becomes simpler. If AKn×nA \in \mathbb{K}^{n \times n} and the characteristic polynomial χA(t)\chi_A(t) of AA over K\mathbb{K} factors into linear factors:

χA(t)=(λ1t)(λnt)\chi_A(t) = (\lambda_1 - t) \cdots (\lambda_n - t)

with pairwise distinct eigenvalues λiλj\lambda_i \neq \lambda_j for iji \neq j with i,j{1,,n}i, j \in \{1, \ldots, n\}, then AA is certainly diagonalizable.

Why is this so? Because eigenvectors for pairwise distinct eigenvalues of AA are always linearly independent and form a basis of Kn\mathbb{K}^n.

But what if AA has repeated eigenvalues? We must check this more carefully. Eigenvalues have algebraic multiplicity μA(λi)\mu_A(\lambda_i) and geometric multiplicity dimEigA(λi)\dim \text{Eig}_A(\lambda_i) with the relationship:

dimEigA(λi)μA(λi)\dim \text{Eig}_A(\lambda_i) \leq \mu_A(\lambda_i)

Diagonalization Characterization Theorem

For a matrix AKn×nA \in \mathbb{K}^{n \times n}, the following statements are equivalent:

  1. AA is diagonalizable.

  2. Both of the following conditions are satisfied. First, the characteristic polynomial of AA must factor into linear factors:

    χA(t)=(λ1t)μA(λ1)(λkt)μA(λk)\chi_A(t) = (\lambda_1 - t)^{\mu_A(\lambda_1)} \cdots (\lambda_k - t)^{\mu_A(\lambda_k)}

    with pairwise distinct eigenvalues λ1,,λkK\lambda_1, \ldots, \lambda_k \in \mathbb{K} of AA. Second, for all eigenvalues of AA, the algebraic multiplicity must equal the geometric multiplicity:

    μA(λi)=dimEigA(λi)for i=1,,k\mu_A(\lambda_i) = \dim \text{Eig}_A(\lambda_i) \quad \text{for } i = 1, \ldots, k
  3. The direct sum of all eigenspaces is the entire vector space:

    EigA(λ1)EigA(λk)=Kn\text{Eig}_A(\lambda_1) \oplus \cdots \oplus \text{Eig}_A(\lambda_k) = \mathbb{K}^n

    This means there exists a basis of Kn\mathbb{K}^n consisting of eigenvectors of AA.

For each i=1,,ki = 1, \ldots, k, let v1(i),,vdi(i)v_1^{(i)}, \ldots, v_{d_i}^{(i)} be a basis of eigenvectors of AA for the eigenspace EigA(λi)\text{Eig}_A(\lambda_i). Then:

v1(1),,vd1(1),v1(2),,vd2(2),,v1(k),,vdk(k)v_1^{(1)}, \ldots, v_{d_1}^{(1)}, v_1^{(2)}, \ldots, v_{d_2}^{(2)}, \ldots, v_1^{(k)}, \ldots, v_{d_k}^{(k)}

is a basis of Kn\mathbb{K}^n consisting of eigenvectors of AA. Therefore, AA is diagonalizable.