Command Palette

Search for a command to run...

Linear Methods of AI

Spectral Theorem for Complex Matrices

Orthogonal Diagonalization Characterization

The spectral theorem for complex matrices provides a complete characterization of when a complex matrix can be diagonalized using an orthonormal basis of eigenvectors. This is a very important result in linear algebra because it connects geometric concepts (orthonormal basis) with algebraic concepts (matrix normality).

For a complex matrix ACn×nA \in \mathbb{C}^{n \times n}, the following conditions are equivalent to each other:

  1. There exists an orthonormal basis of Cn\mathbb{C}^n consisting of eigenvectors of matrix AA
  2. Matrix AA is normal

This equivalence is very profound because it shows that algebraic properties (normality) are directly related to the possibility of orthogonal diagonalization.

Proof of the First Direction

Let us prove that if there exists an orthonormal basis of eigenvectors, then the matrix is normal. This is like proving that if a building has a very regular and symmetric structure, then the building has special balance properties.

Let v1,,vnv_1, \ldots, v_n be an orthonormal basis consisting of eigenvectors of matrix AA. For each j=1,,nj = 1, \ldots, n we have Avj=λjvjAv_j = \lambda_j v_j.

Since it's an orthonormal basis, we have the following sequence of calculations:

(AHvi)Hvj=viH(Avj)=viH(λjvj)(A^H v_i)^H v_j = v_i^H (Av_j) = v_i^H (\lambda_j v_j)
=λjviHvj=λjδij= \lambda_j v_i^H v_j = \lambda_j \delta_{ij}

This means AHvi=λiviA^H v_i = \overline{\lambda_i} v_i, so that:

AAHvi=A(λivi)=λiλiviAA^H v_i = A(\overline{\lambda_i} v_i) = \overline{\lambda_i} \lambda_i v_i
=λiλivi=λi(AHvi)=AH(λivi)=AHAvi= \lambda_i \overline{\lambda_i} v_i = \lambda_i (A^H v_i) = A^H (\lambda_i v_i) = A^H A v_i

Since this holds for all basis vectors, then AAH=AHAAA^H = A^H A, which means AA is normal.

Proof from Normality to Diagonalization

Now we prove the opposite direction using mathematical induction. Imagine building a multi-story house, we start from the ground floor and prove that each new floor can be built using results from the previous floor.

Mathematical Induction Structure

  1. Base Step For n=0n = 0 (empty matrix), the statement is clearly true.

  2. Induction Hypothesis Assume the statement is true for all normal matrices of size (n1)×(n1)(n-1) \times (n-1).

  3. Induction Step Let ACn×nA \in \mathbb{C}^{n \times n} be normal. Based on the fundamental theorem of algebra, there exists an eigenvalue λ1C\lambda_1 \in \mathbb{C} of AA. Let v1Cnv_1 \in \mathbb{C}^n be a corresponding eigenvector with v1Hv1=1v_1^H v_1 = 1.

    We have Av1=λ1v1Av_1 = \lambda_1 v_1 and from the properties of normal matrices, AHv1=λ1v1A^H v_1 = \overline{\lambda_1} v_1.

Formation of Invariant Subspace

Complete v1v_1 to an orthonormal basis of Cn\mathbb{C}^n with vectors v2,,vnv_2, \ldots, v_n. Define:

W=span(v2,,vn)W = \text{span}(v_2, \ldots, v_n)

Here the span of vectors v2,,vnv_2, \ldots, v_n is the set of all linear combinations of those vectors. In other words, WW contains all vectors that can be written as a2v2+a3v3++anvna_2 v_2 + a_3 v_3 + \cdots + a_n v_n with a2,a3,,anCa_2, a_3, \ldots, a_n \in \mathbb{C}.

For every wWw \in W, we have:

(Aw)Hv1=wH(AHv1)=wH(λ1v1)(Aw)^H v_1 = w^H (A^H v_1) = w^H (\overline{\lambda_1} v_1)
=λ1wHv1=0= \overline{\lambda_1} w^H v_1 = 0

So AwWAw \in W, which means:

A(W)={AwwW}WA(W) = \{Aw \mid w \in W\} \subset W

In other words, WW is a subspace that is invariant (unchanging) under transformation AA. This is like a separate pond where fish swimming in that pond never leave the pond.

Unitary Transformation and Preservation of Normality

Let TCn×nT \in \mathbb{C}^{n \times n} be a unitary matrix with columns v1,,vnv_1, \ldots, v_n. Then TTH=TT1=IT \cdot T^H = T \cdot T^{-1} = I.

This transformation gives an elegant block form:

THAT=T1ATT^H \cdot A \cdot T = T^{-1} \cdot A \cdot T
=(λ10T0A)= \begin{pmatrix} \lambda_1 & 0^T \\ 0 & A' \end{pmatrix}

with:

AC(n1)×(n1)A' \in \mathbb{C}^{(n-1) \times (n-1)}

Since unitary transformations preserve normality like a mirror that preserves the shape of objects, we can show that:

(THAT)(THAT)H(T^H \cdot A \cdot T) \cdot (T^H \cdot A \cdot T)^H
=THATTHAHT= T^H \cdot A \cdot T \cdot T^H \cdot A^H \cdot T
=THAAHT= T^H \cdot A \cdot A^H \cdot T
=THAHAT= T^H \cdot A^H \cdot A \cdot T
=(THAT)H(THAT)= (T^H \cdot A \cdot T)^H \cdot (T^H \cdot A \cdot T)

Therefore the block diagonal matrix is also normal, which means AA' is also normal.

Construction of Complete Orthonormal Basis

Now we arrive at the final stage like assembling a puzzle that is almost complete. Based on the induction hypothesis, there exists an orthonormal basis v2,,vnv'_2, \ldots, v'_n of eigenvectors for AA'. Let SS' be a unitary matrix with those columns, where:

SC(n1)×(n1)S' \in \mathbb{C}^{(n-1) \times (n-1)}

so that:

SHAS=(λ2λn)S'^H \cdot A' \cdot S' = \begin{pmatrix} \lambda_2 & & \\ & \ddots & \\ & & \lambda_n \end{pmatrix}

with eigenvalues λ2,,λn\lambda_2, \ldots, \lambda_n of AA'.

Now we define:

S=T(10T0S)S = T \cdot \begin{pmatrix} 1 & 0^T \\ 0 & S' \end{pmatrix}

Then SS is a unitary matrix with:

SCn×nS \in \mathbb{C}^{n \times n}

and:

SHAS=(10T0SH)THATS^H \cdot A \cdot S = \begin{pmatrix} 1 & 0^T \\ 0 & S'^H \end{pmatrix} \cdot T^H \cdot A \cdot T
(10T0S)\cdot \begin{pmatrix} 1 & 0^T \\ 0 & S' \end{pmatrix}
=(λ10T0SHAS)= \begin{pmatrix} \lambda_1 & 0^T \\ 0 & S'^H A' S' \end{pmatrix}
=(λ1λn)= \begin{pmatrix} \lambda_1 & & \\ & \ddots & \\ & & \lambda_n \end{pmatrix}

The columns of SS form an orthonormal basis of eigenvectors of matrix AA.

Real Matrix Case

For real matrices, the situation is slightly different like the difference between drawing on a flat canvas (real) compared to drawing in 3D space (complex). A real matrix:

ARn×nA \in \mathbb{R}^{n \times n}

is called normal if:

AAT=ATAA \cdot A^T = A^T \cdot A

Normal real matrices are a special case of normal complex matrices with real entries. Therefore, the same properties apply to both types of symmetric and orthogonal matrices.

Symmetric matrices and orthogonal matrices are always normal. For a symmetric matrix AT=AA^T = A, it is clear that:

AAT=AA=ATAA \cdot A^T = A \cdot A = A^T \cdot A

For an orthogonal matrix AT=A1A^T = A^{-1}, we have:

AAT=AA1=IA \cdot A^T = A \cdot A^{-1} = I
=A1A=ATA= A^{-1} \cdot A = A^T \cdot A

However, not all normal real matrices have real eigenvalues. In the real spectral theorem, the existence of real eigenvalues is not guaranteed, so orthogonal diagonalization may not always be possible in real numbers.