Principal Component Analysis Example
Imagine you have a random vector that is normally distributed. This vector has a zero expected value and a positive definite covariance matrix . We can write it as a normal distribution like this:
Each individual parameter represents characteristics of the process we are observing. In practice, almost all entries of the covariance matrix can be non-zero. This means these parameters are strongly correlated due to covariances in the off-diagonal elements.
Through principal component analysis, we can determine the main influence factors that affect the process.
Covariance Matrix Diagonalization
To identify the main influence factors, we need to perform diagonalization on the covariance matrix . Let be the eigenvalues of with corresponding orthonormal eigenvectors .
Based on the spectral theorem, we can form the diagonal matrix and eigenvector matrix:
Then the fundamental relationship holds:
Transformation to New Coordinates
In relation to the basis , new coordinates are defined as . What's interesting is that the variables become independent and normally distributed with variance :
These variables are called the principal components of . The principal component with the largest variance describes the main influence factor of the observed process.
The analogy is like when you observe cloud movement in the sky. There are many factors affecting cloud movement, but westerly winds might have the greatest influence. The first principal component is like the main wind direction that contributes most to the cloud movement pattern.
Geometric Visualization
Geometrically, principal component analysis can be understood as a way to find the most optimal direction to represent data. Imagine data scattered like a cloud of points in two-dimensional space. Principal components show the direction where data has maximum variability.
In the visualization above, Variable 1 and Variable 2 represent the original coordinates of your data. Meanwhile, Factor 1 and Factor 2 show the new principal component directions. Notice how the factor directions are not aligned with the original axes, but rather follow the actual data distribution pattern.
Factor 1 shows the direction with the greatest variability in the data, while Factor 2 shows the direction with the second greatest variability that is perpendicular to Factor 1. This transformation allows us to better understand the data structure because principal components capture the variability patterns that actually exist in the data.