What is Normal Equation System
Imagine you have a complex optimization problem, but it turns out there's an elegant shortcut. In least squares problems, instead of doing direct optimization, we can transform it into a system of equations that's easier to solve.
When we want to minimize
it turns out the solution can be found by solving the following system of equations
This equation is called the normal equation system because it involves the concept of orthogonality or "normal" (perpendicular) state in vector space.
Fundamental Relationship
There's a very interesting relationship between the minimization problem and this normal equation system. Vector is a solution to the least squares problem if and only if that vector satisfies the normal equation system.
In other words, finding that makes minimal is exactly the same as finding that satisfies .
Why This System Works
To understand why this relationship holds, we need to look at it from a geometric perspective.
When gives the minimum value for , then the error vector must be orthogonal to all vectors in the column space of matrix .
This column space consists of all vectors that can be written as for . The orthogonality condition means
for every vector . Using the properties of inner product, we can write
Since this relationship must hold for all vectors , then
This is what gives us the normal equation system.
Proof Using Pythagorean Theorem
We can also verify this result in a different way. Let be the solution of the normal equation system and be any vector in .
Using the Pythagorean theorem, we can write
Since satisfies the normal equation system, then and the squared norm is always non-negative. Therefore
This inequality proves that indeed gives the minimum value.
When Normal Equation System Can Be Solved
Not all normal equation systems can be solved easily. There are special conditions that must be met.
For matrix with , the symmetric matrix can be inverted if and only if matrix has full rank, that is .
This condition is very important because it determines whether the normal equation system has a unique solution. When can be inverted, the solution can be written explicitly as
Proof of Invertible Condition
To understand when can be inverted, we need to look at the relationship between null space (kernel) and rank.
If can be inverted, then the null space of contains only the zero vector. Since the null space of includes the null space of , then also has only the zero vector in its null space. This means .
Conversely, if , then the equation has only the solution . To see that can be inverted, note that if , then
Since the inner product is only zero when , and we know this only happens when , then is indeed invertible.
Positive Definite Property
More than just being invertible, matrix has a special property. When and , we have and
This shows that is a positive definite matrix. This property guarantees that the normal equation system not only has a unique solution, but is also numerically stable when solved with computational methods. Algorithms like Cholesky decomposition can be safely used to solve this system.