# Nakafa Framework: LLM
URL: https://nakafa.com/en/subject/university/bachelor/ai-ds/linear-methods/normal-equation-solution
Source: https://raw.githubusercontent.com/nakafaai/nakafa.com/refs/heads/main/packages/contents/subject/university/bachelor/ai-ds/linear-methods/normal-equation-solution/en.mdx
Output docs content for large language models.
---
export const metadata = {
  title: "Normal Equation System Solution",
  description: "Implement normal equation solutions using QR decomposition and pseudoinverse. Learn numerical methods, polynomial fitting, and stability comparisons.",
  authors: [{ name: "Nabil Akbarazzima Fatih" }],
  date: "07/15/2025",
  subject: "Linear Methods of AI",
};
import { getColor } from "@repo/design-system/lib/color";
import { LineEquation } from "@repo/design-system/components/contents/line-equation";
## Fundamental Properties
Normal equation systems  have special characteristics that distinguish them from ordinary linear systems. Imagine finding the best point on a line to represent scattered data points, this system provides a mathematical way to find that optimal solution.
For a matrix  with , the normal equation system  always has a solution. More specifically, this system has a unique solution exactly when matrix  has full rank, that is when . Under this condition, the solution can be expressed as .
When matrix  does not have full rank, the solution set of the normal equation system takes the form , where  is any particular solution of the system.
### Why the System is Always Solvable
The fundamental reason why normal equation systems always have a solution lies in the concept of orthogonal projection. The orthogonal projection of vector  onto the column space  always exists and is the solution to the linear least squares problem, which automatically is also a solution to the normal equation system.
To understand why other solutions take the form , suppose  is another solution of the system . Then  is a solution to the normal equation system if and only if , which is equivalent to , which is further equivalent to , or in other words .
## Moore-Penrose Pseudoinverse
For a matrix  with  and , we can define the Moore-Penrose pseudoinverse as
The Moore-Penrose pseudoinverse functions like the "best inverse" of a non-square matrix. It provides an optimal way to "cancel" linear transformations in the least squares context.
The Moore-Penrose pseudoinverse satisfies four Penrose axioms that uniquely determine its characteristics
These four properties are unique, meaning if a matrix  satisfies all four axioms, then automatically . The Moore-Penrose pseudoinverse thus functions as the unique solution operator for linear least squares problems.
## Solution Using QR Decomposition
A more numerically stable approach to solving normal equation systems uses QR decomposition. For a matrix  with full rank and , we can use the (thin) QR decomposition .
With this decomposition, the normal equation system  can be solved through
Since  is an invertible upper triangular matrix, this equation is equivalent to
This upper triangular system can be solved using back substitution, providing the solution  directly and efficiently.
## Numerical Example
Let's apply this method to a concrete example. Suppose we have experimental data that we want to fit with a quadratic polynomial. We will use the following data
Each row in matrix  has the format  to find the coefficients of polynomial , while vector  contains the corresponding observation values.
### QR Decomposition Process
The QR decomposition of matrix  yields
### Solution Steps
First, we compute  to obtain
Next, we solve the upper triangular system  using back substitution. Since  is upper triangular, we start from the bottom equation.
From the third equation, , so .
From the second equation, , so .
From the first equation, , so .
Thus, the complete solution is
### Fitting Results
Based on the obtained solution, the quadratic polynomial that best fits the data is
The following visualization shows how well this polynomial represents the original data
 {
    // Original data from matrix A (column t) and vector b
    const originalData = [
      [-3, -2.2],
      [-2, -4.2],
      [-1, -4.2],
      [0, -1.8],
      [1, 1.8],
      [2, 8.2],
      [3, 15.8]
    ];
    
    // Polynomial coefficients from normal equation system solution
    const a = 0.98095;
    const b = 3.02857;
    const c = -2.00952;
    
    return [
      {
        points: Array.from({ length: 50 }, (_, i) => {
          const t = -3 + (i * 6) / 49;
          const y = a * t * t + b * t + c;
          return { x: t, y: y, z: 0 };
        }),
        color: getColor("CYAN"),
        smooth: true,
        showPoints: false
      },
      {
        points: originalData.map(([t, y]) => ({ x: t, y: y, z: 0 })),
        color: getColor("ORANGE"),
        smooth: false,
        showPoints: true
      }
    ];
  })()}
/>
## Method Comparison
To solve normal equation systems, there are two main approaches that can be compared in terms of computation and numerical stability.
The Cholesky approach involves explicitly forming the matrix  first, then applying Cholesky decomposition since this matrix is positive definite. This method requires approximately  arithmetic operations. However, multiplication and decomposition can become sources of large error propagation, especially when  where .
The QR approach, conversely, can solve this problem with better numerical stability and comparable computational complexity. The main complexity is determined by  operations for QR decomposition, making it comparable to the Cholesky approach. However, the significant advantage of QR lies in the fact that orthogonal transformations do not worsen the problem condition, unlike forming  in the Cholesky method.
The choice of the appropriate method depends on the data characteristics and the level of accuracy required in the specific application.