Command Palette

Search for a command to run...

Linear Methods of AI

System of Linear Equations

Overdetermined Linear Equation Systems

Imagine we are trying to fit a curve to a set of data. In many practical cases, we have more data than parameters we want to find. Such situations create what is called overdetermined linear equation systems.

This system has a special characteristic. The number of equations is greater than the number of unknown variables. Mathematically, if we have mm equations and nn variables, then the condition m>nm > n makes this system "overdetermined".

Real Example with Quadratic Polynomial Model

Let's look at a concrete example. Suppose we have 77 data points that we want to fit with a parabola or quadratic curve.

The data we have is as follows.

ii11223344556677
tit_i3-32-21-100112233
yiy_i2.2-2.24.2-4.24.2-4.21.8-1.81.81.88.28.215.815.8

We want to find a parabola with the following form.

y=a2t2+a1t+a0y = a_2 \cdot t^2 + a_1 \cdot t + a_0

Here we are looking for 33 parameters, namely a2a_2 (quadratic coefficient), a1a_1 (linear coefficient), and a0a_0 (constant).

Setting Up the System of Equations

Now, how do we use this data to find the parabola parameters? The idea is simple. For each data point, we can write one equation. With 77 data points, we will get 77 equations.

a2(3)2+a1(3)+a0=2.2a_2 \cdot (-3)^2 + a_1 \cdot (-3) + a_0 = -2.2
a2(2)2+a1(2)+a0=4.2a_2 \cdot (-2)^2 + a_1 \cdot (-2) + a_0 = -4.2
a2(1)2+a1(1)+a0=4.2a_2 \cdot (-1)^2 + a_1 \cdot (-1) + a_0 = -4.2
a202+a10+a0=1.8a_2 \cdot 0^2 + a_1 \cdot 0 + a_0 = -1.8
a212+a11+a0=1.8a_2 \cdot 1^2 + a_1 \cdot 1 + a_0 = 1.8
a222+a12+a0=8.2a_2 \cdot 2^2 + a_1 \cdot 2 + a_0 = 8.2
a232+a13+a0=15.8a_2 \cdot 3^2 + a_1 \cdot 3 + a_0 = 15.8

Now let's calculate the square values for each tit_i. For example, for t1=3t_1 = -3, we have (3)2=9(-3)^2 = 9. Similarly for the others. After calculating everything, our equations become like this.

9a23a1+a0=2.29a_2 - 3a_1 + a_0 = -2.2
4a22a1+a0=4.24a_2 - 2a_1 + a_0 = -4.2
1a21a1+a0=4.21a_2 - 1a_1 + a_0 = -4.2
0a2+0a1+a0=1.80a_2 + 0a_1 + a_0 = -1.8
1a2+1a1+a0=1.81a_2 + 1a_1 + a_0 = 1.8
4a2+2a1+a0=8.24a_2 + 2a_1 + a_0 = 8.2
9a2+3a1+a0=15.89a_2 + 3a_1 + a_0 = 15.8

Matrix Form

The system of equations above can be written in matrix form Ax=bA \cdot x = b.

(931421111001111421931)(a2a1a0)=(2.24.24.21.81.88.215.8)\begin{pmatrix} 9 & -3 & 1 \\ 4 & -2 & 1 \\ 1 & -1 & 1 \\ 0 & 0 & 1 \\ 1 & 1 & 1 \\ 4 & 2 & 1 \\ 9 & 3 & 1 \end{pmatrix} \begin{pmatrix} a_2 \\ a_1 \\ a_0 \end{pmatrix} = \begin{pmatrix} -2.2 \\ -4.2 \\ -4.2 \\ -1.8 \\ 1.8 \\ 8.2 \\ 15.8 \end{pmatrix}

In general, for a quadratic polynomial model with mm data points, the matrix form is as follows.

(t12t11tm2tm1)(a2a1a0)=(y1ym)\begin{pmatrix} t_1^2 & t_1 & 1 \\ \vdots & \vdots & \vdots \\ t_m^2 & t_m & 1 \end{pmatrix} \begin{pmatrix} a_2 \\ a_1 \\ a_0 \end{pmatrix} = \begin{pmatrix} y_1 \\ \vdots \\ y_m \end{pmatrix}

Why There Is No Exact Solution

Now we face an interesting situation. In our example, matrix AA has size 7×37 \times 3 and vector xx has size 3×13 \times 1. This means we have 77 equations but only 33 unknown variables.

Does this mean the system cannot be solved? Let's examine this more deeply.

The three columns of matrix AA are linearly independent, so the rank of matrix AA is 33. However, when we add vector bb to matrix AA to form the augmented matrix (Ab)(A|b), its rank becomes 44.

This condition tells us something important. This system has no exact solution. In simple terms, there is no single parabola that can pass through all 77 data points perfectly.

Solution with Least Squares

So what should we do? Give up? Of course not!

When an overdetermined linear equation system has no exact solution, we use the least squares approach. The basic idea makes perfect sense. If we cannot find a parabola that passes through all points, let's find the parabola that is "closest" to all points.

Mathematically, this method seeks parameters that minimize the sum of squared differences between predicted values and observed values. Imagine we draw a parabola, then measure the vertical distance from each data point to that parabola. The least squares method finds the parabola that makes the total squared distances as small as possible.

Overdetermined linear equation systems are very common in the real world, especially when we have many measurement data but a relatively simple model.

The least squares approach provides an optimal solution in the sense of minimizing overall error, making it very practical for engineering and scientific applications.