Linear Model

Understanding Linear Models

Imagine we are trying to understand relationships between various things in the real world. Linear models are a very elegant way to describe these relationships in mathematical form.

The basic form is very simple:

y = h(t) \cdot x

Why is it called "linear"? Because if we look at the relationship between $y$ and $x$ , the relationship forms a straight line. Here $h(t)$ acts as a "connector" with dimensions $\mathbb{R}^{d \times n}$ .

Let's get to know the three main players in this model:

$y \in \mathbb{R}^d$ is the result we observe (model response)
$x \in \mathbb{R}^n$ is the value we want to find (model parameters)
$t \in \mathbb{R}^k$ is the input we provide (independent variables)

What's interesting is that although we call it "linear", the relationship with input $t$ can actually be complex or curved. Only the relationship with parameter $x$ is linear.

From Data to Model

Now, how do we use this model in real life? The process is actually like playing detective with data.

First, we conduct a series of experiments or measurements:

We choose various values for $t$
For each of these values, we measure and obtain $y$
Our goal is to find $x$ that can explain all this data

Suppose we perform $M$ measurements. For each measurement $i$ , we have:

y_i \approx h(t_i) \cdot x, \quad i = 1, \ldots, M

Why use the "approximately equal" sign instead of "equals"? Because in the real world, no measurement is perfect. There's always noise, instrument errors, or other random factors that affect the results.

If we count the total of all data we collect, the amount is $m = M \cdot d$ . Usually this number is much larger than the number of parameters we want to find ( $n$ ), so $m \gg n$ .

Our challenge now is how to find the best value of $x$ , so that the equation $y_i = h(t_i) \cdot x$ is satisfied as accurately as possible.

When we arrange all this data, a system of equations is formed that looks like this:

b = \begin{pmatrix} y_1 \\ \vdots \\ y_M \end{pmatrix} = \begin{pmatrix} h(t_1) \\ \vdots \\ h(t_M) \end{pmatrix} \cdot \begin{pmatrix} x_1 \\ \vdots \\ x_n \end{pmatrix} = A(t) \cdot x

This produces a system with matrix $A(t)$ of size $m \times n$ and vector $b$ of size $m$ .

Various Forms of Linear Models

Linear models turn out to be very flexible and can take various forms. Let's look at some examples that appear most frequently:

Simple Straight Line

The most basic form is a straight line:

y = a + b \cdot t = (1 \quad t) \cdot \begin{pmatrix} a \\ b \end{pmatrix} = h(t) \cdot x

Here we look for two parameters: $a$ (intercept) and $b$ (slope). This model is suitable when data forms a straight line pattern or nearly straight.

Polynomial Curves

If data forms curved patterns, we can use polynomials:

y = a_0 + a_1 \cdot t + \ldots + a_n \cdot t^n = (1 \quad t \quad \ldots \quad t^n) \cdot \begin{pmatrix} a_0 \\ a_1 \\ \vdots \\ a_n \end{pmatrix} = h(t) \cdot x

Although $t^n$ looks nonlinear, remember that what we mean by "linear" is the relationship with parameters $a_0, a_1, \ldots, a_n$ .

Repeating Patterns with Trigonometry

For data that has repeating or cyclic patterns, we can use sine and cosine functions:

y = a_0 + \sum_{k=1}^{n} a_k \cos(k \cdot t) + \sum_{k=1}^{n} b_k \sin(k \cdot t)

This model is very useful for analyzing data that has seasonal or periodic patterns.

Multiple Inputs

If the output depends on several inputs simultaneously, we can combine them. For example with two inputs $t$ and $s$ :

y = a + b \cdot t + c \cdot s + d \cdot t \cdot s = (1 \quad t \quad s \quad t \cdot s) \cdot \begin{pmatrix} a \\ b \\ c \\ d \end{pmatrix} = h(t,s) \cdot x

The term $t \cdot s$ captures the interaction between the two inputs.

Multiple Outputs

Sometimes we want to predict several things simultaneously from the same input:

y = a + b \cdot t

z = c + d \cdot t

This is like having two linear models running simultaneously.

Real Example from Physics

The general gas equation in physics is:

p = n \cdot R \cdot \frac{T}{V}

Here:

$p$ is the pressure we measure
$T$ is the temperature we set
$V$ is the volume we set
$n$ is the number of gas molecules (which we want to determine)
$R$ is a known constant

If we consider $T$ and $V$ as inputs we can control, and $p$ as output we measure, then pressure depends linearly on $n$ . This allows us to use linear models to determine the number of gas molecules.

When Models Are Initially Nonlinear

Not all real-world problems are directly linear in form. Sometimes we encounter models where parameters appear in quadratic form, multiplication between parameters, or even in exponential functions:

y = h(t, x)

But don't despair! In many cases, we can "linearize" such models. The method is by using tangent line approximation around a certain point $x_0$ :

y \approx h(x_0, t) + \frac{\partial h}{\partial x}(x_0, t) \cdot (x - x_0)

With this trick, we replace a complex curve with a straight line that approximates it. The result is an equation that is linear with respect to $x$ , so it can be solved with standard linear algebra methods like least squares.

However, this linearization method only works if the nonlinear model is not too "curved" around point $x_0$ . Highly nonlinear models require numerical optimization techniques.

Command Palette