Command Palette

Search for a command to run...

Understanding Linear Models

Imagine we are trying to understand relationships between various things in the real world. Linear models are a very elegant way to describe these relationships in mathematical form.

The basic form is very simple:

y=h(t)xy = h(t) \cdot x

Why is it called "linear"? Because if we look at the relationship between yy and xx, the relationship forms a straight line. Here h(t)h(t) acts as a "connector" with dimensions Rd×n\mathbb{R}^{d \times n}.

Let's get to know the three main players in this model:

  • yRdy \in \mathbb{R}^d is the result we observe (model response)
  • xRnx \in \mathbb{R}^n is the value we want to find (model parameters)
  • tRkt \in \mathbb{R}^k is the input we provide (independent variables)

What's interesting is that although we call it "linear", the relationship with input tt can actually be complex or curved. Only the relationship with parameter xx is linear.

From Data to Model

Now, how do we use this model in real life? The process is actually like playing detective with data.

First, we conduct a series of experiments or measurements:

  1. We choose various values for tt
  2. For each of these values, we measure and obtain yy
  3. Our goal is to find xx that can explain all this data

Suppose we perform MM measurements. For each measurement ii, we have:

yih(ti)x,i=1,,My_i \approx h(t_i) \cdot x, \quad i = 1, \ldots, M

Why use the "approximately equal" sign instead of "equals"? Because in the real world, no measurement is perfect. There's always noise, instrument errors, or other random factors that affect the results.

If we count the total of all data we collect, the amount is m=Mdm = M \cdot d. Usually this number is much larger than the number of parameters we want to find (nn), so mnm \gg n.

Our challenge now is how to find the best value of xx, so that the equation yi=h(ti)xy_i = h(t_i) \cdot x is satisfied as accurately as possible.

When we arrange all this data, a system of equations is formed that looks like this:

b=(y1yM)=(h(t1)h(tM))(x1xn)=A(t)xb = \begin{pmatrix} y_1 \\ \vdots \\ y_M \end{pmatrix} = \begin{pmatrix} h(t_1) \\ \vdots \\ h(t_M) \end{pmatrix} \cdot \begin{pmatrix} x_1 \\ \vdots \\ x_n \end{pmatrix} = A(t) \cdot x

This produces a system with matrix A(t)A(t) of size m×nm \times n and vector bb of size mm.

Various Forms of Linear Models

Linear models turn out to be very flexible and can take various forms. Let's look at some examples that appear most frequently:

Simple Straight Line

The most basic form is a straight line:

y=a+bt=(1t)(ab)=h(t)xy = a + b \cdot t = (1 \quad t) \cdot \begin{pmatrix} a \\ b \end{pmatrix} = h(t) \cdot x

Here we look for two parameters: aa (intercept) and bb (slope). This model is suitable when data forms a straight line pattern or nearly straight.

Polynomial Curves

If data forms curved patterns, we can use polynomials:

y=a0+a1t++antn=(1ttn)(a0a1an)=h(t)xy = a_0 + a_1 \cdot t + \ldots + a_n \cdot t^n = (1 \quad t \quad \ldots \quad t^n) \cdot \begin{pmatrix} a_0 \\ a_1 \\ \vdots \\ a_n \end{pmatrix} = h(t) \cdot x

Although tnt^n looks nonlinear, remember that what we mean by "linear" is the relationship with parameters a0,a1,,ana_0, a_1, \ldots, a_n.

Repeating Patterns with Trigonometry

For data that has repeating or cyclic patterns, we can use sine and cosine functions:

y=a0+k=1nakcos(kt)+k=1nbksin(kt)y = a_0 + \sum_{k=1}^{n} a_k \cos(k \cdot t) + \sum_{k=1}^{n} b_k \sin(k \cdot t)

This model is very useful for analyzing data that has seasonal or periodic patterns.

Multiple Inputs

If the output depends on several inputs simultaneously, we can combine them. For example with two inputs tt and ss:

y=a+bt+cs+dts=(1tsts)(abcd)=h(t,s)xy = a + b \cdot t + c \cdot s + d \cdot t \cdot s = (1 \quad t \quad s \quad t \cdot s) \cdot \begin{pmatrix} a \\ b \\ c \\ d \end{pmatrix} = h(t,s) \cdot x

The term tst \cdot s captures the interaction between the two inputs.

Multiple Outputs

Sometimes we want to predict several things simultaneously from the same input:

y=a+bty = a + b \cdot t
z=c+dtz = c + d \cdot t

This is like having two linear models running simultaneously.

Real Example from Physics

The general gas equation in physics is:

p=nRTVp = n \cdot R \cdot \frac{T}{V}

Here:

  • pp is the pressure we measure
  • TT is the temperature we set
  • VV is the volume we set
  • nn is the number of gas molecules (which we want to determine)
  • RR is a known constant

If we consider TT and VV as inputs we can control, and pp as output we measure, then pressure depends linearly on nn. This allows us to use linear models to determine the number of gas molecules.

When Models Are Initially Nonlinear

Not all real-world problems are directly linear in form. Sometimes we encounter models where parameters appear in quadratic form, multiplication between parameters, or even in exponential functions:

y=h(t,x)y = h(t, x)

But don't despair! In many cases, we can "linearize" such models. The method is by using tangent line approximation around a certain point x0x_0:

yh(x0,t)+hx(x0,t)(xx0)y \approx h(x_0, t) + \frac{\partial h}{\partial x}(x_0, t) \cdot (x - x_0)

With this trick, we replace a complex curve with a straight line that approximates it. The result is an equation that is linear with respect to xx, so it can be solved with standard linear algebra methods like least squares.

However, this linearization method only works if the nonlinear model is not too "curved" around point x0x_0. Highly nonlinear models require numerical optimization techniques.