# Nakafa Learning Content

> For AI agents: use [llms.txt](https://nakafa.com/llms.txt) for the site index. Markdown versions are available by appending `.md` to content URLs or sending `Accept: text/markdown`.

URL: https://nakafa.com/en/subjects/ai-ds/linear-methods/linear-model
Source: https://raw.githubusercontent.com/nakafaai/nakafa.com/refs/heads/main/packages/contents/material/lesson/ai-ds/linear-methods/linear-model/en.mdx

Learn linear models: fundamental y=h(t)·x relationships, polynomial/trigonometric forms, data-to-model conversion, and linearization techniques for AI.

---

## Understanding Linear Models

Imagine we are trying to understand relationships between various things in the real world.
Linear models are a very elegant way to describe these relationships in mathematical form.

The basic form is very simple:

```math
y = h(t) \cdot x
```

Why is it called "linear"? Because if we look at the relationship between $$y$$ and $$x$$,
the relationship forms a straight line. Here $$h(t)$$ acts as a "connector"
with dimensions $$\mathbb{R}^{d \times n}$$.

Visible text: Why is it called "linear"? Because if we look at the relationship between and ,
the relationship forms a straight line. Here acts as a "connector"
with dimensions .

Let's get to know the three main players in this model:

- $$y \in \mathbb{R}^d$$ is the **result we observe** (model response)
- $$x \in \mathbb{R}^n$$ is the **value we want to find** (model parameters)
- $$t \in \mathbb{R}^k$$ is the **input we provide** (independent variables)

Visible text: - is the **result we observe** (model response)
- is the **value we want to find** (model parameters)
- is the **input we provide** (independent variables)

What's interesting is that although we call it "linear", the relationship with input $$t$$
can actually be complex or curved. Only the relationship with parameter $$x$$ is linear.

Visible text: What's interesting is that although we call it "linear", the relationship with input 
can actually be complex or curved. Only the relationship with parameter is linear.

## From Data to Model

Now, how do we use this model in real life?
The process is actually like playing detective with data.

First, we conduct a series of experiments or measurements:

1. We choose various values for $$t$$
2. For each of these values, we measure and obtain $$y$$
3. Our goal is to find $$x$$ that can explain all this data

Visible text: 1. We choose various values for 
2. For each of these values, we measure and obtain 
3. Our goal is to find that can explain all this data

Suppose we perform $$M$$ measurements.
For each measurement $$i$$, we have:

Visible text: Suppose we perform measurements.
For each measurement , we have:

```math
y_i \approx h(t_i) \cdot x, \quad i = 1, \ldots, M
```

Why use the "approximately equal" sign instead of "equals"?
Because in the real world, no measurement is perfect.
There's always noise, instrument errors, or other random factors that affect the results.

If we count the total of all data we collect, the amount is $$m = M \cdot d$$.
Usually this number is much larger than the number of parameters we want to find ($$n$$),
so $$m \gg n$$.

Visible text: If we count the total of all data we collect, the amount is .
Usually this number is much larger than the number of parameters we want to find (),
so .

Our challenge now is how to find the best value of $$x$$,
so that the equation $$y_i = h(t_i) \cdot x$$ is satisfied as accurately as possible.

Visible text: Our challenge now is how to find the best value of ,
so that the equation is satisfied as accurately as possible.

When we arrange all this data, a system of equations is formed that looks like this:

Component: MathContainer
Children:

```math
b = \begin{pmatrix} y_1 \\ \vdots \\ y_M \end{pmatrix} = \begin{pmatrix} h(t_1) \\ \vdots \\ h(t_M) \end{pmatrix} \cdot \begin{pmatrix} x_1 \\ \vdots \\ x_n \end{pmatrix} = A(t) \cdot x
```

This produces a system with matrix $$A(t)$$ of size $$m \times n$$
and vector $$b$$ of size $$m$$.

Visible text: This produces a system with matrix of size 
and vector of size .

## Various Forms of Linear Models

Linear models turn out to be very flexible and can take various forms.
Let's look at some examples that appear most frequently:

### Simple Straight Line

The most basic form is a straight line:

```math
y = a + b \cdot t = (1 \quad t) \cdot \begin{pmatrix} a \\ b \end{pmatrix} = h(t) \cdot x
```

Here we look for two parameters: $$a$$ (intercept) and $$b$$ (slope).
This model is suitable when data forms a straight line pattern or nearly straight.

Visible text: Here we look for two parameters: (intercept) and (slope).
This model is suitable when data forms a straight line pattern or nearly straight.

### Polynomial Curves

If data forms curved patterns, we can use polynomials:

```math
y = a_0 + a_1 \cdot t + \ldots + a_n \cdot t^n = (1 \quad t \quad \ldots \quad t^n) \cdot \begin{pmatrix} a_0 \\ a_1 \\ \vdots \\ a_n \end{pmatrix} = h(t) \cdot x
```

Although $$t^n$$ looks nonlinear, remember that what we mean by "linear"
is the relationship with parameters $$a_0, a_1, \ldots, a_n$$.

Visible text: Although looks nonlinear, remember that what we mean by "linear"
is the relationship with parameters .

### Repeating Patterns with Trigonometry

For data that has repeating or cyclic patterns, we can use sine and cosine functions:

```math
y = a_0 + \sum_{k=1}^{n} a_k \cos(k \cdot t) + \sum_{k=1}^{n} b_k \sin(k \cdot t)
```

This model is very useful for analyzing data that has seasonal or periodic patterns.

### Multiple Inputs

If the output depends on several inputs simultaneously, we can combine them.
For example with two inputs $$t$$ and $$s$$:

Visible text: If the output depends on several inputs simultaneously, we can combine them.
For example with two inputs and :

Component: MathContainer
Children:

```math
y = a + b \cdot t + c \cdot s + d \cdot t \cdot s = (1 \quad t \quad s \quad t \cdot s) \cdot \begin{pmatrix} a \\ b \\ c \\ d \end{pmatrix} = h(t,s) \cdot x
```

The term $$t \cdot s$$ captures the interaction between the two inputs.

Visible text: The term captures the interaction between the two inputs.

### Multiple Outputs

Sometimes we want to predict several things simultaneously from the same input:

Component: MathContainer
Children:

```math
y = a + b \cdot t
```

```math
z = c + d \cdot t
```

This is like having two linear models running simultaneously.

## Real Example from Physics

The general gas equation in physics is:

```math
p = n \cdot R \cdot \frac{T}{V}
```

Here:

- $$p$$ is the pressure we measure
- $$T$$ is the temperature we set
- $$V$$ is the volume we set
- $$n$$ is the number of gas molecules (which we want to determine)
- $$R$$ is a known constant

Visible text: - is the pressure we measure
- is the temperature we set
- is the volume we set
- is the number of gas molecules (which we want to determine)
- is a known constant

If we consider $$T$$ and $$V$$ as inputs we can control,
and $$p$$ as output we measure, then pressure depends linearly on $$n$$.
This allows us to use linear models to determine the number of gas molecules.

Visible text: If we consider and as inputs we can control,
and as output we measure, then pressure depends linearly on .
This allows us to use linear models to determine the number of gas molecules.

## When Models Are Initially Nonlinear

Not all real-world problems are directly linear in form.
Sometimes we encounter models where parameters appear in quadratic form, multiplication between parameters,
or even in exponential functions:

```math
y = h(t, x)
```

But don't despair! In many cases, we can "linearize" such models.
The method is by using tangent line approximation around a certain point $$x_0$$:

Visible text: But don't despair! In many cases, we can "linearize" such models.
The method is by using tangent line approximation around a certain point :

```math
y \approx h(x_0, t) + \frac{\partial h}{\partial x}(x_0, t) \cdot (x - x_0)
```

With this trick, we replace a complex curve with a straight line that approximates it.
The result is an equation that is linear with respect to $$x$$, so it can be solved
with standard linear algebra methods like least squares.

Visible text: With this trick, we replace a complex curve with a straight line that approximates it.
The result is an equation that is linear with respect to , so it can be solved
with standard linear algebra methods like least squares.

However, this linearization method only works if the nonlinear model is not too "curved"
around point $$x_0$$. Highly nonlinear models require numerical optimization techniques.

Visible text: However, this linearization method only works if the nonlinear model is not too "curved"
around point . Highly nonlinear models require numerical optimization techniques.