# Nakafa Learning Content

> For AI agents: use [llms.txt](https://nakafa.com/llms.txt) for the site index. Markdown versions are available by appending `.md` to content URLs or sending `Accept: text/markdown`.

URL: https://nakafa.com/en/subjects/ai-ds/linear-methods/orthogonal-projection
Source: https://raw.githubusercontent.com/nakafaai/nakafa.com/refs/heads/main/packages/contents/material/lesson/ai-ds/linear-methods/orthogonal-projection/en.mdx

Learn orthogonal projection theory with existence theorems, orthonormal basis formulas, Gram matrices, and best approximation methods in vector spaces.

---

## Existence and Uniqueness Theorem

An important question that arises is whether the best approximation really exists and whether its solution is unique? The answer is yes. Let $$V$$ be a Euclidean vector space and $$S \subset V$$ be a finite-dimensional vector subspace. Then for every $$f \in V$$ there exists a unique best approximation $$g \in S$$ with

Visible text: An important question that arises is whether the best approximation really exists and whether its solution is unique? The answer is yes. Let be a Euclidean vector space and be a finite-dimensional vector subspace. Then for every there exists a unique best approximation with

```math
\|f - g\| = \min_{\varphi \in S} \|f - \varphi\|
```

This theorem guarantees that the best approximation always exists and is unique. Like finding the closest point from a location to a highway, there is always one point that gives the shortest distance.

Let $$n$$ be the dimension of $$S$$ and $$\psi_1, \ldots, \psi_n$$ be a basis of $$S$$. Using the Gram-Schmidt process, we can compute an orthonormal basis $$\varphi_1, \ldots, \varphi_n$$ of $$S$$ with $$\langle \varphi_i, \varphi_k \rangle = \delta_{ik}$$.

Visible text: Let be the dimension of and be a basis of . Using the Gram-Schmidt process, we can compute an orthonormal basis of with .

Every $$g \in S$$ has a unique representation as $$g = \sum_{i=1}^n \alpha_i \varphi_i$$. Then it follows that

Visible text: Every has a unique representation as . Then it follows that

Component: MathContainer
Children:

```math
\|f - g\|^2 = \langle f - g, f - g \rangle = \left\langle f - \sum_{i=1}^n \alpha_i \varphi_i, f - \sum_{k=1}^n \alpha_k \varphi_k \right\rangle
```

```math
= \langle f, f \rangle - 2 \sum_{i=1}^n \alpha_i \langle f, \varphi_i \rangle + \sum_{i,k=1}^n \alpha_i \alpha_k \langle \varphi_i, \varphi_k \rangle
```

```math
= \|f\|^2 - 2 \sum_{i=1}^n \alpha_i \langle f, \varphi_i \rangle + \sum_{i=1}^n \alpha_i^2
```

Using the identity $$(\alpha_i - \langle f, \varphi_i \rangle)^2 = \alpha_i^2 - 2\alpha_i \langle f, \varphi_i \rangle + \langle f, \varphi_i \rangle^2$$, we obtain

Visible text: Using the identity , we obtain

```math
\|f - g\|^2 = \|f\|^2 - \sum_{i=1}^n \langle f, \varphi_i \rangle^2 + \sum_{i=1}^n (\alpha_i - \langle f, \varphi_i \rangle)^2
```

Function $$g \in S$$ is the best approximation of $$f$$ if and only if $$\alpha_i = \langle f, \varphi_i \rangle$$ for $$i = 1, \ldots, n$$.

Visible text: Function is the best approximation of if and only if for .

## Orthonormal Basis Formula

For an orthonormal basis $$\varphi_1, \ldots, \varphi_n$$ of $$S$$, the best approximation is given by

Visible text: For an orthonormal basis of , the best approximation is given by

```math
g = \sum_{i=1}^n \langle f, \varphi_i \rangle \varphi_i
```

The best approximation satisfies the distance formula

```math
\|f - g\| = \left( \|f\|^2 - \sum_{i=1}^n \langle f, \varphi_i \rangle^2 \right)^{\frac{1}{2}}
```

The best approximation $$g$$ of $$f$$ in $$S$$ is the orthogonal projection of $$f$$ onto $$S$$. This means

Visible text: The best approximation of in is the orthogonal projection of onto . This means

```math
\langle f - g, \varphi \rangle = 0 \text{ for all } \varphi \in S
```

Geometrically, the vector from $$g$$ to $$f$$ is perpendicular to the subspace $$S$$. Imagine dropping a ball from the air to the floor, the point where it lands is the orthogonal projection of the ball onto the floor.

Visible text: Geometrically, the vector from to is perpendicular to the subspace . Imagine dropping a ball from the air to the floor, the point where it lands is the orthogonal projection of the ball onto the floor.

## Construction with Arbitrary Basis

When an orthonormal basis of $$S$$ is not known, we can use an arbitrary basis $$\psi_1, \ldots, \psi_n$$ of $$S$$. Let $$g = \sum_{i=1}^n \alpha_i \psi_i$$ be the unique representation of $$g$$ with respect to this basis.

Visible text: When an orthonormal basis of is not known, we can use an arbitrary basis of . Let be the unique representation of with respect to this basis.

Since $$\psi_k \in S$$, the orthogonality condition gives

Visible text: Since , the orthogonality condition gives

```math
\left\langle f - \sum_{i=1}^n \alpha_i \psi_i, \psi_k \right\rangle = 0, \quad k = 1, \ldots, n
```

This yields the linear system

```math
\sum_{i=1}^n \alpha_i \langle \psi_i, \psi_k \rangle = \langle f, \psi_k \rangle, \quad k = 1, \ldots, n
```

The coefficient matrix $$A = (\langle \psi_i, \psi_k \rangle)_{i=1,\ldots,n,k=1,\ldots,n}$$ is called the Gram matrix of the basis $$\psi_1, \ldots, \psi_n$$. This matrix is symmetric and positive definite. For $$\alpha \neq 0$$ it holds

Visible text: The coefficient matrix is called the Gram matrix of the basis . This matrix is symmetric and positive definite. For it holds

```math
\alpha^T A \alpha = \sum_{i,k=1}^n \langle \psi_i, \psi_k \rangle \alpha_i \alpha_k = \left\| \sum_{i=1}^n \alpha_i \psi_i \right\|^2 > 0
```

However, matrix $$A$$ can become very ill-conditioned in practice. For example, for the monomial basis $$1, x, \ldots, x^n$$, the matrix becomes very unstable so that computing $$g$$ becomes difficult for large $$n$$.

Visible text: However, matrix can become very ill-conditioned in practice. For example, for the monomial basis , the matrix becomes very unstable so that computing becomes difficult for large .

The Gauss approximation with an orthonormal basis of $$S$$ has the advantage of easy computation of the best approximation

Visible text: The Gauss approximation with an orthonormal basis of has the advantage of easy computation of the best approximation

Component: MathContainer
Children:

```math
g(x) = \sum_{k=1}^n \langle f, \varphi_k \rangle \varphi_k(x)
```

```math
= \sum_{k=1}^n \int_a^b f(t) \varphi_k(t) \, dt \, \varphi_k(x)
```

without needing to solve a linear system. With an orthonormal basis, we can directly compute the projection coefficients like using a coordinate system that is already neatly arranged and mutually perpendicular.