# Nakafa Learning Content

> For AI agents: use [llms.txt](https://nakafa.com/llms.txt) for the site index. Markdown versions are available by appending `.md` to content URLs or sending `Accept: text/markdown`.

URL: https://nakafa.com/en/subjects/ai-ds/linear-methods/qr-decomposition
Source: https://raw.githubusercontent.com/nakafaai/nakafa.com/refs/heads/main/packages/contents/material/lesson/ai-ds/linear-methods/qr-decomposition/en.mdx

Learn QR decomposition using Gram-Schmidt and Householder methods for matrix factorization, orthogonal transformations, and linear systems.

---

## QR Decomposition Existence Theorem

We want to transform a matrix into upper triangular form, but not through elementary row operations, rather through orthogonal transformations that have better conditioning. Imagine rotating and reflecting geometric space to simplify the matrix, without changing its fundamental properties.

Let $$A \in \mathbb{R}^{m \times n}$$ be a rectangular matrix with $$m \geq n$$ and $$\text{Rank}A = n$$. Then there exists an orthogonal matrix $$Q \in \mathbb{R}^{m \times m}$$ with $$Q^T Q = I$$ and an upper triangular matrix $$R \in \mathbb{R}^{m \times n}$$ with diagonal elements $$r_{ii} > 0$$ for $$i = 1, \ldots, n$$, such that:

Visible text: Let be a rectangular matrix with and . Then there exists an orthogonal matrix with and an upper triangular matrix with diagonal elements for , such that:

```math
A = Q \cdot R
```

This representation is called the QR decomposition of $$A$$.

Visible text: This representation is called the QR decomposition of .

### Proof using Gram-Schmidt

The columns $$a_j$$ for $$j = 1, \ldots, n$$ of matrix $$A$$ can be orthonormalized using the Gram-Schmidt process:

Visible text: The columns for of matrix can be orthonormalized using the Gram-Schmidt process:

Component: MathContainer
Children:

```math
\tilde{q}_j := a_j - \sum_{k=1}^{j-1} \langle a_j, q_k \rangle \cdot q_k
```

```math
q_j := \frac{\tilde{q}_j}{\|\tilde{q}_j\|_2}
```

We obtain orthonormal vectors $$q_j$$ for $$j = 1, \ldots, n$$ as columns of the orthogonal matrix $$Q_1 \in \mathbb{R}^{m \times n}$$. Conversely:

Visible text: We obtain orthonormal vectors for as columns of the orthogonal matrix . Conversely:

Component: MathContainer
Children:

```math
a_j = \sum_{k=1}^{j-1} \langle a_j, q_k \rangle \cdot q_k + \tilde{q}_j
```

```math
= \sum_{k=1}^{j-1} \langle a_j, q_k \rangle \cdot q_k + \|\tilde{q}_j\|_2 \cdot q_j
```

```math
= \sum_{k=1}^{j-1} q_k \cdot r_{kj} + q_j \cdot r_{jj}
```

Thus $$A = Q_1 R_1$$ with upper triangular matrix $$R_1 \in \mathbb{R}^{n \times n}$$ whose diagonal elements are $$r_{ii} = \|\tilde{q}_i\|_2 > 0$$.

Visible text: Thus with upper triangular matrix whose diagonal elements are .

If $$Q_1$$ is completed with $$m - n$$ additional columns to become orthogonal matrix $$Q = (Q_1 \quad Q_2) \in \mathbb{R}^{m \times m}$$ and $$R_1$$ becomes $$R = \begin{pmatrix} R_1 \\ 0 \end{pmatrix} \in \mathbb{R}^{m \times n}$$, then $$A = Q_1 R_1 = QR$$.

Visible text: If is completed with additional columns to become orthogonal matrix and becomes , then .

## Full and Economical QR Decomposition

When we have a matrix $$A \in \mathbb{R}^{m \times n}$$ with $$m \geq n$$,
there are two ways to represent the QR decomposition. The difference lies in the size of the matrices used.

Visible text: When we have a matrix with ,
there are two ways to represent the QR decomposition. The difference lies in the size of the matrices used.

### Full QR Decomposition

Full QR decomposition uses full-sized matrices:

```math
A = Q \cdot R
```

with $$Q \in \mathbb{R}^{m \times m}$$ being a full-sized orthogonal matrix and
$$R \in \mathbb{R}^{m \times n}$$ being an upper triangular matrix.

Visible text: with being a full-sized orthogonal matrix and
 being an upper triangular matrix.

### Economical QR Decomposition

Since the lower part of matrix $$R$$ only contains zeros, we can save storage and computation.
Economical QR decomposition only uses the parts that are actually needed:

Visible text: Since the lower part of matrix only contains zeros, we can save storage and computation.
Economical QR decomposition only uses the parts that are actually needed:

```math
A = Q \cdot R = (Q_1 \quad Q_2) \cdot \begin{pmatrix} R_1 \\ 0 \end{pmatrix} = Q_1 \cdot R_1
```

Here $$Q_1 \in \mathbb{R}^{m \times n}$$ only takes the first through $$n$$-th columns
from $$Q$$, and $$R_1 \in \mathbb{R}^{n \times n}$$ is a square upper triangular matrix.

Visible text: Here only takes the first through -th columns
from , and is a square upper triangular matrix.

Why is it called economical? Because we save storage space and computational time.
Instead of storing matrix $$Q$$ of size $$m \times m$$ which can be very large,
we only need $$Q_1$$ of size $$m \times n$$.

Visible text: Why is it called economical? Because we save storage space and computational time.
Instead of storing matrix of size which can be very large,
we only need of size .

The columns of matrix $$Q_2 \in \mathbb{R}^{m \times (m-n)}$$ that we don't use form
an orthonormal basis of $$\text{Kernel}A^T$$:

Visible text: The columns of matrix that we don't use form
an orthonormal basis of :

```math
A^T Q_2 = R_1^T Q_1^T Q_2 = 0
```

### Uniqueness Property

The economical QR decomposition $$A = Q_1 \cdot R_1$$ with condition $$r_{ii} > 0$$
for all $$i = 1, \ldots, n$$ is unique for matrix $$A$$
that has full rank.

Visible text: The economical QR decomposition with condition 
for all is unique for matrix 
that has full rank.

## Householder Method for QR Decomposition

Although the Gram-Schmidt process provides an elegant theoretical way to obtain QR decomposition,
this method is not suitable for practical computation. The main problem is numerical instability due to cancellation,
orthogonality of columns is quickly lost during computation.

To overcome this problem, we need a method that is more numerically stable.
One of the most successful approaches is the Householder procedure,
which uses orthogonal reflection transformations. Another alternative is the Givens procedure with rotation transformations.

### Householder Transformation

For a vector $$v \in \mathbb{R}^m$$ with $$\|v\|_2 = 1$$,
we can define the Householder transformation matrix:

Visible text: For a vector with ,
we can define the Householder transformation matrix:

```math
S = I - 2vv^T \in \mathbb{R}^{m \times m}
```

Note that $$vv^T$$ is a dyadic product, which is the multiplication of column vector
$$v \in \mathbb{R}^{m \times 1}$$ with row vector $$v^T \in \mathbb{R}^{1 \times m}$$.
The result of this multiplication is an $$m \times m$$ matrix with rank $$1$$ for $$v \neq 0$$.
Don't confuse this with scalar multiplication $$v^T v \in \mathbb{R}$$.

Visible text: Note that is a dyadic product, which is the multiplication of column vector
 with row vector .
The result of this multiplication is an matrix with rank for .
Don't confuse this with scalar multiplication .

## Properties of Householder Transformation

Let $$S = I - 2vv^T \in \mathbb{R}^{m \times m}$$ be a Householder transformation matrix
for vector $$v \in \mathbb{R}^m$$ with $$\|v\|_2 = 1$$. Then it holds:

Visible text: Let be a Householder transformation matrix
for vector with . Then it holds:

1. $$S$$ is symmetric: $$S^T = S$$

2. $$S$$ is an orthogonal matrix: $$S^T S = I$$

3. Multiplication $$S \cdot x$$ of $$S$$ from the left with vector $$x \in \mathbb{R}^n$$
   causes reflection of $$x$$ in the subspace $$\text{Span}(v)^{\perp}$$,
   that is, in the hyperplane with normal vector $$v$$

4. $$\text{cond}_2(S) = 1$$

Visible text: 1. is symmetric: 

2. is an orthogonal matrix: 

3. Multiplication of from the left with vector 
 causes reflection of in the subspace ,
 that is, in the hyperplane with normal vector 

4.

## Householder Procedure

Given matrix $$A \in \mathbb{R}^{m \times n}$$ with $$m \geq n$$
and $$\text{Rank}A = n$$. For QR decomposition computation,
matrix $$A$$ is transformed column by column through Householder reflections
into upper triangular form.

Visible text: Given matrix with 
and . For QR decomposition computation,
matrix is transformed column by column through Householder reflections
into upper triangular form.

Start with $$A_1 = A$$ and reflect the first column of $$A_1$$
with respect to a vector using reflection plane with:

Visible text: Start with and reflect the first column of 
with respect to a vector using reflection plane with:

Component: MathContainer
Children:

```math
\tilde{a}_1 \text{ first column of } A_1
```

```math
\pm\|\tilde{a}_1\|_2 \cdot e_1 \in \mathbb{R}^m \text{ target vector}
```

```math
v_1 = u_1/\|u_1\|_2 \text{ with } \text{Span}(v_1)^{\perp}
```

```math
u_1 = \tilde{a}_1 \mp \|\tilde{a}_1\|_2 \cdot e_1
```

### Iterative Transformation Process

Householder transformation matrix $$S_1 = I_m - 2v_1v_1^T \in \mathbb{R}^{m \times m}$$. We obtain:

Visible text: Householder transformation matrix . We obtain:

```math
A_2 = S_1 \cdot A_1 = \begin{pmatrix} r_{11} & * \\ 0 & \tilde{A}_2 \end{pmatrix}
```

with $$r_{11} = \pm\|\tilde{a}_1\|_2$$ and $$\tilde{A}_2 \in \mathbb{R}^{m-1 \times n-1}$$.

Visible text: with and .

Continue with reflection of the first column of the submatrix with reflection plane:

Component: MathContainer
Children:

```math
\tilde{a}_2 \text{ first column of } \tilde{A}_2
```

```math
\pm\|\tilde{a}_2\|_2 \cdot \tilde{e}_1 \in \mathbb{R}^{m-1}
```

```math
v_2 = u_2/\|u_2\|_2
```

```math
u_2 = \tilde{a}_2 \mp \|\tilde{a}_2\|_2 \cdot \tilde{e}_1
```

With transformation matrix:

Component: MathContainer
Children:

```math
\tilde{S}_2 = I_{m-1} - 2v_2v_2^T \in \mathbb{R}^{m-1 \times m-1}
```

```math
S_2 = \begin{pmatrix} I_1 & 0 \\ 0 & \tilde{S}_2 \end{pmatrix} \in \mathbb{R}^{m \times m}
```

We obtain:

```math
A_3 = S_2 \cdot A_2 = \begin{pmatrix} r_{11} & * & * \\ 0 & r_{22} & * \\ 0 & 0 & \tilde{A}_3 \end{pmatrix}
```

with $$r_{22} = \pm\|\tilde{a}_2\|_2$$ and $$\tilde{A}_3 \in \mathbb{R}^{m-2 \times n-2}$$, and so on until:

Visible text: with and , and so on until:

```math
\tilde{A}_n = \begin{pmatrix} \tilde{a}_{nn} \\ \vdots \\ \tilde{a}_{mn} \end{pmatrix} \in \mathbb{R}^{m-(n-1) \times 1}
```

Finally we obtain the upper triangular matrix:

```math
A_{n+1} = R = \begin{pmatrix} r_{11} & & * \\ & \ddots & \\ 0 & & r_{nn} \\ 0 & \cdots & 0 \\ \vdots & & \vdots \\ 0 & \cdots & 0 \end{pmatrix} = S_n \cdot \ldots \cdot S_1 \cdot A = Q^T \cdot A
```

Thus we obtain the factorization:

Component: MathContainer
Children:

```math
A = Q \cdot R
```

```math
Q = (S_n \cdot \ldots \cdot S_1)^T = S_1 \cdot \ldots \cdot S_n
```

with $$Q$$ being an orthogonal matrix.

Visible text: with being an orthogonal matrix.

## Householder Implementation Algorithm

Implementation of the Householder procedure:

Start with:

Component: MathContainer
Children:

```math
A_1 := A
```

```math
Q_1 := I
```

For $$i = 1, \ldots, n$$:

Visible text: For :

```math
\tilde{a}_i = \begin{pmatrix} a_{ii} \\ \vdots \\ a_{mi} \end{pmatrix} := \begin{pmatrix} (A_i)_{ii} \\ \vdots \\ (A_i)_{mi} \end{pmatrix} \in \mathbb{R}^{m-i+1}
```

Compute:

Component: MathContainer
Children:

```math
\sigma := \|\tilde{a}_i\|_2
```

```math
u_i := \tilde{a}_i + \begin{pmatrix} \mp\sigma \\ 0 \\ \vdots \\ 0 \end{pmatrix} \in \mathbb{R}^{m-i+1}
```

```math
\hat{u}_i := \begin{pmatrix} 0 \\ \vdots \\ 0 \\ u_i \end{pmatrix} \in \mathbb{R}^m
```

and:

```math
\beta := 1/(\sigma(\sigma + |\tilde{a}_{ii}|))
```

Then:

Component: MathContainer
Children:

```math
v_i = \frac{\tilde{a}_i \mp \|\tilde{a}_i\|_2 \cdot e_1}{\|\tilde{a}_i \mp \|\tilde{a}_i\|_2 \cdot e_1\|_2} = \frac{u_i}{\|u_i\|_2}
```

```math
\tilde{S}_i = I_{m-i+1} - 2v_i v_i^T = I_{m-i+1} - \beta u_i u_i^T
```

```math
S_i = I_m - \beta \hat{u}_i \hat{u}_i^T
```

So we obtain:

Component: MathContainer
Children:

```math
A_{i+1} := S_i A_i = (I_m - \beta \hat{u}_i \hat{u}_i^T) A_i = A_i - \hat{u}_i (\beta \hat{u}_i^T A_i)
```

```math
Q_{i+1} := Q_i S_i = Q_i (I_m - \beta \hat{u}_i \hat{u}_i^T) = Q_i - (Q_i \hat{u}_i) \beta \hat{u}_i^T
```

Finally we obtain:

Component: MathContainer
Children:

```math
R := A_{n+1}
```

```math
Q := Q_{n+1}
```

## Numerical Aspects and Householder Complexity

### Computational Complexity

The numerical complexity for computing QR decomposition of matrix $$A \in \mathbb{R}^{m \times n}$$
with Householder procedure is essentially:

Visible text: The numerical complexity for computing QR decomposition of matrix 
with Householder procedure is essentially:

Component: MathContainer
Children:

```math
n^2 \cdot m \text{ arithmetic operations for } m \gg n
```

```math
\frac{2}{3}n^3 \text{ arithmetic operations for } m \approx n
```

### Numerical Properties

1. Due to orthogonal transformation, it holds that $$\text{cond}_2(R) = \text{cond}_2(A)$$

2. The diagonal elements of $$R$$ are the numbers $$\pm\|\tilde{a}_i\|_2$$
   from the $$i$$-th step. Choosing the transformation so that all are positive gives the QR decomposition.
   If $$\|\tilde{a}_i\|_2 = 0$$, $$A$$ does not have full rank
   and the algorithm stops.

   For numerical reasons choose the sign so that cancellation is avoided:

   <MathContainer>
   
   
   ```math
   u_i = \tilde{a}_i \mp \|\tilde{a}_i\|_2 \cdot e_1 \text{ (general form)}
   ```

   
   
   ```math
   u_i = \tilde{a}_i + \text{sgn}(\tilde{a}_i) \cdot \|\tilde{a}_i\|_2 \cdot e_1 \text{ (optimal sign choice)}
   ```

   </MathContainer>

3. Instead of constructing $$Q$$, in compact storage one can also store
   only the information needed for the transformation:

   
   
   ```math
   \text{Store vector } \tilde{a}_i \mp \|\tilde{a}_i\|_2 \cdot e_1
   ```

   in the free space of $$A$$ below the diagonal and diagonal elements in an additional vector.

4. If you only want to compute economical QR decomposition, remove the corresponding columns of $$Q$$ and rows of $$R$$.

Visible text: 1. Due to orthogonal transformation, it holds that 

2. The diagonal elements of are the numbers 
 from the -th step. Choosing the transformation so that all are positive gives the QR decomposition.
 If , does not have full rank
 and the algorithm stops.

 For numerical reasons choose the sign so that cancellation is avoided:

 <MathContainer>
 
 

 
 

 </MathContainer>

3. Instead of constructing , in compact storage one can also store
 only the information needed for the transformation:

 
 

 in the free space of below the diagonal and diagonal elements in an additional vector.

4. If you only want to compute economical QR decomposition, remove the corresponding columns of and rows of .