QR Decomposition

QR Decomposition Existence Theorem

We want to transform a matrix into upper triangular form, but not through elementary row operations, rather through orthogonal transformations that have better conditioning. Imagine rotating and reflecting geometric space to simplify the matrix, without changing its fundamental properties.

Let $A \in \mathbb{R}^{m \times n}$ be a rectangular matrix with $m \geq n$ and $\text{Rank}A = n$ . Then there exists an orthogonal matrix $Q \in \mathbb{R}^{m \times m}$ with $Q^T Q = I$ and an upper triangular matrix $R \in \mathbb{R}^{m \times n}$ with diagonal elements $r_{ii} > 0$ for $i = 1, \ldots, n$ , such that:

A = Q \cdot R

This representation is called the QR decomposition of $A$ .

Proof using Gram-Schmidt

The columns $a_j$ for $j = 1, \ldots, n$ of matrix $A$ can be orthonormalized using the Gram-Schmidt process:

\tilde{q}_j := a_j - \sum_{k=1}^{j-1} \langle a_j, q_k \rangle \cdot q_k

q_j := \frac{\tilde{q}_j}{\|\tilde{q}_j\|_2}

We obtain orthonormal vectors $q_j$ for $j = 1, \ldots, n$ as columns of the orthogonal matrix $Q_1 \in \mathbb{R}^{m \times n}$ . Conversely:

a_j = \sum_{k=1}^{j-1} \langle a_j, q_k \rangle \cdot q_k + \tilde{q}_j

= \sum_{k=1}^{j-1} \langle a_j, q_k \rangle \cdot q_k + \|\tilde{q}_j\|_2 \cdot q_j

= \sum_{k=1}^{j-1} q_k \cdot r_{kj} + q_j \cdot r_{jj}

Thus $A = Q_1 R_1$ with upper triangular matrix $R_1 \in \mathbb{R}^{n \times n}$ whose diagonal elements are $r_{ii} = \|\tilde{q}_i\|_2 > 0$ .

If $Q_1$ is completed with $m - n$ additional columns to become orthogonal matrix $Q = (Q_1 \quad Q_2) \in \mathbb{R}^{m \times m}$ and $R_1$ becomes $R = \begin{pmatrix} R_1 \\ 0 \end{pmatrix} \in \mathbb{R}^{m \times n}$ , then $A = Q_1 R_1 = QR$ .

Full and Economical QR Decomposition

When we have a matrix $A \in \mathbb{R}^{m \times n}$ with $m \geq n$ , there are two ways to represent the QR decomposition. The difference lies in the size of the matrices used.

Full QR Decomposition

Full QR decomposition uses full-sized matrices:

A = Q \cdot R

with $Q \in \mathbb{R}^{m \times m}$ being a full-sized orthogonal matrix and $R \in \mathbb{R}^{m \times n}$ being an upper triangular matrix.

Economical QR Decomposition

Since the lower part of matrix $R$ only contains zeros, we can save storage and computation. Economical QR decomposition only uses the parts that are actually needed:

A = Q \cdot R = (Q_1 \quad Q_2) \cdot \begin{pmatrix} R_1 \\ 0 \end{pmatrix} = Q_1 \cdot R_1

Here $Q_1 \in \mathbb{R}^{m \times n}$ only takes the first through $n$ -th columns from $Q$ , and $R_1 \in \mathbb{R}^{n \times n}$ is a square upper triangular matrix.

Why is it called economical? Because we save storage space and computational time. Instead of storing matrix $Q$ of size $m \times m$ which can be very large, we only need $Q_1$ of size $m \times n$ .

The columns of matrix $Q_2 \in \mathbb{R}^{m \times (m-n)}$ that we don't use form an orthonormal basis of $\text{Kernel}A^T$ :

A^T Q_2 = R_1^T Q_1^T Q_2 = 0

Uniqueness Property

The economical QR decomposition $A = Q_1 \cdot R_1$ with condition $r_{ii} > 0$ for all $i = 1, \ldots, n$ is unique for matrix $A$ that has full rank.

Householder Method for QR Decomposition

Although the Gram-Schmidt process provides an elegant theoretical way to obtain QR decomposition, this method is not suitable for practical computation. The main problem is numerical instability due to cancellation, orthogonality of columns is quickly lost during computation.

To overcome this problem, we need a method that is more numerically stable. One of the most successful approaches is the Householder procedure, which uses orthogonal reflection transformations. Another alternative is the Givens procedure with rotation transformations.

Householder Transformation

For a vector $v \in \mathbb{R}^m$ with $\|v\|_2 = 1$ , we can define the Householder transformation matrix:

S = I - 2vv^T \in \mathbb{R}^{m \times m}

Note that $vv^T$ is a dyadic product, which is the multiplication of column vector $v \in \mathbb{R}^{m \times 1}$ with row vector $v^T \in \mathbb{R}^{1 \times m}$ . The result of this multiplication is an $m \times m$ matrix with rank 1 for $v \neq 0$ . Don't confuse this with scalar multiplication $v^T v \in \mathbb{R}$ .

Properties of Householder Transformation

Let $S = I - 2vv^T \in \mathbb{R}^{m \times m}$ be a Householder transformation matrix for vector $v \in \mathbb{R}^m$ with $\|v\|_2 = 1$ . Then it holds:

$S$ is symmetric: $S^T = S$
$S$ is an orthogonal matrix: $S^T S = I$
Multiplication $S \cdot x$ of $S$ from the left with vector $x \in \mathbb{R}^n$ causes reflection of $x$ in the subspace $\text{Span}(v)^{\perp}$ , that is, in the hyperplane with normal vector $v$
$\text{cond}_2(S) = 1$

Householder Procedure

Given matrix $A \in \mathbb{R}^{m \times n}$ with $m \geq n$ and $\text{Rank}A = n$ . For QR decomposition computation, matrix $A$ is transformed column by column through Householder reflections into upper triangular form.

Start with $A_1 = A$ and reflect the first column of $A_1$ with respect to a vector using reflection plane with:

\tilde{a}_1 \text{ first column of } A_1

\pm\|\tilde{a}_1\|_2 \cdot e_1 \in \mathbb{R}^m \text{ target vector}

v_1 = u_1/\|u_1\|_2 \text{ with } \text{Span}(v_1)^{\perp}

u_1 = \tilde{a}_1 \mp \|\tilde{a}_1\|_2 \cdot e_1

Iterative Transformation Process

Householder transformation matrix $S_1 = I_m - 2v_1v_1^T \in \mathbb{R}^{m \times m}$ . We obtain:

A_2 = S_1 \cdot A_1 = \begin{pmatrix} r_{11} & * \\ 0 & \tilde{A}_2 \end{pmatrix}

with $r_{11} = \pm\|\tilde{a}_1\|_2$ and $\tilde{A}_2 \in \mathbb{R}^{m-1 \times n-1}$ .

Continue with reflection of the first column of the submatrix with reflection plane:

\tilde{a}_2 \text{ first column of } \tilde{A}_2

\pm\|\tilde{a}_2\|_2 \cdot \tilde{e}_1 \in \mathbb{R}^{m-1}

v_2 = u_2/\|u_2\|_2

u_2 = \tilde{a}_2 \mp \|\tilde{a}_2\|_2 \cdot \tilde{e}_1

With transformation matrix:

\tilde{S}_2 = I_{m-1} - 2v_2v_2^T \in \mathbb{R}^{m-1 \times m-1}

S_2 = \begin{pmatrix} I_1 & 0 \\ 0 & \tilde{S}_2 \end{pmatrix} \in \mathbb{R}^{m \times m}

We obtain:

A_3 = S_2 \cdot A_2 = \begin{pmatrix} r_{11} & * & * \\ 0 & r_{22} & * \\ 0 & 0 & \tilde{A}_3 \end{pmatrix}

with $r_{22} = \pm\|\tilde{a}_2\|_2$ and $\tilde{A}_3 \in \mathbb{R}^{m-2 \times n-2}$ , and so on until:

\tilde{A}_n = \begin{pmatrix} \tilde{a}_{nn} \\ \vdots \\ \tilde{a}_{mn} \end{pmatrix} \in \mathbb{R}^{m-(n-1) \times 1}

Finally we obtain the upper triangular matrix:

A_{n+1} = R = \begin{pmatrix} r_{11} & & * \\ & \ddots & \\ 0 & & r_{nn} \\ 0 & \cdots & 0 \\ \vdots & & \vdots \\ 0 & \cdots & 0 \end{pmatrix} = S_n \cdot \ldots \cdot S_1 \cdot A = Q^T \cdot A

Thus we obtain the factorization:

A = Q \cdot R

Q = (S_n \cdot \ldots \cdot S_1)^T = S_1 \cdot \ldots \cdot S_n

with $Q$ being an orthogonal matrix.

Householder Implementation Algorithm

Implementation of the Householder procedure:

Start with:

A_1 := A

Q_1 := I

For $i = 1, \ldots, n$ :

\tilde{a}_i = \begin{pmatrix} a_{ii} \\ \vdots \\ a_{mi} \end{pmatrix} := \begin{pmatrix} (A_i)_{ii} \\ \vdots \\ (A_i)_{mi} \end{pmatrix} \in \mathbb{R}^{m-i+1}

Compute:

\sigma := \|\tilde{a}_i\|_2

u_i := \tilde{a}_i + \begin{pmatrix} \mp\sigma \\ 0 \\ \vdots \\ 0 \end{pmatrix} \in \mathbb{R}^{m-i+1}

\hat{u}_i := \begin{pmatrix} 0 \\ \vdots \\ 0 \\ u_i \end{pmatrix} \in \mathbb{R}^m

and:

\beta := 1/(\sigma(\sigma + |\tilde{a}_{ii}|))

Then:

v_i = \frac{\tilde{a}_i \mp \|\tilde{a}_i\|_2 \cdot e_1}{\|\tilde{a}_i \mp \|\tilde{a}_i\|_2 \cdot e_1\|_2} = \frac{u_i}{\|u_i\|_2}

\tilde{S}_i = I_{m-i+1} - 2v_i v_i^T = I_{m-i+1} - \beta u_i u_i^T

S_i = I_m - \beta \hat{u}_i \hat{u}_i^T

So we obtain:

A_{i+1} := S_i A_i = (I_m - \beta \hat{u}_i \hat{u}_i^T) A_i = A_i - \hat{u}_i (\beta \hat{u}_i^T A_i)

Q_{i+1} := Q_i S_i = Q_i (I_m - \beta \hat{u}_i \hat{u}_i^T) = Q_i - (Q_i \hat{u}_i) \beta \hat{u}_i^T

Finally we obtain:

R := A_{n+1}

Q := Q_{n+1}

Numerical Aspects and Householder Complexity

Computational Complexity

The numerical complexity for computing QR decomposition of matrix $A \in \mathbb{R}^{m \times n}$ with Householder procedure is essentially:

n^2 \cdot m \text{ arithmetic operations for } m \gg n

\frac{2}{3}n^3 \text{ arithmetic operations for } m \approx n

Numerical Properties

Due to orthogonal transformation, it holds that $\text{cond}_2(R) = \text{cond}_2(A)$
The diagonal elements of $R$ are the numbers $\pm\|\tilde{a}_i\|_2$ from the $i$ -th step. Choosing the transformation so that all are positive gives the QR decomposition. If $\|\tilde{a}_i\|_2 = 0$ , $A$ does not have full rank and the algorithm stops.

For numerical reasons choose the sign so that cancellation is avoided:

$u_i = \tilde{a}_i \mp \|\tilde{a}_i\|_2 \cdot e_1 \text{ (general form)}$
$u_i = \tilde{a}_i + \text{sgn}(\tilde{a}_i) \cdot \|\tilde{a}_i\|_2 \cdot e_1 \text{ (optimal sign choice)}$
Instead of constructing $Q$ , in compact storage one can also store only the information needed for the transformation:

$\text{Store vector } \tilde{a}_i \mp \|\tilde{a}_i\|_2 \cdot e_1$

in the free space of $A$ below the diagonal and diagonal elements in an additional vector.
If you only want to compute economical QR decomposition, remove the corresponding columns of $Q$ and rows of $R$ .

Command Palette