QR decomposition using Givens rotations

08 Dec 2021 - tsp
Last update 20 Mar 2022
Reading time 7 mins

TL;DR:

QR decomposition of a matrix $A \in \mathfrak{R}^{m \times n}$ produces an orthogonal matrix $Q \in \mathfrak{R}^{m \times m}$ and an upper triangular matrix $R \in \mathfrak{R}^{m \times n}$ such that $A = Q * R$
$Q$ and $R$ are easily invertible even if $A$ is not
Can be done using Householder reflections or Givens rotations
Givens rotations require $\mathcal{O}(\frac{4}{3}n^3)$ multiplications / divisions and $\mathcal{O}(\frac{1}{2} n^2)$ square roots, that’s double the cost as for Householder reflections
Can be embedded in some particular algorithms such as GMRES pretty efficiently when done by Givens rotations
No pivoting required
Introduction
QR decomposition by Givens rotations
Python demonstration

Introduction

The following article is a short summary (recipe) on how to perform QR factorization using Givens rotations. This has been written since I’m in process of writing a short summary on how techniques like the generalized minimal residual method (GMRES) for for numerical solutions of linear equations work since this is an often encountered problem in stuff like numerically approximating solutions for electromagnetic fields (when FEM or FTDM modeling of Maxwell equations), performing calculations solutions such as finite element calculations of stresses, for solving fluid dynamics equations, doing social network simulations, etc. - basically for every problem requiring the solution of systems specified by local differential equations. It also has applications in truss constructions, optimizations in business as well as of course quantum mechanics and computational chemistry. QR factorization can be used to solve the least squares problem. The article is part of a mini series on techniques in linear algebra such as power iterations which are used in similar context (Eigenmode solvers for electromagnetic radiation, economic fraud prevention, social network analysis, etc.). Note that it should not provide a mathematical found and solid formulation but rather provide my view of the concept and the picture I have how this stuff works - my personal physicists view. For a mathematical found and rigid description refer to your favorite math lecture notes.

What is a QR decomposition anyways? Basically it allows one to decompose a matrix $A \in \mathfrak{R}^{m \times n}$ that might not be invertible into a orthogonal matrix $Q \in \mathfrak{R}^{m \times m}$ and a upper tridiagonal matrix $R \in \mathfrak{R}^{m \times n}$ with $A = Q * R$.

Recall that orthogonal means $Q^T Q = Q Q^T = \mathfrak{I}$ and thus $Q^{-1} = Q^T$. The upper triangular matrix is also invertible as long as the diagonal elements are not zero - there are some constructions where it’s obvious. To actually perform the QR decomposition there are many different approaches. The most popular ones are:

Householder reflections. This is by far the most efficient one - but I won’t talk about this method in this article since I’m more interested in …
Using givens rotations. In contrast to householder reflections this produces single zeros in the upper triangular matrix one at a time by performing rotations in orthogonal planes - which makes the calculation of the inverse matrix really easy.
One can also apply Gram-Schmidt orthogonalization - but since this is numerically unstable one usually does not use it in numerics.

QR decomposition by Givens rotations

So what’s the idea of Givens rotations. These are orthogonal rotations that are used to form the upper triangular matrix $R$ by eliminating every non zero element left below the diagonal - one element at a time. The combination of the transformation matrices forming orthogonal rotations is still orthogonal - so this will then be our matrix $Q$. Since $Q$ is orthogonal one then also has determined $Q^{-1} = Q^T$ which is important for stuff like least squares and thus algorithms like GMRES. These also avoid the inversion of $R$.

To illustrate the idea one can look at the two dimensional case ($\mathfrak{R}^2$) at first when trying to apply an orthogonal rotation matrix in a way to rotate a vector $\vec{x}$ into direction of the first unit vector $\hat{e_1}$:

[ \vec{x} = \begin{pmatrix} x_1 \\ x_2 \end{pmatrix} \\ \begin{pmatrix} c & s \\ -s & c \end{pmatrix} * \begin{pmatrix} x_1 \\ x_2 \end{pmatrix} = \begin{pmatrix} r \\ 0 \end{pmatrix} \\ \begin{pmatrix} c & s \\ -s & c \end{pmatrix} * \begin{pmatrix} r * cos(\phi) \\ r * sin(\phi) \end{pmatrix} = \begin{pmatrix} r \\ 0 \end{pmatrix} ]

In the last step I’ve used the polar form of the vector $\vec{x}$ described by it’s length $r = \sqrt{x_1^2 + x_2^2} = | \vec{x} |_2$ and it’s polar angle $\phi = arg(\vec{x}) \in [0 ; 2\pi)$. This leads to a set of two linear equations:

[ c * r * cos(\phi) + s * r * sin(\phi) = r \\ -s * r * sin(\phi) + c * r * cos(\phi) = 0 ]

Solving the system of linear equations one simply yields the rotation matrix:

[ c = cos(\phi) = \frac{x_1}{r} \\ s = sin(\phi) = \frac{x_2}{r} \\ \frac{1}{\| \vec{x} \|_2} \begin{pmatrix} x_1 & x_2 \\ -x_2 & x_1 \end{pmatrix} ]

For numerical stability it’s sometimes interesting to express this in a slightly different way:

[ s = \frac{x_2}{r} = \frac{x_2}{\sqrt{x_1^2 + x_2^2}} = \frac{1}{\sqrt{\frac{x_1^2}{x_2^2} + 1}} \\ c = \frac{x_1}{r} = \underbrace{\frac{x_2}{r}}_{s} * \frac{x_1}{x_2} ]

To perform QR factorization the idea is just extended into $n$ dimensional space. The givens rotations in the plane spawned by basis vectors $\hat{e_i}$ and $\hat{e_j}$ can be expressed as:

[ G_{ik} = \begin{pmatrix} 1 & 0 & \ldots & \ldots & \ldots & \ldots & \ldots & \ldots & \ldots & \ldots & 0 \\ 0 & 1 & \ldots & \ldots & \ldots & \ldots & \ldots & \ldots & \ldots & \ldots & 0 \\ 0 & \ldots & c & \ldots & \ldots & \ldots & s & \ldots & \ldots & \ldots & 0 \\ 0 & \ldots & \ldots & 1 & \ldots & \ldots & \ldots & \ldots & \ldots & \ldots & 0 \\ 0 & \ldots & \ldots & \ldots & \ldots & 1 & \ldots & \ldots & \ldots & \ldots & 0 \\ 0 & \ldots & -s & \ldots & \ldots & \ldots & c & \ldots & \ldots & \ldots & 0 \\ 0 & \ldots & \ldots & \ldots & \ldots & \ldots & \ldots & 1 & \ldots & \ldots & 0 \\ 0 & \ldots & \ldots & \ldots & \ldots & \ldots & \ldots & \ldots & \ldots & \ldots & 0 \\ 0 & \ldots & \ldots & \ldots & \ldots & \ldots & \ldots & \ldots & \ldots & \ldots & 1 \end{pmatrix} \\ G_{ik} \in \mathfrak{R}^{n \times n} ]

Applying a Givens rotation to an arbitrary vector $\vec{x} \in \mathfrak{R}^{n}$ gives:

[ G_{ik} * \vec{x} = G_{ik} * \begin{pmatrix} x_1 \\ \ldots \\ x_n \end{pmatrix} = \begin{pmatrix} x_1 \\ \ldots \\ c x_i + s x_k \\ \ldots \\ -s x_i + c x_k \\ \ldots \\ x_n \end{pmatrix} \\ G_{ik} * \vec{x} = \begin{pmatrix} x_1 \\ \ldots \\ \frac{1}{r} (x_i^2 + x_k^2) \\ \ldots \\ \frac{1}{r} \underbrace{(-x_i x_k + x_i x_k)}_{0} \\ \ldots \\ x_n \end{pmatrix} \\ G_{ik} * \vec{x} = \begin{pmatrix} x_1 \\ \ldots \\ r \\ \ldots \\ 0 \\ \ldots \\ x_n \end{pmatrix} ]

As one sees this produces a zero in the $k$ row of the vector. To perform the full QR decomposition one now applies one Givens rotation after each other to set all elements below the diagonal to zero:

[ G_{i_N,k_N} * \ldots * G_{i_1,k_1} * A = R ]

The resulting matrix $R$ is already the second part of our QR decomposition. The inverse of the rotations (which is simply formed by transposition for a orthogonal rotation) yields the orthogonal $Q$ part:

[ A = \underbrace{G_{i_1,k_1}^T * \ldots * G_{i_N,k_N}^T}_{Q} * R = Q * R ]

Python demonstration

The following GitHub GIST contains a simple implementation of the described decomposition algorithm. Note that the used matrix class is of course not performant and should not be used for realworld applications - especially not in Python. This implementation only serves demonstration purposes. This notebook is also available as a PDF