08 Dec 2021 - tsp

Last update 02 Jan 2022

7 mins

**TL;DR:**

- QR decomposition of a matrix $A \in \mathfrak{R}^{m \times n}$ produces an orthogonal matrix $Q \in \mathfrak{R}^{m \times m}$ and an upper triangular matrix $R \in \mathfrak{R}^{m \times n}$ such that $A = Q * R$
- $Q$ and $R$ are easily invertible even if $A$ is not
- Can be done using Householder reflections or Givens rotations
- Givens rotations require $\mathcal{O}(\frac{4}{3}n^3)$ multiplications / divisions and $\mathcal{O}(\frac{1}{2} n^2)$ square roots, that’s double the cost as for Householder reflections
- Can be embedded in some particular algorithms such as GMRES pretty efficiently when done by Givens rotations
- No pivoting required

The following article is a short summary (recipe) on how to perform QR factorization
using Givens rotations. This has been written since I’m in process of writing
a short summary on how techniques like the *generalized minimal residual method* (GMRES)
for for numerical solutions of linear equations work since this is an often encountered
problem in stuff like numerically approximating solutions for electromagnetic
fields (when FEM or FTDM modeling of Maxwell equations),
performing calculations solutions such as finite element calculations of stresses,
for solving fluid dynamics equations, doing social network simulations, etc. - basically
for every problem requiring the solution of systems specified by local differential
equations. It also has applications in truss constructions, optimizations in business
as well as of course quantum mechanics and computational chemistry. QR factorization
can be used to solve the *least squares problem*. The article is part of a mini series on
techniques in linear algebra such as power iterations
which are used in similar context (Eigenmode solvers for electromagnetic radiation,
economic fraud prevention, social network analysis, etc.). Note that it should
not provide a mathematical found and solid formulation but rather provide my view
of the concept and the picture I have how this stuff works - my personal physicists
view. For a mathematical found and rigid description refer to your favorite math
lecture notes.

What is a QR decomposition anyways? Basically it allows one to decompose a matrix $A \in \mathfrak{R}^{m \times n}$ that might not be invertible into a orthogonal matrix $Q \in \mathfrak{R}^{m \times m}$ and a upper tridiagonal matrix $R \in \mathfrak{R}^{m \times n}$ with $A = Q * R$.

Recall that orthogonal means $Q^T Q = Q Q^T = \mathfrak{I}$ and thus $Q^{-1} = Q^T$. The upper triangular matrix is also invertible as long as the diagonal elements are not zero - there are some constructions where it’s obvious. To actually perform the QR decomposition there are many different approaches. The most popular ones are:

- Householder reflections. This is by far the most efficient one - but I won’t talk about this method in this article since I’m more interested in …
- Using givens rotations. In contrast to householder reflections this produces single zeros in the upper triangular matrix one at a time by performing rotations in orthogonal planes - which makes the calculation of the inverse matrix really easy.
- One can also apply Gram-Schmidt orthogonalization - but since this is numerically unstable one usually does not use it in numerics.

So what’s the idea of Givens rotations. These are orthogonal rotations that are used to form the upper triangular matrix $R$ by eliminating every non zero element left below the diagonal - one element at a time. The combination of the transformation matrices forming orthogonal rotations is still orthogonal - so this will then be our matrix $Q$. Since $Q$ is orthogonal one then also has determined $Q^{-1} = Q^T$ which is important for stuff like least squares and thus algorithms like GMRES. These also avoid the inversion of $R$.

To illustrate the idea one can look at the two dimensional case ($\mathfrak{R}^2$) at first when trying to apply an orthogonal rotation matrix in a way to rotate a vector $\vec{x}$ into direction of the first unit vector $\hat{e_1}$:

[ \vec{x} = \begin{pmatrix} x_1 \\ x_2 \end{pmatrix} \\ \begin{pmatrix} c & s \\ -s & c \end{pmatrix} * \begin{pmatrix} x_1 \\ x_2 \end{pmatrix} = \begin{pmatrix} r \\ 0 \end{pmatrix} \\ \begin{pmatrix} c & s \\ -s & c \end{pmatrix} * \begin{pmatrix} r * cos(\phi) \\ r * sin(\phi) \end{pmatrix} = \begin{pmatrix} r \\ 0 \end{pmatrix} ]In the last step I’ve used the polar form of the vector $\vec{x}$ described by it’s length $r = \sqrt{x_1^2 + x_2^2} = | \vec{x} |_2$ and it’s polar angle $\phi = arg(\vec{x}) \in [0 ; 2\pi)$. This leads to a set of two linear equations:

[ c * r * cos(\phi) + s * r * sin(\phi) = r \\ -s * r * sin(\phi) + c * r * cos(\phi) = 0 ]Solving the system of linear equations one simply yields the rotation matrix:

[ c = cos(\phi) = \frac{x_1}{r} \\ s = sin(\phi) = \frac{x_2}{r} \\ \frac{1}{\| \vec{x} \|_2} \begin{pmatrix} x_1 & x_2 \\ -x_2 & x_1 \end{pmatrix} ]For numerical stability it’s sometimes interesting to express this in a slightly different way:

[ s = \frac{x_2}{r} = \frac{x_2}{\sqrt{x_1^2 + x_2^2}} = \frac{1}{\sqrt{\frac{x_1^2}{x_2^2} + 1}} \\ c = \frac{x_1}{r} = \underbrace{\frac{x_2}{r}}_{s} * \frac{x_1}{x_2} ]To perform QR factorization the idea is just extended into $n$ dimensional space. The givens rotations in the plane spawned by basis vectors $\hat{e_i}$ and $\hat{e_j}$ can be expressed as:

[ G_{ik} = \begin{pmatrix} 1 & 0 & \ldots & \ldots & \ldots & \ldots & \ldots & \ldots & \ldots & \ldots & 0 \\ 0 & 1 & \ldots & \ldots & \ldots & \ldots & \ldots & \ldots & \ldots & \ldots & 0 \\ 0 & \ldots & c & \ldots & \ldots & \ldots & s & \ldots & \ldots & \ldots & 0 \\ 0 & \ldots & \ldots & 1 & \ldots & \ldots & \ldots & \ldots & \ldots & \ldots & 0 \\ 0 & \ldots & \ldots & \ldots & \ldots & 1 & \ldots & \ldots & \ldots & \ldots & 0 \\ 0 & \ldots & -s & \ldots & \ldots & \ldots & c & \ldots & \ldots & \ldots & 0 \\ 0 & \ldots & \ldots & \ldots & \ldots & \ldots & \ldots & 1 & \ldots & \ldots & 0 \\ 0 & \ldots & \ldots & \ldots & \ldots & \ldots & \ldots & \ldots & \ldots & \ldots & 0 \\ 0 & \ldots & \ldots & \ldots & \ldots & \ldots & \ldots & \ldots & \ldots & \ldots & 1 \end{pmatrix} \\ G_{ik} \in \mathfrak{R}^{n \times n} ]Applying a Givens rotation to an arbitrary vector $\vec{x} \in \mathfrak{R}^{n}$ gives:

[ G_{ik} * \vec{x} = G_{ik} * \begin{pmatrix} x_1 \\ \ldots \\ x_n \end{pmatrix} = \begin{pmatrix} x_1 \\ \ldots \\ c x_i + s x_k \\ \ldots \\ -s x_i + c x_k \\ \ldots \\ x_n \end{pmatrix} \\ G_{ik} * \vec{x} = \begin{pmatrix} x_1 \\ \ldots \\ \frac{1}{r} (x_i^2 + x_k^2) \\ \ldots \\ \frac{1}{r} \underbrace{(-x_i x_k + x_i x_k)}_{0} \\ \ldots \\ x_n \end{pmatrix} \\ G_{ik} * \vec{x} = \begin{pmatrix} x_1 \\ \ldots \\ r \\ \ldots \\ 0 \\ \ldots \\ x_n \end{pmatrix} ]As one sees this produces a zero in the $k$ row of the vector. To perform the full QR decomposition one now applies one Givens rotation after each other to set all elements below the diagonal to zero:

[ G_{i_N,k_N} * \ldots * G_{i_1,k_1} * A = R ]The resulting matrix $R$ is already the second part of our QR decomposition. The inverse of the rotations (which is simply formed by transposition for a orthogonal rotation) yields the orthogonal $Q$ part:

[ A = \underbrace{G_{i_1,k_1}^T * \ldots * G_{i_N,k_N}^T}_{Q} * R = Q * R ]This article is tagged: Math, Programming, Physics, How stuff works, Machine learning, Computational linear algebra

Dipl.-Ing. Thomas Spielauer, Wien (webcomplains389t48957@tspi.at)

This webpage is also available via TOR at http://jugujbrirx3irwyx.onion/