08 Dec 2021 - tsp
Last update 20 Mar 2022
7 mins
TL;DR:
No pivoting required
The following article is a short summary (recipe) on how to perform QR factorization using Givens rotations. This has been written since I’m in process of writing a short summary on how techniques like the generalized minimal residual method (GMRES) for for numerical solutions of linear equations work since this is an often encountered problem in stuff like numerically approximating solutions for electromagnetic fields (when FEM or FTDM modeling of Maxwell equations), performing calculations solutions such as finite element calculations of stresses, for solving fluid dynamics equations, doing social network simulations, etc. - basically for every problem requiring the solution of systems specified by local differential equations. It also has applications in truss constructions, optimizations in business as well as of course quantum mechanics and computational chemistry. QR factorization can be used to solve the least squares problem. The article is part of a mini series on techniques in linear algebra such as power iterations which are used in similar context (Eigenmode solvers for electromagnetic radiation, economic fraud prevention, social network analysis, etc.). Note that it should not provide a mathematical found and solid formulation but rather provide my view of the concept and the picture I have how this stuff works - my personal physicists view. For a mathematical found and rigid description refer to your favorite math lecture notes.
What is a QR decomposition anyways? Basically it allows one to decompose a matrix $A \in \mathfrak{R}^{m \times n}$ that might not be invertible into a orthogonal matrix $Q \in \mathfrak{R}^{m \times m}$ and a upper tridiagonal matrix $R \in \mathfrak{R}^{m \times n}$ with $A = Q * R$.
Recall that orthogonal means $Q^T Q = Q Q^T = \mathfrak{I}$ and thus $Q^{-1} = Q^T$. The upper triangular matrix is also invertible as long as the diagonal elements are not zero - there are some constructions where it’s obvious. To actually perform the QR decomposition there are many different approaches. The most popular ones are:
So what’s the idea of Givens rotations. These are orthogonal rotations that are used to form the upper triangular matrix $R$ by eliminating every non zero element left below the diagonal - one element at a time. The combination of the transformation matrices forming orthogonal rotations is still orthogonal - so this will then be our matrix $Q$. Since $Q$ is orthogonal one then also has determined $Q^{-1} = Q^T$ which is important for stuff like least squares and thus algorithms like GMRES. These also avoid the inversion of $R$.
To illustrate the idea one can look at the two dimensional case ($\mathfrak{R}^2$) at first when trying to apply an orthogonal rotation matrix in a way to rotate a vector $\vec{x}$ into direction of the first unit vector $\hat{e_1}$:
[ \vec{x} = \begin{pmatrix} x_1 \\ x_2 \end{pmatrix} \\ \begin{pmatrix} c & s \\ -s & c \end{pmatrix} * \begin{pmatrix} x_1 \\ x_2 \end{pmatrix} = \begin{pmatrix} r \\ 0 \end{pmatrix} \\ \begin{pmatrix} c & s \\ -s & c \end{pmatrix} * \begin{pmatrix} r * cos(\phi) \\ r * sin(\phi) \end{pmatrix} = \begin{pmatrix} r \\ 0 \end{pmatrix} ]In the last step I’ve used the polar form of the vector $\vec{x}$ described by it’s length $r = \sqrt{x_1^2 + x_2^2} = | \vec{x} |_2$ and it’s polar angle $\phi = arg(\vec{x}) \in [0 ; 2\pi)$. This leads to a set of two linear equations:
[ c * r * cos(\phi) + s * r * sin(\phi) = r \\ -s * r * sin(\phi) + c * r * cos(\phi) = 0 ]Solving the system of linear equations one simply yields the rotation matrix:
[ c = cos(\phi) = \frac{x_1}{r} \\ s = sin(\phi) = \frac{x_2}{r} \\ \frac{1}{\| \vec{x} \|_2} \begin{pmatrix} x_1 & x_2 \\ -x_2 & x_1 \end{pmatrix} ]For numerical stability it’s sometimes interesting to express this in a slightly different way:
[ s = \frac{x_2}{r} = \frac{x_2}{\sqrt{x_1^2 + x_2^2}} = \frac{1}{\sqrt{\frac{x_1^2}{x_2^2} + 1}} \\ c = \frac{x_1}{r} = \underbrace{\frac{x_2}{r}}_{s} * \frac{x_1}{x_2} ]To perform QR factorization the idea is just extended into $n$ dimensional space. The givens rotations in the plane spawned by basis vectors $\hat{e_i}$ and $\hat{e_j}$ can be expressed as:
[ G_{ik} = \begin{pmatrix} 1 & 0 & \ldots & \ldots & \ldots & \ldots & \ldots & \ldots & \ldots & \ldots & 0 \\ 0 & 1 & \ldots & \ldots & \ldots & \ldots & \ldots & \ldots & \ldots & \ldots & 0 \\ 0 & \ldots & c & \ldots & \ldots & \ldots & s & \ldots & \ldots & \ldots & 0 \\ 0 & \ldots & \ldots & 1 & \ldots & \ldots & \ldots & \ldots & \ldots & \ldots & 0 \\ 0 & \ldots & \ldots & \ldots & \ldots & 1 & \ldots & \ldots & \ldots & \ldots & 0 \\ 0 & \ldots & -s & \ldots & \ldots & \ldots & c & \ldots & \ldots & \ldots & 0 \\ 0 & \ldots & \ldots & \ldots & \ldots & \ldots & \ldots & 1 & \ldots & \ldots & 0 \\ 0 & \ldots & \ldots & \ldots & \ldots & \ldots & \ldots & \ldots & \ldots & \ldots & 0 \\ 0 & \ldots & \ldots & \ldots & \ldots & \ldots & \ldots & \ldots & \ldots & \ldots & 1 \end{pmatrix} \\ G_{ik} \in \mathfrak{R}^{n \times n} ]Applying a Givens rotation to an arbitrary vector $\vec{x} \in \mathfrak{R}^{n}$ gives:
[ G_{ik} * \vec{x} = G_{ik} * \begin{pmatrix} x_1 \\ \ldots \\ x_n \end{pmatrix} = \begin{pmatrix} x_1 \\ \ldots \\ c x_i + s x_k \\ \ldots \\ -s x_i + c x_k \\ \ldots \\ x_n \end{pmatrix} \\ G_{ik} * \vec{x} = \begin{pmatrix} x_1 \\ \ldots \\ \frac{1}{r} (x_i^2 + x_k^2) \\ \ldots \\ \frac{1}{r} \underbrace{(-x_i x_k + x_i x_k)}_{0} \\ \ldots \\ x_n \end{pmatrix} \\ G_{ik} * \vec{x} = \begin{pmatrix} x_1 \\ \ldots \\ r \\ \ldots \\ 0 \\ \ldots \\ x_n \end{pmatrix} ]As one sees this produces a zero in the $k$ row of the vector. To perform the full QR decomposition one now applies one Givens rotation after each other to set all elements below the diagonal to zero:
[ G_{i_N,k_N} * \ldots * G_{i_1,k_1} * A = R ]The resulting matrix $R$ is already the second part of our QR decomposition. The inverse of the rotations (which is simply formed by transposition for a orthogonal rotation) yields the orthogonal $Q$ part:
[ A = \underbrace{G_{i_1,k_1}^T * \ldots * G_{i_N,k_N}^T}_{Q} * R = Q * R ]The following GitHub GIST contains a simple implementation of the described decomposition algorithm. Note that the used matrix class is of course not performant and should not be used for realworld applications - especially not in Python. This implementation only serves demonstration purposes. This notebook is also available as a PDF
This article is tagged: Math, Programming, Physics, How stuff works, Machine learning, Computational linear algebra
Dipl.-Ing. Thomas Spielauer, Wien (webcomplains389t48957@tspi.at)
This webpage is also available via TOR at http://rh6v563nt2dnxd5h2vhhqkudmyvjaevgiv77c62xflas52d5omtkxuid.onion/