08 Dec 2021 - tsp
Last update 20 Mar 2022
7 mins
TL;DR:
No pivoting required
The following article is a short summary (recipe) on how to perform QR factorization using Givens rotations. This has been written since Iβm in process of writing a short summary on how techniques like the generalized minimal residual method (GMRES) for for numerical solutions of linear equations work since this is an often encountered problem in stuff like numerically approximating solutions for electromagnetic fields (when FEM or FTDM modeling of Maxwell equations), performing calculations solutions such as finite element calculations of stresses, for solving fluid dynamics equations, doing social network simulations, etc. - basically for every problem requiring the solution of systems specified by local differential equations. It also has applications in truss constructions, optimizations in business as well as of course quantum mechanics and computational chemistry. QR factorization can be used to solve the least squares problem. The article is part of a mini series on techniques in linear algebra such as power iterations which are used in similar context (Eigenmode solvers for electromagnetic radiation, economic fraud prevention, social network analysis, etc.). Note that it should not provide a mathematical found and solid formulation but rather provide my view of the concept and the picture I have how this stuff works - my personal physicists view. For a mathematical found and rigid description refer to your favorite math lecture notes.
What is a QR decomposition anyways? Basically it allows one to decompose a matrix AβRmΓnAβRmΓn that might not be invertible into a orthogonal matrix QβRmΓmQβRmΓm and a upper tridiagonal matrix RβRmΓnRβRmΓn with A=QβRA=QβR.
Recall that orthogonal means QTQ=QQT=IQTQ=QQT=I and thus Qβ1=QTQβ1=QT. The upper triangular matrix is also invertible as long as the diagonal elements are not zero - there are some constructions where itβs obvious. To actually perform the QR decomposition there are many different approaches. The most popular ones are:
So whatβs the idea of Givens rotations. These are orthogonal rotations that are used to form the upper triangular matrix RR by eliminating every non zero element left below the diagonal - one element at a time. The combination of the transformation matrices forming orthogonal rotations is still orthogonal - so this will then be our matrix QQ. Since QQ is orthogonal one then also has determined Qβ1=QTQβ1=QT which is important for stuff like least squares and thus algorithms like GMRES. These also avoid the inversion of RR.
To illustrate the idea one can look at the two dimensional case (R2R2) at first when trying to apply an orthogonal rotation matrix in a way to rotate a vector βxβx into direction of the first unit vector ^e1^e1:
βx=(x1x2)(csβsc)β(x1x2)=(r0)(csβsc)β(rβcos(Ο)rβsin(Ο))=(r0)βx=(x1x2)(csβsc)β(x1x2)=(r0)(csβsc)β(rβcos(Ο)rβsin(Ο))=(r0)In the last step Iβve used the polar form of the vector βxβx described by itβs length r=βx21+x22=|βx|2r=βx21+x22=|βx|2 and itβs polar angle Ο=arg(βx)β[0;2Ο)Ο=arg(βx)β[0;2Ο). This leads to a set of two linear equations:
cβrβcos(Ο)+sβrβsin(Ο)=rβsβrβsin(Ο)+cβrβcos(Ο)=0cβrβcos(Ο)+sβrβsin(Ο)=rβsβrβsin(Ο)+cβrβcos(Ο)=0Solving the system of linear equations one simply yields the rotation matrix:
c=cos(Ο)=x1rs=sin(Ο)=x2r1ββxβ2(x1x2βx2x1)c=cos(Ο)=x1rs=sin(Ο)=x2r1β₯βxβ₯2(x1x2βx2x1)For numerical stability itβs sometimes interesting to express this in a slightly different way:
s=x2r=x2βx21+x22=1βx21x22+1c=x1r=x2rβsβx1x2s=x2r=x2βx21+x22=1βx21x22+1c=x1r=x2rξ ξ ξ ξ sβx1x2To perform QR factorization the idea is just extended into nn dimensional space. The givens rotations in the plane spawned by basis vectors ^ei^ei and ^ej can be expressed as:
Gik=(10β¦β¦β¦β¦β¦β¦β¦β¦001β¦β¦β¦β¦β¦β¦β¦β¦00β¦cβ¦β¦β¦sβ¦β¦β¦00β¦β¦1β¦β¦β¦β¦β¦β¦00β¦β¦β¦β¦1β¦β¦β¦β¦00β¦βsβ¦β¦β¦cβ¦β¦β¦00β¦β¦β¦β¦β¦β¦1β¦β¦00β¦β¦β¦β¦β¦β¦β¦β¦β¦00β¦β¦β¦β¦β¦β¦β¦β¦β¦1)GikβRnΓnApplying a Givens rotation to an arbitrary vector βxβRn gives:
Gikββx=Gikβ(x1β¦xn)=(x1β¦cxi+sxkβ¦βsxi+cxkβ¦xn)Gikββx=(x1β¦1r(x2i+x2k)β¦1r(βxixk+xixk)β0β¦xn)Gikββx=(x1β¦rβ¦0β¦xn)As one sees this produces a zero in the k row of the vector. To perform the full QR decomposition one now applies one Givens rotation after each other to set all elements below the diagonal to zero:
GiN,kNββ¦βGi1,k1βA=RThe resulting matrix R is already the second part of our QR decomposition. The inverse of the rotations (which is simply formed by transposition for a orthogonal rotation) yields the orthogonal Q part:
A=GTi1,k1ββ¦βGTiN,kNβQβR=QβRThe following GitHub GIST contains a simple implementation of the described decomposition algorithm. Note that the used matrix class is of course not performant and should not be used for realworld applications - especially not in Python. This implementation only serves demonstration purposes. This notebook is also available as a PDF
This article is tagged: Math, Programming, Physics, How stuff works, Machine learning, Computational linear algebra
Dipl.-Ing. Thomas Spielauer, Wien (webcomplains389t48957@tspi.at)
This webpage is also available via TOR at http://rh6v563nt2dnxd5h2vhhqkudmyvjaevgiv77c62xflas52d5omtkxuid.onion/