Projections and Orthogonality

Dot Products

Dot products are rather simple. Let's say you have a vector vv with elements v1,v2,v3,vnv_1,v_2,v_3,v_n and so on, and a vector uu with elements u1,u2,u3,un.u_1,u_2,u_3,u_n. Then the dot product is as follows.

uv=u1v1+u2v2+unvnu \cdot v=u_1v_1+u_2v_2+u_nv_n

Here are some properties of dot products.

  • uv=vuu\cdot v=v\cdot u

  • c(uv)=(cu)v=u(cv)c(u\cdot v)=(cu)\cdot v=u\cdot(cv)

  • (u+v)p=pu+vp(u+v)\cdot p=p\cdot u + v\cdot p

It's a somewhat uncommon way of expressing it but you can also express the length (or norm) of a vector using dot products.

uu\sqrt{u\cdot u}

Orthogonal Vectors

Two vectors are orthogonal if their dot product equals zero.

The above concept is an important one to understand. It can be proven with the law of cosines, which I can't be bothered to format in LaTeX, so if you really want to know the proof here it is.

Let's talk about vector spaces WW and W.W^{\perp}.

We're gonna assume that WW is a subspace of RnR^n.

  • WW^{\perp} is vector space made of vectors where every vector vv in WW^{\perp} is orthogonal to every vector in WW

  • WW^{\perp} is a subspace of RnR^{n} .

WW^{\perp} is referred to as the orthogonal complement of W.W.

There are a few important orthogonal complements to take note of based on the vectors that we have learned.

  • (RowA)=NulA(Row A)^{\perp}=NulA

  • (ColA)=NulA(ColA)^{\perp}=NulA^{\intercal}

In words, this means that all of the columns of Row A are orthogonal to every column in Nul A. The second statement is saying all of the columns of Col A are orthogonal to every row in Nul A.

The proof for this is simply that the definition of a null space of A is that Ax=0Ax=0 , meaning that all the dot products of each row of A and x is 0, fitting the definitions of an orthogonal vector.

Orthogonal Sets

An orthogonal set is a set of vectors in which each vector in the set is orthogonal to every other vector in the set. Note that if the set contains all nonzero vectors, the vectors in the set are linearly independent.

An orthogonal basis for a subspace W in R^n is exactly what it sounds like it is.

Here is a theorem of orthogonal sets. Assuming {u1,u2,up}\{u_1, u_2, u_p\} is an orthogonal basis, the weights in the linear combination for each vector yyin WW (the orthogonal basis), y=c1u1+c2u2...etcy=c_1u_1+c_2u_2...etc the weights are cj=yujujujc_j=\frac{y\cdot u_j}{u_j\cdot u_j}.

Decomposition and Projection

Let's say we had a vector yy in Rn.R^n. We can decompose or break up into parts, that vector, into two vectors where one a multiple of some nonzero vector uu and some other vector orthogonal to it. In other words, observe the diagram.

Here is the notation for the that bottom vector — it is written as βz=y^\beta z=\hat{y}. In accordance of our original goal of decomposing the vector yy into the sum of two vectors, we find that y=y^+z.y=\hat{y}+z. However, this form useless insofar as we don't know the constant β.\beta. Let's look at a more useful form below that doesn't involve calculating the constant at all.

y^=yuuuu\hat{y}=\frac{y\cdot u}{u\cdot u}u

The Orthogonal Decomposition Theorem

Let WWbe a subspace of Rn.R^n. Then each yy inRnR^ncan be expressed uniquely in the following form.

y=y^+zy=\hat{y}+z

where y^\hat{y} is in WW and zz is in W.W^{\perp}. Keeping in mind that {u1,u2,up}\{u_1, u_2, u_p\} is an orthogonal basis for WW, the following holds true.

y^=yu1u1u1u1+yu2u2u2u2+yupupupup\hat{y}=\frac{y\cdot u_1}{u_1\cdot u_1}u_1+\frac{y\cdot u_2}{u_2\cdot u_2}u_2+\frac{y\cdot u_p}{u_p\cdot u_p}u_p

And of course, z=yy^.z=y-\hat{y}.

The Best Approximation Theorem

Given that WWis a subspace of Rn,R^n,and yyis any vector in Rn,R^n,and y^\hat{y} being the orthogonal projection of yy onto WW, y^\hat{y}is the closest point in WWtoyy.

Properties of Orthonormal Matrices

An orthonormal matrix is a matrix whose columns from an orthogonal basis in RnR^nand are all unit vectors (meaning that each vector's magnitude is one). Let's look at some theorems of orthonormal bases, given UU is an orthonormal matrix.

UU=IUU=projwyU^{\intercal}U=I\\ UU^{\intercal}=proj_w^y

The Gram-Schmidt Process

The Gram-Schmidt process is a painful method of finding an orthonormal basis for a subspace. Given any basis for a nonzero subspace WWof Rn,R^n,the following holds true.

v1=x1v2=x2(x2v1v1v1v1)v3=x3(x3v1v1v1v1+x3v2v2v2v2)vp=xp(xpv1v1v1v1+xpv2v2v2v2+xpvpvpvpvp)v_1=x_1\\ v_2=x_2-(\frac{x_2\cdot v_1}{v_1\cdot v_1}v_1)\\ v_3=x_3-(\frac{x_3\cdot v_1}{v_1\cdot v_1}v_1+\frac{x_3\cdot v_2}{v_2\cdot v_2}v_2)\\ v_p=x_p-(\frac{x_p\cdot v_1}{v_1\cdot v_1}v_1+\frac{x_p\cdot v_2}{v_2\cdot v_2}v_2+\frac{x_p\cdot v_p}{v_p\cdot v_p}v_p)

QR Decomposition

A matrix A can be decomposed into a product between orthonormal matrix Q and an upper triangular matrix R. The columns of Q can be found with the Gram-Schmidt process and can be turned into unit vectors.

A=QRA=QR

Well, we know Q is orthonormal, so we can rewrite this. Remember a property mentioned earlier, that UU=IU^{\intercal}U=I where UU is orthonormal. Therefore, we can rewrite the above equation like so.

QA=QQRQ^{\intercal}A=Q^{\intercal}QR

Which simplifies to

QA=IRQA=RQ^{\intercal}A=IR\Longrightarrow Q^{\intercal}A=R

Last updated