🐻
berkeley
  • #Despair
  • 16A/B
    • Thevenin and Norton Equivalence
    • Resistive Touchscreen
    • Trilateration and Correlation
    • RC Circuits & Transient Analysis
  • 61B
    • Testing
    • References in Java
    • Linked Lists in Java
    • Implementation Inheritance
    • Comparables and Iterators
    • Casting
    • Lists and Sets
    • Asymptotics
    • Trees
    • Hashing
    • PriorityQueue and Heap
    • Graphs, Tree Traversals
    • Tries
    • Sorting
  • Discrete Mathematics
    • Sets + Mathematical Notation
    • Logic and Proofs
    • Induction
    • Cryptography/RSA
    • Modular Arithmetic
    • Discrete Probability
  • Linear Algebra
    • Linear Equations
    • Matrix Algebra
    • Determinants
    • Vector Spaces
    • Eigenvectors and Eigenvalues
    • Projections and Orthogonality
    • SVD/PCA
  • Math 275 — Ordinary Differential Equations
    • Math 275 — Overview
    • Modeling via Differential Equations
    • Linear Systems
  • INTEGBI 35AC
    • Humans Came From Fish
    • Evolution's History
    • Typology
    • The Human Diaspora
  • 170
    • Basics
    • Divide and Conquer
    • Fast Fourier Transform
    • Graphs
    • Greedy
  • 61C
    • Pointers and Memory
    • Floating Point
    • Compliation, Assembly, Linking, Loading
    • RISC-V, Assembly
Powered by GitBook
On this page
  • SVD
  • Procedure of determining SVD
  • Compact form of SVD
  • PCA

Was this helpful?

  1. Linear Algebra

SVD/PCA

PreviousProjections and OrthogonalityNextMath 275 — Overview

Last updated 4 years ago

Was this helpful?

SVD

Eigenvectors and eigenvalues are incredibly useful, but the problem is that non-square matrices do not have eigenvalues or eigenvectors. The solution to this is singular value decomposition.

Given a matrixM\textbf{M}M, we want to find MV=UΣ\textbf{M}\textbf{V}=\textbf{U}\SigmaMV=UΣ where V\textbf{V}V (n x n) and U\textbf{U}U(m x m) are orthonormal and Σ\SigmaΣ is sort of diagonal, in that the nonzero entries are along a diagonal but the matrix is not necessarily square. SVD is then written as

M=UΣV⊺\textbf{M} = \textbf{U}\Sigma\textbf{V}^{\intercal}M=UΣV⊺

Assuming that we have ppp nonzero eigenvalues and qqq eigenvalues that are just zero, we can rewrite the RHS of the equation as follows.

M[v1⃗...vp⃗∣v⃗p+1...v⃗p+1+q]\textbf{M} \begin{bmatrix} \vec{v_1} ...\vec{v_p} | \vec{v}_{p + 1} ...\vec{v}_{p + 1 + q} \end{bmatrix}M[v1​​...vp​​∣vp+1​...vp+1+q​​]

The right side of that matrix that we just split into two is a basis for the null space of C\textbf{C}C.

Now, we need to cheese our equation enough to the point where it can sort of become a scuffed symmetric matrix. We will use the following properties to do so.

Null(M⊺M)=Null(M)(M⊺M)⊺=M⊺MEigenvalues of M⊺M are real and non-negative.\text{Null}(\textbf{M}^\intercal\textbf{M}) = \text{Null}(\textbf{M})\\ (\textbf{M}^\intercal\textbf{M})^\intercal = \textbf{M}^\intercal\textbf{M}\\ \text{Eigenvalues of }\textbf{M}^\intercal\textbf{M} \text{ are real and non-negative.}Null(M⊺M)=Null(M)(M⊺M)⊺=M⊺MEigenvalues of M⊺M are real and non-negative.

Procedure of determining SVD

Here is a summary of the above for finding the SVD.

Compact form of SVD

PCA

"Principal components" are the lower-dimensional structures in 2D data. For example, the principal component of a line in a 2D graph is the underlying 1D structure of the line.

which simplifies to

We further simplify as

which can be written in matrix form as

TODO: PCA

Because M⊺M\textbf{M}^\intercal\textbf{M}M⊺M is square, we choose our V\textbf{V}V matrix to be its orthonormalized eigenvectors, corresponding to non-zero and zero eigenvalues as mentioned earlier.

Along with these cheese strats, we claim that the columns of Vcol\textbf{V}_{col}Vcol​ where Vcol\textbf{V}_{col}Vcol​ is the left side of the matrix we split into two earlier is an orthonormal basis for the column space of V\textbf{V}V . This makes our V\textbf{V}V matrix useful for calculations since it splits it up into a portion that's a basis for a column space (the useful part) and the portion that is a basis for the null space (useless).

As for the U\textbf{U}U matrix, we have

u⃗i=Mv⃗iλi\vec{u}_i = \frac{\textbf{M}\vec{v}_i}{\sqrt{\lambda_i}}ui​=λi​​Mvi​​

for each vi⃗\vec{v_i}vi​​ in Vcol\textbf{V}_{col}Vcol​.

So far, we have MV=UΣ\textbf{M}\textbf{V}=\textbf{U}\SigmaMV=UΣ, where the LHS is split can be split into two sections, one part for the column space MVcol\textbf{M}\textbf{V}_{col}MVcol​ and another for the null space MVnull\textbf{M}\textbf{V}_{null}MVnull​, while the RHS is split into the diagonal portion UΣα\textbf{U}\Sigma_\alphaUΣα​and the rest of the matrix which is just zeroes. The two sections of each side correspond to each other.

Now, we just need to decide the values of Σ\SigmaΣ. We choose that, in order for the equation to be true, σi=λi\sigma_i=\sqrt{\lambda_i}σi​=λi​​. (where λi\lambda_iλi​ is an eigenvalue of M⊺M\textbf{M}^\intercal\textbf{M}M⊺M)

Compute S=M⊺M.\textbf{S} = \textbf{M}^\intercal\textbf{M}.S=M⊺M.

Find the orthonormalized eigenvectors of S\textbf{S}Sto be cols ofV.\textbf{V}.V.

Compute U\textbf{U}U where the columns of it are u⃗i=Mv⃗iλi\vec{u}_i = \frac{\textbf{M}\vec{v}_i}{\sqrt{\lambda_i}}ui​=λi​​Mvi​​, λi≠0.\lambda_i\neq0.λi​=0. (If there aren't enough for it to be a square, fill the rest with gram-schmidt)

Compute Σ\SigmaΣ where the items on the diagonal are σi=λi.\sigma_i=\sqrt{\lambda_i}.σi​=λi​​.

The result is M=UΣV⊺.\textbf{M} = \textbf{U}\Sigma\textbf{V}^{\intercal}.M=UΣV⊺.

If the matrix is not full rank, your diagonal just gets cut off in your Σ\SigmaΣ matrix and there will be some zeroes along the diagonal.

Having all of those zeroes is kind of cringe, so instead of the full SVD we can use a compact form. It basically cuts down the SVD so that, given rrr is the rank of M,\textbf{M},M, we have sizes Um×r,Σr×r,\textbf{U}_{m\times r}, \Sigma_{r\times r},Um×r​,Σr×r​,and Vr×n.\textbf{V}_{r\times{n}}.Vr×n​. Unfortunately, this means that our squareness is gone, which is also cringe.

We can see this "collapse of dimensionality" by looking at Σ\SigmaΣ and the number of singlular values there are. Of course, in real world applications, dimensionally-lost singular values won't perfectly be zero, but really close. I.E., if some of the singlular values are really small, don't pay attention to them.

To find the principal component is to find some vector w⃗\vec{w}w that, when projected onto the data, gives the least error. Note that w would be a matrix for multiple principal components.

If qi⃗\vec{q_i}qi​​ was one data point, we would be minimizing

∑i=1n∣∣qi⃗−<qi⃗,w⃗>w⃗∣∣2\sum_{i=1}^n||\vec{q_i} - <\vec{q_i}, \vec{w}>\vec{w}||^2i=1∑n​∣∣qi​​−<qi​​,w>w∣∣2
∑i=1n∣∣qi⃗∣∣2−<qi⃗,w⃗>2\sum_{i=1}^n||\vec{q_i}||^2 - <\vec{q_i}, \vec{w}>^2i=1∑n​∣∣qi​​∣∣2−<qi​​,w>2

which is the same as wanting to maximize <qi⃗,w⃗>2.<\vec{q_i}, \vec{w}>^2.<qi​​,w>2.

max[w⃗⊺(∑i=1nq⃗iq⃗i⊺)w⃗]\text{max}\left[\vec{w}^\intercal\left(\sum_{i=1}^n \vec{q}_i\vec{q}_i^\intercal\right)\vec{w}\right]max[w⊺(i=1∑n​q​i​q​i⊺​)w]
max[w⃗⊺QQ⊺w⃗]\text{max}\left[\vec{w}^\intercal\textbf{Q}\textbf{Q}^\intercal\vec{w}\right]max[w⊺QQ⊺w]

We can then substitute the Q\textbf{Q}Q matrices with their SVDs. Doing, so, we can simplify and find that:

If we want iii principal components along the rows, we pick u⃗1...u⃗i.\vec{u}_1 ... \vec{u}_i.u1​...ui​.

If we wanted them along the rows, we pick v⃗1...v⃗i.\vec{v}_1 ... \vec{v}_i.v1​...vi​.