SVD/PCA
SVD
Eigenvectors and eigenvalues are incredibly useful, but the problem is that non-square matrices do not have eigenvalues or eigenvectors. The solution to this is singular value decomposition.
Given a matrixM, we want to find MV=UΣ where V (n x n) and U(m x m) are orthonormal and Σ is sort of diagonal, in that the nonzero entries are along a diagonal but the matrix is not necessarily square. SVD is then written as
Assuming that we have p nonzero eigenvalues and q eigenvalues that are just zero, we can rewrite the RHS of the equation as follows.
The right side of that matrix that we just split into two is a basis for the null space of C.
Now, we need to cheese our equation enough to the point where it can sort of become a scuffed symmetric matrix. We will use the following properties to do so.
Because M⊺M is square, we choose our V matrix to be its orthonormalized eigenvectors, corresponding to non-zero and zero eigenvalues as mentioned earlier.
Along with these cheese strats, we claim that the columns of Vcol where Vcol is the left side of the matrix we split into two earlier is an orthonormal basis for the column space of V . This makes our V matrix useful for calculations since it splits it up into a portion that's a basis for a column space (the useful part) and the portion that is a basis for the null space (useless).
As for the U matrix, we have
for each vi in Vcol.
So far, we have MV=UΣ, where the LHS is split can be split into two sections, one part for the column space MVcol and another for the null space MVnull, while the RHS is split into the diagonal portion UΣαand the rest of the matrix which is just zeroes. The two sections of each side correspond to each other.
Now, we just need to decide the values of Σ. We choose that, in order for the equation to be true, σi=λi. (where λi is an eigenvalue of M⊺M)
Procedure of determining SVD
Here is a summary of the above for finding the SVD.
Compute S=M⊺M.
Find the orthonormalized eigenvectors of Sto be cols ofV.
Compute U where the columns of it are ui=λiMvi, λi=0. (If there aren't enough for it to be a square, fill the rest with gram-schmidt)
Compute Σ where the items on the diagonal are σi=λi.
The result is M=UΣV⊺.
If the matrix is not full rank, your diagonal just gets cut off in your Σ matrix and there will be some zeroes along the diagonal.
Compact form of SVD
Having all of those zeroes is kind of cringe, so instead of the full SVD we can use a compact form. It basically cuts down the SVD so that, given r is the rank of M, we have sizes Um×r,Σr×r,and Vr×n. Unfortunately, this means that our squareness is gone, which is also cringe.
PCA
"Principal components" are the lower-dimensional structures in 2D data. For example, the principal component of a line in a 2D graph is the underlying 1D structure of the line.
We can see this "collapse of dimensionality" by looking at Σ and the number of singlular values there are. Of course, in real world applications, dimensionally-lost singular values won't perfectly be zero, but really close. I.E., if some of the singlular values are really small, don't pay attention to them.
To find the principal component is to find some vector w that, when projected onto the data, gives the least error. Note that w would be a matrix for multiple principal components.
If qi was one data point, we would be minimizing
which simplifies to
which is the same as wanting to maximize <qi,w>2.
We further simplify as
which can be written in matrix form as
We can then substitute the Q matrices with their SVDs. Doing, so, we can simplify and find that:
If we want i principal components along the rows, we pick u1...ui.
If we wanted them along the rows, we pick v1...vi.
TODO: PCA
Last updated