SVD/PCA
Last updated
Last updated
Eigenvectors and eigenvalues are incredibly useful, but the problem is that non-square matrices do not have eigenvalues or eigenvectors. The solution to this is singular value decomposition.
Given a matrix, we want to find where (n x n) and (m x m) are orthonormal and is sort of diagonal, in that the nonzero entries are along a diagonal but the matrix is not necessarily square. SVD is then written as
Assuming that we have nonzero eigenvalues and eigenvalues that are just zero, we can rewrite the RHS of the equation as follows.
The right side of that matrix that we just split into two is a basis for the null space of .
Now, we need to cheese our equation enough to the point where it can sort of become a scuffed symmetric matrix. We will use the following properties to do so.
Here is a summary of the above for finding the SVD.
"Principal components" are the lower-dimensional structures in 2D data. For example, the principal component of a line in a 2D graph is the underlying 1D structure of the line.
which simplifies to
We further simplify as
which can be written in matrix form as
TODO: PCA
Because is square, we choose our matrix to be its orthonormalized eigenvectors, corresponding to non-zero and zero eigenvalues as mentioned earlier.
Along with these cheese strats, we claim that the columns of where is the left side of the matrix we split into two earlier is an orthonormal basis for the column space of . This makes our matrix useful for calculations since it splits it up into a portion that's a basis for a column space (the useful part) and the portion that is a basis for the null space (useless).
As for the matrix, we have
for each in .
So far, we have , where the LHS is split can be split into two sections, one part for the column space and another for the null space , while the RHS is split into the diagonal portion and the rest of the matrix which is just zeroes. The two sections of each side correspond to each other.
Now, we just need to decide the values of . We choose that, in order for the equation to be true, . (where is an eigenvalue of )
Compute
Find the orthonormalized eigenvectors of to be cols of
Compute where the columns of it are , (If there aren't enough for it to be a square, fill the rest with gram-schmidt)
Compute where the items on the diagonal are
The result is
If the matrix is not full rank, your diagonal just gets cut off in your matrix and there will be some zeroes along the diagonal.
Having all of those zeroes is kind of cringe, so instead of the full SVD we can use a compact form. It basically cuts down the SVD so that, given is the rank of we have sizes and Unfortunately, this means that our squareness is gone, which is also cringe.
We can see this "collapse of dimensionality" by looking at and the number of singlular values there are. Of course, in real world applications, dimensionally-lost singular values won't perfectly be zero, but really close. I.E., if some of the singlular values are really small, don't pay attention to them.
To find the principal component is to find some vector that, when projected onto the data, gives the least error. Note that w would be a matrix for multiple principal components.
If was one data point, we would be minimizing
which is the same as wanting to maximize
We can then substitute the matrices with their SVDs. Doing, so, we can simplify and find that:
If we want principal components along the rows, we pick
If we wanted them along the rows, we pick