A Fast Factorization-based Approach to Robust PCA
Robust principal component analysis (RPCA) has been widely used for recovering low-rank matrices in many data mining and machine learning problems. It separates a data matrix into a low-rank part and a sparse part. The convex approach has been well studied in the literature. However, state-of-the-art algorithms for the convex approach usually have relatively high complexity due to the need of solving (partial) singular value decompositions of large matrices. A non-convex approach, AltProj, has also been proposed with lighter complexity and better scalability. Given the true rank r of the underlying low rank matrix, AltProj has a complexity of O(r^2dn), where d× n is the size of data matrix. In this paper, we propose a novel factorization-based model of RPCA, which has a complexity of O(kdn), where k is an upper bound of the true rank. Our method does not need the precise value of the true rank. From extensive experiments, we observe that AltProj can work only when r is precisely known in advance; however, when the needed rank parameter r is specified to a value different from the true rank, AltProj cannot fully separate the two parts while our method succeeds. Even when both work, our method is about 4 times faster than AltProj. Our method can be used as a light-weight, scalable tool for RPCA in the absence of the precise value of the true rank.
READ FULL TEXT