Random Projections for k-means Clustering

11/21/2010
by   Christos Boutsidis, et al.
0

This paper discusses the topic of dimensionality reduction for k-means clustering. We prove that any set of n points in d dimensions (rows in a matrix A ∈^n × d) can be projected into t = Ω(k / ^2) dimensions, for any ∈ (0,1/3), in O(n d ^-2 k/ (d) ) time, such that with constant probability the optimal k-partition of the point set is preserved within a factor of 2+. The projection is done by post-multiplying A with a d × t random matrix R having entries +1/√(t) or -1/√(t) with equal probability. A numerical implementation of our technique and experiments on a large face images dataset verify the speed and the accuracy of our theoretical results.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset