NCVis: Noise Contrastive Approach for Scalable Visualization
Modern methods for data visualization via dimensionality reduction, such as t-SNE, usually have performance issues that prohibit their application to large amounts of high-dimensional data. In this work, we propose NCVis – a high-performance dimensionality reduction method built on a sound statistical basis of noise contrastive estimation. We show that NCVis outperforms state-of-the-art techniques in terms of speed while preserving the representation quality of other methods. In particular, the proposed approach successfully proceeds a large dataset of more than 1 million news headlines in several minutes and presents the underlying structure in a human-readable way. Moreover, it provides results consistent with classical methods like t-SNE on more straightforward datasets like images of hand-written digits. We believe that the broader usage of such software can significantly simplify the large-scale data analysis and lower the entry barrier to this area.
READ FULL TEXT