Poisson Kernel-Based Clustering on the Sphere: Convergence Properties, Identifiability, and a Method of Sampling
Many applications of interest involve data that can be analyzed as unit vectors on a d-dimensional sphere. Specific examples include text mining, in particular clustering of documents, biology, astronomy and medicine among others. Previous work has proposed a clustering method using mixtures of Poisson kernel-based distributions (PKBD) on the sphere. We prove identifiability of mixtures of the aforementioned model, convergence of the associated EM-type algorithm and study its operational characteristics. Furthermore, we propose an empirical densities distance plot for estimating the number of clusters in a PKBD model. Finally, we propose a method to simulate data from Poisson kernel-based densities and exemplify our methods via application on real data sets and simulation experiments.
READ FULL TEXT