Hierarchical Latent Word Clustering

01/20/2016
by   Halid Ziya Yerebakan, et al.
0

This paper presents a new Bayesian non-parametric model by extending the usage of Hierarchical Dirichlet Allocation to extract tree structured word clusters from text data. The inference algorithm of the model collects words in a cluster if they share similar distribution over documents. In our experiments, we observed meaningful hierarchical structures on NIPS corpus and radiology reports collected from public repositories.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset