Spectral Learning for Supervised Topic Models

02/19/2016
by   Yong Ren, et al.
0

Supervised topic models simultaneously model the latent topic structure of large collections of documents and a response variable associated with each document. Existing inference methods are based on variational approximation or Monte Carlo sampling, which often suffers from the local minimum defect. Spectral methods have been applied to learn unsupervised topic models, such as latent Dirichlet allocation (LDA), with provable guarantees. This paper investigates the possibility of applying spectral methods to recover the parameters of supervised LDA (sLDA). We first present a two-stage spectral method, which recovers the parameters of LDA followed by a power update method to recover the regression model parameters. Then, we further present a single-phase spectral algorithm to jointly recover the topic distribution matrix as well as the regression weights. Our spectral algorithms are provably correct and computationally efficient. We prove a sample complexity bound for each algorithm and subsequently derive a sufficient condition for the identifiability of sLDA. Thorough experiments on synthetic and real-world datasets verify the theory and demonstrate the practical effectiveness of the spectral algorithms. In fact, our results on a large-scale review rating dataset demonstrate that our single-phase spectral algorithm alone gets comparable or even better performance than state-of-the-art methods, while previous work on spectral methods has rarely reported such promising performance.

READ FULL TEXT
research
12/10/2013

Guaranteed Model Order Estimation and Sample Complexity Bounds for LDA

The question of how to determine the number of independent latent factor...
research
08/17/2018

Learning Supervised Topic Models for Classification and Regression from Crowds

The growing need to analyze large collections of documents has led to gr...
research
07/07/2015

Rethinking LDA: moment matching for discrete ICA

We consider moment matching techniques for estimation in Latent Dirichle...
research
05/30/2016

Spectral Methods for Correlated Topic Models

In this paper, we propose guaranteed spectral methods for learning a bro...
research
12/09/2020

EvaLDA: Efficient Evasion Attacks Towards Latent Dirichlet Allocation

As one of the most powerful topic models, Latent Dirichlet Allocation (L...
research
10/26/2014

A provable SVD-based algorithm for learning topics in dominant admixture corpus

Topic models, such as Latent Dirichlet Allocation (LDA), posit that docu...
research
06/06/2018

Spectral Inference Networks: Unifying Spectral Methods With Deep Learning

We present Spectral Inference Networks, a framework for learning eigenfu...

Please sign up or login with your details

Forgot password? Click here to reset