ANTM: An Aligned Neural Topic Model for Exploring Evolving Topics

by   Hamed Rahimi, et al.

As the amount of text data generated by humans and machines increases, the necessity of understanding large corpora and finding a way to extract insights from them is becoming more crucial than ever. Dynamic topic models are effective methods that primarily focus on studying the evolution of topics present in a collection of documents. These models are widely used for understanding trends, exploring public opinion in social networks, or tracking research progress and discoveries in scientific archives. Since topics are defined as clusters of semantically similar documents, it is necessary to observe the changes in the content or themes of these clusters in order to understand how topics evolve as new knowledge is discovered over time. In this paper, we introduce the Aligned Neural Topic Model (ANTM), a dynamic neural topic model that uses document embeddings to compute clusters of semantically similar documents at different periods and to align document clusters to represent their evolution. This alignment procedure preserves the temporal similarity of document clusters over time and captures the semantic change of words characterized by their context within different periods. Experiments on four different datasets show that ANTM outperforms probabilistic dynamic topic models (e.g. DTM, DETM) and significantly improves topic coherence and diversity over other existing dynamic neural topic models (e.g. BERTopic).


Timeline: A Dynamic Hierarchical Dirichlet Process Model for Recovering Birth/Death and Evolution of Topics in Text Stream

Topic models have proven to be a useful tool for discovering latent stru...

SimDoc: Topic Sequence Alignment based Document Similarity Framework

Document similarity is the problem of estimating the degree to which a g...

Ontology-Grounded Topic Modeling for Climate Science Research

In scientific disciplines where research findings have a strong impact o...

Neural Dynamic Focused Topic Model

Topic models and all their variants analyse text by learning meaningful ...

Domain-topic models with chained dimensions: modeling the evolution of a major oncology conference (1995-2017)

In this paper we introduce a novel approach for the computational analys...

Multilayer Networks for Text Analysis with Multiple Data Types

We are interested in the widespread problem of clustering documents and ...

Domain-topic models with chained dimensions: charting the evolution of a major oncology conference (1995-2017)

This paper presents three main contributions to the computational study ...

Please sign up or login with your details

Forgot password? Click here to reset