Feature Extraction Using Deep Generative Models for Bangla Text Classification on a New Comprehensive Dataset

08/21/2023
by   Md Rafi Ur Rashid, et al.
0

The selection of features for text classification is a fundamental task in text mining and information retrieval. Despite being the sixth most widely spoken language in the world, Bangla has received little attention due to the scarcity of text datasets. In this research, we collected, annotated, and prepared a comprehensive dataset of 212,184 Bangla documents in seven different categories and made it publicly accessible. We implemented three deep learning generative models: LSTM variational autoencoder (LSTM VAE), auxiliary classifier generative adversarial network (AC-GAN), and adversarial autoencoder (AAE) to extract text features, although their applications are initially found in the field of computer vision. We utilized our dataset to train these three models and used the feature space obtained in the document classification task. We evaluated the performance of the classifiers and found that the adversarial autoencoder model produced the best feature space.

READ FULL TEXT
research
06/19/2019

LIA: Latently Invertible Autoencoder with Adversarial Learning

Deep generative models play an increasingly important role in machine le...
research
03/10/2020

Generating Natural Language Adversarial Examples on a Large Scale with Generative Models

Today text classification models have been widely used. However, these c...
research
02/07/2016

Scalable Text Mining with Sparse Generative Models

The information age has brought a deluge of data. Much of this is in tex...
research
06/21/2020

Missing Features Reconstruction Using a Wasserstein Generative Adversarial Imputation Network

Missing data is one of the most common preprocessing problems. In this p...
research
08/28/2020

An Intelligent CNN-VAE Text Representation Technology Based on Text Semantics for Comprehensive Big Data

In the era of big data, a large number of text data generated by the Int...
research
09/26/2022

Adaptation of Autoencoder for Sparsity Reduction From Clinical Notes Representation Learning

When dealing with clinical text classification on a small dataset recent...
research
09/17/2019

Prediction of rare feature combinations in population synthesis: Application of deep generative modelling

In population synthesis applications, when considering populations with ...

Please sign up or login with your details

Forgot password? Click here to reset