A Review of Bangla Natural Language Processing Tasks and the Utility of Transformer Models

07/08/2021
by   Firoj Alam, et al.
2

Bangla – ranked as the 6th most widely spoken language across the world (https://www.ethnologue.com/guides/ethnologue200), with 230 million native speakers – is still considered as a low-resource language in the natural language processing (NLP) community. With three decades of research, Bangla NLP (BNLP) is still lagging behind mainly due to the scarcity of resources and the challenges that come with it. There is sparse work in different areas of BNLP; however, a thorough survey reporting previous work and recent advances is yet to be done. In this study, we first provide a review of Bangla NLP tasks, resources, and tools available to the research community; we benchmark datasets collected from various platforms for nine NLP tasks using current state-of-the-art algorithms (i.e., transformer-based models). We provide comparative results for the studied NLP tasks by comparing monolingual vs. multilingual models of varying sizes. We report our results using both individual and consolidated datasets and provide data splits for future research. We reviewed a total of 108 papers and conducted 175 sets of experiments. Our results show promising performance using transformer-based models while highlighting the trade-off with computational costs. We hope that such a comprehensive survey will motivate the community to build on and further advance the research on Bangla NLP.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/25/2023

Natural Language Processing in Ethiopian Languages: Current State, Challenges, and Opportunities

This survey delves into the current state of natural language processing...
research
12/16/2020

No Budget? Don't Flex! Cost Consideration when Planning to Adopt NLP for Your Business

Recent advances in Natural Language Processing (NLP) have largely pushed...
research
02/15/2022

A Survey on Model Compression for Natural Language Processing

With recent developments in new architectures like Transformer and pretr...
research
05/23/2022

BanglaNLG: Benchmarks and Resources for Evaluating Low-Resource Natural Language Generation in Bangla

This work presents BanglaNLG, a comprehensive benchmark for evaluating n...
research
01/21/2023

Rationalization for Explainable NLP: A Survey

Recent advances in deep learning have improved the performance of many N...
research
06/20/2023

Exploring New Frontiers in Agricultural NLP: Investigating the Potential of Large Language Models for Food Applications

This paper explores new frontiers in agricultural natural language proce...
research
07/14/2023

Investigating ChatGPT's Potential to Assist in Requirements Elicitation Processes

Natural Language Processing (NLP) for Requirements Engineering (RE) (NLP...

Please sign up or login with your details

Forgot password? Click here to reset