Capitalization and Punctuation Restoration: a Survey

11/21/2021
by   Vasile Pais, et al.
0

Ensuring proper punctuation and letter casing is a key pre-processing step towards applying complex natural language processing algorithms. This is especially significant for textual sources where punctuation and casing are missing, such as the raw output of automatic speech recognition systems. Additionally, short text messages and micro-blogging platforms offer unreliable and often wrong punctuation and casing. This survey offers an overview of both historical and state-of-the-art techniques for restoring punctuation and correcting word casing. Furthermore, current challenges and research directions are highlighted.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/04/2022

Vietnamese Capitalization and Punctuation Recovery Models

Despite the rise of recent performant methods in Automatic Speech Recogn...
research
02/19/2022

Punctuation Restoration

Given the increasing number of livestreaming videos, automatic speech re...
research
12/21/2022

End-to-End Automatic Speech Recognition model for the Sudanese Dialect

Designing a natural voice interface rely mostly on Speech recognition fo...
research
03/09/2020

Deep Neural Networks for Automatic Speech Processing: A Survey from Large Corpora to Limited Data

Most state-of-the-art speech systems are using Deep Neural Networks (DNN...
research
07/15/2020

A Survey on Computational Propaganda Detection

Propaganda campaigns aim at influencing people's mindset with the purpos...
research
08/24/2023

Sparks of Large Audio Models: A Survey and Outlook

This survey paper provides a comprehensive overview of the recent advanc...
research
08/10/2023

Optical Script Identification for multi-lingual Indic-script

Script identification and text recognition are some of the major domains...

Please sign up or login with your details

Forgot password? Click here to reset