Joint Language Identification of Code-Switching Speech using Attention based E2E Network

07/15/2019
by   Sreeram Ganji, et al.
0

Language identification (LID) has relevance in many speech processing applications. For the automatic recognition of code-switching speech, the conventional approaches often employ an LID system for detecting the languages present within an utterance. In the existing works, the LID on code-switching speech involves modelling of the underlying languages separately. In this work, we propose a joint modelling based LID system for code-switching speech. To achieve the same, an attention-based end-to-end (E2E) network has been explored. For the development and evaluation of the proposed approach, a recently created Hindi-English code-switching corpus has been used. For the contrast purpose, an LID system employing the connectionist temporal classification-based E2E network is also developed. On comparing both the LID systems, the attention based approach is noted to result in better LID accuracy. The effective location of code-switching boundaries within the utterance by the proposed approach has been demonstrated by plotting the attention weights of E2E network.

READ FULL TEXT
research
07/15/2019

Investigating Target Set Reduction for End-to-End Speech Recognition of Hindi-English Code-Switching Data

End-to-end (E2E) systems are fast replacing the conventional systems in ...
research
03/03/2021

An Attention Based Neural Network for Code Switching Detection: English Roman Urdu

Code-switching is a common phenomenon among people with diverse lingual ...
research
09/24/2018

Hindi-English Code-Switching Speech Corpus

Code-switching refers to the usage of two languages within a sentence or...
research
05/31/2023

Simple yet Effective Code-Switching Language Identification with Multitask Pre-Training and Transfer Learning

Code-switching, also called code-mixing, is the linguistics phenomenon w...
research
10/20/2022

Text Enhancement for Paragraph Processing in End-to-End Code-switching TTS

Current end-to-end code-switching Text-to-Speech (TTS) can already gener...
research
08/02/2020

Efficient Urdu Caption Generation using Attention based LSTMs

Recent advancements in deep learning has created a lot of opportunities ...
research
05/02/2022

TuGeBiC: A Turkish German Bilingual Code-Switching Corpus

In this paper we describe the process of collection, transcription, and ...

Please sign up or login with your details

Forgot password? Click here to reset