a unified front-end framework for english text-to-speech synthesis

05/18/2023
by   Zelin Ying, et al.
0

The front-end is a critical component of English text-to-speech (TTS) systems, responsible for extracting linguistic features that are essential for a text-to-speech model to synthesize speech, such as prosodies and phonemes. The English TTS front-end typically consists of a text normalization (TN) module, a prosody word prosody phrase (PWPP) module, and a grapheme-to-phoneme (G2P) module. However, current research on the English TTS front-end focuses solely on individual modules, neglecting the interdependence between them and resulting in sub-optimal performance for each module. Therefore, this paper proposes a unified front-end framework that captures the dependencies among the English TTS front-end modules. Extensive experiments have demonstrated that the proposed method achieves state-of-the-art (SOTA) performance in all modules.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/11/2019

A unified sequence-to-sequence front-end model for Mandarin text-to-speech synthesis

In Mandarin text-to-speech (TTS) system, the front-end text processing m...
research
10/31/2018

A speech-based driver assisting module for Intelligent Transport System

Aim of this research is to transform images of roadside traffic panels t...
research
10/10/2017

MoNoise: Modeling Noise Using a Modular Normalization System

We propose MoNoise: a normalization model focused on generalizability an...
research
04/03/2019

What is wrong with scene text recognition model comparisons? dataset and model analysis

Many new proposals for scene text recognition (STR) models have been int...
research
04/15/2021

Proteno: Text Normalization with Limited Data for Fast Deployment in Text to Speech Systems

Developing Text Normalization (TN) systems for Text-to-Speech (TTS) on n...
research
09/16/2023

FastGraphTTS: An Ultrafast Syntax-Aware Speech Synthesis Framework

This paper integrates graph-to-sequence into an end-to-end text-to-speec...
research
06/11/2021

Sprachsynthese – State-of-the-Art in englischer und deutscher Sprache

Reading text aloud is an important feature for modern computer applicati...

Please sign up or login with your details

Forgot password? Click here to reset