Aimilios Chalamandaris

research

∙ 11/29/2022

Controllable speech synthesis by learning discrete phoneme-level prosodic representations

In this paper, we present a novel method for phoneme-level prosody contr...

0 Nikolaos Ellinas, et al. ∙

research

∙ 11/02/2022

Predicting phoneme-level prosody latents using AR and flow-based Prior Networks for expressive speech synthesis

A large part of the expressive speech synthesis literature focuses on le...

0 Konstantinos Klapsas, et al. ∙

research

∙ 11/01/2022

Learning utterance-level representations through token-level acoustic latents prediction for Expressive Speech Synthesis

This paper proposes an Expressive Speech Synthesis model that utilizes t...

0 Karolos Nikitaras, et al. ∙

research

∙ 11/01/2022

Generating Gender-Ambiguous Text-to-Speech Voices

The gender of a voice assistant or any voice user interface is a central...

0 Konstantinos Markopoulos, et al. ∙

research

∙ 11/01/2022

Investigating Content-Aware Neural Text-To-Speech MOS Prediction Using Prosodic and Linguistic Features

Current state-of-the-art methods for automatic synthetic speech evaluati...

0 Alexandra Vioni, et al. ∙

research

∙ 10/31/2022

Cross-lingual Text-To-Speech with Flow-based Voice Conversion for Improved Pronunciation

This paper presents a method for end-to-end cross-lingual text-to-speech...

0 Nikolaos Ellinas, et al. ∙

research

∙ 04/11/2022

Fine-grained Noise Control for Multispeaker Speech Synthesis

A text-to-speech (TTS) model typically factorizes speech attributes such...

0 Karolos Nikitaras, et al. ∙

research

∙ 04/08/2022

Karaoker: Alignment-free singing voice synthesis with speech training data

Existing singing voice synthesis models (SVS) are usually trained on sin...

0 Panos Kakoulidis, et al. ∙

research

∙ 04/07/2022

Self supervised learning for robust voice cloning

Voice cloning is a difficult task which requires robust and informative ...

0 Konstantinos Klapsas, et al. ∙

research

∙ 04/06/2022

SOMOS: The Samsung Open MOS Dataset for the Evaluation of Neural Text-to-Speech Synthesis

In this work, we present the SOMOS dataset, the first large-scale mean o...

0 Georgia Maniati, et al. ∙

research

∙ 11/19/2021

Prosodic Clustering for Phoneme-level Prosody Control in End-to-End Speech Synthesis

This paper presents a method for controlling the prosody at the phoneme ...

0 Alexandra Vioni, et al. ∙

research

∙ 11/19/2021

Improved Prosodic Clustering for Multispeaker and Speaker-independent Phoneme-level Prosody Control

This paper presents a method for phoneme-level prosody control of F0 and...

0 Myrsini Christidou, et al. ∙

research

∙ 11/17/2021

Rapping-Singing Voice Synthesis based on Phoneme-level Prosody Control

In this paper, a text-to-rapping/singing system is introduced, which can...

0 Konstantinos Markopoulos, et al. ∙

research

∙ 11/17/2021

Cross-lingual Low Resource Speaker Adaptation Using Phonological Features

The idea of using phonological features instead of phonemes as input to ...

0 Georgia Maniati, et al. ∙

research

∙ 11/17/2021

High Quality Streaming Speech Synthesis with Low, Sentence-Length-Independent Latency

This paper presents an end-to-end text-to-speech system with low latency...

0 Nikolaos Ellinas, et al. ∙

Aimilios Chalamandaris

Featured Co-authors

Sign in with Google

Consider DeepAI Pro