Automatic methods to predict Mean Opinion Score (MOS) of listeners have ...
Cross-lingual synthesis can be defined as the task of letting a speaker
...
Recent advances in neural TTS have led to models that can produce
high-q...
This paper describes the initial steps towards the design of a robotic s...
In this paper, we propose to use deep 3-dimensional convolutional networ...
One of the challenges in Speech Emotion Recognition (SER) "in the wild" ...