Neural text-to-speech systems are often optimized on L1/L2 losses, which...
We present a scalable method to produce high quality emphasis for
text-t...
We present eCat, a novel end-to-end multispeaker model capable of: a)
ge...
Generating expressive and contextually appropriate prosody remains a
cha...
Duration modelling has become an important research problem once more wi...
In this paper, we present CopyCat2 (CC2), a novel model capable of: a)
s...
We propose a novel Multi-Scale Spectrogram (MSS) modelling approach to
s...
Many factors influence speech yielding different renditions of a given
s...
In this paper, we introduce Kathaka, a model trained with a novel two-st...
The objective of this paper is to rectify any monocular image by computi...