We present SPECTRON, a novel approach to adapting pre-trained language m...
We present a noisy channel generative model of two sequences, for exampl...
This work explores the task of synthesizing speech in nonexistent
human-...
This paper introduces Parallel Tacotron 2, a non-autoregressive neural
t...
We describe a sequence-to-sequence neural network which can directly gen...
Non-saturating generative adversarial network (GAN) training is widely u...
Despite the ability to produce human-level speech for in-domain text,
at...
We present a novel generative model that combines state-of-the-art neura...
We present a multispeaker, multilingual text-to-speech (TTS) synthesis m...
Recent work has explored sequence-to-sequence latent variable models for...
Although end-to-end text-to-speech (TTS) models such as Tacotron have sh...
Global Style Tokens (GSTs) are a recently-proposed method to learn laten...
We present an extension to the Tacotron speech synthesis architecture th...
In this work, we propose "global style tokens" (GSTs), a bank of embeddi...
This paper describes Tacotron 2, a neural network architecture for speec...
Prosodic modeling is a core problem in speech synthesis. The key challen...
A text-to-speech synthesis system typically consists of multiple stages,...