A Generative Model for Raw Audio Using Transformer Architectures

06/30/2021
by   Prateek Verma, et al.
0

This paper proposes a novel way of doing audio synthesis at the waveform level using Transformer architectures. We propose a deep neural network for generating waveforms, similar to wavenet. This is fully probabilistic, auto-regressive, and causal, i.e. each sample generated depends only on the previously observed samples. Our approach outperforms a widely used wavenet architecture by up to 9 Using the attention mechanism, we enable the architecture to learn which audio samples are important for the prediction of the future sample. We show how causal transformer generative models can be used for raw waveform synthesis. We also show that this performance can be improved by another 2 samples over a wider context. The flexibility of the current model to synthesize audio from latent representations suggests a large number of potential applications. The novel approach of using generative transformer architectures for raw audio synthesis is, however, still far away from generating any meaningful music, without using latent codes/meta-data to aid the generation process.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/06/2018

FloWaveNet : A Generative Flow for Raw Audio

Most of modern text-to-speech architectures use a WaveNet vocoder for sy...
research
11/09/2021

RAVE: A variational autoencoder for fast and high-quality neural audio synthesis

Deep generative models applied to audio have improved by a large margin ...
research
06/16/2022

GoodBye WaveNet – A Language Model for Raw Audio with Context of 1/2 Million Samples

Modeling long-term dependencies for audio signals is a particularly chal...
research
11/26/2019

SchrödingeRNN: Generative Modeling of Raw Audio as a Continuously Observed Quantum State

We introduce SchrödingeRNN, a quantum inspired generative model for raw ...
research
05/24/2023

Sound Design Strategies for Latent Audio Space Explorations Using Deep Learning Architectures

The research in Deep Learning applications in sound and music computing ...
research
09/12/2016

WaveNet: A Generative Model for Raw Audio

This paper introduces WaveNet, a deep neural network for generating raw ...
research
10/12/2022

JukeDrummer: Conditional Beat-aware Audio-domain Drum Accompaniment Generation via Transformer VQ-VA

This paper proposes a model that generates a drum track in the audio dom...

Please sign up or login with your details

Forgot password? Click here to reset