Blind Source Separation in Polyphonic Music Recordings Using Deep Neural Networks Trained via Policy Gradients

07/09/2021
by   Sören Schulze, et al.
0

We propose a method for the blind separation of sounds of musical instruments in audio signals. We describe the individual tones via a parametric model, training a dictionary to capture the relative amplitudes of the harmonics. The model parameters are predicted via a U-Net, which is a type of deep neural network. The network is trained without ground truth information, based on the difference between the model prediction and the individual time frames of the short-time Fourier transform. Since some of the model parameters do not yield a useful backpropagation gradient, we model them stochastically and employ the policy gradient instead. To provide phase information and account for inaccuracies in the dictionary-based representation, we also let the network output a direct prediction, which we then use to resynthesize the audio signals for the individual instruments. Due to the flexibility of the neural network, inharmonicity can be incorporated seamlessly and no preprocessing of the input spectra is required. Our algorithm yields high-quality separation results with particularly low interference on a variety of different audio samples, both acoustic and synthetic, provided that the sample contains enough data for the training and that the spectral characteristics of the musical instruments are sufficiently stable to be approximated by the dictionary.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/14/2020

Solos: A Dataset for Audio-Visual Music Analysis

In this paper, we present a new dataset of music performance videos whic...
research
06/01/2018

Musical Instrument Separation on Shift-Invariant Spectrograms via Stochastic Dictionary Learning

We propose a method for the blind separation of audio signals from music...
research
05/03/2022

Few-Shot Musical Source Separation

Deep learning-based approaches to musical source separation are often li...
research
10/22/2019

Simultaneous Separation and Transcription of Mixtures with Multiple Polyphonic and Percussive Instruments

We present a single deep learning architecture that can both separate an...
research
06/08/2018

Wave-U-Net: A Multi-Scale Neural Network for End-to-End Audio Source Separation

Models for audio source separation usually operate on the magnitude spec...
research
10/23/2019

Bootstrapping deep music separation from primitive auditory grouping principles

Separating an audio scene such as a cocktail party into constituent, mea...
research
09/20/2021

Acoustic Echo Cancellation using Residual U-Nets

This paper presents an acoustic echo canceler based on a U-Net convoluti...

Please sign up or login with your details

Forgot password? Click here to reset