Towards Interpretable Polyphonic Transcription with Invertible Neural Networks

09/04/2019
by   Rainer Kelz, et al.
0

We explore a novel way of conceptualising the task of polyphonic music transcription, using so-called invertible neural networks. Invertible models unify both discriminative and generative aspects in one function, sharing one set of parameters. Introducing invertibility enables the practitioner to directly inspect what the discriminative model has learned, and exactly determine which inputs lead to which outputs. For the task of transcribing polyphonic audio into symbolic form, these models may be especially useful as they allow us to observe, for instance, to what extent the concept of single notes could be learned from a corpus of polyphonic music alone (which has been identified as a serious problem in recent research). This is an entirely new approach to audio transcription, which first of all necessitates some groundwork. In this paper, we begin by looking at the simplest possible invertible transcription model, and then thoroughly investigate its properties. Finally, we will take first steps towards a more sophisticated and capable version. We use the task of piano transcription, and specifically the MAPS dataset, as a basis for these investigations.

READ FULL TEXT

page 1

page 5

page 6

research
10/29/2018

Enabling Factorized Piano Music Modeling and Generation with the MAESTRO Dataset

Generating musical audio directly with neural networks is notoriously di...
research
04/08/2022

Exploring Transformer's potential on automatic piano transcription

Most recent research about automatic music transcription (AMT) uses conv...
research
12/15/2016

Towards Score Following in Sheet Music Images

This paper addresses the matching of short music audio snippets to the c...
research
07/30/2018

Lead Sheet Generation and Arrangement by Conditional Generative Adversarial Network

Research on automatic music generation has seen great progress due to th...
research
11/01/2018

Neural Music Synthesis for Flexible Timbre Control

The recent success of raw audio waveform synthesis models like WaveNet m...
research
10/11/2022

DiffRoll: Diffusion-based Generative Music Transcription with Unsupervised Pretraining Capability

In this paper we propose a novel generative approach, DiffRoll, to tackl...
research
04/09/2018

Polyphonic Pitch Tracking with Deep Layered Learning

This paper presents a polyphonic pitch tracking system able to extract b...

Please sign up or login with your details

Forgot password? Click here to reset