RECAP: Retrieval Augmented Music Captioner

12/21/2022
by   Zihao He, et al.
0

With the prevalence of stream media platforms serving music search and recommendation, interpreting music by understanding audio and lyrics interactively has become an important and challenging task. However, many previous works focus on refining individual components of encoder-decoder architecture mapping music to caption tokens, ignoring the potential usage of audio and lyrics correspondence. In this paper, we propose to explicitly learn the multi-modal alignment with retrieval augmentation by contrastive learning. By learning audio-lyrics correspondence, the model is guided to learn better cross-modal attention weights, thus generating high-quality caption words. We provide both theoretical and empirical results that demonstrate the advantage of the proposed method.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/21/2023

Self-Supervised Contrastive Learning for Robust Audio-Sheet Music Retrieval Systems

Linking sheet music images to audio recordings remains a key problem for...
research
03/14/2023

Improving Music Genre Classification from multi-modal properties of music and genre correlations Perspective

Music genre classification has been widely studied in past few years for...
research
08/25/2022

Contrastive Audio-Language Learning for Music

As one of the most intuitive interfaces known to humans, natural languag...
research
09/21/2023

Towards Robust and Truly Large-Scale Audio-Sheet Music Retrieval

A range of applications of multi-modal music information retrieval is ce...
research
04/30/2021

Cross-Modal Music-Video Recommendation: A Study of Design Choices

In this work, we study music/video cross-modal recommendation, i.e. reco...
research
08/24/2022

Interpreting Song Lyrics with an Audio-Informed Pre-trained Language Model

Lyric interpretations can help people understand songs and their lyrics ...
research
01/20/2023

Screen Correspondence: Mapping Interchangeable Elements between UIs

Understanding user interface (UI) functionality is a useful yet challeng...

Please sign up or login with your details

Forgot password? Click here to reset