Weakly-supervised Visual Instrument-playing Action Detection in Videos

05/05/2018
by   Jen-Yu Liu, et al.
0

Instrument playing is among the most common scenes in music-related videos, which represent nowadays one of the largest sources of online videos. In order to understand the instrument-playing scenes in the videos, it is important to know what instruments are played, when they are played, and where the playing actions occur in the scene. While audio-based recognition of instruments has been widely studied, the visual aspect of the music instrument playing remains largely unaddressed in the literature. One of the main obstacles is the difficulty in collecting annotated data of the action locations for training-based methods. To address this issue, we propose a weakly-supervised framework to find when and where the instruments are played in the videos. We propose to use two auxiliary models, a sound model and an object model, to provide supervisions for training the instrument-playing action model. The sound model provides temporal supervisions, while the object model provides spatial supervisions. They together can simultaneously provide temporal and spatial supervisions. The resulted model only needs to analyze the visual part of a music video to deduce which, when and where instruments are played. We found that the proposed method significantly improves the localization accuracy. We evaluate the result of the proposed method temporally and spatially on a small dataset (totally 5,400 frames) that we manually annotated.

READ FULL TEXT

page 2

page 4

page 7

page 12

page 15

research
11/03/2018

Multitask learning for frame-level instrument recognition

For many music analysis problems, we need to know the presence of instru...
research
03/28/2023

Adaptive Background Music for a Fighting Game: A Multi-Instrument Volume Modulation Approach

This paper presents our work to enhance the background music (BGM) in Da...
research
07/07/2023

LaunchpadGPT: Language Model as Music Visualization Designer on Launchpad

Launchpad is a musical instrument that allows users to create and perfor...
research
10/20/2019

Musical Instrument Playing Technique Detection Based on FCN: Using Chinese Bowed-Stringed Instrument as an Example

Unlike melody extraction and other aspects of music transcription, resea...
research
07/09/2019

An Attention Mechanism for Musical Instrument Recognition

While the automatic recognition of musical instruments has seen signific...
research
01/27/2020

Machine Learning for a Music Glove Instrument

A music glove instrument equipped with force sensitive, flex and IMU sen...
research
05/12/2022

Weakly-Supervised Action Detection Guided by Audio Narration

Videos are more well-organized curated data sources for visual concept l...

Please sign up or login with your details

Forgot password? Click here to reset