Event-related data conditioning for acoustic event classification

by   Yuanbo Hou, et al.

Models based on diverse attention mechanisms have recently shined in tasks related to acoustic event classification (AEC). Among them, self-attention is often used in audio-only tasks to help the model recognize different acoustic events. Self-attention relies on the similarity between time frames, and uses global information from the whole segment to highlight specific features within a frame. In real life, information related to acoustic events will attenuate over time, which means the information within some frames around the event deserves more attention than distant time global information that may be unrelated to the event. This paper shows that self-attention may over-enhance certain segments of audio representations, and smooth out the boundaries between events representations and background noises. Hence, this paper proposes an event-related data conditioning (EDC) for AEC. EDC directly works on spectrograms. The idea of EDC is to adaptively select the frame-related attention range based on acoustic features, and gather the event-related local information to represent the frame. Experiments show that: 1) compared with spectrogram-based data augmentation methods and trainable feature weighting and self-attention, EDC outperforms them in both the original-size mode and the augmented mode; 2) EDC effectively gathers event-related local information and enhances boundaries between events and backgrounds, improving the performance of AEC.


Relation-guided acoustic scene classification aided with event embeddings

In real life, acoustic scenes and audio events are naturally correlated....

Multi-dimensional Edge-based Audio Event Relational Graph Representation Learning for Acoustic Scene Classification

Most existing deep learning-based acoustic scene classification (ASC) ap...

CT-SAT: Contextual Transformer for Sequential Audio Tagging

Sequential audio event tagging can provide not only the type information...

Event Detection on Dynamic Graphs

Event detection is a critical task for timely decision-making in graph a...

CNN-based Discriminative Training for Domain Compensation in Acoustic Event Detection with Frame-wise Classifier

Domain mismatch is a noteworthy issue in acoustic event detection tasks,...

Content-based feature exploration for transparent music recommendation using self-attentive genre classification

Interpretation of retrieved results is an important issue in music recom...

Deep model with built-in self-attention alignment for acoustic echo cancellation

With recent research advances, deep learning models have become an attra...

Please sign up or login with your details

Forgot password? Click here to reset