Dual Memory Units with Uncertainty Regulation for Weakly Supervised Video Anomaly Detection

by   Hang Zhou, et al.

Learning discriminative features for effectively separating abnormal events from normality is crucial for weakly supervised video anomaly detection (WS-VAD) tasks. Existing approaches, both video and segment-level label oriented, mainly focus on extracting representations for anomaly data while neglecting the implication of normal data. We observe that such a scheme is sub-optimal, i.e., for better distinguishing anomaly one needs to understand what is a normal state, and may yield a higher false alarm rate. To address this issue, we propose an Uncertainty Regulated Dual Memory Units (UR-DMU) model to learn both the representations of normal data and discriminative features of abnormal data. To be specific, inspired by the traditional global and local structure on graph convolutional networks, we introduce a Global and Local Multi-Head Self Attention (GL-MHSA) module for the Transformer network to obtain more expressive embeddings for capturing associations in videos. Then, we use two memory banks, one additional abnormal memory for tackling hard samples, to store and separate abnormal and normal prototypes and maximize the margins between the two representations. Finally, we propose an uncertainty learning scheme to learn the normal data latent space, that is robust to noise from camera switching, object changing, scene transforming, etc. Extensive experiments on XD-Violence and UCF-Crime datasets demonstrate that our method outperforms the state-of-the-art methods by a sizable margin.


page 3

page 4

page 7


Weakly-supervised Video Anomaly Detection with Contrastive Learning of Long and Short-range Temporal Features

In this paper, we address the problem of weakly-supervised video anomaly...

Towards Open Set Video Anomaly Detection

Open Set Video Anomaly Detection (OpenVAD) aims to identify abnormal eve...

Learning Weakly Supervised Audio-Visual Violence Detection in Hyperbolic Space

In recent years, the task of weakly supervised audio-visual violence det...

Unbiased Multiple Instance Learning for Weakly Supervised Video Anomaly Detection

Weakly Supervised Video Anomaly Detection (WSVAD) is challenging because...

MIST: Multiple Instance Self-Training Framework for Video Anomaly Detection

Weakly supervised video anomaly detection (WS-VAD) is to distinguish ano...

Discriminative-Generative Dual Memory Video Anomaly Detection

Recently, people tried to use a few anomalies for video anomaly detectio...

Real-world Video Anomaly Detection by Extracting Salient Features in Videos

We propose a lightweight and accurate method for detecting anomalies in ...

Please sign up or login with your details

Forgot password? Click here to reset