Making Reconstruction-based Method Great Again for Video Anomaly Detection

by   Yizhou Wang, et al.

Anomaly detection in videos is a significant yet challenging problem. Previous approaches based on deep neural networks employ either reconstruction-based or prediction-based approaches. Nevertheless, existing reconstruction-based methods 1) rely on old-fashioned convolutional autoencoders and are poor at modeling temporal dependency; 2) are prone to overfit the training samples, leading to indistinguishable reconstruction errors of normal and abnormal frames during the inference phase. To address such issues, firstly, we get inspiration from transformer and propose Spatio-Temporal Auto-Trans-Encoder, dubbed as STATE, as a new autoencoder model for enhanced consecutive frame reconstruction. Our STATE is equipped with a specifically designed learnable convolutional attention module for efficient temporal learning and reasoning. Secondly, we put forward a novel reconstruction-based input perturbation technique during testing to further differentiate anomalous frames. With the same perturbation magnitude, the testing reconstruction error of the normal frames lowers more than that of the abnormal frames, which contributes to mitigating the overfitting problem of reconstruction. Owing to the high relevance of the frame abnormality and the objects in the frame, we conduct object-level reconstruction using both the raw frame and the corresponding optical flow patches. Finally, the anomaly score is designed based on the combination of the raw and motion reconstruction errors using perturbed inputs. Extensive experiments on benchmark video anomaly detection datasets demonstrate that our approach outperforms previous reconstruction-based methods by a notable margin, and achieves state-of-the-art anomaly detection performance consistently. The code is available at


page 1

page 3


A Hybrid Video Anomaly Detection Framework via Memory-Augmented Flow Reconstruction and Flow-Guided Frame Prediction

In this paper, we propose $\text{HF}^2$-VAD, a Hybrid framework that int...

Anomaly Detection using Deep Reconstruction and Forecasting for Autonomous Systems

We propose self-supervised deep algorithms to detect anomalies in hetero...

Normal Learning in Videos with Attention Prototype Network

Frame reconstruction (current or future frame) based on Auto-Encoder (AE...

Multi-Contextual Predictions with Vision Transformer for Video Anomaly Detection

Video Anomaly Detection(VAD) has been traditionally tackled in two main ...

Robust Unsupervised Video Anomaly Detection by Multi-Path Frame Prediction

Video anomaly detection is commonly used in many applications such as se...

Towards Optimal Use of Exception Handling Information for Function Detection

Function entry detection is critical for security of binary code. Conven...

Video Abnormal Event Detection by Learning to Complete Visual Cloze Tests

Video abnormal event detection (VAD) is a vital semi-supervised task tha...

Please sign up or login with your details

Forgot password? Click here to reset