Gabriella: An Online System for Real-Time Activity Detection in Untrimmed Surveillance Videos

by   Mamshad Nayeem Rizve, et al.

Activity detection in surveillance videos is a difficult problem due to multiple factors such as large field of view, presence of multiple activities, varying scales and viewpoints, and its untrimmed nature. The existing research in activity detection is mainly focused on datasets, such as UCF-101, JHMDB, THUMOS, and AVA, which partially address these issues. The requirement of processing the surveillance videos in real-time makes this even more challenging. In this work we propose Gabriella, a real-time online system to perform activity detection on untrimmed surveillance videos. The proposed method consists of three stages: tubelet extraction, activity classification, and online tubelet merging. For tubelet extraction, we propose a localization network which takes a video clip as input and spatio-temporally detects potential foreground regions at multiple scales to generate action tubelets. We propose a novel Patch-Dice loss to handle large variations in actor size. Our online processing of videos at a clip level drastically reduces the computation time in detecting activities. The detected tubelets are assigned activity class scores by the classification network and merged together using our proposed Tubelet-Merge Action-Split (TMAS) algorithm to form the final action detections. The TMAS algorithm efficiently connects the tubelets in an online fashion to generate action detections which are robust against varying length activities. We perform our experiments on the VIRAT and MEVA (Multiview Extended Video with Activities) datasets and demonstrate the effectiveness of the proposed approach in terms of speed ( 100 fps) and performance with state-of-the-art results. The code and models will be made publicly available.


page 1

page 6

page 7


TinyVIRAT: Low-resolution Video Action Recognition

The existing research in action recognition is mostly focused on high-qu...

EvaluationNet: Can Human Skill be Evaluated by Deep Networks?

With the recent substantial growth of media such as YouTube, a considera...

Online Transition-Based Feature Generation for Anomaly Detection in Concurrent Data Streams

In this paper, we introduce the transition-based feature generator (TFGe...

PAMI-AD: An Activity Detector Exploiting Part-attention and Motion Information in Surveillance Videos

Activity detection in surveillance videos is a challenging task caused b...

vireoJD-MM at Activity Detection in Extended Videos

This notebook paper presents an overview and comparative analysis of our...

Out the Window: A Crowd-Sourced Dataset for Activity Classification in Surveillance Video

The Out the Window (OTW) dataset is a crowdsourced activity dataset cont...

Argus++: Robust Real-time Activity Detection for Unconstrained Video Streams with Overlapping Cube Proposals

Activity detection is one of the attractive computer vision tasks to exp...

Please sign up or login with your details

Forgot password? Click here to reset