Prompt-Guided Zero-Shot Anomaly Action Recognition using Pretrained Deep Skeleton Features

by   Fumiaki Sato, et al.

This study investigates unsupervised anomaly action recognition, which identifies video-level abnormal-human-behavior events in an unsupervised manner without abnormal samples, and simultaneously addresses three limitations in the conventional skeleton-based approaches: target domain-dependent DNN training, robustness against skeleton errors, and a lack of normal samples. We present a unified, user prompt-guided zero-shot learning framework using a target domain-independent skeleton feature extractor, which is pretrained on a large-scale action recognition dataset. Particularly, during the training phase using normal samples, the method models the distribution of skeleton features of the normal actions while freezing the weights of the DNNs and estimates the anomaly score using this distribution in the inference phase. Additionally, to increase robustness against skeleton errors, we introduce a DNN architecture inspired by a point cloud deep learning paradigm, which sparsely propagates the features between joints. Furthermore, to prevent the unobserved normal actions from being misidentified as abnormal actions, we incorporate a similarity score between the user prompt embeddings and skeleton features aligned in the common space into the anomaly score, which indirectly supplements normal actions. On two publicly available datasets, we conduct experiments to test the effectiveness of the proposed method with respect to abovementioned limitations.


page 1

page 2

page 3

page 4


Syntactically Guided Generative Embeddings for Zero-Shot Skeleton Action Recognition

We introduce SynSE, a novel syntactically guided generative approach for...

Unified Keypoint-based Action Recognition Framework via Structured Keypoint Pooling

This paper simultaneously addresses three limitations associated with co...

Zero-shot Skeleton-based Action Recognition via Mutual Information Estimation and Maximization

Zero-shot skeleton-based action recognition aims to recognize actions of...

RareAct: A video dataset of unusual interactions

This paper introduces a manually annotated video dataset of unusual acti...

Cross-Domain Video Anomaly Detection without Target Domain Adaptation

Most cross-domain unsupervised Video Anomaly Detection (VAD) works assum...

Action recognition by learning pose representations

Pose detection is one of the fundamental steps for the recognition of hu...

CFA: Coupled-hypersphere-based Feature Adaptation for Target-Oriented Anomaly Localization

For a long time, anomaly localization has been widely used in industries...

Please sign up or login with your details

Forgot password? Click here to reset