GAN for Vision, KG for Relation: a Two-stage Deep Network for Zero-shot Action Recognition

by   Bin Sun, et al.

Zero-shot action recognition can recognize samples of unseen classes that are unavailable in training by exploring common latent semantic representation in samples. However, most methods neglected the connotative relation and extensional relation between the action classes, which leads to the poor generalization ability of the zero-shot learning. Furthermore, the learned classifier incline to predict the samples of seen class, which leads to poor classification performance. To solve the above problems, we propose a two-stage deep neural network for zero-shot action recognition, which consists of a feature generation sub-network serving as the sampling stage and a graph attention sub-network serving as the classification stage. In the sampling stage, we utilize a generative adversarial networks (GAN) trained by action features and word vectors of seen classes to synthesize the action features of unseen classes, which can balance the training sample data of seen classes and unseen classes. In the classification stage, we construct a knowledge graph (KG) based on the relationship between word vectors of action classes and related objects, and propose a graph convolution network (GCN) based on attention mechanism, which dynamically updates the relationship between action classes and objects, and enhances the generalization ability of zero-shot learning. In both stages, we all use word vectors as bridges for feature generation and classifier generalization from seen classes to unseen classes. We compare our method with state-of-the-art methods on UCF101 and HMDB51 datasets. Experimental results show that our proposed method improves the classification performance of the trained classifier and achieves higher accuracy.


page 9

page 11


CLASTER: Clustering with Reinforcement Learning for Zero-Shot Action Recognition

Zero-shot action recognition is the task of recognizing action classes w...

Out-of-Distribution Detection for Generalized Zero-Shot Action Recognition

Generalized zero-shot action recognition is a challenging problem, where...

Improving Accuracy of Zero-Shot Action Recognition with Handcrafted Features

With the development of machine learning, datasets for models are gettin...

Self-Augmentation: Generalizing Deep Networks to Unseen Classes for Few-Shot Learning

Few-shot learning aims to classify unseen classes with a few training ex...

TGG: Transferable Graph Generation for Zero-shot and Few-shot Learning

Zero-shot and few-shot learning aim to improve generalization to unseen ...

Towards Context-aware Interaction Recognition

Recognizing how objects interact with each other is a crucial task in vi...

Segmenting 3D Hybrid Scenes via Zero-Shot Learning

This work is to tackle the problem of point cloud semantic segmentation ...

Please sign up or login with your details

Forgot password? Click here to reset