Action Recognition with Image Based CNN Features

12/13/2015
by   Mahdyar Ravanbakhsh, et al.
0

Most of human actions consist of complex temporal compositions of more simple actions. Action recognition tasks usually relies on complex handcrafted structures as features to represent the human action model. Convolutional Neural Nets (CNN) have shown to be a powerful tool that eliminate the need for designing handcrafted features. Usually, the output of the last layer in CNN (a layer before the classification layer -known as fc7) is used as a generic feature for images. In this paper, we show that fc7 features, per se, can not get a good performance for the task of action recognition, when the network is trained only on images. We present a feature structure on top of fc7 features, which can capture the temporal variation in a video. To represent the temporal components, which is needed to capture motion information, we introduced a hierarchical structure. The hierarchical model enables to capture sub-actions from a complex action. At the higher levels of the hierarchy, it represents a coarse capture of action sequence and lower levels represent fine action elements. Furthermore, we introduce a method for extracting key-frames using binary coding of each frame in a video, which helps to improve the performance of our hierarchical model. We experimented our method on several action datasets and show that our method achieves superior results compared to other state-of-the-arts methods.

READ FULL TEXT

page 1

page 4

page 5

research
06/17/2019

A Temporal Sequence Learning for Action Recognition and Prediction

In this work[This work was supported in part by the National Science Fou...
research
06/11/2015

P-CNN: Pose-based CNN Features for Action Recognition

This work targets human action recognition in video. While recent method...
research
04/07/2020

Temporal Pyramid Network for Action Recognition

Visual tempo characterizes the dynamics and the temporal scale of an act...
research
01/01/2023

Hierarchical Explanations for Video Action Recognition

We propose Hierarchical ProtoPNet: an interpretable network that explain...
research
08/29/2019

DWnet: Deep-Wide Network for 3D Action Recognition

We propose in this paper a deep-wide network (DWnet) which combines the ...
research
12/20/2016

Dynamic Action Recognition: A convolutional neural network model for temporally organized joint location data

Motivation: Recognizing human actions in a video is a challenging task w...
research
07/11/2021

Interpretable Deep Feature Propagation for Early Action Recognition

Early action recognition (action prediction) from limited preliminary ob...

Please sign up or login with your details

Forgot password? Click here to reset