Pose-Based Two-Stream Relational Networks for Action Recognition in Videos

05/22/2018
by   Wei Wang, et al.
1

Recently, pose-based action recognition has gained more and more attention due to the better performance compared with traditional appearance-based methods. However, there still exist two problems to be further solved. First, existing pose-based methods generally recognize human actions with captured 3D human poses which are very difficult to obtain in real scenarios. Second, few pose-based methods model the action-related objects in recognizing human-object interaction actions in which objects play an important role. To solve the problems above, we propose a pose-based two-stream relational network (PSRN) for action recognition. In PSRN, one stream models the temporal dynamics of the targeted 2D human pose sequences which are directly extracted from raw videos, and the other stream models the action-related objects from a randomly sampled video frame. Most importantly, instead of fusing two-streams in the class score layer as before, we propose a pose-object relational network to model the relationship between human poses and action-related objects. We evaluate the proposed PSRN on two challenging benchmarks, i.e., Sub-JHMDB and PennAction. Experimental results show that our PSRN obtains the state-the-of-art performance on Sub-JHMDB (80.2 door to action recognition by combining 2D human pose extracted from raw video and image appearance.

READ FULL TEXT

page 4

page 6

page 8

research
12/16/2019

Mimetics: Towards Understanding Human Actions Out of Context

Recent methods for video action recognition have reached outstanding per...
research
07/13/2020

IntegralAction: Pose-driven Feature Integration for Robust Human Action Recognition in Videos

Most current action recognition methods heavily rely on appearance infor...
research
09/06/2022

Real-Time Cattle Interaction Recognition via Triple-stream Network

In stockbreeding of beef cattle, computer vision-based approaches have b...
research
07/12/2020

Two-Stream Deep Feature Modelling for Automated Video Endoscopy Data Analysis

Automating the analysis of imagery of the Gastrointestinal (GI) tract ca...
research
09/08/2019

Multi-Modal Three-Stream Network for Action Recognition

Human action recognition in video is an active yet challenging research ...
research
09/28/2020

PERF-Net: Pose Empowered RGB-Flow Net

In recent years, many works in the video action recognition literature h...
research
10/29/2018

ActionXPose: A Novel 2D Multi-view Pose-based Algorithm for Real-time Human Action Recognition

We present ActionXPose, a novel 2D pose-based algorithm for posture-leve...

Please sign up or login with your details

Forgot password? Click here to reset