Action Representation Using Classifier Decision Boundaries

04/06/2017
by   Jue Wang, et al.
0

Most popular deep learning based models for action recognition are designed to generate separate predictions within their short temporal windows, which are often aggregated by heuristic means to assign an action label to the full video segment. Given that not all frames from a video characterize the underlying action, pooling schemes that impose equal importance to all frames might be unfavorable. In an attempt towards tackling this challenge, we propose a novel pooling scheme, dubbed SVM pooling, based on the notion that among the bag of features generated by a CNN on all temporal windows, there is at least one feature that characterizes the action. To this end, we learn a decision hyperplane that separates this unknown yet useful feature from the rest. Applying multiple instance learning in an SVM setup, we use the parameters of this separating hyperplane as a descriptor for the video. Since these parameters are directly related to the support vectors in a max-margin framework, they serve as robust representations for pooling of the CNN features. We devise a joint optimization objective and an efficient solver that learns these hyperplanes per video and the corresponding action classifiers over the hyperplanes. Showcased experiments on the standard HMDB and UCF101 datasets demonstrate state-of-the-art performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/26/2018

Video Representation Learning Using Discriminative Pooling

Popular deep models for action recognition in videos generate independen...
research
09/05/2019

Discriminative Video Representation Learning Using Support Vector Classifiers

Most popular deep models for action recognition in videos generate indep...
research
01/19/2017

Higher-order Pooling of CNN Features via Kernel Linearization for Action Recognition

Most successful deep learning algorithms for action recognition extend m...
research
04/07/2017

Generalized Rank Pooling for Activity Recognition

Most popular deep models for action recognition split video sequences in...
research
05/24/2017

Sequence Summarization Using Order-constrained Kernelized Feature Subspaces

Representations that can compactly and effectively capture temporal evol...
research
03/27/2018

Non-Linear Temporal Subspace Representations for Activity Recognition

Representations that can compactly and effectively capture the temporal ...
research
07/24/2018

Learning Discriminative Video Representations Using Adversarial Perturbations

Adversarial perturbations are noise-like patterns that can subtly change...

Please sign up or login with your details

Forgot password? Click here to reset