Non-Volume Preserving-based Feature Fusion Approach to Group-Level Expression Recognition on Crowd Videos

by   Kha Gia Quach, et al.

Group-level emotion recognition (ER) is a growing research area as the demands for assessing crowds of all sizes is becoming an interest in both the security arena and social media. This work investigates group-level expression recognition on crowd videos where information is not only aggregated across a variable length sequence of frames but also over the set of faces within each frame to produce aggregated recognition results. In this paper, we propose an effective deep feature level fusion mechanism to model the spatial-temporal information in the crowd videos. Furthermore, we extend our proposed NVP fusion mechanism to temporal NVP fussion appoarch to learn the temporal information between frames. In order to demonstrate the robustness and effectiveness of each component in the proposed approach, three experiments were conducted: (i) evaluation on the AffectNet database to benchmark the proposed emoNet for recognizing facial expression; (ii) evaluation on EmotiW2018 to benchmark the proposed deep feature level fusion mechanism NVPF; and, (iii) examine the proposed TNVPF on an innovative Group-level Emotion on Crowd Videos (GECV) dataset composed of 627 videos collected from social media. GECV dataset is a collection of videos ranging in duration from 10 to 20 seconds of crowds of twenty (20) or more subjects and each video is labeled as positive, negative, or neutral.


page 1

page 2

page 4

page 6


Modelling Temporal Information Using Discrete Fourier Transform for Recognizing Emotions in User-generated Videos

With the widespread of user-generated Internet videos, emotion recogniti...

A Novel Apex-Time Network for Cross-Dataset Micro-Expression Recognition

The automatic recognition of micro-expression has been boosted ever sinc...

Spatiotemporal Modeling for Crowd Counting in Videos

Region of Interest (ROI) crowd counting can be formulated as a regressio...

Human-Centered Emotion Recognition in Animated GIFs

As an intuitive way of expression emotion, the animated Graphical Interc...

An Occam's Razor View on Learning Audiovisual Emotion Recognition with Small Training Sets

This paper presents a light-weight and accurate deep neural model for au...

Group-Level Emotion Recognition Using a Unimodal Privacy-Safe Non-Individual Approach

This article presents our unimodal privacy-safe and non-individual propo...

Improving Word Recognition in Speech Transcriptions by Decision-level Fusion of Stemming and Two-way Phoneme Pruning

We introduce an unsupervised approach for correcting highly imperfect sp...

Please sign up or login with your details

Forgot password? Click here to reset