WiFE: WiFi and Vision based Intelligent Facial-Gesture Emotion Recognition

by   Yu Gu, et al.

Emotion is an essential part of Artificial Intelligence (AI) and human mental health. Current emotion recognition research mainly focuses on single modality (e.g., facial expression), while human emotion expressions are multi-modal in nature. In this paper, we propose a hybrid emotion recognition system leveraging two emotion-rich and tightly-coupled modalities, i.e., facial expression and body gesture. However, unbiased and fine-grained facial expression and gesture recognition remain a major problem. To this end, unlike our rivals relying on contact or even invasive sensors, we explore the commodity WiFi signal for device-free and contactless gesture recognition, while adopting a vision-based facial expression. However, there exist two design challenges, i.e., how to improve the sensitivity of WiFi signals and how to process the large-volume, heterogeneous, and non-synchronous data contributed by the two-modalities. For the former, we propose a signal sensitivity enhancement method based on the Rician K factor theory; for the latter, we combine CNN and RNN to mine the high-level features of bi-modal data, and perform a score-level fusion for fine-grained recognition. To evaluate the proposed method, we build a first-of-its-kind Vision-CSI Emotion Database (VCED) and conduct extensive experiments. Empirical results show the superiority of the bi-modality by achieving 83.24% recognition accuracy for seven emotions, as compared with 66.48 gesture-only based solution and facial-only based solution, respectively. The VCED database download link is https://drive.google.com/open?id=1OdNhCWDS28qT21V8YHdCNRjHLbe042eG. Note: You need to apply for permission after clicking the link, we will grant you a week of access after passing.


page 1

page 2

page 3

page 4

page 5

page 6

page 7


Multimodal Utterance-level Affect Analysis using Visual, Audio and Text Features

Affective computing models are essential for human behavior analysis. A ...

Multi-Modal Facial Expression Recognition with Transformer-Based Fusion Networks and Dynamic Sampling

Facial expression recognition is important for various purpose such as e...

MAFW: A Large-scale, Multi-modal, Compound Affective Database for Dynamic Facial Expression Recognition in the Wild

Dynamic facial expression recognition (FER) databases provide important ...

Leveraging Recent Advances in Deep Learning for Audio-Visual Emotion Recognition

Emotional expressions are the behaviors that communicate our emotional s...

EmoSense: Computational Intelligence Driven Emotion Sensing via Wireless Channel Data

Emotion is well-recognized as a distinguished symbol of human beings, an...

A High-Fidelity Open Embodied Avatar with Lip Syncing and Expression Capabilities

Embodied avatars as virtual agents have many applications and provide be...

Deep Evolution for Facial Emotion Recognition

Deep facial expression recognition faces two challenges that both stem f...

Please sign up or login with your details

Forgot password? Click here to reset