Portrait Interpretation and a Benchmark

by   Yixuan Fan, et al.

We propose a task we name Portrait Interpretation and construct a dataset named Portrait250K for it. Current researches on portraits such as human attribute recognition and person re-identification have achieved many successes, but generally, they: 1) may lack mining the interrelationship between various tasks and the possible benefits it may bring; 2) design deep models specifically for each task, which is inefficient; 3) may be unable to cope with the needs of a unified model and comprehensive perception in actual scenes. In this paper, the proposed portrait interpretation recognizes the perception of humans from a new systematic perspective. We divide the perception of portraits into three aspects, namely Appearance, Posture, and Emotion, and design corresponding sub-tasks for each aspect. Based on the framework of multi-task learning, portrait interpretation requires a comprehensive description of static attributes and dynamic states of portraits. To invigorate research on this new task, we construct a new dataset that contains 250,000 images labeled with identity, gender, age, physique, height, expression, and posture of the whole body and arms. Our dataset is collected from 51 movies, hence covering extensive diversity. Furthermore, we focus on representation learning for portrait interpretation and propose a baseline that reflects our systematic perspective. We also propose an appropriate metric for this task. Our experimental results demonstrate that combining the tasks related to portrait interpretation can yield benefits. Code and dataset will be made public.


When Age-Invariant Face Recognition Meets Face Age Synthesis: A Multi-Task Learning Framework and A New Benchmark

To minimize the impact of age variation on face recognition, age-invaria...

Joint Person Identity, Gender and Age Estimation from Hand Images using Deep Multi-Task Representation Learning

In this paper, we propose a multi-task representation learning framework...

Temporal Attribute-Appearance Learning Network for Video-based Person Re-Identification

Video-based person re-identification aims to match a specific pedestrian...

Learning Disentangled Representation for Robust Person Re-identification

We address the problem of person re-identification (reID), that is, retr...

Surround-view Fisheye BEV-Perception for Valet Parking: Dataset, Baseline and Distortion-insensitive Multi-task Framework

Surround-view fisheye perception under valet parking scenes is fundament...

UFO: Unified Feature Optimization

This paper proposes a novel Unified Feature Optimization (UFO) paradigm ...

Structured learning and detailed interpretation of minimal object images

We model the process of human full interpretation of object images, name...

Please sign up or login with your details

Forgot password? Click here to reset