Multi-View Consistency Loss for Improved Single-Image 3D Reconstruction of Clothed People

by   Akin Caliskan, et al.

We present a novel method to improve the accuracy of the 3D reconstruction of clothed human shape from a single image. Recent work has introduced volumetric, implicit and model-based shape learning frameworks for reconstruction of objects and people from one or more images. However, the accuracy and completeness for reconstruction of clothed people is limited due to the large variation in shape resulting from clothing, hair, body size, pose and camera viewpoint. This paper introduces two advances to overcome this limitation: firstly a new synthetic dataset of realistic clothed people, 3DVH; and secondly, a novel multiple-view loss function for training of monocular volumetric shape estimation, which is demonstrated to significantly improve generalisation and reconstruction accuracy. The 3DVH dataset of realistic clothed 3D human models rendered with diverse natural backgrounds is demonstrated to allows transfer to reconstruction from real images of people. Comprehensive comparative performance evaluation on both synthetic and real images of people demonstrates that the proposed method significantly outperforms the previous state-of-the-art learning-based single image 3D human shape estimation approaches achieving significant improvement of reconstruction accuracy, completeness, and quality. An ablation study shows that this is due to both the proposed multiple-view training and the new 3DVH dataset. The code and the dataset can be found at the project website:


Temporal Consistency Loss for High Resolution Textured and Clothed 3DHuman Reconstruction from Monocular Video

We present a novel method to learn temporally consistent 3D reconstructi...

BodyNet: Volumetric Inference of 3D Human Body Shapes

Human shape estimation is an important task for video editing, animation...

Multi-person Implicit Reconstruction from a Single Image

We present a new end-to-end learning framework to obtain detailed and sp...

ReFit: Recurrent Fitting Network for 3D Human Recovery

We present Recurrent Fitting (ReFit), a neural network architecture for ...

Pose Adaptive Dual Mixup for Few-Shot Single-View 3D Reconstruction

We present a pose adaptive few-shot learning procedure and a two-stage d...

Coherent Reconstruction of Multiple Humans from a Single Image

In this work, we address the problem of multi-person 3D pose estimation ...

Putting People in their Place: Monocular Regression of 3D People in Depth

Given an image with multiple people, our goal is to directly regress the...

Please sign up or login with your details

Forgot password? Click here to reset