Human Motion Prediction via Spatio-Temporal Inpainting

by   Alejandro Hernandez Ruiz, et al.

We propose a Generative Adversarial Network (GAN) to forecast 3D human motion given a sequence of observed 3D skeleton poses. While recent GANs have shown promising results, they can only forecast plausible human-like motion over relatively short periods of time, i.e. a few hundred milliseconds, and typically ignore the absolute position of the skeleton w.r.t. the camera. The GAN scheme we propose can reliably provide long term predictions of two seconds or more for both the non-rigid body pose and its absolute position, and can be trained in an self-supervised manner. Our approach builds upon three main contributions. First, we consider a data representation based on a spatio-temporal tensor of 3D skeleton coordinates which allows us to formulate the prediction problem as an inpainting one, for which GANs work particularly well. Secondly, we design a GAN architecture to learn the joint distribution of body poses and global motion, allowing us to hypothesize large chunks of the input 3D tensor with missing data. And finally, we argue that the L2 metric, which is considered so far by most approaches, fails to capture the actual distribution of long-term human motion. We therefore propose an alternative metric that is more correlated with human perception. Our experiments demonstrate that our approach achieves significant improvements over the state of the art for human motion forecasting and that it also handles situations in which past observations are corrupted by severe occlusions, noise and consecutive missing frames.


page 1

page 2

page 3

page 4


3D Skeleton-based Human Motion Prediction with Manifold-Aware GAN

In this work we propose a novel solution for 3D skeleton-based human mot...

Skeleton-Graph: Long-Term 3D Motion Prediction From 2D Observations Using Deep Spatio-Temporal Graph CNNs

Several applications such as autonomous driving, augmented reality and v...

Human Motion Prediction via Learning Local Structure Representations and Temporal Dependencies

Human motion prediction from motion capture data is a classical problem ...

Attention, please: A Spatio-temporal Transformer for 3D Human Motion Prediction

In this paper, we propose a novel architecture for the task of 3D human ...

Real-time Locational Marginal Price Forecasting Using Generative Adversarial Network

In this paper, we propose a model-free unsupervised learning approach to...

Pose Transformers (POTR): Human Motion Prediction with Non-Autoregressive Transformers

We propose to leverage Transformer architectures for non-autoregressive ...

Human Motion Anticipation with Symbolic Label

Anticipating human motion depends on two factors: the past motion and th...

Please sign up or login with your details

Forgot password? Click here to reset