PoseGPT: Quantization-based 3D Human Motion Generation and Forecasting

10/19/2022
by   Thomas Lucas, et al.
12

We address the problem of action-conditioned generation of human motion sequences. Existing work falls into two categories: forecast models conditioned on observed past motions, or generative models conditioned on action labels and duration only. In contrast, we generate motion conditioned on observations of arbitrary length, including none. To solve this generalized problem, we propose PoseGPT, an auto-regressive transformer-based approach which internally compresses human motion into quantized latent sequences. An auto-encoder first maps human motion to latent index sequences in a discrete space, and vice-versa. Inspired by the Generative Pretrained Transformer (GPT), we propose to train a GPT-like model for next-index prediction in that space; this allows PoseGPT to output distributions on possible futures, with or without conditioning on past motion. The discrete and compressed nature of the latent space allows the GPT-like model to focus on long-range signal, as it removes low-level redundancy in the input signal. Predicting discrete indices also alleviates the common pitfall of predicting averaged poses, a typical failure case when regressing continuous values, as the average of discrete targets is not a target itself. Our experimental results show that our proposed approach achieves state-of-the-art results on HumanAct12, a standard but small scale dataset, as well as on BABEL, a recent large scale MoCap dataset, and on GRAB, a human-object interactions dataset.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/12/2021

Action-Conditioned 3D Human Motion Synthesis with Transformer VAE

We tackle the problem of action-conditioned generation of realistic and ...
research
11/29/2022

UDE: A Unified Driving Engine for Human Motion Generation

Generating controllable and editable human motion sequences is a key cha...
research
07/30/2020

Action2Motion: Conditioned Generation of 3D Human Motions

Action recognition is a relatively established task, where givenan input...
research
03/15/2022

ActFormer: A GAN Transformer Framework towards General Action-Conditioned 3D Human Motion Generation

We present a GAN Transformer framework for general action-conditioned 3D...
research
11/25/2022

PaCMO: Partner Dependent Human Motion Generation in Dyadic Human Activity using Neural Operators

We address the problem of generating 3D human motions in dyadic activiti...
research
06/14/2022

Recurrent Transformer Variational Autoencoders for Multi-Action Motion Synthesis

We consider the problem of synthesizing multi-action human motion sequen...
research
03/15/2022

MotionCLIP: Exposing Human Motion Generation to CLIP Space

We introduce MotionCLIP, a 3D human motion auto-encoder featuring a late...

Please sign up or login with your details

Forgot password? Click here to reset