The KIT Motion-Language Dataset

by   Matthias Plappert, et al.

Linking human motion and natural language is of great interest for the generation of semantic representations of human activities as well as for the generation of robot activities based on natural language input. However, while there have been years of research in this area, no standardized and openly available dataset exists to support the development and evaluation of such systems. We therefore propose the KIT Motion-Language Dataset, which is large, open, and extensible. We aggregate data from multiple motion capture databases and include them in our dataset using a unified representation that is independent of the capture system or marker set, making it easy to work with the data regardless of its origin. To obtain motion annotations in natural language, we apply a crowd-sourcing approach and a web-based tool that was specifically build for this purpose, the Motion Annotation Tool. We thoroughly document the annotation process itself and discuss gamification methods that we used to keep annotators motivated. We further propose a novel method, perplexity-based selection, which systematically selects motions for further annotation that are either under-represented in our dataset or that have erroneous annotations. We show that our method mitigates the two aforementioned problems and ensures a systematic annotation process. We provide an in-depth analysis of the structure and contents of our resulting dataset, which, as of June 14, 2016, contains 3917 motions with a total duration of 11.26 hours and 5486 annotations in natural language that contain 45779 words. We believe that this makes our dataset an excellent choice that enables more transparent and comparable research in this important area.


page 12

page 15

page 31


Learning a bidirectional mapping between human whole-body motion and natural language using deep recurrent neural networks

Linking human whole-body motion and natural language is of great interes...

Motion-R3: Fast and Accurate Motion Annotation via Representation-based Representativeness Ranking

In this paper, we follow a data-centric philosophy and propose a novel m...

Reliable Evaluations for Natural Language Inference based on a Unified Cross-dataset Benchmark

Recent studies show that crowd-sourced Natural Language Inference (NLI) ...

CoaCor: Code Annotation for Code Retrieval with Reinforcement Learning

To accelerate software development, much research has been performed to ...

MoVi: A Large Multipurpose Motion and Video Dataset

Human movements are both an area of intense study and the basis of many ...

Annotation Methodologies for Vision and Language Dataset Creation

Annotated datasets are commonly used in the training and evaluation of t...

A Text Reassembling Approach to Natural Language Generation

Recent years have seen a number of proposals for performing Natural Lang...

Please sign up or login with your details

Forgot password? Click here to reset