Annotation Methodologies for Vision and Language Dataset Creation

07/10/2016
by   Gitit Kehat, et al.
0

Annotated datasets are commonly used in the training and evaluation of tasks involving natural language and vision (image description generation, action recognition and visual question answering). However, many of the existing datasets reflect problems that emerge in the process of data selection and annotation. Here we point out some of the difficulties and problems one confronts when creating and validating annotated vision and language datasets.

READ FULL TEXT

page 1

page 2

research
04/24/2017

An Analysis of Action Recognition Datasets for Language and Vision Tasks

A large amount of recent research has focused on tasks that combine lang...
research
12/26/2019

Vision and Language: from Visual Perception to Content Creation

Vision and language are two fundamental capabilities of human intelligen...
research
10/06/2022

Ambiguous Images With Human Judgments for Robust Visual Event Classification

Contemporary vision benchmarks predominantly consider tasks on which hum...
research
07/25/2022

Is GPT-3 all you need for Visual Question Answering in Cultural Heritage?

The use of Deep Learning and Computer Vision in the Cultural Heritage do...
research
06/02/2023

Learning from Partially Annotated Data: Example-aware Creation of Gap-filling Exercises for Language Learning

Since performing exercises (including, e.g., practice tests) forms a cru...
research
04/17/2021

Mobile App Tasks with Iterative Feedback (MoTIF): Addressing Task Feasibility in Interactive Visual Environments

In recent years, vision-language research has shifted to study tasks whi...
research
07/13/2016

The KIT Motion-Language Dataset

Linking human motion and natural language is of great interest for the g...

Please sign up or login with your details

Forgot password? Click here to reset