Generating captions without looking beyond objects

10/12/2016
by   Hendrik Heuer, et al.
0

This paper explores new evaluation perspectives for image captioning and introduces a noun translation task that achieves comparative image caption generation performance by translating from a set of nouns to captions. This implies that in image captioning, all word categories other than nouns can be evoked by a powerful language model without sacrificing performance on n-gram precision. The paper also investigates lower and upper bounds of how much individual word categories in the captions contribute to the final BLEU score. A large possible improvement exists for nouns, verbs, and prepositions.

READ FULL TEXT
research
10/10/2022

CLIP-Diffusion-LM: Apply Diffusion Model on Image Captioning

Image captioning task has been extensively researched by previous work. ...
research
09/02/2018

Chittron: An Automatic Bangla Image Captioning System

Automatic image caption generation aims to produce an accurate descripti...
research
05/09/2021

A Hybrid Model for Combining Neural Image Caption and k-Nearest Neighbor Approach for Image Captioning

A hybrid model is proposed that integrates two popular image captioning ...
research
02/07/2023

KENGIC: KEyword-driven and N-Gram Graph based Image Captioning

This paper presents a Keyword-driven and N-gram Graph based approach for...
research
07/21/2018

What is not where: the challenge of integrating spatial representations into deep learning architectures

This paper examines to what degree current deep learning architectures f...
research
04/15/2018

Pragmatically Informative Image Captioning with Character-Level Reference

We combine a neural image captioner with a Rational Speech Acts (RSA) mo...
research
06/05/2023

Cheap-fake Detection with LLM using Prompt Engineering

The misuse of real photographs with conflicting image captions in news i...

Please sign up or login with your details

Forgot password? Click here to reset