MRRC: Multiple Role Representation Crossover Interpretation for Image Captioning With R-CNN Feature Distribution Composition (FDC)

02/15/2020
by   Chiranjib Sur, et al.
7

While image captioning through machines requires structured learning and basis for interpretation, improvement requires multiple context understanding and processing in a meaningful way. This research will provide a novel concept for context combination and will impact many applications to deal visual features as an equivalence of descriptions of objects, activities and events. There are three components of our architecture: Feature Distribution Composition (FDC) Layer Attention, Multiple Role Representation Crossover (MRRC) Attention Layer and the Language Decoder. FDC Layer Attention helps in generating the weighted attention from RCNN features, MRRC Attention Layer acts as intermediate representation processing and helps in generating the next word attention, while Language Decoder helps in estimation of the likelihood for the next probable word in the sentence. We demonstrated effectiveness of FDC, MRRC, regional object feature attention and reinforcement learning for effective learning to generate better captions from images. The performance of our model enhanced previous performances by 35.3% and created a new standard and theory for representation generation based on logic, better interpretability and contexts.

READ FULL TEXT

page 5

page 6

page 8

page 10

page 11

page 13

page 16

page 17

research
12/12/2016

Text-guided Attention Model for Image Captioning

Visual attention plays an important role to understand images and demons...
research
06/16/2022

Image Captioning based on Feature Refinement and Reflective Decoding

Automatically generating a description of an image in natural language i...
research
01/27/2020

aiTPR: Attribute Interaction-Tensor Product Representation for Image Caption

Region visual features enhance the generative capability of the machines...
research
06/21/2020

Improving Image Captioning with Better Use of Captions

Image captioning is a multimodal problem that has drawn extensive attent...
research
12/17/2018

Feature Fusion Effects of Tensor Product Representation on (De)Compositional Network for Caption Generation for Images

Progress in image captioning is gradually getting complex as researchers...
research
11/02/2020

Boost Image Captioning with Knowledge Reasoning

Automatically generating a human-like description for a given image is a...

Please sign up or login with your details

Forgot password? Click here to reset