Neighborhood Contrastive Transformer for Change Captioning

03/06/2023
by   Yunbin Tu, et al.
0

Change captioning is to describe the semantic change between a pair of similar images in natural language. It is more challenging than general image captioning, because it requires capturing fine-grained change information while being immune to irrelevant viewpoint changes, and solving syntax ambiguity in change descriptions. In this paper, we propose a neighborhood contrastive transformer to improve the model's perceiving ability for various changes under different scenes and cognition ability for complex syntax structure. Concretely, we first design a neighboring feature aggregating to integrate neighboring context into each feature, which helps quickly locate the inconspicuous changes under the guidance of conspicuous referents. Then, we devise a common feature distilling to compare two images at neighborhood level and extract common properties from each image, so as to learn effective contrastive information between them. Finally, we introduce the explicit dependencies between words to calibrate the transformer decoder, which helps better understand complex syntax structure during training. Extensive experimental results demonstrate that the proposed method achieves the state-of-the-art performance on three public datasets with different change scenarios. The code is available at https://github.com/tuyunbin/NCT.

READ FULL TEXT

page 1

page 7

page 9

page 10

page 11

page 12

page 14

research
10/20/2021

R^3Net:Relation-embedded Representation Reconstruction Network for Change Captioning

Change captioning is to use a natural language sentence to describe the ...
research
12/25/2019

Explicit Sparse Transformer: Concentrated Attention Through Explicit Selection

Self-attention based Transformer has demonstrated the state-of-the-art p...
research
02/09/2022

Image Difference Captioning with Pre-training and Contrastive Learning

The Image Difference Captioning (IDC) task aims to describe the visual d...
research
09/30/2020

Finding It at Another Side: A Viewpoint-Adapted Matching Encoder for Change Captioning

Change Captioning is a task that aims to describe the difference between...
research
02/10/2023

Exploiting Neighborhood Structural Features for Change Detection

In this letter, a novel method for change detection is proposed using ne...
research
05/30/2023

Align, Perturb and Decouple: Toward Better Leverage of Difference Information for RSI Change Detection

Change detection is a widely adopted technique in remote sense imagery (...
research
03/25/2021

Describing and Localizing Multiple Changes with Transformers

Change captioning tasks aim to detect changes in image pairs observed be...

Please sign up or login with your details

Forgot password? Click here to reset