Information Screening whilst Exploiting! Multimodal Relation Extraction with Feature Denoising and Multimodal Topic Modeling

05/19/2023
by   Shengqiong Wu, et al.
0

Existing research on multimodal relation extraction (MRE) faces two co-existing challenges, internal-information over-utilization and external-information under-exploitation. To combat that, we propose a novel framework that simultaneously implements the idea of internal-information screening and external-information exploiting. First, we represent the fine-grained semantic structures of the input image and text with the visual and textual scene graphs, which are further fused into a unified cross-modal graph (CMG). Based on CMG, we perform structure refinement with the guidance of the graph information bottleneck principle, actively denoising the less-informative features. Next, we perform topic modeling over the input image and text, incorporating latent multimodal topic features to enrich the contexts. On the benchmark MRE dataset, our system outperforms the current best model significantly. With further in-depth analyses, we reveal the great potential of our method for the MRE task. Our codes are open at https://github.com/ChocoWu/MRE-ISE.

READ FULL TEXT

page 7

page 8

page 15

research
11/14/2022

On Analyzing the Role of Image for Visual-enhanced Relation Extraction

Multimodal relation extraction is an essential task for knowledge graph ...
research
05/25/2023

Multimodal Relation Extraction with Cross-Modal Retrieval and Synthesis

Multimodal relation extraction (MRE) is the task of identifying the sema...
research
04/05/2023

Enhancing Multimodal Entity and Relation Extraction with Variational Information Bottleneck

This paper studies the multimodal named entity recognition (MNER) and mu...
research
05/07/2022

Good Visual Guidance Makes A Better Extractor: Hierarchical Visual Prefix for Multimodal Entity and Relation Extraction

Multimodal named entity recognition and relation extraction (MNER and MR...
research
04/09/2022

Modeling Multi-Granularity Hierarchical Features for Relation Extraction

Relation extraction is a key task in Natural Language Processing (NLP), ...
research
05/20/2023

Scene Graph as Pivoting: Inference-time Image-free Unsupervised Multimodal Machine Translation with Visual Scene Hallucination

In this work, we investigate a more realistic unsupervised multimodal ma...
research
06/13/2018

Cross-modal Hallucination for Few-shot Fine-grained Recognition

State-of-the-art deep learning algorithms generally require large amount...

Please sign up or login with your details

Forgot password? Click here to reset