Recognizing Unseen Objects via Multimodal Intensive Knowledge Graph Propagation

by   Likang Wu, et al.

Zero-Shot Learning (ZSL), which aims at automatically recognizing unseen objects, is a promising learning paradigm to understand new real-world knowledge for machines continuously. Recently, the Knowledge Graph (KG) has been proven as an effective scheme for handling the zero-shot task with large-scale and non-attribute data. Prior studies always embed relationships of seen and unseen objects into visual information from existing knowledge graphs to promote the cognitive ability of the unseen data. Actually, real-world knowledge is naturally formed by multimodal facts. Compared with ordinary structural knowledge from a graph perspective, multimodal KG can provide cognitive systems with fine-grained knowledge. For example, the text description and visual content can depict more critical details of a fact than only depending on knowledge triplets. Unfortunately, this multimodal fine-grained knowledge is largely unexploited due to the bottleneck of feature alignment between different modalities. To that end, we propose a multimodal intensive ZSL framework that matches regions of images with corresponding semantic embeddings via a designed dense attention module and self-calibration loss. It makes the semantic transfer process of our ZSL framework learns more differentiated knowledge between entities. Our model also gets rid of the performance limitation of only using rough global features. We conduct extensive experiments and evaluate our model on large-scale real-world data. The experimental results clearly demonstrate the effectiveness of the proposed model in standard zero-shot classification tasks.


KMF: Knowledge-Aware Multi-Faceted Representation Learning for Zero-Shot Node Classification

Recently, Zero-Shot Node Classification (ZNC) has been an emerging and c...

Multi-label Zero-shot Classification by Learning to Transfer from External Knowledge

Multi-label zero-shot classification aims to predict multiple unseen cla...

Generative Adversarial Zero-Shot Relational Learning for Knowledge Graphs

Large-scale knowledge graphs (KGs) are shown to become more important in...

OS-MSL: One Stage Multimodal Sequential Link Framework for Scene Segmentation and Classification

Scene segmentation and classification (SSC) serve as a critical step tow...

Stacked Semantic-Guided Attention Model for Fine-Grained Zero-Shot Learning

Zero-Shot Learning (ZSL) is achieved via aligning the semantic relations...

Prompt-based Zero-shot Relation Classification with Semantic Knowledge Augmentation

Recognizing unseen relations with no training instances is a challenging...

Estimating Fund-Raising Performance for Start-up Projects from a Market Graph Perspective

In the online innovation market, the fund-raising performance of the sta...

Please sign up or login with your details

Forgot password? Click here to reset