Primitive Generation and Semantic-related Alignment for Universal Zero-Shot Segmentation

by   Shuting He, et al.

We study universal zero-shot segmentation in this work to achieve panoptic, instance, and semantic segmentation for novel categories without any training samples. Such zero-shot segmentation ability relies on inter-class relationships in semantic space to transfer the visual knowledge learned from seen categories to unseen ones. Thus, it is desired to well bridge semantic-visual spaces and apply the semantic relationships to visual feature learning. We introduce a generative model to synthesize features for unseen categories, which links semantic and visual spaces as well as addresses the issue of lack of unseen training data. Furthermore, to mitigate the domain gap between semantic and visual spaces, firstly, we enhance the vanilla generator with learned primitives, each of which contains fine-grained attributes related to categories, and synthesize unseen features by selectively assembling these primitives. Secondly, we propose to disentangle the visual feature into the semantic-related part and the semantic-unrelated part that contains useful visual classification clues but is less relevant to semantic representation. The inter-class relationships of semantic-related visual features are then required to be aligned with those in semantic space, thereby transferring semantic knowledge to visual feature learning. The proposed approach achieves impressively state-of-the-art performance on zero-shot panoptic segmentation, instance segmentation, and semantic segmentation. Code is available at


page 1

page 3

page 8


Semantic-Promoted Debiasing and Background Disambiguation for Zero-Shot Instance Segmentation

Zero-shot instance segmentation aims to detect and precisely segment obj...

Zero-shot Point Cloud Segmentation by Transferring Geometric Primitives

We investigate transductive zero-shot point cloud semantic segmentation ...

From Pixel to Patch: Synthesize Context-aware Features for Zero-shot Semantic Segmentation

Zero-shot learning has been actively studied for image classification ta...

Visual Semantic Segmentation Based on Few/Zero-Shot Learning: An Overview

Visual semantic segmentation aims at separating a visual sample into div...

Visual Data Synthesis via GAN for Zero-Shot Video Classification

Zero-Shot Learning (ZSL) in video classification is a promising research...

Zero-shot Unsupervised Transfer Instance Segmentation

Segmentation is a core computer vision competency, with applications spa...

Efficient Feature Distillation for Zero-shot Detection

The large-scale vision-language models (e.g., CLIP) are leveraged by dif...

Please sign up or login with your details

Forgot password? Click here to reset