Visual and Textual Prior Guided Mask Assemble for Few-Shot Segmentation and Beyond

08/15/2023
by   Chen Shuai, et al.
0

Few-shot segmentation (FSS) aims to segment the novel classes with a few annotated images. Due to CLIP's advantages of aligning visual and textual information, the integration of CLIP can enhance the generalization ability of FSS model. However, even with the CLIP model, the existing CLIP-based FSS methods are still subject to the biased prediction towards base classes, which is caused by the class-specific feature level interactions. To solve this issue, we propose a visual and textual Prior Guided Mask Assemble Network (PGMA-Net). It employs a class-agnostic mask assembly process to alleviate the bias, and formulates diverse tasks into a unified manner by assembling the prior through affinity. Specifically, the class-relevant textual and visual features are first transformed to class-agnostic prior in the form of probability map. Then, a Prior-Guided Mask Assemble Module (PGMAM) including multiple General Assemble Units (GAUs) is introduced. It considers diverse and plug-and-play interactions, such as visual-textual, inter- and intra-image, training-free, and high-order ones. Lastly, to ensure the class-agnostic ability, a Hierarchical Decoder with Channel-Drop Mechanism (HDCDM) is proposed to flexibly exploit the assembled masks and low-level features, without relying on any class-specific information. It achieves new state-of-the-art results in the FSS task, with mIoU of 77.6 on PASCAL-5^i and 59.4 on COCO-20^i in 1-shot scenario. Beyond this, we show that without extra re-training, the proposed PGMA-Net can solve bbox-level and cross-domain FSS, co-segmentation, zero-shot segmentation (ZSS) tasks, leading an any-shot segmentation framework.

READ FULL TEXT

page 1

page 3

page 5

page 6

research
11/02/2022

A Joint Framework Towards Class-aware and Class-agnostic Alignment for Few-shot Segmentation

Few-shot segmentation (FSS) aims to segment objects of unseen classes gi...
research
08/04/2020

Prior Guided Feature Enrichment Network for Few-Shot Segmentation

State-of-the-art semantic segmentation methods require sufficient labele...
research
08/23/2022

CRCNet: Few-shot Segmentation with Cross-Reference and Region-Global Conditional Networks

Few-shot segmentation aims to learn a segmentation model that can be gen...
research
03/24/2020

CRNet: Cross-Reference Networks for Few-Shot Segmentation

Over the past few years, state-of-the-art image segmentation algorithms ...
research
03/24/2023

Adaptive Base-class Suppression and Prior Guidance Network for One-Shot Object Detection

One-shot object detection (OSOD) aims to detect all object instances tow...
research
10/15/2022

Prediction Calibration for Generalized Few-shot Semantic Segmentation

Generalized Few-shot Semantic Segmentation (GFSS) aims to segment each i...
research
12/29/2022

A Unified Object Counting Network with Object Occupation Prior

The counting task, which plays a fundamental rule in numerous applicatio...

Please sign up or login with your details

Forgot password? Click here to reset