Diversity-Aware Meta Visual Prompting

by   Qidong Huang, et al.

We present Diversity-Aware Meta Visual Prompting (DAM-VP), an efficient and effective prompting method for transferring pre-trained models to downstream tasks with frozen backbone. A challenging issue in visual prompting is that image datasets sometimes have a large data diversity whereas a per-dataset generic prompt can hardly handle the complex distribution shift toward the original pretraining data distribution properly. To address this issue, we propose a dataset Diversity-Aware prompting strategy whose initialization is realized by a Meta-prompt. Specifically, we cluster the downstream dataset into small homogeneity subsets in a diversity-adaptive way, with each subset has its own prompt optimized separately. Such a divide-and-conquer design reduces the optimization difficulty greatly and significantly boosts the prompting performance. Furthermore, all the prompts are initialized with a meta-prompt, which is learned across several datasets. It is a bootstrapped paradigm, with the key observation that the prompting knowledge learned from previous datasets could help the prompt to converge faster and perform better on a new dataset. During inference, we dynamically select a proper prompt for each input, based on the feature distance between the input and each subset. Through extensive experiments, our DAM-VP demonstrates superior efficiency and effectiveness, clearly surpassing previous prompting methods in a series of downstream datasets for different pretraining models. Our code is available at: <https://github.com/shikiw/DAM-VP>.


page 3

page 4

page 13


Distribution-Aware Prompt Tuning for Vision-Language Models

Pre-trained vision-language models (VLMs) have shown impressive performa...

Unleashing the Power of Visual Prompting At the Pixel Level

This paper presents a simple and effective visual prompting method for a...

A Task-guided, Implicitly-searched and Meta-initialized Deep Model for Image Fusion

Image fusion plays a key role in a variety of multi-sensor-based vision ...

Meta-learning for downstream aware and agnostic pretraining

Neural network pretraining is gaining attention due to its outstanding p...

Active Finetuning: Exploiting Annotation Budget in the Pretraining-Finetuning Paradigm

Given the large-scale data and the high annotation cost, pretraining-fin...

HumanBench: Towards General Human-centric Perception with Projector Assisted Pretraining

Human-centric perceptions include a variety of vision tasks, which have ...

Explore and Exploit the Diverse Knowledge in Model Zoo for Domain Generalization

The proliferation of pretrained models, as a result of advancements in p...

Please sign up or login with your details

Forgot password? Click here to reset