Amortized Prompt: Lightweight Fine-Tuning for CLIP in Domain Generalization

11/25/2021
by   Xin Zhang, et al.
0

Domain generalization (DG) is a difficult transfer learning problem aiming to learn a generalizable model to unseen domains. Recent massive pre-trained models such as CLIP and GPT-3, i.e. foundation models (FMs), have been shown to be robust to many distribution shifts and therefore should lead to substantial improvements in DG. In this work, we study generic ways to adopt CLIP for DG problems in image classification, where we evaluate on naive zero-shot learning and full DG learning settings. For the latter, we propose AP (Amortized Prompt), as a novel approach for domain inference in the form of prompt generation. Using several standard datasets on domain generalization benchmark, namely PACS, VLCS, OfficeHome, and TerraIncognita, CLIP provides comparable performance without fine-tuning any parameters, suggesting the applicability and importance of FM in DG. In addition, we show that combining domain prompt inference with CLIP enables AP to outperform strong baselines and the naive CLIP baselines by a large margin, raising accuracy from 71.3% to 79.3%. We hope the simplicity and success of our approach emphasizes the importance of and leads to wider more adoption and analysis of foundation models in the field of domain generalization.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset