Meta-DMoE: Adapting to Domain Shift by Meta-Distillation from Mixture-of-Experts

by   Tao Zhong, et al.

In this paper, we tackle the problem of domain shift. Most existing methods perform training on multiple source domains using a single model, and the same trained model is used on all unseen target domains. Such solutions are sub-optimal as each target domain exhibits its own speciality, which is not adapted. Furthermore, expecting the single-model training to learn extensive knowledge from the multiple source domains is counterintuitive. The model is more biased toward learning only domain-invariant features and may result in negative knowledge transfer. In this work, we propose a novel framework for unsupervised test-time adaptation, which is formulated as a knowledge distillation process to address domain shift. Specifically, we incorporate Mixture-of-Experts (MoE) as teachers, where each expert is separately trained on different source domains to maximize their speciality. Given a test-time target domain, a small set of unlabeled data is sampled to query the knowledge from MoE. As the source domains are correlated to the target domains, a transformer-based aggregator then combines the domain knowledge by examining the interconnection among them. The output is treated as a supervision signal to adapt a student prediction network toward the target domain. We further employ meta-learning to enforce the aggregator to distill positive knowledge and the student network to achieve fast adaptation. Extensive experiments demonstrate that the proposed method outperforms the state-of-the-art and validates the effectiveness of each proposed component. Our code is available at


page 1

page 2

page 3

page 4


Meta-causal Learning for Single Domain Generalization

Single domain generalization aims to learn a model from a single trainin...

A Prototype-Oriented Clustering for Domain Shift with Source Privacy

Unsupervised clustering under domain shift (UCDS) studies how to transfe...

Dynamic Domain Generalization

Domain generalization (DG) is a fundamental yet very challenging researc...

Student Become Decathlon Master in Retinal Vessel Segmentation via Dual-teacher Multi-target Domain Adaptation

Unsupervised domain adaptation has been proposed recently to tackle the ...

Dynamic Distillation Network for Cross-Domain Few-Shot Recognition with Unlabeled Data

Most existing works in few-shot learning rely on meta-learning the netwo...

Generalizable Decision Boundaries: Dualistic Meta-Learning for Open Set Domain Generalization

Domain generalization (DG) is proposed to deal with the issue of domain ...

HMOE: Hypernetwork-based Mixture of Experts for Domain Generalization

Due to the domain shift, machine learning systems typically fail to gene...

Please sign up or login with your details

Forgot password? Click here to reset