Preservation of the Global Knowledge by Not-True Self Knowledge Distillation in Federated Learning

by   Gihun Lee, et al.

In Federated Learning (FL), a strong global model is collaboratively learned by aggregating the clients' locally trained models. Although this allows no need to access clients' data directly, the global model's convergence often suffers from data heterogeneity. This paper suggests that forgetting could be the bottleneck of global convergence. We observe that fitting on biased local distribution shifts the feature on global distribution and results in forgetting of global knowledge. We consider this phenomenon as an analogy to Continual Learning, which also faces catastrophic forgetting when fitted on the new task distribution. Based on our findings, we hypothesize that tackling down the forgetting in local training relives the data heterogeneity problem. To this end, we propose a simple yet effective framework Federated Local Self-Distillation (FedLSD), which utilizes the global knowledge on locally available data. By following the global perspective on local data, FedLSD encourages the learned features to preserve global knowledge and have consistent views across local models, thus improving convergence without compromising data privacy. Under our framework, we further extend FedLSD to FedLS-NTD, which only considers the not-true class signals to compensate noisy prediction of the global model. We validate that both FedLSD and FedLS-NTD significantly improve the performance in standard FL benchmarks in various setups, especially in the extreme data heterogeneity cases.


page 8

page 18

page 19

page 20


Fine-tuning Global Model via Data-Free Knowledge Distillation for Non-IID Federated Learning

Federated Learning (FL) is an emerging distributed learning paradigm und...

Acceleration of Federated Learning with Alleviated Forgetting in Local Training

Federated learning (FL) enables distributed optimization of machine lear...

Don't Memorize; Mimic The Past: Federated Class Incremental Learning Without Episodic Memory

Deep learning models are prone to forgetting information learned in the ...

No One Left Behind: Real-World Federated Class-Incremental Learning

Federated learning (FL) is a hot collaborative training framework via ag...

Learning Federated Visual Prompt in Null Space for MRI Reconstruction

Federated Magnetic Resonance Imaging (MRI) reconstruction enables multip...

CD^2-pFed: Cyclic Distillation-guided Channel Decoupling for Model Personalization in Federated Learning

Federated learning (FL) is a distributed learning paradigm that enables ...

Multi-Level Branched Regularization for Federated Learning

A critical challenge of federated learning is data heterogeneity and imb...

Please sign up or login with your details

Forgot password? Click here to reset