Federated Split BERT for Heterogeneous Text Classification

by   Zhengyang Li, et al.

Pre-trained BERT models have achieved impressive performance in many natural language processing (NLP) tasks. However, in many real-world situations, textual data are usually decentralized over many clients and unable to be uploaded to a central server due to privacy protection and regulations. Federated learning (FL) enables multiple clients collaboratively to train a global model while keeping the local data privacy. A few researches have investigated BERT in federated learning setting, but the problem of performance loss caused by heterogeneous (e.g., non-IID) data over clients remain under-explored. To address this issue, we propose a framework, FedSplitBERT, which handles heterogeneous data and decreases the communication cost by splitting the BERT encoder layers into local part and global part. The local part parameters are trained by the local client only while the global part parameters are trained by aggregating gradients of multiple clients. Due to the sheer size of BERT, we explore a quantization method to further reduce the communication cost with minimal performance loss. Our framework is ready-to-use and compatible to many existing federated learning algorithms, including FedAvg, FedProx and FedAdam. Our experiments verify the effectiveness of the proposed framework, which outperforms baseline methods by a significant margin, while FedSplitBERT with quantization can reduce the communication cost by 11.9×.


No One Left Behind: Inclusive Federated Learning over Heterogeneous Devices

Federated learning (FL) is an important paradigm for training global mod...

QuPeL: Quantized Personalization with Applications to Federated Learning

Traditionally, federated learning (FL) aims to train a single global mod...

When BERT Meets Quantum Temporal Convolution Learning for Text Classification in Heterogeneous Computing

The rapid development of quantum computing has demonstrated many unique ...

Communication-Efficient Federated Learning for Heterogeneous Edge Devices Based on Adaptive Gradient Quantization

Federated learning (FL) enables geographically dispersed edge devices (i...

Collaborating Heterogeneous Natural Language Processing Tasks via Federated Learning

The increasing privacy concerns on personal private text data promote th...

FewFedWeight: Few-shot Federated Learning Framework across Multiple NLP Tasks

Massively multi-task learning with large language models has recently ma...

HarmoFL: Harmonizing Local and Global Drifts in Federated Learning on Heterogeneous Medical Images

Multiple medical institutions collaboratively training a model using fed...

Please sign up or login with your details

Forgot password? Click here to reset