Analysis of Error Feedback in Federated Non-Convex Optimization with Biased Compression

by   Xiaoyun Li, et al.

In federated learning (FL) systems, e.g., wireless networks, the communication cost between the clients and the central server can often be a bottleneck. To reduce the communication cost, the paradigm of communication compression has become a popular strategy in the literature. In this paper, we focus on biased gradient compression techniques in non-convex FL problems. In the classical setting of distributed learning, the method of error feedback (EF) is a common technique to remedy the downsides of biased gradient compression. In this work, we study a compressed FL scheme equipped with error feedback, named Fed-EF. We further propose two variants: Fed-EF-SGD and Fed-EF-AMS, depending on the choice of the global model optimizer. We provide a generic theoretical analysis, which shows that directly applying biased compression in FL leads to a non-vanishing bias in the convergence rate. The proposed Fed-EF is able to match the convergence rate of the full-precision FL counterparts under data heterogeneity with a linear speedup. Moreover, we develop a new analysis of the EF under partial client participation, which is an important scenario in FL. We prove that under partial participation, the convergence rate of Fed-EF exhibits an extra slow-down factor due to a so-called “stale error compensation” effect. A numerical study is conducted to justify the intuitive impact of stale error accumulation on the norm convergence of Fed-EF under partial participation. Finally, we also demonstrate that incorporating the two-way compression in Fed-EF does not change the convergence results. In summary, our work conducts a thorough analysis of the error feedback in federated non-convex optimization. Our analysis with partial client participation also provides insights on a theoretical limitation of the error feedback mechanism, and possible directions for improvements.


page 1

page 2

page 3

page 4


Adaptive Control of Client Selection and Gradient Compression for Efficient Federated Learning

Federated learning (FL) allows multiple clients cooperatively train mode...

CFedAvg: Achieving Efficient Communication and Fast Convergence in Non-IID Federated Learning

Federated learning (FL) is a prevailing distributed learning paradigm, w...

Resource Allocation for Compression-aided Federated Learning with High Distortion Rate

Recently, a considerable amount of works have been made to tackle the co...

Client Selection in Nonconvex Federated Learning: Improved Convergence Analysis for Optimal Unbiased Sampling Strategy

Federated learning (FL) is a distributed machine learning paradigm that ...

Optimal Rate Adaption in Federated Learning with Compressed Communications

Federated Learning (FL) incurs high communication overhead, which can be...

Stochastic Controlled Averaging for Federated Learning with Communication Compression

Communication compression, a technique aiming to reduce the information ...

Compressed-VFL: Communication-Efficient Learning with Vertically Partitioned Data

We propose Compressed Vertical Federated Learning (C-VFL) for communicat...

Please sign up or login with your details

Forgot password? Click here to reset