DeepAI AI Chat
Log In Sign Up

A Robust Classification Framework for Byzantine-Resilient Stochastic Gradient Descent

01/16/2023
by   Shashank Reddy Chirra, et al.
IIIT Bangalore
0

This paper proposes a Robust Gradient Classification Framework (RGCF) for Byzantine fault tolerance in distributed stochastic gradient descent. The framework consists of a pattern recognition filter which we train to be able to classify individual gradients as Byzantine by using their direction alone. This filter is robust to an arbitrary number of Byzantine workers for convex as well as non-convex optimisation settings, which is a significant improvement on the prior work that is robust to Byzantine faults only when up to 50 workers are Byzantine. This solution does not require an estimate of the number of Byzantine workers; its running time is not dependent on the number of workers and can scale up to training instances with a large number of workers without a loss in performance. We validate our solution by training convolutional neural networks on the MNIST dataset in the presence of Byzantine workers.

READ FULL TEXT

page 1

page 2

page 3

page 4

03/17/2019

Zeno++: robust asynchronous SGD with arbitrary number of Byzantine workers

We propose Zeno++, a new robust asynchronous synchronous Stochastic Grad...
06/24/2020

Befriending The Byzantines Through Reputation Scores

We propose two novel stochastic gradient descent algorithms, ByGARS and ...
03/08/2023

Byzantine-Robust Loopless Stochastic Variance-Reduced Gradient

Distributed optimization with open collaboration is a popular field sinc...
06/16/2020

Byzantine-Robust Learning on Heterogeneous Datasets via Resampling

In Byzantine robust distributed optimization, a central server wants to ...
04/26/2018

Securing Distributed Machine Learning in High Dimensions

We consider securing a distributed machine learning system wherein the d...
02/03/2022

Byzantine-Robust Decentralized Learning via Self-Centered Clipping

In this paper, we study the challenging task of Byzantine-robust decentr...
07/27/2021

Verifiable Coded Computing: Towards Fast, Secure and Private Distributed Machine Learning

Stragglers, Byzantine workers, and data privacy are the main bottlenecks...