Verifiable Coded Computing: Towards Fast, Secure and Private Distributed Machine Learning

07/27/2021
by   Tingting Tang, et al.
0

Stragglers, Byzantine workers, and data privacy are the main bottlenecks in distributed cloud computing. Several prior works proposed coded computing strategies to jointly address all three challenges. They require either a large number of workers, a significant communication cost or a significant computational complexity to tolerate malicious workers. Much of the overhead in prior schemes comes from the fact that they tightly couple coding for all three problems into a single framework. In this work, we propose Verifiable Coded Computing (VCC) framework that decouples Byzantine node detection challenge from the straggler tolerance. VCC leverages coded computing just for handling stragglers and privacy, and then uses an orthogonal approach of verifiable computing to tackle Byzantine nodes. Furthermore, VCC dynamically adapts its coding scheme to tradeoff straggler tolerance with Byzantine protection and vice-versa. We evaluate VCC on compute intensive distributed logistic regression application. Our experiments show that VCC speeds up the conventional uncoded implementation of distributed logistic regression by 3.2×-6.9×, and also improves the test accuracy by up to 12.6%.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/04/2018

Lagrange Coded Computing: Optimal Design for Resiliency, Security and Privacy

We consider a distributed computing scenario that involves computations ...
research
12/19/2019

Randomized Reactive Redundancy for Byzantine Fault-Tolerance in Parallelized Learning

This report considers the problem of Byzantine fault-tolerance in synchr...
research
09/17/2023

Privacy-Preserving Polynomial Computing Over Distributed Data

In this letter, we delve into a scenario where a user aims to compute po...
research
01/23/2020

Coded Computing for Boolean Functions

The growing size of modern datasets necessitates a massive computation i...
research
10/22/2019

Train Where the Data is: A Case for Bandwidth Efficient Coded Training

Training a machine learning model is both compute and data-intensive. Mo...
research
10/14/2019

Election Coding for Distributed Learning: Protecting SignSGD against Byzantine Attacks

Recent advances in large-scale distributed learning algorithms have enab...
research
05/14/2019

Coded Distributed Tracking

We consider the problem of tracking the state of a process that evolves ...

Please sign up or login with your details

Forgot password? Click here to reset