A Practical Algorithm Design and Evaluation for Heterogeneous Elastic Computing with Stragglers

07/18/2021
by   Nicholas Woolsey, et al.
0

Our extensive real measurements over Amazon EC2 show that the virtual instances often have different computing speeds even if they share the same configurations. This motivates us to study heterogeneous Coded Storage Elastic Computing (CSEC) systems where machines, with different computing speeds, join and leave the network arbitrarily over different computing steps. In CSEC systems, a Maximum Distance Separable (MDS) code is used for coded storage such that the file placement does not have to be redefined with each elastic event. Computation assignment algorithms are used to minimize the computation time given computation speeds of different machines. While previous studies of heterogeneous CSEC do not include stragglers-the slow machines during the computation, we develop a new framework in heterogeneous CSEC that introduces straggler tolerance. Based on this framework, we design a novel algorithm using our previously proposed approach for heterogeneous CSEC such that the system can handle any subset of stragglers of a specified size while minimizing the computation time. Furthermore, we establish a trade-off in computation time and straggler tolerance. Another major limitation of existing CSEC designs is the lack of practical evaluations using real applications. In this paper, we evaluate the performance of our designs on Amazon EC2 for applications of the power iteration and linear regression. Evaluation results show that the proposed heterogeneous CSEC algorithms outperform the state-of-the-art designs by more than 30

READ FULL TEXT
research
01/12/2020

Heterogeneous Computation Assignments in Coded Elastic Computing

We study the optimal design of a heterogeneous coded elastic computing (...
research
08/12/2020

Coded Elastic Computing on Machines with Heterogeneous Storage and Computation Speed

We study the optimal design of heterogeneous Coded Elastic Computing (CE...
research
07/20/2021

A New Design Framework for Heterogeneous Uncoded Storage Elastic Computing

Elasticity is one important feature in modern cloud computing systems an...
research
12/16/2018

Coded Elastic Computing

Cloud providers have recently introduced low-priority machines to reduce...
research
06/19/2022

Hierarchical coded elastic computing

Elasticity is offered by cloud service providers to exploit under-utiliz...
research
12/29/2019

On Batch-Processing Based Coded Computing for Heterogeneous Distributed Computing Systems

In recent years, coded distributed computing (CDC) has attracted signifi...
research
12/05/2021

Design Trade-offs for a Robust Dynamic Hybrid Hash Join (Extended Version)

The Join operator, as one of the most expensive and commonly used operat...

Please sign up or login with your details

Forgot password? Click here to reset