Coded Elastic Computing

12/16/2018
by   Yaoqing Yang, et al.
0

Cloud providers have recently introduced low-priority machines to reduce the cost of computations. Exploiting such opportunity for machine learning tasks is challenging inasmuch as low-priority machines can elastically leave (through preemption) and join the computation at any time. In this paper, we design a new technique called coded elastic computing enabling distributed machine learning computations over elastic resources. The proposed technique allows machines to transparently leave the computation without sacrificing the algorithm-level performance, and, at the same time, flexibly reduce the workload at existing machines when new machines join the computation. Thanks to the redundancy provided by encoding, our approach is able to achieve similar computational cost as the original (uncoded) method when all machines are present; the cost gracefully increases when machines are preempted and reduces when machines join. We test the performance of the proposed technique on two mini-benchmark experiments, namely elastic matrix multiplications and linear regression. Our preliminary experimental results show improvements over several existing techniques.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset