Benchmarking Resource Usage for Efficient Distributed Deep Learning

by   Nathan C. Frey, et al.

Deep learning (DL) workflows demand an ever-increasing budget of compute and energy in order to achieve outsized gains. Neural architecture searches, hyperparameter sweeps, and rapid prototyping consume immense resources that can prevent resource-constrained researchers from experimenting with large models and carry considerable environmental impact. As such, it becomes essential to understand how different deep neural networks (DNNs) and training leverage increasing compute and energy resources – especially specialized computationally-intensive models across different domains and applications. In this paper, we conduct over 3,400 experiments training an array of deep networks representing various domains/tasks – natural language processing, computer vision, and chemistry – on up to 424 graphics processing units (GPUs). During training, our experiments systematically vary compute resource characteristics and energy-saving mechanisms such as power utilization and GPU clock rate limits to capture and illustrate the different trade-offs and scaling behaviors each representative model exhibits under various resource and energy-constrained regimes. We fit power law models that describe how training time scales with available compute resources and energy constraints. We anticipate that these findings will help inform and guide high-performance computing providers in optimizing resource utilization, by selectively reducing energy consumption for different deep learning tasks/workflows with minimal impact on training.


page 2

page 6

page 7


The Impact of GPU DVFS on the Energy and Performance of Deep Learning: an Empirical Study

Over the past years, great progress has been made in improving the compu...

Great Power, Great Responsibility: Recommendations for Reducing Energy for Training Language Models

The energy requirements of current natural language processing models co...

NestDNN: Resource-Aware Multi-Tenant On-Device Deep Learning for Continuous Mobile Vision

Mobile vision systems such as smartphones, drones, and augmented-reality...

DNNAbacus: Toward Accurate Computational Cost Prediction for Deep Neural Networks

Deep learning is attracting interest across a variety of domains, includ...

Measuring what Really Matters: Optimizing Neural Networks for TinyML

With the surge of inexpensive computational and memory resources, neural...

Serving MoE Models on Resource-constrained Edge Devices via Dynamic Expert Swapping

Mixture of experts (MoE) is a popular technique in deep learning that im...

DecisiveNets: Training Deep Associative Memories to Solve Complex Machine Learning Problems

Learning deep representations to solve complex machine learning tasks ha...

Please sign up or login with your details

Forgot password? Click here to reset