Benchmarking Neural Network Training Algorithms

by   George E. Dahl, et al.

Training algorithms, broadly construed, are an essential part of every deep learning pipeline. Training algorithm improvements that speed up training across a wide variety of workloads (e.g., better update rules, tuning protocols, learning rate schedules, or data selection schemes) could save time, save computational resources, and lead to better, more accurate, models. Unfortunately, as a community, we are currently unable to reliably identify training algorithm improvements, or even determine the state-of-the-art training algorithm. In this work, using concrete experiments, we argue that real progress in speeding up training requires new benchmarks that resolve three basic challenges faced by empirical comparisons of training algorithms: (1) how to decide when training is complete and precisely measure training time, (2) how to handle the sensitivity of measurements to exact workload details, and (3) how to fairly compare algorithms that require hyperparameter tuning. In order to address these challenges, we introduce a new, competitive, time-to-result benchmark using multiple workloads running on fixed hardware, the AlgoPerf: Training Algorithms benchmark. Our benchmark includes a set of workload variants that make it possible to detect benchmark submissions that are more robust to workload changes than current widely-used methods. Finally, we evaluate baseline submissions constructed using various optimizers that represent current practice, as well as other optimizers that have recently received attention in the literature. These baseline results collectively demonstrate the feasibility of our benchmark, show that non-trivial gaps between methods exist, and set a provisional state-of-the-art for future benchmark submissions to try and surpass.


page 1

page 2

page 3

page 4


BigDataBench: A Scalable and Unified Big Data and AI Benchmark Suite

Several fundamental changes in technology indicate domain-specific hardw...

How much progress have we made in neural network training? A New Evaluation Protocol for Benchmarking Optimizers

Many optimizers have been proposed for training deep neural networks, an...

BigDataBench: A Dwarf-based Big Data and AI Benchmark Suite

As architecture, system, data management, and machine learning communiti...

Autonomic Architecture for Big Data Performance Optimization

The big data software stack based on Apache Spark and Hadoop has become ...

The acute:chronic workload ratio: challenges and prospects for improvement

Injuries occur when an athlete performs a greater amount of activity (wo...

AI Matrix - Synthetic Benchmarks for DNN

Deep neural network (DNN) architectures, such as convolutional neural ne...

Benchmarking TinyML Systems: Challenges and Direction

Recent advancements in ultra-low-power machine learning (TinyML) hardwar...

Please sign up or login with your details

Forgot password? Click here to reset