Hardware and Software Optimizations for Accelerating Deep Neural Networks: Survey of Current Trends, Challenges, and the Road Ahead

by   Maurizio Capra, et al.

Currently, Machine Learning (ML) is becoming ubiquitous in everyday life. Deep Learning (DL) is already present in many applications ranging from computer vision for medicine to autonomous driving of modern cars as well as other sectors in security, healthcare, and finance. However, to achieve impressive performance, these algorithms employ very deep networks, requiring a significant computational power, both during the training and inference time. A single inference of a DL model may require billions of multiply-and-accumulated operations, making the DL extremely compute- and energy-hungry. In a scenario where several sophisticated algorithms need to be executed with limited energy and low latency, the need for cost-effective hardware platforms capable of implementing energy-efficient DL execution arises. This paper first introduces the key properties of two brain-inspired models like Deep Neural Network (DNN), and Spiking Neural Network (SNN), and then analyzes techniques to produce efficient and high-performance designs. This work summarizes and compares the works for four leading platforms for the execution of algorithms such as CPU, GPU, FPGA and ASIC describing the main solutions of the state-of-the-art, giving much prominence to the last two solutions since they offer greater design flexibility and bear the potential of high energy-efficiency, especially for the inference process. In addition to hardware solutions, this paper discusses some of the important security issues that these DNN and SNN models may have during their execution, and offers a comprehensive section on benchmarking, explaining how to assess the quality of different networks and hardware systems designed for them.


page 3

page 4

page 34

page 38

page 39

page 40

page 41

page 42


Hardware-Aware Machine Learning: Modeling and Optimization

Recent breakthroughs in Deep Learning (DL) applications have made DL mod...

Can Deep Neural Networks be Converted to Ultra Low-Latency Spiking Neural Networks?

Spiking neural networks (SNNs), that operate via binary spikes distribut...

Comparative Analysis of CPU and GPU Profiling for Deep Learning Models

Deep Learning(DL) and Machine Learning(ML) applications are rapidly incr...

Bosch Deep Learning Hardware Benchmark

The widespread use of Deep Learning (DL) applications in science and ind...

Decompiling x86 Deep Neural Network Executables

Due to their widespread use on heterogeneous hardware devices, deep lear...

Towards Energy-Efficient and Secure Edge AI: A Cross-Layer Framework

The security and privacy concerns along with the amount of data that is ...

Exploiting Activation based Gradient Output Sparsity to Accelerate Backpropagation in CNNs

Machine/deep-learning (ML/DL) based techniques are emerging as a driving...

Please sign up or login with your details

Forgot password? Click here to reset