Binarized Convolutional Neural Networks for Efficient Inference on GPUs

08/01/2018
by   Mir Khan, et al.
0

Convolutional neural networks have recently achieved significant breakthroughs in various image classification tasks. However, they are computationally expensive,which can make their feasible mplementation on embedded and low-power devices difficult. In this paper convolutional neural network binarization is implemented on GPU-based platforms for real-time inference on resource constrained devices. In binarized networks, all weights and intermediate computations between layers are quantized to +1 and -1, allowing multiplications and additions to be replaced with bit-wise operations between 32-bit words. This representation completely eliminates the need for floating point multiplications and additions and decreases both the computational load and the memory footprint compared to a full-precision network implemented in floating point, making it well-suited for resource-constrained environments. We compare the performance of our implementation with an equivalent floating point implementation on one desktop and two embedded GPU platforms. Our implementation achieves a maximum speed up of 7. 4X with only 4.4 implementation.

READ FULL TEXT
research
09/24/2018

No Multiplication? No Floating Point? No Problem! Training Networks for Efficient Inference

For successful deployment of deep neural networks on highly--resource-co...
research
07/28/2020

Optimization of XNOR Convolution for Binary Convolutional Neural Networks on GPU

Binary convolutional networks have lower computational load and lower me...
research
08/22/2018

An Overview of Datatype Quantization Techniques for Convolutional Neural Networks

Convolutional Neural Networks (CNNs) are becoming increasingly popular d...
research
02/23/2020

PoET-BiN: Power Efficient Tiny Binary Neurons

The success of neural networks in image classification has inspired vari...
research
09/29/2022

Tuning of Mixture-of-Experts Mixed-Precision Neural Networks

Deep learning has become a useful data analysis method, however mainstre...
research
05/28/2018

Convolutional neural network compression for natural language processing

Convolutional neural networks are modern models that are very efficient ...
research
09/06/2017

Embedded Binarized Neural Networks

We study embedded Binarized Neural Networks (eBNNs) with the aim of allo...

Please sign up or login with your details

Forgot password? Click here to reset