Diagonalwise Refactorization: An Efficient Training Method for Depthwise Convolutions

03/27/2018
by   Zheng Qin, et al.
0

Depthwise convolutions provide significant performance benefits owing to the reduction in both parameters and mult-adds. However, training depthwise convolution layers with GPUs is slow in current deep learning frameworks because their implementations cannot fully utilize the GPU capacity. To address this problem, in this paper we present an efficient method (called diagonalwise refactorization) for accelerating the training of depthwise convolution layers. Our key idea is to rearrange the weight vectors of a depthwise convolution into a large diagonal weight matrix so as to convert the depthwise convolution into one single standard convolution, which is well supported by the cuDNN library that is highly-optimized for GPU computations. We have implemented our training method in five popular deep learning frameworks. Evaluation results show that our proposed method gains 15.4× training speedup on Darknet, 8.4× on Caffe, 5.4× on PyTorch, 3.5× on MXNet, and 1.4× on TensorFlow, compared to their original implementations of depthwise convolutions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/24/2014

Fast Convolutional Nets With fbfft: A GPU Performance Evaluation

We examine the performance profile of Convolutional Neural Network train...
research
07/25/2019

HUGE2: a Highly Untangled Generative-model Engine for Edge-computing

As a type of prominent studies in deep learning, generative models have ...
research
04/13/2018

μ-cuDNN: Accelerating Deep Learning Frameworks with Micro-Batching

NVIDIA cuDNN is a low-level library that provides GPU kernels frequently...
research
09/06/2019

ILP-M Conv: Optimize Convolution Algorithm for Single-Image Convolution Neural Network Inference on Mobile GPUs

Convolution neural networks are widely used for mobile applications. How...
research
01/27/2015

maxDNN: An Efficient Convolution Kernel for Deep Learning with Maxwell GPUs

This paper describes maxDNN, a computationally efficient convolution ker...
research
08/14/2018

CosmoFlow: Using Deep Learning to Learn the Universe at Scale

Deep learning is a promising tool to determine the physical model that d...
research
03/30/2023

Hybrid Dealiasing of Complex Convolutions

Efficient algorithms for computing linear convolutions based on the fast...

Please sign up or login with your details

Forgot password? Click here to reset