Accelerated Neural Networks on OpenCL Devices Using SYCL-DNN

by   Rod Burns, et al.

Over the past few years machine learning has seen a renewed explosion of interest, following a number of studies showing the effectiveness of neural networks in a range of tasks which had previously been considered incredibly hard. Neural networks' effectiveness in the fields of image recognition and natural language processing stems primarily from the vast amounts of data available to companies and researchers, coupled with the huge amounts of compute power available in modern accelerators such as GPUs, FPGAs and ASICs. There are a number of approaches available to developers for utilizing GPGPU technologies such as SYCL, OpenCL and CUDA, however many applications require the same low level mathematical routines. Libraries dedicated to accelerating these common routines allow developers to easily make full use of the available hardware without requiring low level knowledge of the hardware themselves, however such libraries are often provided by hardware manufacturers for specific hardware such as cuDNN for Nvidia hardware or MIOpen for AMD hardware. SYCL-DNN is a new open-source library dedicated to providing accelerated routines for neural network operations which are hardware and vendor agnostic. Built on top of the SYCL open standard and written entirely in standard C++, SYCL-DNN allows a user to easily accelerate neural network code for a wide range of hardware using a modern C++ interface. The library is tested on AMD's OpenCL for GPU, Intel's OpenCL for CPU and GPU, ARM's OpenCL for Mali GPUs as well as ComputeAorta's OpenCL for R-Car CV engine and host CPU. In this talk we will present performance figures for SYCL-DNN on this range of hardware, and discuss how high performance was achieved on such a varied set of accelerators with such different hardware features.


page 1

page 2

page 3

page 4


A Metaprogramming and Autotuning Framework for Deploying Deep Learning Applications

In recent years, deep neural networks (DNNs), have yielded strong result...

EvoJAX: Hardware-Accelerated Neuroevolution

Evolutionary computation has been shown to be a highly effective method ...

SMaLL: A Software Framework for portable Machine Learning Libraries

Interest in deploying Deep Neural Network (DNN) inference on edge device...

cltorch: a Hardware-Agnostic Backend for the Torch Deep Neural Network Library, Based on OpenCL

This paper presents cltorch, a hardware-agnostic backend for the Torch n...

De-specializing an HLS library for Deep Neural Networks: improvements upon hls4ml

Custom hardware accelerators for Deep Neural Networks are increasingly p...

High-performance evaluation of high angular momentum 4-center Gaussian integrals on modern accelerated processors

We present a high-performance evaluation method for 4-center 2-particle ...

TFApprox: Towards a Fast Emulation of DNN Approximate Hardware Accelerators on GPU

Energy efficiency of hardware accelerators of deep neural networks (DNN)...

Please sign up or login with your details

Forgot password? Click here to reset