The CoRa Tensor Compiler: Compilation for Ragged Tensors with Minimal Padding

10/19/2021
by   Pratik Fegade, et al.
0

There is often variation in the shape and size of input data used for deep learning. In many cases, such data can be represented using tensors with non-uniform shapes, or ragged tensors. Due to limited and non-portable support for efficient execution on ragged tensors, current deep learning frameworks generally use techniques such as padding and masking to make the data shapes uniform and then offload the computations to optimized kernels for dense tensor algebra. Such techniques can, however, lead to a lot of wasted computation and therefore, a loss in performance. This paper presents CoRa, a tensor compiler that allows users to easily generate efficient code for ragged tensor operators targeting a wide range of CPUs and GPUs. Evaluating CoRa on a variety of operators on ragged tensors as well as on an encoder layer of the transformer model, we find that CoRa (i)performs competitively with hand-optimized implementations of the operators and the transformer encoder and (ii) achieves, over PyTorch, a 1.6X geomean speedup for the encoder on an Nvidia GPU and a 1.86X geomean speedup for the multi-head attention module used in transformers on an ARM CPU.

READ FULL TEXT
research
02/09/2021

A High-Performance Sparse Tensor Algebra Compiler in Multi-Level IR

Tensor algebra is widely used in many applications, such as scientific c...
research
10/05/2022

Axon: A Language for Dynamic Shapes in Deep Learning Graphs

Axon is a language that enables shape and rank inference for tensors in ...
research
10/01/2021

An Attempt to Generate Code for Symmetric Tensor Computations

This document describes an attempt to develop a compiler-based approach ...
research
05/10/2020

AutoHOOT: Automatic High-Order Optimization for Tensors

High-order optimization methods, including Newton's method and its varia...
research
01/02/2020

A Parallel Sparse Tensor Benchmark Suite on CPUs and GPUs

Tensor computations present significant performance challenges that impa...
research
08/25/2022

Polyhedral Specification and Code Generation of Sparse Tensor Contraction with Co-Iteration

This paper presents a code generator for sparse tensor contraction compu...
research
11/05/2018

Simple, Distributed, and Accelerated Probabilistic Programming

We describe a simple, low-level approach for embedding probabilistic pro...

Please sign up or login with your details

Forgot password? Click here to reset