High-Performance Tensor Contraction without Transposition

07/01/2016
by   Devin A. Matthews, et al.
0

Tensor computations--in particular tensor contraction (TC)--are important kernels in many scientific computing applications. Due to the fundamental similarity of TC to matrix multiplication (MM) and to the availability of optimized implementations such as the BLAS, tensor operations have traditionally been implemented in terms of BLAS operations, incurring both a performance and a storage overhead. Instead, we implement TC using the flexible BLIS framework, which allows for transposition (reshaping) of the tensor to be fused with internal partitioning and packing operations, requiring no explicit transposition operations or additional workspace. This implementation, TBLIS, achieves performance approaching that of MM, and in some cases considerably higher than that of traditional TC. Our implementation supports multithreading using an approach identical to that used for MM in BLIS, with similar performance characteristics. The complexity of managing tensor-to-matrix transformations is also handled automatically in our approach, greatly simplifying its use in scientific applications.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/11/2017

Strassen's Algorithm for Tensor Contraction

Tensor contraction (TC) is an important computational kernel widely used...
research
11/28/2017

TLib: A Flexible C++ Tensor Framework for Numerical Tensor Calculus

Numerical tensor calculus comprise basic tensor operations such as the e...
research
07/01/2016

Design of a high-performance GEMM-like Tensor-Tensor Multiplication

We present "GEMM-like Tensor-Tensor multiplication" (GETT), a novel appr...
research
01/04/2022

TAMM: Tensor Algebra for Many-body Methods

Tensor contraction operations in computational chemistry consume signifi...
research
05/16/2022

Cloud Matrix Machine for Julia and Implicit Parallelization for Matrix Languages

Matrix computations are widely used in increasing sizes and complexity i...
research
06/01/2017

Performance Modeling and Prediction for Dense Linear Algebra

This dissertation introduces measurement-based performance modeling and ...
research
04/11/2018

Fast Feasible and Unfeasible Matrix Multiplication

Fast matrix-by-matrix multiplication (hereafter MM) is a highly recogniz...

Please sign up or login with your details

Forgot password? Click here to reset