Differentiating a Tensor Language

08/25/2020
by   Gilbert Bernstein, et al.
0

How does one compile derivatives of tensor programs, such that the resulting code is purely functional (hence easier to optimize and parallelize) and provably efficient relative to the original program? We show that naively differentiating tensor code—as done in popular systems like Tensorflow and PyTorch—can cause asymptotic slowdowns in pathological cases, violating the Cheap Gradients Principle. However, all existing automatic differentiation methods that guarantee this principle (for variable size data) do so by relying on += mutation through aliases/pointers—which complicates downstream optimization. We provide the first purely functional, provably efficient, adjoint/reverse-mode derivatives of array/tensor code by explicitly accounting for sparsity. We do this by focusing on the indicator function from Iverson's APL. We also introduce a new "Tensor SSA" normal form and a new derivation of reverse-mode automatic differentiation based on the universal property of inner-products.

READ FULL TEXT
research
12/19/2022

Denotationally Correct, Purely Functional, Efficient Reverse-mode Automatic Differentiation

Reverse-mode differentiation is used for optimization, but it introduces...
research
07/07/2023

Efficient CHAD

We show how the basic Combinatory Homomorphic Automatic Differentiation ...
research
10/01/2021

CHAD for Expressive Total Languages

We show how to apply forward and reverse mode Combinatory Homomorphic Au...
research
10/07/2020

A Simple and Efficient Tensor Calculus for Machine Learning

Computing derivatives of tensor expressions, also known as tensor calcul...
research
07/10/2020

Reverse AD at Higher Types: Pure, Principled and Denotationally Correct

We show how to define source-code transformations for forward- and rever...
research
09/23/2015

The Stan Math Library: Reverse-Mode Automatic Differentiation in C++

As computational challenges in optimization and statistical inference gr...
research
02/21/2022

AD for an Array Language with Nested Parallelism

We present a technique for applying (forward and) reverse-mode automatic...

Please sign up or login with your details

Forgot password? Click here to reset