Piecewise-Linear Activations or Analytic Activation Functions: Which Produce More Expressive Neural Networks?

04/24/2022
by   Anastasis Kratsios, et al.
7

Many currently available universal approximation theorems affirm that deep feedforward networks defined using any suitable activation function can approximate any integrable function locally in L^1-norm. Though different approximation rates are available for deep neural networks defined using other classes of activation functions, there is little explanation for the empirically confirmed advantage that ReLU networks exhibit over their classical (e.g. sigmoidal) counterparts. Our main result demonstrates that deep networks with piecewise linear activation (e.g. ReLU or PReLU) are fundamentally more expressive than deep feedforward networks with analytic (e.g. sigmoid, Swish, GeLU, or Softplus). More specifically, we construct a strict refinement of the topology on the space L^1_loc(ℝ^d,ℝ^D) of locally Lebesgue-integrable functions, in which the set of deep ReLU networks with (bilinear) pooling NN^ReLU + Pool is dense (i.e. universal) but the set of deep feedforward networks defined using any combination of analytic activation functions with (or without) pooling layers NN^ω+Pool is not dense (i.e. not universal). Our main result is further explained by quantitatively demonstrating that this "separation phenomenon" between the networks in NN^ReLU+Pool and those in NN^ω+Pool by showing that the networks in NN^ReLU are capable of approximate any compactly supported Lipschitz function while simultaneously approximating its essential support; whereas, the networks in NN^ω+pool cannot.

READ FULL TEXT

page 3

page 7

page 15

page 19

page 20

page 21

page 22

page 23

research
12/14/2020

High-Order Approximation Rates for Neural Networks with ReLU^k Activation Functions

We study the approximation properties of shallow neural networks (NN) wi...
research
08/23/2016

Neural Networks with Smooth Adaptive Activation Functions for Regression

In Neural Networks (NN), Adaptive Activation Functions (AAF) have parame...
research
02/28/2021

Deep Neural Networks with ReLU-Sine-Exponential Activations Break Curse of Dimensionality on Hölder Class

In this paper, we construct neural networks with ReLU, sine and 2^x as a...
research
09/23/2021

Arbitrary-Depth Universal Approximation Theorems for Operator Neural Networks

The standard Universal Approximation Theorem for operator neural network...
research
08/16/2022

Universal Solutions of Feedforward ReLU Networks for Interpolations

This paper provides a theoretical framework on the solution of feedforwa...
research
12/28/2020

Neural Network Approximation

Neural Networks (NNs) are the method of choice for building learning alg...
research
04/05/2021

Deep neural network approximation of analytic functions

We provide an entropy bound for the spaces of neural networks with piece...

Please sign up or login with your details

Forgot password? Click here to reset