Searching for Activation Functions

10/16/2017
by   Prajit Ramachandran, et al.
0

The choice of activation functions in deep networks has a significant effect on the training dynamics and task performance. Currently, the most successful and widely-used activation function is the Rectified Linear Unit (ReLU). Although various hand-designed alternatives to ReLU have been proposed, none have managed to replace it due to inconsistent gains. In this work, we propose to leverage automatic search techniques to discover new activation functions. Using a combination of exhaustive and reinforcement learning-based search, we discover multiple novel activation functions. We verify the effectiveness of the searches by conducting an empirical evaluation with the best discovered activation function. Our experiments show that the best discovered activation function, f(x) = x ·sigmoid(β x), which we name Swish, tends to work better than ReLU on deeper models across a number of challenging datasets. For example, simply replacing ReLUs with Swish units improves top-1 classification accuracy on ImageNet by 0.9% for Mobile NASNet-A and 0.6% for Inception-ResNet-v2. The simplicity of Swish and its similarity to ReLU make it easy for practitioners to replace ReLUs with Swish units in any neural network.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/23/2019

Mish: A Self Regularized Non-Monotonic Neural Activation Function

The concept of non-linearity in a Neural Network is introduced by an act...
research
04/08/2021

Learning specialized activation functions with the Piecewise Linear Unit

The choice of activation functions is crucial for modern deep neural net...
research
01/09/2019

Is it Time to Swish? Comparing Deep Learning Activation Functions Across NLP tasks

Activation functions play a crucial role in neural networks because they...
research
05/05/2015

Empirical Evaluation of Rectified Activations in Convolutional Network

In this paper we investigate the performance of different types of recti...
research
03/05/2023

Swim: A General-Purpose, High-Performing, and Efficient Activation Function for Locomotion Control Tasks

Activation functions play a significant role in the performance of deep ...
research
03/01/2020

Soft-Root-Sign Activation Function

The choice of activation function in deep networks has a significant eff...
research
05/18/2023

Learning Activation Functions for Sparse Neural Networks

Sparse Neural Networks (SNNs) can potentially demonstrate similar perfor...

Please sign up or login with your details

Forgot password? Click here to reset