A Methodology for Automatic Selection of Activation Functions to Design Hybrid Deep Neural Networks

by   Alberto Marchisio, et al.

Activation functions influence behavior and performance of DNNs. Nonlinear activation functions, like Rectified Linear Units (ReLU), Exponential Linear Units (ELU) and Scaled Exponential Linear Units (SELU), outperform the linear counterparts. However, selecting an appropriate activation function is a challenging problem, as it affects the accuracy and the complexity of the given DNN. In this paper, we propose a novel methodology to automatically select the best-possible activation function for each layer of a given DNN, such that the overall DNN accuracy, compared to considering only one type of activation function for the whole DNN, is improved. However, an associated scientific challenge in exploring all the different configurations of activation functions would be time and resource-consuming. Towards this, our methodology identifies the Evaluation Points during learning to evaluate the accuracy in an intermediate step of training and to perform early termination by checking the accuracy gradient of the learning curve. This helps in significantly reducing the exploration time during training. Moreover, our methodology selects, for each layer, the dropout rate that optimizes the accuracy. Experiments show that we are able to achieve on average 7 CIFAR-10 and CIFAR-100 benchmarks, with limited performance and power penalty on GPUs.


page 1

page 2

page 3

page 4


Complexity of Neural Network Training and ETR: Extensions with Effectively Continuous Functions

We study the complexity of the problem of training neural networks defin...

Design Space Exploration of Neural Network Activation Function Circuits

The widespread application of artificial neural networks has prompted re...

Nesting Forward Automatic Differentiation for Memory-Efficient Deep Neural Network Training

An activation function is an element-wise mathematical function and play...

GELU Activation Function in Deep Learning: A Comprehensive Mathematical Analysis and Performance

Selecting the most suitable activation function is a critical factor in ...

Stability of Accuracy for the Training of DNNs Via the Uniform Doubling Condition

We study the stability of accuracy for the training of deep neural netwo...

Why Shallow Networks Struggle with Approximating and Learning High Frequency: A Numerical Study

In this work, a comprehensive numerical study involving analysis and exp...

Learning DNN networks using un-rectifying ReLU with compressed sensing application

The un-rectifying technique expresses a non-linear point-wise activation...

Please sign up or login with your details

Forgot password? Click here to reset