Stability of Accuracy for the Training of DNNs Via the Uniform Doubling Condition

by   Yitzchak Shmalo, et al.

We study the stability of accuracy for the training of deep neural networks. Here the training of a DNN is preformed via the minimization of a cross-entropy loss function and the performance metric is the accuracy (the proportion of objects classified correctly). While training amounts to the decrease of loss, the accuracy does not necessarily increase during the training. A recent result by Berlyand, Jabin and Safsten introduces a doubling condition on the training data which ensures the stability of accuracy during training for DNNs with the absolute value activation function. For training data in ^n, this doubling condition is formulated using slabs in ^n and it depends on the choice of the slabs. The goal of this paper is twofold. First to make the doubling condition uniform, that is independent on the choice of slabs leading to sufficient conditions for stability in terms of training data only. Second to extend the original stability results for the absolute value activation function to a broader class of piecewise linear activation function with finitely many critical points such as the popular Leaky ReLU.


page 1

page 2

page 3

page 4


Improving Classification Neural Networks by using Absolute activation function (MNIST/LeNET-5 example)

The paper discusses the use of the Absolute activation function in class...

Stability for the Training of Deep Neural Networks and Other Classifiers

We examine the stability of loss-minimizing training processes that are ...

Criticality versus uniformity in deep neural networks

Deep feedforward networks initialized along the edge of chaos exhibit ex...

A Methodology for Automatic Selection of Activation Functions to Design Hybrid Deep Neural Networks

Activation functions influence behavior and performance of DNNs. Nonline...

An Investigation on Deep Learning with Beta Stabilizer

Artificial neural networks (ANN) have been used in many applications suc...

Accurate and Efficient Hyperbolic Tangent Activation Function on FPGA using the DCT Interpolation Filter

Implementing an accurate and fast activation function with low cost is a...

Graph Interpolating Activation Improves Both Natural and Robust Accuracies in Data-Efficient Deep Learning

Improving the accuracy and robustness of deep neural nets (DNNs) and ada...

Please sign up or login with your details

Forgot password? Click here to reset