Activation functions are not needed: the ratio net
The function approximator that finds the function mapping the feature to the label is an important component in a deep neural network for classification tasks. To overcome nonlinearity, which is the main difficulty in designing the function approximator, one usually uses the method based on the nonlinear activation function or the nonlinear kernel function and yields classical networks such as the feed-forward neural network (MLP) and the radial basis function network (RBF). Although, classical networks such as the MLP are robust in most of the classification task, they are not the most efficient. E.g., they use large amount of parameters and take long times to train. Additionally, the choice of activation functions has a non-negligible influence on the effectiveness and efficiency of the network. In this paper, we propose a new network that is efficient in finding the function that maps the feature to the label. Instead of using the nonlinear activation function, the new proposed network uses the fractional form to overcome the nonlinearity, thus for the sake of convenience, we name the network the ratio net. We compare the effectiveness and efficiency of the ratio net and the classical networks such as the MLP and the RBF in the classification task on the mnist database of handwritten digits and the IMDb dataset which is a binary sentiment analysis dataset. The result shows that the ratio net outperforms both the MLP and the RBF.
READ FULL TEXT