Convergence Analysis of the Dynamics of a Special Kind of Two-Layered Neural Networks with ℓ_1 and ℓ_2 Regularization

11/19/2017
by   Zhifeng Kong, et al.
0

In this paper, we made an extension to the convergence analysis of the dynamics of two-layered bias-free networks with one ReLU output. We took into consideration two popular regularization terms: the ℓ_1 and ℓ_2 norm of the parameter vector w, and added it to the square loss function with coefficient λ/2. We proved that when λ is small, the weight vector w converges to the optimal solution ŵ (with respect to the new loss function) with probability ≥ (1-ε)(1-A_d)/2 under random initiations in a sphere centered at the origin, where ε is a small value and A_d is a constant. Numerical experiments including phase diagrams and repeated simulations verified our theory.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset

Sign in with Google

×

Use your Google Account to sign in to DeepAI

×

Consider DeepAI Pro