Structure and Gradient Dynamics Near Global Minima of Two-layer Neural Networks

09/01/2023
by   Leyang Zhang, et al.
0

Under mild assumptions, we investigate the structure of loss landscape of two-layer neural networks near global minima, determine the set of parameters which give perfect generalization, and fully characterize the gradient flows around it. With novel techniques, our work uncovers some simple aspects of the complicated loss landscape and reveals how model, target function, samples and initialization affect the training dynamics differently. Based on these results, we also explain why (overparametrized) neural networks could generalize well.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset