We consider optimizing two-layer neural networks in the mean-field regim...
Due to its simplicity and outstanding ability to generalize, stochastic
...
Adaptive gradient methods have attracted much attention of machine learn...
Second-order optimization methods have desirable convergence properties....
Recent advances in language modeling such as word2vec motivate a number ...