A faster and simpler algorithm for learning shallow networks

07/24/2023
by   Sitan Chen, et al.
0

We revisit the well-studied problem of learning a linear combination of k ReLU activations given labeled examples drawn from the standard d-dimensional Gaussian measure. Chen et al. [CDG+23] recently gave the first algorithm for this problem to run in poly(d,1/ε) time when k = O(1), where ε is the target error. More precisely, their algorithm runs in time (d/ε)^quasipoly(k) and learns over multiple stages. Here we show that a much simpler one-stage version of their algorithm suffices, and moreover its runtime is only (d/ε)^O(k^2).

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset