Deep Learning meets Nonparametric Regression: Are Weight-Decayed DNNs Locally Adaptive?

04/20/2022
by   Kaiqi Zhang, et al.
0

We study the theory of neural network (NN) from the lens of classical nonparametric regression problems with a focus on NN's ability to adaptively estimate functions with heterogeneous smoothness – a property of functions in Besov or Bounded Variation (BV) classes. Existing work on this problem requires tuning the NN architecture based on the function spaces and sample sizes. We consider a "Parallel NN" variant of deep ReLU networks and show that the standard weight decay is equivalent to promoting the ℓ_p-sparsity (0<p<1) of the coefficient vector of an end-to-end learned function bases, i.e., a dictionary. Using this equivalence, we further establish that by tuning only the weight decay, such Parallel NN achieves an estimation error arbitrarily close to the minimax rates for both the Besov and BV classes. Notably, it gets exponentially closer to minimax optimal as the NN gets deeper. Our research sheds new lights on why depth matters and how NNs are more powerful than kernel methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/05/2022

Factor Augmented Sparse Throughput Deep ReLU Neural Networks for High Dimensional Regression

This paper introduces a Factor Augmented Sparse Throughput (FAST) model ...
research
06/27/2022

Benign overfitting and adaptive nonparametric regression

In the nonparametric regression setting, we construct an estimator which...
research
07/19/2017

Rates of Uniform Consistency for k-NN Regression

We derive high-probability finite-sample uniform rates of consistency fo...
research
07/17/2022

Nonparametric regression with modified ReLU networks

We consider regression estimation with modified ReLU neural networks in ...
research
05/07/2021

What Kinds of Functions do Deep Neural Networks Learn? Insights from Variational Spline Theory

We develop a variational framework to understand the properties of funct...
research
08/20/2020

Minimum discrepancy principle strategy for choosing k in k-NN regression

This paper presents a novel data-driven strategy to choose the hyperpara...
research
01/19/2021

Variance Based Samples Weighting for Supervised Deep Learning

In the context of supervised learning of a function by a Neural Network ...

Please sign up or login with your details

Forgot password? Click here to reset