Stability and Generalization of Bilevel Programming in Hyperparameter Optimization

06/08/2021
by   Fan Bao, et al.
0

Recently, the (gradient-based) bilevel programming framework is widely used in hyperparameter optimization and has achieved excellent performance empirically. Previous theoretical work mainly focuses on its optimization properties, while leaving the analysis on generalization largely open. This paper attempts to address the issue by presenting an expectation bound w.r.t. the validation set based on uniform stability. Our results can explain some mysterious behaviours of the bilevel programming in practice, for instance, overfitting to the validation set. We also present an expectation bound for the classical cross-validation algorithm. Our results suggest that gradient-based algorithms can be better than cross-validation under certain conditions in a theoretical perspective. Furthermore, we prove that regularization terms in both the outer and inner levels can relieve the overfitting problem in gradient-based algorithms. In experiments on feature learning and data reweighting for noisy labels, we corroborate our theoretical findings.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/18/2018

Optimizing for Generalization in Machine Learning with Cross-Validation Gradients

Cross-validation is the workhorse of modern applied statistics and machi...
research
12/18/2017

A Bridge Between Hyperparameter Optimization and Larning-to-learn

We consider a class of a nested optimization problems involving inner an...
research
09/05/2018

Deep Bilevel Learning

We present a novel regularization approach to train neural networks that...
research
06/13/2018

Bilevel Programming for Hyperparameter Optimization and Meta-Learning

We introduce a framework based on bilevel programming that unifies gradi...
research
01/12/2023

Toward Theoretical Guidance for Two Common Questions in Practical Cross-Validation based Hyperparameter Selection

We show, to our knowledge, the first theoretical treatments of two commo...
research
04/16/2021

Overfitting in Bayesian Optimization: an empirical study and early-stopping solution

Bayesian Optimization (BO) is a successful methodology to tune the hyper...
research
06/11/2020

Optimizing generalization on the train set: a novel gradient-based framework to train parameters and hyperparameters simultaneously

Generalization is a central problem in Machine Learning. Most prediction...

Please sign up or login with your details

Forgot password? Click here to reset