PAC-Bayes Analysis Beyond the Usual Bounds
We focus on a stochastic learning model where the learner observes a finite set of training examples and the output of the learning process is a data-dependent distribution over a space of hypotheses. The learned data-dependent distribution is then used to make randomized predictions, and the high-level theme addressed here is guaranteeing the quality of predictions on examples that were not seen during training, i.e. generalization. In this setting the unknown quantity of interest is the expected risk of the data-dependent randomized predictor, for which upper bounds can be derived via a PAC-Bayes analysis, leading to PAC-Bayes bounds. Specifically, we present a basic PAC-Bayes inequality for stochastic kernels, from which one may derive extensions of various known PAC-Bayes bounds as well as novel bounds. We clarify the role of the requirement of fixed `data-free' priors and illustrate the use of data-dependent priors. We also present a simple bound that is valid for a loss function with unbounded range. Our analysis clarifies that those two requirements were used to upper-bound an exponential moment term, while the basic PAC-Bayes inequality remains valid with those restrictions removed.
READ FULL TEXT