Risk and Regret of Hierarchical Bayesian Learners

by   Jonathan H. Huggins, et al.

Common statistical practice has shown that the full power of Bayesian methods is not realized until hierarchical priors are used, as these allow for greater "robustness" and the ability to "share statistical strength." Yet it is an ongoing challenge to provide a learning-theoretically sound formalism of such notions that: offers practical guidance concerning when and how best to utilize hierarchical models; provides insights into what makes for a good hierarchical prior; and, when the form of the prior has been chosen, can guide the choice of hyperparameter settings. We present a set of analytical tools for understanding hierarchical priors in both the online and batch learning settings. We provide regret bounds under log-loss, which show how certain hierarchical models compare, in retrospect, to the best single model in the model class. We also show how to convert a Bayesian log-loss regret bound into a Bayesian risk bound for any bounded loss, a result which may be of independent interest. Risk and regret bounds for Student's t and hierarchical Gaussian priors allow us to formalize the concepts of "robustness" and "sharing statistical strength." Priors for feature selection are investigated as well. Our results suggest that the learning-theoretic benefits of using hierarchical priors can often come at little cost on practical problems.


page 1

page 2

page 3

page 4


Sharp regret bounds for empirical Bayes and compound decision problems

We consider the classical problems of estimating the mean of an n-dimens...

Assessing Statistical Disclosure Risk for Differentially Private, Hierarchical Count Data, with Application to the 2020 U.S. Decennial Census

We propose Bayesian methods to assess the statistical disclosure risk of...

Empirical Bayesian Selection for Value Maximization

We study the common problem of selecting the best m units from a set of ...

A Tight Excess Risk Bound via a Unified PAC-Bayesian-Rademacher-Shtarkov-MDL Complexity

We present a novel notion of complexity that interpolates between and ge...

Bayesian Analysis of Generalized Hierarchical Indian Buffet Processes for Within and Across Group Sharing of Latent Features

Bayesian nonparametric hierarchical priors provide flexible models for s...

On the Prior Sensitivity of Thompson Sampling

The empirically successful Thompson Sampling algorithm for stochastic ba...

Please sign up or login with your details

Forgot password? Click here to reset