Output-weighted optimal sampling for Bayesian regression and rare event statistics using few samples
For many important problems the quantity of interest (or output) is an unknown function of the parameter space (or input), which is a random vector with known statistics. Since the dependence of the output on this random vector is unknown, the challenge is to identify its statistics, using the minimum number of function evaluations. This is a problem that can been seen in the context of active learning or optimal experimental design. We employ Bayesian regression to represent the derived model uncertainty due to finite and small number of input-output pairs. In this context we evaluate existing methods for optimal sample selection, such as model error minimization and mutual information maximization. We show that the commonly employed criteria in the literature do not take into account the output values of the existing input-output pairs. To overcome this deficiency we introduce a new criterion that explicitly takes into account the values of the output for the existing samples and adaptively selects inputs from regions or dimensions of the parameter space which have important contribution to the output. The new method allows for application to a large number of input variables, paving the way for optimal experimental design in very high-dimensions.
READ FULL TEXT