Scalable Bayesian Transformed Gaussian Processes
The Bayesian transformed Gaussian process (BTG) model, proposed by Kedem and Oliviera, is a fully Bayesian counterpart to the warped Gaussian process (WGP) and marginalizes out a joint prior over input warping and kernel hyperparameters. This fully Bayesian treatment of hyperparameters often provides more accurate regression estimates and superior uncertainty propagation, but is prohibitively expensive. The BTG posterior predictive distribution, itself estimated through high-dimensional integration, must be inverted in order to perform model prediction. To make the Bayesian approach practical and comparable in speed to maximum-likelihood estimation (MLE), we propose principled and fast techniques for computing with BTG. Our framework uses doubly sparse quadrature rules, tight quantile bounds, and rank-one matrix algebra to enable both fast model prediction and model selection. These scalable methods allow us to regress over higher-dimensional datasets and apply BTG with layered transformations that greatly improve its expressibility. We demonstrate that BTG achieves superior empirical performance over MLE-based models.
READ FULL TEXT