Laplace approximation and the natural gradient for Gaussian process regression with the heteroscedastic Student-t model
This paper considers the Laplace method to derive approximate inference for the Gaussian process (GP) regression in the location and scale parameters of the Student-t probabilistic model. This allows both mean and variance of the data to vary as a function of covariates with the attractive feature that the Student-t model has been widely used as a useful tool for robustifying data analysis. The challenge in the approximate inference for the GP regression with the Student-t probabilistic model, lies in the analytical intractability of the posterior distribution and the lack of concavity of the log-likelihood function. We present the natural gradient adaptation for the estimation process which primarily relies on the property that the Student-t model naturally has orthogonal parametrization with respect to the location and scale paramaters. Due to this particular property of the model, we also introduce an alternative Laplace approximation by using the Fisher information matrix in place of the Hessian matrix of the negative log-likelihood function. According to experiments this alternative approximation provides very similar posterior approximations and predictive performance when compared to the traditional Laplace approximation. We also compare both of these Laplace approximations with the Monte Carlo Markov Chain (MCMC) method. Moreover, we compare our heteroscedastic Student-t model and the GP regression with the heteroscedastic Gaussian model. We also discuss how our approach can improve the inference algorithm in cases where the probabilistic model assumed for the data is not log-concave.
READ FULL TEXT