Dimensionality Detection and Integration of Multiple Data Sources via the GP-LVM

07/01/2013
by   James Barrett, et al.
0

The Gaussian Process Latent Variable Model (GP-LVM) is a non-linear probabilistic method of embedding a high dimensional dataset in terms low dimensional `latent' variables. In this paper we illustrate that maximum a posteriori (MAP) estimation of the latent variables and hyperparameters can be used for model selection and hence we can determine the optimal number or latent variables and the most appropriate model. This is an alternative to the variational approaches developed recently and may be useful when we want to use a non-Gaussian prior or kernel functions that don't have automatic relevance determination (ARD) parameters. Using a second order expansion of the latent variable posterior we can marginalise the latent variables and obtain an estimate for the hyperparameter posterior. Secondly, we use the GP-LVM to integrate multiple data sources by simultaneously embedding them in terms of common latent variables. We present results from synthetic data to illustrate the successful detection and retrieval of low dimensional structure from high dimensional data. We demonstrate that the integration of multiple data sources leads to more robust performance. Finally, we show that when the data are used for binary classification tasks we can attain a significant gain in prediction accuracy when the low dimensional representation is used.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/28/2022

High-dimensional factor copula models with estimation of latent variables

Factor models are a parsimonious way to explain the dependence of variab...
research
05/18/2020

Deep Latent-Variable Kernel Learning

Deep kernel learning (DKL) leverages the connection between Gaussian pro...
research
04/13/2021

Variational Autoencoder Analysis of Ising Model Statistical Distributions and Phase Transitions

Variational autoencoders employ an encoding neural network to generate a...
research
10/13/2015

Consistent Estimation of Low-Dimensional Latent Structure in High-Dimensional Data

We consider the problem of extracting a low-dimensional, linear latent v...
research
02/07/2020

Multi-source Deep Gaussian Process Kernel Learning

For many problems, relevant data are plentiful but explicit knowledge is...
research
05/24/2018

A Unified Probabilistic Model for Learning Latent Factors and Their Connectivities from High-Dimensional Data

Connectivity estimation is challenging in the context of high-dimensiona...
research
12/10/2014

GP-select: Accelerating EM using adaptive subspace preselection

We propose a nonparametric procedure to achieve fast inference in genera...

Please sign up or login with your details

Forgot password? Click here to reset