Minimum message length estimation of mixtures of multivariate Gaussian and von Mises-Fisher distributions

02/27/2015
by   Parthan Kasarapu, et al.
0

Mixture modelling involves explaining some observed evidence using a combination of probability distributions. The crux of the problem is the inference of an optimal number of mixture components and their corresponding parameters. This paper discusses unsupervised learning of mixture models using the Bayesian Minimum Message Length (MML) criterion. To demonstrate the effectiveness of search and inference of mixture parameters using the proposed approach, we select two key probability distributions, each handling fundamentally different types of data: the multivariate Gaussian distribution to address mixture modelling of data distributed in Euclidean space, and the multivariate von Mises-Fisher (vMF) distribution to address mixture modelling of directional data distributed on a unit hypersphere. The key contributions of this paper, in addition to the general search and inference methodology, include the derivation of MML expressions for encoding the data using multivariate Gaussian and von Mises-Fisher distributions, and the analytical derivation of the MML estimates of the parameters of the two distributions. Our approach is tested on simulated and real world data sets. For instance, we infer vMF mixtures that concisely explain experimentally determined three-dimensional protein conformations, providing an effective null model description of protein structures that is central to many inference problems in structural bioinformatics. The experimental results demonstrate that the performance of our proposed search and inference method along with the encoding schemes improve on the state of the art mixture modelling techniques.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/05/2016

Mixtures of Bivariate von Mises Distributions with Applications to Modelling of Protein Dihedral Angles

The modelling of empirically observed data is commonly done using mixtur...
research
06/26/2015

Modelling of directional data using Kent distributions

The modelling of data on a spherical surface requires the consideration ...
research
03/01/2023

Mixture of regressions with multivariate responses for discovering subtypes in Alzheimer's biomarkers with detection limits

There is no gold standard for the diagnosis of Alzheimer's disease (AD),...
research
04/14/2020

Universal Approximation on the Hypersphere

It is well known that any continuous probability density function on R^m...
research
09/22/2020

Finite mixture modeling of censored and missing data using the multivariate skew-normal distribution

Finite mixture models have been widely used to model and analyze data fr...
research
12/30/2022

Mixture of von Mises-Fisher distribution with sparse prototypes

Mixtures of von Mises-Fisher distributions can be used to cluster data o...
research
05/24/2023

Simultaneous identification of models and parameters of scientific simulators

Many scientific models are composed of multiple discrete components, and...

Please sign up or login with your details

Forgot password? Click here to reset