AI Giving Back to Statistics? Discovery of the Coordinate System of Univariate Distributions by Beta Variational Autoencoder

by   Alex Glushkovsky, et al.

Distributions are fundamental statistical elements that play essential theoretical and practical roles. The article discusses experiences of training neural networks to classify univariate empirical distributions and to represent them on the two-dimensional latent space forcing disentanglement based on the inputs of cumulative distribution functions (CDF). The latent space representation has been performed using an unsupervised beta variational autoencoder (beta-VAE). It separates distributions of different shapes while overlapping similar ones and empirically realises relationships between distributions that are known theoretically. The synthetic experiment of generated univariate continuous and discrete (Bernoulli) distributions with varying sample sizes and parameters has been performed to support the study. The representation on the latent two-dimensional coordinate system can be seen as an additional metadata of the real-world data that disentangles important distribution characteristics, such as shape of the CDF, classification probabilities of underlying theoretical distributions and their parameters, information entropy, and skewness. Entropy changes, providing an "arrow of time", determine dynamic trajectories along representations of distributions on the latent space. In addition, post beta-VAE unsupervised segmentation of the latent space based on weight-of-evidence (WOE) of posterior versus standard isotopic two-dimensional normal densities has been applied detecting the presence of assignable causes that distinguish exceptional CDF inputs.


page 2

page 4

page 5

page 6

page 7

page 8

page 9

page 10


AI Discovering a Coordinate System of Chemical Elements: Dual Representation by Variational Autoencoders

The periodic table is a fundamental representation of chemical elements ...

Designing Complex Experiments by Applying Unsupervised Machine Learning

Design of experiments (DOE) is playing an essential role in learning and...

AEVB-Comm: An Intelligent CommunicationSystem based on AEVBs

In recent years, applying Deep Learning (DL) techniques emerged as a com...

Learning Latent Representations of Bank Customers With The Variational Autoencoder

Learning data representations that reflect the customers' creditworthine...

CQ-VAE: Coordinate Quantized VAE for Uncertainty Estimation with Application to Disk Shape Analysis from Lumbar Spine MRI Images

Ambiguity is inevitable in medical images, which often results in differ...

Dataset Size Dependence of Rate-Distortion Curve and Threshold of Posterior Collapse in Linear VAE

In the Variational Autoencoder (VAE), the variational posterior often al...

Chest X-Rays Image Classification from beta-Variational Autoencoders Latent Features

Chest X-Ray (CXR) is one of the most common diagnostic techniques used i...

Please sign up or login with your details

Forgot password? Click here to reset