Estimating the Mutual Information between two Discrete, Asymmetric Variables with Limited Samples

by   Damián G. Hernández, et al.

Determining the strength of non-linear statistical dependencies between two variables is a crucial matter in many research fields. The established measure for quantifying such relations is the mutual information. However, estimating mutual information from limited samples is a challenging task. Since the mutual information is the difference of two entropies, the existing Bayesian estimators of entropy may be used to estimate information. This procedure, however, is still biased in the severely under-sampled regime. Here we propose an alternative estimator that is applicable to those cases in which the marginal distribution of one of the two variables---the one with minimal entropy---is well sampled. The other variable, as well as the joint and conditional distributions, can be severely undersampled. We obtain an estimator that presents very low bias, outperforming previous methods even when the sampled data contain few coincidences. As with other Bayesian estimators, our proposal focuses on the strength of the interaction between two discrete variables, without seeking to model the specific way in which the variables are related. A distinctive property of our method is that the main data statistics determining the amount of mutual information is the inhomogeneity of the conditional distribution of the low-entropy variable in those states (typically few) in which the large-entropy variable registers coincidences.


page 1

page 2

page 3

page 4


A note on the unbiased estimation of mutual information

Estimators for mutual information are typically biased. However, in the ...

Neural Joint Entropy Estimation

Estimating the entropy of a discrete random variable is a fundamental pr...

Neural Entropic Estimation: A faster path to mutual information estimation

We point out a limitation of the mutual information neural estimation (M...

Survey On The Estimation Of Mutual Information Methods as a Measure of Dependency Versus Correlation Analysis

In this survey, we present and compare different approaches to estimate ...

On the Effect of Suboptimal Estimation of Mutual Information in Feature Selection and Classification

This paper introduces a new property of estimators of the strength of st...

Estimating Conditional Transfer Entropy in Time Series using Mutual Information and Non-linear Prediction

We propose a new estimator to measure directed dependencies in time seri...

Neural Fields for Interactive Visualization of Statistical Dependencies in 3D Simulation Ensembles

We present the first neural network that has learned to compactly repres...

Please sign up or login with your details

Forgot password? Click here to reset