Information-theoretical analysis of the statistical dependencies among three variables: Applications to written language

by   Damián G. Hernández, et al.

We develop the information-theoretical concepts required to study the statistical dependencies among three variables. Some of such dependencies are pure triple interactions, in the sense that they cannot be explained in terms of a combination of pairwise correlations. We derive bounds for triple dependencies, and characterize the shape of the joint probability distribution of three binary variables with high triple interaction. The analysis also allows us to quantify the amount of redundancy in the mutual information between pairs of variables, and to assess whether the information between two variables is or is not mediated by a third variable. These concepts are applied to the analysis of written texts. We find that the probability that a given word is found in a particular location within the text is not only modulated by the presence or absence of other nearby words, but also, on the presence or absence of nearby pairs of words. We identify the words enclosing the key semantic concepts of the text, the triplets of words with high pairwise and triple interactions, and the words that mediate the pairwise interactions between other words.


page 1

page 2

page 3

page 4


The Stochastic complexity of spin models: How simple are simple spin models?

Simple models, in information theoretic terms, are those with a small st...

Statistical Inference of Minimally Complex Models

Finding the best model that describes a high dimensional dataset, is a d...

Inferring the location of authors from words in their texts

For the purposes of computational dialectology or other geographically b...

Survey On The Estimation Of Mutual Information Methods as a Measure of Dependency Versus Correlation Analysis

In this survey, we present and compare different approaches to estimate ...

Linguistic dependencies and statistical dependence

What is the relationship between linguistic dependencies and statistical...

Interaction Measures, Partition Lattices and Kernel Tests for High-Order Interactions

Models that rely solely on pairwise relationships often fail to capture ...

Universality and diversity in word patterns

Words are fundamental linguistic units that connect thoughts and things ...

Please sign up or login with your details

Forgot password? Click here to reset