Modelling Correlated Bernoulli Data Part I: Theory and Run Lengths

11/30/2022
by   Louise Kimpton, et al.
0

Binary data are very common in many applications, and are typically simulated independently via a Bernoulli distribution with a single probability of success. However, this is not always the physical truth, and the probability of a success can be dependent on the outcome successes of past events. Presented here is a novel approach for simulating binary data where, for a chain of events, successes (1) and failures (0) cluster together according to a distance correlation. The structure is derived from de Bruijn Graphs - a directed graph, where given a set of symbols, V, and a 'word' length, m, the nodes of the graph consist of all possible sequences of V of length m. De Bruijn Graphs are a generalisation of Markov chains, where the 'word' length controls the number of states that each individual state is dependent on. This increases correlation over a wider area. To quantify how clustered a sequence generated from a de Bruijn process is, the run lengths of letters are observed along with run length properties.

READ FULL TEXT

page 9

page 42

research
12/07/2022

Modelling Correlated Bernoulli Data Part II: Inference

Binary data are highly common in many applications, however it is usuall...
research
09/23/2019

New Sets of Optimal Odd-length Binary Z-Complementary Pairs

A pair of sequences is called a Z-complementary pair (ZCP) if it has zer...
research
03/11/2020

Rényi entropy and pattern matching for run-length encoded sequences

In this note, we studied the asymptotic behaviour of the length of the l...
research
08/31/2022

Computing all-vs-all MEMs in run-length encoded collections of HiFi reads

We describe an algorithm to find maximal exact matches (MEMs) among HiFi...
research
02/25/2022

Kron Reduction and Effective Resistance of Directed Graphs

In network theory, the concept of the effective resistance is a distance...
research
12/31/2019

Asymptotic convergence rate of the longest run in an inflating Bernoulli net

In image detection, one problem is to test whether the set, though mostl...
research
10/14/2021

Detecting Renewal States in Chains of Variable Length via Intrinsic Bayes Factors

Markov chains with variable length are useful parsimonious stochastic mo...

Please sign up or login with your details

Forgot password? Click here to reset