Same but Different: distance correlations between topological summaries

03/04/2019
by   Katharine Turner, et al.
0

Persistent homology allows us to create topological summaries of complex data. In order to analyse these statistically we need to choose a topological summary and a metric space where these topological summaries exist. While different representations of the persistent homology may contain the same information (as they come from the same persistence module) they can lead to different statistical conclusions because the metric spaces they lie in are different. The best choice for analysis will be application specific. In this paper we will discuss distance correlation which is a non-parametric tool for comparing data sets that can lie in completely different metric spaces. In particular we can calculate the distance correlation between different choices of topological summary (e.g. bottleneck distance persistence diagrams vs L^2 function distances between persistence landscapes). For a variety of random models we compare some different topological summaries via the distance correlation between the samples. We will give examples of performing distance correlation between topological summaries to another measurement of interest - such as a paired random variable or a parameter in the random model used to generate the persistent homology. This article is meant to be expository in style, so we will include the definitions of standard statistical quantities to make the paper accessible to non-statisticians.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset