Pearson Distance is not a Distance

08/15/2019
by   Victor Solo, et al.
0

The Pearson distance between a pair of random variables X,Y with correlation ρ_xy, namely, 1-ρ_xy, has gained widespread use, particularly for clustering, in areas such as gene expression analysis, brain imaging and cyber security. In all these applications it is implicitly assumed/required that the distance measures be metrics, thus satisfying the triangle inequality. We show however, that Pearson distance is not a metric. We go on to show that this can be repaired by recalling the result, (well known in other literature) that √(1-ρ_xy) is a metric. We similarly show that a related measure of interest, 1-|ρ_xy|, which is invariant to the sign of ρ_xy, is not a metric but that √(1-ρ_xy^2) is. We also give generalizations of these results.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset
Success!
Error Icon An error occurred

Sign in with Google

×

Use your Google Account to sign in to DeepAI

×

Consider DeepAI Pro