Measuring group-separability in geometrical space for evaluation of pattern recognition and embedding algorithms

by   A. Acevedo, et al.

Evaluating data separation in a geometrical space is fundamental for pattern recognition. A plethora of dimensionality reduction (DR) algorithms have been developed in order to reveal the emergence of geometrical patterns in a low dimensional visible representation space, in which high-dimensional samples similarities are approximated by geometrical distances. However, statistical measures to evaluate directly in the low dimensional geometrical space the sample group separability attaiend by these DR algorithms are missing. Certainly, these separability measures could be used both to compare algorithms performance and to tune algorithms parameters. Here, we propose three statistical measures (named as PSI-ROC, PSI-PR, and PSI-P) that have origin from the Projection Separability (PS) rationale introduced in this study, which is expressly designed to assess group separability of data samples in a geometrical space. Traditional cluster validity indices (CVIs) might be applied in this context but they show limitations because they are not specifically tailored for DR. Our PS measures are compared to six baseline cluster validity indices, using five non-linear datasets and six different DR algorithms. The results provide clear evidence that statistical-based measures based on PS rationale are more accurate than CVIs and can be adopted to control the tuning of parameter-dependent DR algorithms.


page 1

page 12

page 14


Mathematical Analysis on Out-of-Sample Extensions

Let X=X∪Z be a data set in R^D, where X is the training set and Z is the...

Geometry of Graph Edit Distance Spaces

In this paper we study the geometry of graph spaces endowed with a speci...

A more globally accurate dimensionality reduction method using triplets

We first show that the commonly used dimensionality reduction (DR) metho...

Explaining dimensionality reduction results using Shapley values

Dimensionality reduction (DR) techniques have been consistently supporti...

Classes are not Clusters: Improving Label-based Evaluation of Dimensionality Reduction

A common way to evaluate the reliability of dimensionality reduction (DR...

Revisiting Dimensionality Reduction Techniques for Visual Cluster Analysis: An Empirical Study

Dimensionality Reduction (DR) techniques can generate 2D projections and...

Exchangeable Bernoulli distributions: high dimensional simulation, estimate and testing

We explore the class of exchangeable Bernoulli distributions building on...

Please sign up or login with your details

Forgot password? Click here to reset