A new measure for assessment of clustering based on kernel density estimation

01/06/2022
by   Soumita Modak, et al.
0

A new clustering accuracy measure is proposed to determine the unknown number of clusters and to assess the quality of clustering of a data set given in any dimensional space. Our validity index applies the classical nonparametric univariate kernel density estimation method to the interpoint distances computed between the members of data. Being based on interpoint distances, it is free of the curse of dimensionality and therefore efficiently computable for high-dimensional situations where the number of study variables can be larger than the sample size. The proposed measure is compatible with any clustering algorithm and with every kind of data set where the interpoint distance measure can be defined to have a density function. Simulation study proves its superiority over widely used cluster validity indices like the average silhouette width and the Dunn index, whereas its applicability is shown with respect to a high-dimensional Biostatistical study of Alon data set and a large Astrostatistical application of time series with light curves of new variable stars.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/01/2022

A new nonparametric interpoint distance-based measure for assessment of clustering

A new interpoint distance-based measure is proposed to identify the opti...
research
04/28/2023

A new interpoint distance-based clustering algorithm using kernel density estimation

A novel nonparametric clustering algorithm is proposed using the interpo...
research
05/11/2021

An internal validity index based on density-involved distance

It is crucial to evaluate the quality of clustering results in cluster a...
research
10/18/2019

Clustering by Optimizing the Average Silhouette Width

In this paper, we propose a unified clustering approach that can estimat...
research
10/09/2009

Scaling Analysis of Affinity Propagation

We analyze and exploit some scaling properties of the Affinity Propagati...
research
08/29/2023

Diffusion-based kernel density estimation improves the assessment of carbon isotope modelling

Comparing differently sized data sets is one main task in model assessme...
research
10/27/2020

Nonparametric estimation of highest density regions for COVID-19

Highest density regions refer to level sets containing points of relativ...

Please sign up or login with your details

Forgot password? Click here to reset