Index-based Solutions for Efficient Density Peaks Clustering

02/08/2020
by   Zafaryab Rasool, et al.
0

Density Peaks Clustering (DPC), a novel density-based clustering approach, has received considerable attention of the research community primarily due to its simplicity and less parameter requirement. However, the resultant clusters obtained using DPC are influenced by its sensitive parameter dc which depends upon data distribution and requirements of different users. Besides, the original DPC algorithm requires visiting a large number of objects making it slow. To this end, this paper investigates index-based solutions for DPC. Specifically, we propose two list-based index methods viz. (i) a simple List Index and (ii) an advanced Cumulative Histogram Index. Efficient query algorithms are proposed for these indices which significantly avoids irrelevant comparisons at the cost of space. To remedy this for memory-constrained systems, we further introduce an approximate solution to the above indices which allows substantial reduction in the space cost provided slight inaccuracies are admissible. Furthermore, owing to considerably lower memory requirements of existing tree-based index structures, we also present effective pruning techniques and efficient query algorithms to support DPC using the popular Quadtree Index and R-tree Index. Finally, we practically evaluate all the above indices and present the findings and results, obtained from a set of extensive experiments on six synthetic and real datasets. The experimental insights obtained help to guide in selecting the befitting index.

READ FULL TEXT
research
07/04/2022

A New Index for Clustering Evaluation Based on Density Estimation

A new index for internal evaluation of clustering is introduced. The ind...
research
02/16/2021

A Lazy Approach for Efficient Index Learning

Learned indices using neural networks have been shown to outperform trad...
research
04/10/2023

FINEX: A Fast Index for Exact Flexible Density-Based Clustering (Extended Version with Proofs)*

Density-based clustering aims to find groups of similar objects (i.e., c...
research
03/24/2020

Tree Index: A New Cluster Evaluation Technique

We introduce a cluster evaluation technique called Tree Index. Our Tree ...
research
03/08/2021

The RLR-Tree: A Reinforcement Learning Based R-Tree for Spatial Data

Learned indices have been proposed to replace classic index structures l...
research
08/13/2019

Beyond the Inverted Index

In this paper, a new data structure named group-list is proposed. The gr...
research
04/03/2023

A Machine Learning approach of Ecological Modeling: A New method to find Similarity Index

In many scientific research, it is often imperative to determine whether...

Please sign up or login with your details

Forgot password? Click here to reset