Nonlinear Markov Clustering by Minimum Curvilinear Sparse Similarity

12/27/2019
by   C. Duran, et al.
34

The development of algorithms for unsupervised pattern recognition by nonlinear clustering is a notable problem in data science. Markov clustering (MCL) is a renowned algorithm that simulates stochastic flows on a network of sample similarities to detect the structural organization of clusters in the data, but it has never been generalized to deal with data nonlinearity. Minimum Curvilinearity (MC) is a principle that approximates nonlinear sample distances in the high-dimensional feature space by curvilinear distances, which are computed as transversal paths over their minimum spanning tree, and then stored in a kernel. Here we propose MC-MCL, which is the first nonlinear kernel extension of MCL and exploits Minimum Curvilinearity to enhance the performance of MCL in real and synthetic data with underlying nonlinear patterns. MC-MCL is compared with baseline clustering methods, including DBSCAN, K-means and affinity propagation. We find that Minimum Curvilinearity provides a valuable framework to estimate nonlinear distances also when its kernel is applied in combination with MCL. Indeed, MC-MCL overcomes classical MCL and even baseline clustering algorithms in different nonlinear datasets.

READ FULL TEXT

page 8

page 11

research
12/21/2014

Learning the nonlinear geometry of high-dimensional data: Models and algorithms

Modern information processing relies on the axiom that high-dimensional ...
research
04/28/2023

A new interpoint distance-based clustering algorithm using kernel density estimation

A novel nonparametric clustering algorithm is proposed using the interpo...
research
03/05/2018

Deep Continuous Clustering

Clustering high-dimensional datasets is hard because interpoint distance...
research
12/04/2020

Adaptive Explicit Kernel Minkowski Weighted K-means

The K-means algorithm is among the most commonly used data clustering me...
research
06/11/2020

Deep Time-Delay Reservoir Computing: Dynamics and Memory Capacity

The Deep Time-Delay Reservoir Computing concept utilizes unidirectionall...
research
03/10/2023

Clustering with minimum spanning trees: How good can it be?

Minimum spanning trees (MSTs) provide a convenient representation of dat...
research
08/26/2011

Prediction of peptide bonding affinity: kernel methods for nonlinear modeling

This paper presents regression models obtained from a process of blind p...

Please sign up or login with your details

Forgot password? Click here to reset