Automatic selection of clustering algorithms using supervised graph embedding

11/16/2020
by   Noy Cohen-Shapira, et al.
0

The widespread adoption of machine learning (ML) techniques and the extensive expertise required to apply them have led to increased interest in automated ML solutions that reduce the need for human intervention. One of the main challenges in applying ML to previously unseen problems is algorithm selection - the identification of high-performing algorithm(s) for a given dataset, task, and evaluation measure. This study addresses the algorithm selection challenge for data clustering, a fundamental task in data mining that is aimed at grouping similar objects. We present MARCO-GE, a novel meta-learning approach for the automated recommendation of clustering algorithms. MARCO-GE first transforms datasets into graphs and then utilizes a graph convolutional neural network technique to extract their latent representation. Using the embedding representations obtained, MARCO-GE trains a ranking meta-model capable of accurately recommending top-performing algorithms for a new dataset and clustering evaluation measure. Extensive evaluation on 210 datasets, 13 clustering algorithms, and 10 clustering measures demonstrates the effectiveness of our approach and its dominance in terms of predictive and generalization performance over state-of-the-art clustering meta-learning approaches.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/20/2022

FIND:Explainable Framework for Meta-learning

Meta-learning is used to efficiently enable the automatic selection of m...
research
10/31/2019

RankML: a Meta Learning-Based Approach for Pre-Ranking Machine Learning Pipelines

The explosion of digital data has created multiple opportunities for org...
research
06/07/2021

Evaluating Meta-Feature Selection for the Algorithm Recommendation Problem

With the popularity of Machine Learning (ML) solutions, algorithms and d...
research
05/23/2023

Clustering Indices based Automatic Classification Model Selection

Classification model selection is a process of identifying a suitable mo...
research
10/04/2012

Learning Heterogeneous Similarity Measures for Hybrid-Recommendations in Meta-Mining

The notion of meta-mining has appeared recently and extends the traditio...
research
11/21/2022

Explainable Model-specific Algorithm Selection for Multi-Label Classification

Multi-label classification (MLC) is an ML task of predictive modeling in...
research
06/18/2022

AutoGML: Fast Automatic Model Selection for Graph Machine Learning

Given a graph learning task, such as link prediction, on a new graph dat...

Please sign up or login with your details

Forgot password? Click here to reset