RankML: a Meta Learning-Based Approach for Pre-Ranking Machine Learning Pipelines
The explosion of digital data has created multiple opportunities for organizations and individuals to leverage machine learning (ML) to transform the way they operate. However, the shortage of experts in the field of machine learning - data scientists - is often a setback to the use of ML. In an attempt to alleviate this shortage, multiple approaches for the automation of machine learning have been proposed in recent years. While these approaches are effective, they often require a great deal of time and computing resources. In this study we propose RankML, a meta-learning based approach for predicting the performance of whole machine learning pipelines. Given a previously-unseen dataset, a performance metric, and a set of candidate pipelines, RankML immediately produces a ranked list of all pipelines based on their predicted performance. Extensive evaluation on 193 datasets, both in regression and classification tasks, shows that our approach achieves results that are equal to those of state-of-the-art, computationally heavy approaches.
READ FULL TEXT