A Level-wise Taxonomic Perspective on Automated Machine Learning to Date and Beyond: Challenges and Opportunities

Automated machine learning (AutoML) is essentially automating the process of applying machine learning to real-world problems. The primary goals of AutoML tools are to provide methods and processes to make Machine Learning available for non-Machine Learning experts (domain experts), to improve efficiency of Machine Learning and to accelerate research on Machine Learning. Although automation and efficiency are some of AutoML's main selling points, the process still requires a surprising level of human involvement. A number of vital steps of the machine learning pipeline, including understanding the attributes of domain-specific data, defining prediction problems, creating a suitable training data set etc. still tend to be done manually by a data scientist on an ad-hoc basis. Often, this process requires a lot of back-and-forth between the data scientist and domain experts, making the whole process more difficult and inefficient. Altogether, AutoML systems are still far from a "real automatic system". In this review article, we present a level-wise taxonomic perspective on AutoML systems to-date and beyond, i.e., we introduce a new classification system with seven levels to distinguish AutoML systems based on their level of autonomy. We first start with a discussion on how an end-to-end Machine learning pipeline actually looks like and which sub-tasks of Machine learning Pipeline has indeed been automated so far. Next, we highlight the sub-tasks which are still done manually by a data-scientist in most cases and how that limits a domain expert's access to Machine learning. Then, we introduce the novel level-based taxonomy of AutoML systems and define each level according to their scope of automation support. Finally, we provide a road-map of future research endeavor in the area of AutoML and discuss some important challenges in achieving this ambitious goal.


page 8

page 12

page 13

page 14

page 15

page 20

page 21

page 22


Automated Machine Learning: State-of-The-Art and Open Challenges

With the continuous and vast increase in the amount of data in our digit...

Towards "all-inclusive" Data Preparation to ensure Data Quality

Data preparation, especially data cleaning, is very important to ensure ...

(Re)Defining Expertise in Machine Learning Development

Domain experts are often engaged in the development of machine learning ...

Building Domain-Specific Machine Learning Workflows: A Conceptual Framework for the State-of-the-Practice

Domain experts are increasingly employing machine learning to solve thei...

Human experts vs. machines in taxa recognition

Biomonitoring of waterbodies is vital as the number of anthropogenic str...

Scaling Systematic Literature Reviews with Machine Learning Pipelines

Systematic reviews, which entail the extraction of data from large numbe...

Data Curation with Deep Learning [Vision]: Towards Self Driving Data Curation

Past. Data curation - the process of discovering, integrating, and clean...

Please sign up or login with your details

Forgot password? Click here to reset