Modular Resource Centric Learning for Workflow Performance Prediction

11/15/2017
by   Alok Singh, et al.
0

Workflows provide an expressive programming model for fine-grained control of large-scale applications in distributed computing environments. Accurate estimates of complex workflow execution metrics on large-scale machines have several key advantages. The performance of scheduling algorithms that rely on estimates of execution metrics degrades when the accuracy of predicted execution metrics decreases. This in-progress paper presents a technique being developed to improve the accuracy of predicted performance metrics of large-scale workflows on distributed platforms. The central idea of this work is to train resource-centric machine learning agents to capture complex relationships between a set of program instructions and their performance metrics when executed on a specific resource. This resource-centric view of a workflow exploits the fact that predicting execution times of sub-modules of a workflow requires monitoring and modeling of a few dynamic and static features. We transform the input workflow that is essentially a directed acyclic graph of actions into a Physical Resource Execution Plan (PREP). This transformation enables us to model an arbitrarily complex workflow as a set of simpler programs running on physical nodes. We delegate a machine learning model to capture performance metrics for each resource type when it executes different program instructions under varying degrees of resource contention. Our algorithm takes the prediction metrics from each resource agent and composes the overall workflow performance metrics by utilizing the structure of the corresponding Physical Resource Execution Plan.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/13/2023

Lotaru: Locally Predicting Workflow Task Runtimes for Resource Management on Heterogeneous Infrastructures

Many resource management techniques for task scheduling, energy and carb...
research
10/10/2018

Task Runtime Prediction in Scientific Workflows Using an Online Incremental Learning Approach

Many algorithms in workflow scheduling and resource provisioning rely on...
research
11/23/2022

Towards Advanced Monitoring for Scientific Workflows

Scientific workflows consist of thousands of highly parallelized tasks e...
research
10/15/2018

An Efficient Fault Tolerant Workflow Scheduling Approach using Replication Heuristics and Checkpointing in the Cloud

Scientific workflows have been predominantly used for complex and large ...
research
06/16/2023

Flow-Bench: A Dataset for Computational Workflow Anomaly Detection

A computational workflow, also known as workflow, consists of tasks that...
research
06/27/2022

Resource-Centric Serverless Computing

Today's serverless computing has several key limitations including per-f...
research
05/19/2018

Partitioning SKA Dataflows for Optimal Graph Execution

Optimizing data-intensive workflow execution is essential to many modern...

Please sign up or login with your details

Forgot password? Click here to reset