POAS: A high-performance scheduling framework for exploiting Accelerator Level Parallelism

by   Pablo Antonio Martínez, et al.

Heterogeneous computing is becoming mainstream in all scopes. This new era in computer architecture brings a new paradigm called Accelerator Level Parallelism (ALP). In ALP, accelerators are used concurrently to provide unprecedented levels of performance and energy efficiency. To reach that, there are many problems to be solved, one of the most challenging being co-execution. This paper develops a scheduling framework called POAS, a general method for providing co-execution to generic applications. Unlike other scheduling approaches, POAS does not directly schedule applications. Instead, it is a generic model that transforms any application to make it suitable for co-execution, so that it can be executed in ALP environments. Our proposal is composed of four differentiated steps: predict, optimize, adapt and schedule. During these phases, different modifications are implemented in the application to make it suitable to be executed in ALP environments. In this work we also apply our framework to a matrix multiplication case study, outlining the critical and most important steps to port the application with POAS. We evaluate our POAS-based implementation for matrix multiplication on a CPU/GPU/XPU environment using CPU cores, CUDA cores and tensor cores (XPU). Our experiments prove that co-execution in the studied scenario can benefit from ALP, yielding speedups of up to 45 The proven flexibility and potential of POAS make it an excellent candidate to reach ALP in future computer systems.


Optimizing Bit-Serial Matrix Multiplication for Reconfigurable Computing

Matrix-matrix multiplication is a key computational kernel for numerous ...

Overhead Management in Multi-Core Environment

In multi-core systems, various factors like inter-process communication,...

BISMO: A Scalable Bit-Serial Matrix Multiplication Overlay for Reconfigurable Computing

Matrix-matrix multiplication is a key computational kernel for numerous ...

Evaluating Spatial Accelerator Architectures with Tiled Matrix-Matrix Multiplication

There is a growing interest in custom spatial accelerators for machine l...

Enhancing Resource Management through Prediction-based Policies

Task-based programming models are emerging as a promising alternative to...

Exploring the Relation Between Two Levels of Scheduling Using a Novel Simulation Approach

Modern high performance computing (HPC) systems exhibit a rapid growth i...

HPM-Frame: A Decision Framework for Executing Software on Heterogeneous Platforms

Heterogeneous computing is one of the most important computational solut...

Please sign up or login with your details

Forgot password? Click here to reset