Asynchronous Execution of Python Code on Task Based Runtime Systems

by   R. Tohid, et al.

Despite advancements in the areas of parallel and distributed computing, the complexity of programming on High Performance Computing (HPC) resources has deterred many domain experts, especially in the areas of machine learning and artificial intelligence (AI), from utilizing performance benefits of such systems. Researchers and scientists favor high-productivity languages to avoid the inconvenience of programming in low-level languages and costs of acquiring the necessary skills required for programming at this level. In recent years, Python, with the support of linear algebra libraries like NumPy, has gained popularity despite facing limitations which prevent this code from distributed runs. Here we present a solution which maintains both high level programming abstractions as well as parallel and distributed efficiency. Phylanx, is an asynchronous array processing toolkit which transforms Python and NumPy operations into code which can be executed in parallel on HPC resources by mapping Python and NumPy functions and variables into a dependency tree executed by HPX, a general purpose, parallel, task-based runtime system written in C++. Phylanx additionally provides introspection and visualization capabilities for debugging and performance analysis. We have tested the foundations of our approach by comparing our implementation of widely used machine learning algorithms to accepted NumPy standards.


Extended Abstract: Productive Parallel Programming with Parsl

Parsl is a parallel programming library for Python that aims to make it ...

Toward Interlanguage Parallel Scripting for Distributed-Memory Scientific Computing

Scripting languages such as Python and R have been widely adopted as too...

Performance Evaluation of Python Parallel Programming Models: Charm4Py and mpi4py

Python is rapidly becoming the lingua franca of machine learning and sci...

PyExaFMM: an exercise in designing high-performance software with Python and Numba

Numba is a game-changing compiler for high-performance computing with Py...

The BioExcel methodology for developing dynamic, scalable, reliable and portable computational biomolecular workflows

Developing complex biomolecular workflows is not always straightforward....

A Comparative Study of Asynchronous Many-Tasking Runtimes: Cilk, Charm++, ParalleX and AM++

We evaluate and compare four contemporary and emerging runtimes for high...

HeAT – a Distributed and GPU-accelerated Tensor Framework for Data Analytics

To cope with the rapid growth in available data, the efficiency of data ...

Please sign up or login with your details

Forgot password? Click here to reset