Quantifying Overheads in Charm++ and HPX using Task Bench

by   Nanmiao Wu, et al.

Asynchronous Many-Task (AMT) runtime systems take advantage of multi-core architectures with light-weight threads, asynchronous executions, and smart scheduling. In this paper, we present the comparison of the AMT systems Charm++ and HPX with the main stream MPI, OpenMP, and MPI+OpenMP libraries using the Task Bench benchmarks. Charm++ is a parallel programming language based on C++, supporting stackless tasks as well as light-weight threads asynchronously along with an adaptive runtime system. HPX is a C++ library for concurrency and parallelism, exposing C++ standards conforming API. First, we analyze the commonalities, differences, and advantageous scenarios of Charm++ and HPX in detail. Further, to investigate the potential overheads introduced by the tasking systems of Charm++ and HPX, we utilize an existing parameterized benchmark, Task Bench, wherein 15 different programming systems were implemented, e.g., MPI, OpenMP, MPI + OpenMP, and extend Task Bench by adding HPX implementations. We quantify the overheads of Charm++, HPX, and the main stream libraries in different scenarios where a single task and multi-task are assigned to each core, respectively. We also investigate each system's scalability and the ability to hide the communication latency.


page 1

page 2

page 3

page 4


Supporting OpenMP 5.0 Tasks in hpxMP – A study of an OpenMP implementation within Task Based Runtime Systems

OpenMP has been the de facto standard for single node parallelism for mo...

Task Bench: A Parameterized Benchmark for Evaluating Parallel Runtime Performance

We present Task Bench, a parameterized benchmark designed to explore the...

TaskTorrent: a Lightweight Distributed Task-Based Runtime System in C++

We present TaskTorrent, a lightweight distributed task-based runtime in ...

Fibers are not (P)Threads: The Case for Loose Coupling of Asynchronous Programming Models and MPI Through Continuations

Asynchronous programming models (APM) are gaining more and more traction...

Code modernization strategies for short-range non-bonded molecular dynamics simulations

As modern HPC systems increasingly rely on greater core counts and wider...

An Introduction to hpxMP -- A Modern OpenMP Implementation Leveraging Asynchronous Many-Tasking System

Asynchronous Many-task (AMT) runtime systems have gained increasing acce...

Improving the performance of classical linear algebra iterative methods via hybrid parallelism

We propose fork-join and task-based hybrid implementations of four class...

Please sign up or login with your details

Forgot password? Click here to reset