Portability for GPU-accelerated molecular docking applications for cloud and HPC: can portable compiler directives provide performance across all platforms?

High-throughput structure-based screening of drug-like molecules has become a common tool in biomedical research. Recently, acceleration with graphics processing units (GPUs) has provided a large performance boost for molecular docking programs. Both cloud and high-performance computing (HPC) resources have been used for large screens with molecular docking programs; while NVIDIA GPUs have dominated cloud and HPC resources, new vendors such as AMD and Intel are now entering the field, creating the problem of software portability across different GPUs. Ideally, software productivity could be maximized with portable programming models that are able to maintain high performance across architectures. While in many cases compiler directives have been used as an easy way to offload parallel regions of a CPU-based program to a GPU accelerator, they may also be an attractive programming model for providing portability across different GPU vendors, in which case the porting process may proceed in the reverse direction: from low-level, architecture-specific code to higher-level directive-based abstractions. MiniMDock is a new mini-application (miniapp) designed to capture the essential computational kernels found in molecular docking calculations, such as are used in pharmaceutical drug discovery efforts, in order to test different solutions for porting across GPU architectures. Here we extend MiniMDock to GPU offloading with OpenMP directives, and compare to performance of kernels using CUDA, and HIP on both NVIDIA and AMD GPUs, as well as across different compilers, exploring performance bottlenecks. We document this reverse-porting process, from highly optimized device code to a higher-level version using directives, compare code structure, and describe barriers that were overcome in this effort.

READ FULL TEXT

page 1

page 3

page 4

page 5

page 7

page 8

research
01/18/2019

Exploiting OpenMP & OpenACC to Accelerate a Molecular Docking Mini-App in Heterogeneous HPC Nodes

In drug discovery, molecular docking is the task in charge of estimating...
research
04/24/2019

Exploring Memory Persistency Models for GPUs

Given its high integration density, high speed, byte addressability, and...
research
01/27/2022

IMEXLBM 1.0: A Proxy Application based on the Lattice Boltzmann Method for solving Computational Fluid Dynamic problems on GPUs

The US Department of Energy launched the Exascale Computing Project (ECP...
research
07/07/2020

On the Efficient Evaluation of the Exchange Correlation Potential on Graphics Processing Unit Clusters

The predominance of Kohn-Sham density functional theory (KS-DFT) for the...
research
08/15/2018

libhclooc: Software Library Facilitating Out-of-core Implementations of Accelerator Kernels on Hybrid Computing Platforms

Hardware accelerators such as Graphics Processing Units (GPUs), Intel Xe...
research
09/09/2023

Towards Accelerating High-Order Stencils on Modern GPUs and Emerging Architectures with a Portable Framework

PDE discretization schemes yielding stencil-like computing patterns are ...
research
01/18/2019

Tunable Approximations to Control Time-to-Solution in an HPC Molecular Docking Mini-App

The drug discovery process involves several tasks to be performed in viv...

Please sign up or login with your details

Forgot password? Click here to reset