Advising OpenMP Parallelization via a Graph-Based Approach with Transformers

by   Tal Kadosh, et al.

There is an ever-present need for shared memory parallelization schemes to exploit the full potential of multi-core architectures. The most common parallelization API addressing this need today is OpenMP. Nevertheless, writing parallel code manually is complex and effort-intensive. Thus, many deterministic source-to-source (S2S) compilers have emerged, intending to automate the process of translating serial to parallel code. However, recent studies have shown that these compilers are impractical in many scenarios. In this work, we combine the latest advancements in the field of AI and natural language processing (NLP) with the vast amount of open-source code to address the problem of automatic parallelization. Specifically, we propose a novel approach, called OMPify, to detect and predict the OpenMP pragmas and shared-memory attributes in parallel code, given its serial version. OMPify is based on a Transformer-based model that leverages a graph-based representation of source code that exploits the inherent structure of code. We evaluated our tool by predicting the parallelization pragmas and attributes of a large corpus of (over 54,000) snippets of serial code written in C and C++ languages (Open-OMP-Plus). Our results demonstrate that OMPify outperforms existing approaches, the general-purposed and popular ChatGPT and targeted PragFormer models, in terms of F1 score and accuracy. Specifically, OMPify achieves up to 90 PolyBench. Additionally, we performed an ablation study to assess the impact of different model components and present interesting insights derived from the study. Lastly, we also explored the potential of using data augmentation and curriculum learning techniques to improve the model's robustness and generalization capabilities.


page 1

page 2

page 3

page 4


Learning to Parallelize in a Shared-Memory Environment with Transformers

In past years, the world has switched to many-core and multi-core shared...

MPI-rical: Data-Driven MPI Distributed Parallelism Assistance with Transformers

Automatic source-to-source parallelization of serial code for shared and...

Learning to Parallelize with OpenMP by Augmented Heterogeneous AST Representation

Detecting parallelizable code regions is a challenging task, even for ex...

ComPar: Optimized Multi-Compiler for Automatic OpenMP S2S Parallelization

Parallelization schemes are essential in order to exploit the full benef...

Automatic task-based parallelization of C++ applications by source-to-source transformations

Currently, multi/many-core CPUs are considered standard in most types of...

Leveraging Code Clones and Natural Language Processing for Log Statement Prediction

Software developers embed logging statements inside the source code as a...

A comparison between Automatically versus Manually Parallelized NAS Benchmarks

We compare automatically and manually parallelized NAS Benchmarks in ord...

Please sign up or login with your details

Forgot password? Click here to reset