Advising OpenMP Parallelization via a Graph-Based Approach with Transformers
There is an ever-present need for shared memory parallelization schemes to exploit the full potential of multi-core architectures. The most common parallelization API addressing this need today is OpenMP. Nevertheless, writing parallel code manually is complex and effort-intensive. Thus, many determ...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | There is an ever-present need for shared memory parallelization schemes to
exploit the full potential of multi-core architectures. The most common
parallelization API addressing this need today is OpenMP. Nevertheless, writing
parallel code manually is complex and effort-intensive. Thus, many
deterministic source-to-source (S2S) compilers have emerged, intending to
automate the process of translating serial to parallel code. However, recent
studies have shown that these compilers are impractical in many scenarios. In
this work, we combine the latest advancements in the field of AI and natural
language processing (NLP) with the vast amount of open-source code to address
the problem of automatic parallelization. Specifically, we propose a novel
approach, called OMPify, to detect and predict the OpenMP pragmas and
shared-memory attributes in parallel code, given its serial version. OMPify is
based on a Transformer-based model that leverages a graph-based representation
of source code that exploits the inherent structure of code. We evaluated our
tool by predicting the parallelization pragmas and attributes of a large corpus
of (over 54,000) snippets of serial code written in C and C++ languages
(Open-OMP-Plus). Our results demonstrate that OMPify outperforms existing
approaches, the general-purposed and popular ChatGPT and targeted PragFormer
models, in terms of F1 score and accuracy. Specifically, OMPify achieves up to
90% accuracy on commonly-used OpenMP benchmark tests such as NAS, SPEC, and
PolyBench. Additionally, we performed an ablation study to assess the impact of
different model components and present interesting insights derived from the
study. Lastly, we also explored the potential of using data augmentation and
curriculum learning techniques to improve the model's robustness and
generalization capabilities. |
---|---|
DOI: | 10.48550/arxiv.2305.11999 |