A Case Study of LLVM-Based Analysis for Optimizing SIMD Code Generation
This paper presents a methodology for using LLVM-based tools to tune the DCA++ (dynamical clusterapproximation) application that targets the new ARM A64FX processor. The goal is to describethe changes required for the new architecture and generate efficient single instruction/multiple data(SIMD) ins...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | This paper presents a methodology for using LLVM-based tools to tune the
DCA++ (dynamical clusterapproximation) application that targets the new ARM
A64FX processor. The goal is to describethe changes required for the new
architecture and generate efficient single instruction/multiple data(SIMD)
instructions that target the new Scalable Vector Extension instruction set.
During manualtuning, the authors used the LLVM tools to improve code
parallelization by using OpenMP SIMD,refactored the code and applied
transformation that enabled SIMD optimizations, and ensured thatthe correct
libraries were used to achieve optimal performance. By applying these code
changes, codespeed was increased by 1.98X and 78 GFlops were achieved on the
A64FX processor. The authorsaim to automatize parts of the efforts in the
OpenMP Advisor tool, which is built on top of existingand newly introduced LLVM
tooling. |
---|---|
DOI: | 10.48550/arxiv.2106.14332 |