Design of Neko—A Scalable High‐Fidelity Simulation Framework With Extensive Accelerator Support
ABSTRACT Recent trends and advancements in including more diverse and heterogeneous hardware in High‐Performance Computing (HPC) are challenging scientific software developers in their pursuit of efficient numerical methods with sustained performance across a diverse set of platforms. As a result, r...
Gespeichert in:
Veröffentlicht in: | Concurrency and computation 2025-01, Vol.37 (2), p.n/a |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | ABSTRACT
Recent trends and advancements in including more diverse and heterogeneous hardware in High‐Performance Computing (HPC) are challenging scientific software developers in their pursuit of efficient numerical methods with sustained performance across a diverse set of platforms. As a result, researchers are today forced to re‐factor their codes to leverage these powerful new heterogeneous systems. We present our design considerations of Neko—a portable framework for high‐fidelity spectral element flow simulations. Unlike prior works, Neko adopts a modern object‐oriented Fortran 2008 approach, allowing multi‐tier ions of the solver stack and facilitating various hardware backends ranging from general‐purpose processors, accelerators down to exotic vector processors and Field‐Programmable Gate Arrays (FPGAs). Focusing on the performance and portability of Neko, we describe the framework's device ion layer managing device memory, data transfer and kernel launches from Fortran, allowing for a solver written in a hardware‐neutral yet performant way. Accelerator‐specific optimizations are also discussed, with auto‐tuning of key kernels and various communication strategies using device‐aware MPI. Finally, we present performance measurements on a wide range of computing platforms, including the EuroHPC pre‐exascale system LUMI, where Neko achieves excellent parallel efficiency for a large direct numerical simulation (DNS) of turbulent fluid flow using up to 80% of the entire LUMI supercomputer. |
---|---|
ISSN: | 1532-0626 1532-0634 1532-0634 |
DOI: | 10.1002/cpe.8340 |