Optimization of the Sparse Multi-Threaded Cholesky Factorization for A64FX
Sparse linear algebra routines are fundamental building blocks of a large variety of scientific applications. Direct solvers, which are methods for solving linear systems via the factorization of matrices into products of triangular matrices, are commonly used in many contexts. The Cholesky factoriz...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Sparse linear algebra routines are fundamental building blocks of a large
variety of scientific applications. Direct solvers, which are methods for
solving linear systems via the factorization of matrices into products of
triangular matrices, are commonly used in many contexts. The Cholesky
factorization is the fastest direct method for symmetric and definite positive
matrices. This paper presents selective nesting, a method to determine the
optimal task granularity for the parallel Cholesky factorization based on the
structure of sparse matrices. We propose the OPT-D-COST algorithm, which
automatically and dynamically applies selective nesting. OPT-D-COST leverages
matrix sparsity to drive complex task-based parallel workloads in the context
of direct solvers. We run an extensive evaluation campaign considering a
heterogeneous set of 60 sparse matrices and a parallel machine featuring the
A64FX processor. OPT-D-COST delivers an average performance speedup of
1.46$\times$ with respect to the best state-of-the-art parallel method to run
direct solvers. |
---|---|
DOI: | 10.48550/arxiv.2202.09288 |