Optimization of the Sparse Multi-Threaded Cholesky Factorization for A64FX

Sparse linear algebra routines are fundamental building blocks of a large variety of scientific applications. Direct solvers, which are methods for solving linear systems via the factorization of matrices into products of triangular matrices, are commonly used in many contexts. The Cholesky factoriz...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	arXiv.org 2022-02
Hauptverfasser:	Valentin Le Fèvre, Usui, Tetsuzo, Casas, Marc
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Cholesky factorization Factorization Linear algebra Linear systems Mathematical analysis Microprocessors Nesting Optimization Solvers Sparse matrices Sparsity
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title	arXiv.org
container_volume
creator	Valentin Le Fèvre Usui, Tetsuzo Casas, Marc
description	Sparse linear algebra routines are fundamental building blocks of a large variety of scientific applications. Direct solvers, which are methods for solving linear systems via the factorization of matrices into products of triangular matrices, are commonly used in many contexts. The Cholesky factorization is the fastest direct method for symmetric and definite positive matrices. This paper presents selective nesting, a method to determine the optimal task granularity for the parallel Cholesky factorization based on the structure of sparse matrices. We propose the OPT-D-COST algorithm, which automatically and dynamically applies selective nesting. OPT-D-COST leverages matrix sparsity to drive complex task-based parallel workloads in the context of direct solvers. We run an extensive evaluation campaign considering a heterogeneous set of 60 sparse matrices and a parallel machine featuring the A64FX processor. OPT-D-COST delivers an average performance speedup of 1.46$\times$ with respect to the best state-of-the-art parallel method to run direct solvers.
format	Article
fullrecord	<record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2631381097</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2631381097</sourcerecordid><originalsourceid>FETCH-proquest_journals_26313810973</originalsourceid><addsrcrecordid>eNqNiz0LwjAUAIMgWLT_4YFzIU365SjFIoI42MGtBPtKU2tfTdJBf70OujvdcHcz5gkpwyCLhFgw39qOcy6SVMSx9NjhNDp91y_lNA1ADbgW4TwqYxGOU-90ULYGVY015C31aG9PKNTVkfk9DRnYJlFxWbF5o3qL_pdLti52Zb4PRkOPCa2rOprM8FGVSGQos5BvUvlf9QabtDwy</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2631381097</pqid></control><display><type>article</type><title>Optimization of the Sparse Multi-Threaded Cholesky Factorization for A64FX</title><source>Free E- Journals</source><creator>Valentin Le Fèvre ; Usui, Tetsuzo ; Casas, Marc</creator><creatorcontrib>Valentin Le Fèvre ; Usui, Tetsuzo ; Casas, Marc</creatorcontrib><description>Sparse linear algebra routines are fundamental building blocks of a large variety of scientific applications. Direct solvers, which are methods for solving linear systems via the factorization of matrices into products of triangular matrices, are commonly used in many contexts. The Cholesky factorization is the fastest direct method for symmetric and definite positive matrices. This paper presents selective nesting, a method to determine the optimal task granularity for the parallel Cholesky factorization based on the structure of sparse matrices. We propose the OPT-D-COST algorithm, which automatically and dynamically applies selective nesting. OPT-D-COST leverages matrix sparsity to drive complex task-based parallel workloads in the context of direct solvers. We run an extensive evaluation campaign considering a heterogeneous set of 60 sparse matrices and a parallel machine featuring the A64FX processor. OPT-D-COST delivers an average performance speedup of 1.46$\times$ with respect to the best state-of-the-art parallel method to run direct solvers.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Algorithms ; Cholesky factorization ; Factorization ; Linear algebra ; Linear systems ; Mathematical analysis ; Microprocessors ; Nesting ; Optimization ; Solvers ; Sparse matrices ; Sparsity</subject><ispartof>arXiv.org, 2022-02</ispartof><rights>2022. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>780,784</link.rule.ids></links><search><creatorcontrib>Valentin Le Fèvre</creatorcontrib><creatorcontrib>Usui, Tetsuzo</creatorcontrib><creatorcontrib>Casas, Marc</creatorcontrib><title>Optimization of the Sparse Multi-Threaded Cholesky Factorization for A64FX</title><title>arXiv.org</title><description>Sparse linear algebra routines are fundamental building blocks of a large variety of scientific applications. Direct solvers, which are methods for solving linear systems via the factorization of matrices into products of triangular matrices, are commonly used in many contexts. The Cholesky factorization is the fastest direct method for symmetric and definite positive matrices. This paper presents selective nesting, a method to determine the optimal task granularity for the parallel Cholesky factorization based on the structure of sparse matrices. We propose the OPT-D-COST algorithm, which automatically and dynamically applies selective nesting. OPT-D-COST leverages matrix sparsity to drive complex task-based parallel workloads in the context of direct solvers. We run an extensive evaluation campaign considering a heterogeneous set of 60 sparse matrices and a parallel machine featuring the A64FX processor. OPT-D-COST delivers an average performance speedup of 1.46$\times$ with respect to the best state-of-the-art parallel method to run direct solvers.</description><subject>Algorithms</subject><subject>Cholesky factorization</subject><subject>Factorization</subject><subject>Linear algebra</subject><subject>Linear systems</subject><subject>Mathematical analysis</subject><subject>Microprocessors</subject><subject>Nesting</subject><subject>Optimization</subject><subject>Solvers</subject><subject>Sparse matrices</subject><subject>Sparsity</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNqNiz0LwjAUAIMgWLT_4YFzIU365SjFIoI42MGtBPtKU2tfTdJBf70OujvdcHcz5gkpwyCLhFgw39qOcy6SVMSx9NjhNDp91y_lNA1ADbgW4TwqYxGOU-90ULYGVY015C31aG9PKNTVkfk9DRnYJlFxWbF5o3qL_pdLti52Zb4PRkOPCa2rOprM8FGVSGQos5BvUvlf9QabtDwy</recordid><startdate>20220218</startdate><enddate>20220218</enddate><creator>Valentin Le Fèvre</creator><creator>Usui, Tetsuzo</creator><creator>Casas, Marc</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20220218</creationdate><title>Optimization of the Sparse Multi-Threaded Cholesky Factorization for A64FX</title><author>Valentin Le Fèvre ; Usui, Tetsuzo ; Casas, Marc</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_26313810973</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Algorithms</topic><topic>Cholesky factorization</topic><topic>Factorization</topic><topic>Linear algebra</topic><topic>Linear systems</topic><topic>Mathematical analysis</topic><topic>Microprocessors</topic><topic>Nesting</topic><topic>Optimization</topic><topic>Solvers</topic><topic>Sparse matrices</topic><topic>Sparsity</topic><toplevel>online_resources</toplevel><creatorcontrib>Valentin Le Fèvre</creatorcontrib><creatorcontrib>Usui, Tetsuzo</creatorcontrib><creatorcontrib>Casas, Marc</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Valentin Le Fèvre</au><au>Usui, Tetsuzo</au><au>Casas, Marc</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Optimization of the Sparse Multi-Threaded Cholesky Factorization for A64FX</atitle><jtitle>arXiv.org</jtitle><date>2022-02-18</date><risdate>2022</risdate><eissn>2331-8422</eissn><abstract>Sparse linear algebra routines are fundamental building blocks of a large variety of scientific applications. Direct solvers, which are methods for solving linear systems via the factorization of matrices into products of triangular matrices, are commonly used in many contexts. The Cholesky factorization is the fastest direct method for symmetric and definite positive matrices. This paper presents selective nesting, a method to determine the optimal task granularity for the parallel Cholesky factorization based on the structure of sparse matrices. We propose the OPT-D-COST algorithm, which automatically and dynamically applies selective nesting. OPT-D-COST leverages matrix sparsity to drive complex task-based parallel workloads in the context of direct solvers. We run an extensive evaluation campaign considering a heterogeneous set of 60 sparse matrices and a parallel machine featuring the A64FX processor. OPT-D-COST delivers an average performance speedup of 1.46$\times$ with respect to the best state-of-the-art parallel method to run direct solvers.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	EISSN: 2331-8422
ispartof	arXiv.org, 2022-02
issn	2331-8422
language	eng
recordid	cdi_proquest_journals_2631381097
source	Free E- Journals
subjects	Algorithms Cholesky factorization Factorization Linear algebra Linear systems Mathematical analysis Microprocessors Nesting Optimization Solvers Sparse matrices Sparsity
title	Optimization of the Sparse Multi-Threaded Cholesky Factorization for A64FX
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-25T11%3A17%3A54IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Optimization%20of%20the%20Sparse%20Multi-Threaded%20Cholesky%20Factorization%20for%20A64FX&rft.jtitle=arXiv.org&rft.au=Valentin%20Le%20F%C3%A8vre&rft.date=2022-02-18&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2631381097%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2631381097&rft_id=info:pmid/&rfr_iscdi=true