Coarsening and parallelism with reduction multigrids for hyperbolic Boltzmann transport

Reduction multigrids have recently shown good performance in hyperbolic problems without the need for Gauss-Seidel smoothers. When applied to the hyperbolic limit of the Boltzmann Transport Equation (BTE), these methods result in very close to [Formula: see text] growth in work with problem size on...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	The international journal of high performance computing applications 2024-12
Hauptverfasser:	Dargaville, Steven, Smedley-Stevenson, Richard, Smith, Paul, Pain, Christopher C
Format:	Artikel
Sprache:	eng
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title	The international journal of high performance computing applications
container_volume
creator	Dargaville, Steven Smedley-Stevenson, Richard Smith, Paul Pain, Christopher C
description	Reduction multigrids have recently shown good performance in hyperbolic problems without the need for Gauss-Seidel smoothers. When applied to the hyperbolic limit of the Boltzmann Transport Equation (BTE), these methods result in very close to [Formula: see text] growth in work with problem size on unstructured grids. This scalability relies on the CF splitting producing an A ff block that is easy to invert. We introduce a parallel two-pass CF splitting designed to give diagonally dominant A ff . The first pass computes a maximal independent set in the symmetrized strong connections. The second pass converts F-points to C-points based on the row-wise diagonal dominance of A ff . We find this two-pass CF splitting outperforms common CF splittings available in hypre. Furthermore, parallelisation of reduction multigrids in hyperbolic problems is difficult as we require both long-range grid-transfer operators and slow coarsenings (with rates of ∼1/2 in both 2D and 3D). We find that good parallel performance in the setup and solve is dependent on several factors: repartitioning the coarse grids, reducing the number of active MPI ranks as we coarsen, truncating the multigrid hierarchy and applying a GMRES polynomial as a coarse-grid solver. We compare the performance of two different reduction multigrids, AIRG (that we developed previously) and the hypre implementation of ℓAIR. In the streaming limit with AIRG, we demonstrate 81% weak scaling efficiency in the solve from 2 to 64 nodes (256 to 8196 cores) with only 8.8k unknowns per core, with solve times up to 5.9× smaller than the ℓAIR implementation in hypre.
doi_str_mv	10.1177/10943420241304759
format	Article
fullrecord	<record><control><sourceid>crossref</sourceid><recordid>TN_cdi_crossref_primary_10_1177_10943420241304759</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>10_1177_10943420241304759</sourcerecordid><originalsourceid>FETCH-LOGICAL-c127t-310f1fcc4b707b7fe7aada3ba911cb5bb75e365b24f35d3d79846255ad40a6a33</originalsourceid><addsrcrecordid>eNplkM9KAzEYxIMoWKsP4C0vsJpvk2zcoxb_QcGL4nH5sknaSDZZkhTRp7dFb55mYIaB-RFyCewKQKlrYL3gomWtAM6Ekv0RWYAS0LQ3ojve-33eHAqn5KyUD8ZYJ7hckPdVwlxs9HFDMRo6Y8YQbPBlop--bmm2ZjdWnyKddqH6TfamUJcy3X7NNusU_EjvUqjfE8ZIa8ZY5pTrOTlxGIq9-NMleXu4f109NeuXx-fV7boZoVW14cAcuHEUWjGllbMK0SDX2AOMWmqtpOWd1K1wXBpuVL-_00qJRjDskPMlgd_dMadSsnXDnP2E-WsANhzIDP_I8B89FVho</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Coarsening and parallelism with reduction multigrids for hyperbolic Boltzmann transport</title><source>SAGE Journals</source><source>Alma/SFX Local Collection</source><creator>Dargaville, Steven ; Smedley-Stevenson, Richard ; Smith, Paul ; Pain, Christopher C</creator><creatorcontrib>Dargaville, Steven ; Smedley-Stevenson, Richard ; Smith, Paul ; Pain, Christopher C</creatorcontrib><description>Reduction multigrids have recently shown good performance in hyperbolic problems without the need for Gauss-Seidel smoothers. When applied to the hyperbolic limit of the Boltzmann Transport Equation (BTE), these methods result in very close to [Formula: see text] growth in work with problem size on unstructured grids. This scalability relies on the CF splitting producing an A ff block that is easy to invert. We introduce a parallel two-pass CF splitting designed to give diagonally dominant A ff . The first pass computes a maximal independent set in the symmetrized strong connections. The second pass converts F-points to C-points based on the row-wise diagonal dominance of A ff . We find this two-pass CF splitting outperforms common CF splittings available in hypre. Furthermore, parallelisation of reduction multigrids in hyperbolic problems is difficult as we require both long-range grid-transfer operators and slow coarsenings (with rates of ∼1/2 in both 2D and 3D). We find that good parallel performance in the setup and solve is dependent on several factors: repartitioning the coarse grids, reducing the number of active MPI ranks as we coarsen, truncating the multigrid hierarchy and applying a GMRES polynomial as a coarse-grid solver. We compare the performance of two different reduction multigrids, AIRG (that we developed previously) and the hypre implementation of ℓAIR. In the streaming limit with AIRG, we demonstrate 81% weak scaling efficiency in the solve from 2 to 64 nodes (256 to 8196 cores) with only 8.8k unknowns per core, with solve times up to 5.9× smaller than the ℓAIR implementation in hypre.</description><identifier>ISSN: 1094-3420</identifier><identifier>EISSN: 1741-2846</identifier><identifier>DOI: 10.1177/10943420241304759</identifier><language>eng</language><ispartof>The international journal of high performance computing applications, 2024-12</ispartof><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c127t-310f1fcc4b707b7fe7aada3ba911cb5bb75e365b24f35d3d79846255ad40a6a33</cites><orcidid>0000-0002-8890-7437</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,776,780,27901,27902</link.rule.ids></links><search><creatorcontrib>Dargaville, Steven</creatorcontrib><creatorcontrib>Smedley-Stevenson, Richard</creatorcontrib><creatorcontrib>Smith, Paul</creatorcontrib><creatorcontrib>Pain, Christopher C</creatorcontrib><title>Coarsening and parallelism with reduction multigrids for hyperbolic Boltzmann transport</title><title>The international journal of high performance computing applications</title><description>Reduction multigrids have recently shown good performance in hyperbolic problems without the need for Gauss-Seidel smoothers. When applied to the hyperbolic limit of the Boltzmann Transport Equation (BTE), these methods result in very close to [Formula: see text] growth in work with problem size on unstructured grids. This scalability relies on the CF splitting producing an A ff block that is easy to invert. We introduce a parallel two-pass CF splitting designed to give diagonally dominant A ff . The first pass computes a maximal independent set in the symmetrized strong connections. The second pass converts F-points to C-points based on the row-wise diagonal dominance of A ff . We find this two-pass CF splitting outperforms common CF splittings available in hypre. Furthermore, parallelisation of reduction multigrids in hyperbolic problems is difficult as we require both long-range grid-transfer operators and slow coarsenings (with rates of ∼1/2 in both 2D and 3D). We find that good parallel performance in the setup and solve is dependent on several factors: repartitioning the coarse grids, reducing the number of active MPI ranks as we coarsen, truncating the multigrid hierarchy and applying a GMRES polynomial as a coarse-grid solver. We compare the performance of two different reduction multigrids, AIRG (that we developed previously) and the hypre implementation of ℓAIR. In the streaming limit with AIRG, we demonstrate 81% weak scaling efficiency in the solve from 2 to 64 nodes (256 to 8196 cores) with only 8.8k unknowns per core, with solve times up to 5.9× smaller than the ℓAIR implementation in hypre.</description><issn>1094-3420</issn><issn>1741-2846</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNplkM9KAzEYxIMoWKsP4C0vsJpvk2zcoxb_QcGL4nH5sknaSDZZkhTRp7dFb55mYIaB-RFyCewKQKlrYL3gomWtAM6Ekv0RWYAS0LQ3ojve-33eHAqn5KyUD8ZYJ7hckPdVwlxs9HFDMRo6Y8YQbPBlop--bmm2ZjdWnyKddqH6TfamUJcy3X7NNusU_EjvUqjfE8ZIa8ZY5pTrOTlxGIq9-NMleXu4f109NeuXx-fV7boZoVW14cAcuHEUWjGllbMK0SDX2AOMWmqtpOWd1K1wXBpuVL-_00qJRjDskPMlgd_dMadSsnXDnP2E-WsANhzIDP_I8B89FVho</recordid><startdate>20241213</startdate><enddate>20241213</enddate><creator>Dargaville, Steven</creator><creator>Smedley-Stevenson, Richard</creator><creator>Smith, Paul</creator><creator>Pain, Christopher C</creator><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0000-0002-8890-7437</orcidid></search><sort><creationdate>20241213</creationdate><title>Coarsening and parallelism with reduction multigrids for hyperbolic Boltzmann transport</title><author>Dargaville, Steven ; Smedley-Stevenson, Richard ; Smith, Paul ; Pain, Christopher C</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c127t-310f1fcc4b707b7fe7aada3ba911cb5bb75e365b24f35d3d79846255ad40a6a33</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Dargaville, Steven</creatorcontrib><creatorcontrib>Smedley-Stevenson, Richard</creatorcontrib><creatorcontrib>Smith, Paul</creatorcontrib><creatorcontrib>Pain, Christopher C</creatorcontrib><collection>CrossRef</collection><jtitle>The international journal of high performance computing applications</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Dargaville, Steven</au><au>Smedley-Stevenson, Richard</au><au>Smith, Paul</au><au>Pain, Christopher C</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Coarsening and parallelism with reduction multigrids for hyperbolic Boltzmann transport</atitle><jtitle>The international journal of high performance computing applications</jtitle><date>2024-12-13</date><risdate>2024</risdate><issn>1094-3420</issn><eissn>1741-2846</eissn><abstract>Reduction multigrids have recently shown good performance in hyperbolic problems without the need for Gauss-Seidel smoothers. When applied to the hyperbolic limit of the Boltzmann Transport Equation (BTE), these methods result in very close to [Formula: see text] growth in work with problem size on unstructured grids. This scalability relies on the CF splitting producing an A ff block that is easy to invert. We introduce a parallel two-pass CF splitting designed to give diagonally dominant A ff . The first pass computes a maximal independent set in the symmetrized strong connections. The second pass converts F-points to C-points based on the row-wise diagonal dominance of A ff . We find this two-pass CF splitting outperforms common CF splittings available in hypre. Furthermore, parallelisation of reduction multigrids in hyperbolic problems is difficult as we require both long-range grid-transfer operators and slow coarsenings (with rates of ∼1/2 in both 2D and 3D). We find that good parallel performance in the setup and solve is dependent on several factors: repartitioning the coarse grids, reducing the number of active MPI ranks as we coarsen, truncating the multigrid hierarchy and applying a GMRES polynomial as a coarse-grid solver. We compare the performance of two different reduction multigrids, AIRG (that we developed previously) and the hypre implementation of ℓAIR. In the streaming limit with AIRG, we demonstrate 81% weak scaling efficiency in the solve from 2 to 64 nodes (256 to 8196 cores) with only 8.8k unknowns per core, with solve times up to 5.9× smaller than the ℓAIR implementation in hypre.</abstract><doi>10.1177/10943420241304759</doi><orcidid>https://orcid.org/0000-0002-8890-7437</orcidid></addata></record>
fulltext	fulltext
identifier	ISSN: 1094-3420
ispartof	The international journal of high performance computing applications, 2024-12
issn	1094-3420 1741-2846
language	eng
recordid	cdi_crossref_primary_10_1177_10943420241304759
source	SAGE Journals; Alma/SFX Local Collection
title	Coarsening and parallelism with reduction multigrids for hyperbolic Boltzmann transport
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-04T23%3A53%3A32IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-crossref&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Coarsening%20and%20parallelism%20with%20reduction%20multigrids%20for%20hyperbolic%20Boltzmann%20transport&rft.jtitle=The%20international%20journal%20of%20high%20performance%20computing%20applications&rft.au=Dargaville,%20Steven&rft.date=2024-12-13&rft.issn=1094-3420&rft.eissn=1741-2846&rft_id=info:doi/10.1177/10943420241304759&rft_dat=%3Ccrossref%3E10_1177_10943420241304759%3C/crossref%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true