Revisiting parallel cyclic reduction and parallel prefix-based algorithms for block tridiagonal systems of equations

Direct solvers based on prefix computation and cyclic reduction algorithms exploit the special structure of tridiagonal systems of equations to deliver better parallel performance compared to those designed for more general systems of equations. This performance advantage is even more pronounced for...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Journal of parallel and distributed computing 2013-02, Vol.73 (2), p.273-280
Hauptverfasser:	Seal, Sudip K., Perumalla, Kalyan S., Hirshman, Steven P.
Format:	Artikel
Sprache:	eng
Schlagworte:	Applied sciences Block tridiagonal matrix Computer science control theory systems Computer systems and distributed systems. User interface Cyclic reduction Exact sciences and technology Parallel solver Prefix computation Software
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	280
container_issue	2
container_start_page	273
container_title	Journal of parallel and distributed computing
container_volume	73
creator	Seal, Sudip K. Perumalla, Kalyan S. Hirshman, Steven P.
description	Direct solvers based on prefix computation and cyclic reduction algorithms exploit the special structure of tridiagonal systems of equations to deliver better parallel performance compared to those designed for more general systems of equations. This performance advantage is even more pronounced for block tridiagonal systems. In this paper, we re-examine the performances of these two algorithms taking the effects of block size into account. Depending on the block size, the parameter space spanned by the number of block rows, size of the blocks and the processor count is shown to favor one or the other of the two algorithms. A critical block size that separates these two regions is shown to emerge and its dependence both on problem dependent parameters and on machine-specific constants is established. Empirical verification of these analytical findings is carried out on up to 2048 cores of a Cray XT4 system. ► Studies the effects of block size on the performance of block tridiagonal solvers. ► Establishes the existence of a critical block size. ► Establishes dependence of critical block size on N, P and machine-dependent constants. ► Studies the effect of block size on the weak and strong scalability. ► Presents empirical results to support analytical findings.
doi_str_mv	10.1016/j.jpdc.2012.10.003
format	Article
fullrecord	<record><control><sourceid>elsevier_osti_</sourceid><recordid>TN_cdi_osti_scitechconnect_1060245</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0743731512002535</els_id><sourcerecordid>S0743731512002535</sourcerecordid><originalsourceid>FETCH-LOGICAL-c357t-2ebcc24589536e666eff7589db524a57b845239c8bb4513327743d20e8cdab4c3</originalsourceid><addsrcrecordid>eNp9UE1r3DAQFaGBbJP8gZxEoUe7-rS90EsJbVJYCITkbOTReKON1nIkJWT_fWU2tLeehpl57828R8gVZzVnvPm2q3ezhVowLsqgZkyekBVn66Zineo-kRVrlaxayfUZ-ZzSjjHOddutSL7HN5dcdtOWziYa79FTOIB3QCPaV8guTNRM9t92jji692owCS01fhuiy0_7RMcQ6eADPNMcnXVmGybjaTqkjGUbRoovr2aRSxfkdDQ-4eVHPSePv34-XN9Wm7ub39c_NhVI3eZK4AAglO7WWjbYNA2OY1s6O2ihjG6HTmkh19ANg9JcStEWk1Yw7MCaQYE8J1-OuiFl1ydwGeEJwjQh5J6zhhXxAhJHEMSQUvHWz9HtTTwURL-E2-_6Jdx-CXeZlXAL6euRNJsExo_RTODSX6ZouVZCqoL7fsRhsfnmMC5f4ARoXVyesMH978wf0KaR2w</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Revisiting parallel cyclic reduction and parallel prefix-based algorithms for block tridiagonal systems of equations</title><source>Elsevier ScienceDirect Journals Complete</source><creator>Seal, Sudip K. ; Perumalla, Kalyan S. ; Hirshman, Steven P.</creator><creatorcontrib>Seal, Sudip K. ; Perumalla, Kalyan S. ; Hirshman, Steven P. ; Center for Computational Sciences ; Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility (OLCF)</creatorcontrib><description>Direct solvers based on prefix computation and cyclic reduction algorithms exploit the special structure of tridiagonal systems of equations to deliver better parallel performance compared to those designed for more general systems of equations. This performance advantage is even more pronounced for block tridiagonal systems. In this paper, we re-examine the performances of these two algorithms taking the effects of block size into account. Depending on the block size, the parameter space spanned by the number of block rows, size of the blocks and the processor count is shown to favor one or the other of the two algorithms. A critical block size that separates these two regions is shown to emerge and its dependence both on problem dependent parameters and on machine-specific constants is established. Empirical verification of these analytical findings is carried out on up to 2048 cores of a Cray XT4 system. ► Studies the effects of block size on the performance of block tridiagonal solvers. ► Establishes the existence of a critical block size. ► Establishes dependence of critical block size on N, P and machine-dependent constants. ► Studies the effect of block size on the weak and strong scalability. ► Presents empirical results to support analytical findings.</description><identifier>ISSN: 0743-7315</identifier><identifier>EISSN: 1096-0848</identifier><identifier>DOI: 10.1016/j.jpdc.2012.10.003</identifier><language>eng</language><publisher>Amsterdam: Elsevier Inc</publisher><subject>Applied sciences ; Block tridiagonal matrix ; Computer science; control theory; systems ; Computer systems and distributed systems. User interface ; Cyclic reduction ; Exact sciences and technology ; Parallel solver ; Prefix computation ; Software</subject><ispartof>Journal of parallel and distributed computing, 2013-02, Vol.73 (2), p.273-280</ispartof><rights>2012 Elsevier Inc.</rights><rights>2014 INIST-CNRS</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c357t-2ebcc24589536e666eff7589db524a57b845239c8bb4513327743d20e8cdab4c3</citedby><cites>FETCH-LOGICAL-c357t-2ebcc24589536e666eff7589db524a57b845239c8bb4513327743d20e8cdab4c3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://dx.doi.org/10.1016/j.jpdc.2012.10.003$$EHTML$$P50$$Gelsevier$$H</linktohtml><link.rule.ids>230,314,777,781,882,3537,27905,27906,45976</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=27154234$$DView record in Pascal Francis$$Hfree_for_read</backlink><backlink>$$Uhttps://www.osti.gov/biblio/1060245$$D View this record in Osti.gov$$Hfree_for_read</backlink></links><search><creatorcontrib>Seal, Sudip K.</creatorcontrib><creatorcontrib>Perumalla, Kalyan S.</creatorcontrib><creatorcontrib>Hirshman, Steven P.</creatorcontrib><creatorcontrib>Center for Computational Sciences</creatorcontrib><creatorcontrib>Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility (OLCF)</creatorcontrib><title>Revisiting parallel cyclic reduction and parallel prefix-based algorithms for block tridiagonal systems of equations</title><title>Journal of parallel and distributed computing</title><description>Direct solvers based on prefix computation and cyclic reduction algorithms exploit the special structure of tridiagonal systems of equations to deliver better parallel performance compared to those designed for more general systems of equations. This performance advantage is even more pronounced for block tridiagonal systems. In this paper, we re-examine the performances of these two algorithms taking the effects of block size into account. Depending on the block size, the parameter space spanned by the number of block rows, size of the blocks and the processor count is shown to favor one or the other of the two algorithms. A critical block size that separates these two regions is shown to emerge and its dependence both on problem dependent parameters and on machine-specific constants is established. Empirical verification of these analytical findings is carried out on up to 2048 cores of a Cray XT4 system. ► Studies the effects of block size on the performance of block tridiagonal solvers. ► Establishes the existence of a critical block size. ► Establishes dependence of critical block size on N, P and machine-dependent constants. ► Studies the effect of block size on the weak and strong scalability. ► Presents empirical results to support analytical findings.</description><subject>Applied sciences</subject><subject>Block tridiagonal matrix</subject><subject>Computer science; control theory; systems</subject><subject>Computer systems and distributed systems. User interface</subject><subject>Cyclic reduction</subject><subject>Exact sciences and technology</subject><subject>Parallel solver</subject><subject>Prefix computation</subject><subject>Software</subject><issn>0743-7315</issn><issn>1096-0848</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2013</creationdate><recordtype>article</recordtype><recordid>eNp9UE1r3DAQFaGBbJP8gZxEoUe7-rS90EsJbVJYCITkbOTReKON1nIkJWT_fWU2tLeehpl57828R8gVZzVnvPm2q3ezhVowLsqgZkyekBVn66Zineo-kRVrlaxayfUZ-ZzSjjHOddutSL7HN5dcdtOWziYa79FTOIB3QCPaV8guTNRM9t92jji692owCS01fhuiy0_7RMcQ6eADPNMcnXVmGybjaTqkjGUbRoovr2aRSxfkdDQ-4eVHPSePv34-XN9Wm7ub39c_NhVI3eZK4AAglO7WWjbYNA2OY1s6O2ihjG6HTmkh19ANg9JcStEWk1Yw7MCaQYE8J1-OuiFl1ydwGeEJwjQh5J6zhhXxAhJHEMSQUvHWz9HtTTwURL-E2-_6Jdx-CXeZlXAL6euRNJsExo_RTODSX6ZouVZCqoL7fsRhsfnmMC5f4ARoXVyesMH978wf0KaR2w</recordid><startdate>20130201</startdate><enddate>20130201</enddate><creator>Seal, Sudip K.</creator><creator>Perumalla, Kalyan S.</creator><creator>Hirshman, Steven P.</creator><general>Elsevier Inc</general><general>Elsevier</general><scope>IQODW</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>OTOTI</scope></search><sort><creationdate>20130201</creationdate><title>Revisiting parallel cyclic reduction and parallel prefix-based algorithms for block tridiagonal systems of equations</title><author>Seal, Sudip K. ; Perumalla, Kalyan S. ; Hirshman, Steven P.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c357t-2ebcc24589536e666eff7589db524a57b845239c8bb4513327743d20e8cdab4c3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2013</creationdate><topic>Applied sciences</topic><topic>Block tridiagonal matrix</topic><topic>Computer science; control theory; systems</topic><topic>Computer systems and distributed systems. User interface</topic><topic>Cyclic reduction</topic><topic>Exact sciences and technology</topic><topic>Parallel solver</topic><topic>Prefix computation</topic><topic>Software</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Seal, Sudip K.</creatorcontrib><creatorcontrib>Perumalla, Kalyan S.</creatorcontrib><creatorcontrib>Hirshman, Steven P.</creatorcontrib><creatorcontrib>Center for Computational Sciences</creatorcontrib><creatorcontrib>Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility (OLCF)</creatorcontrib><collection>Pascal-Francis</collection><collection>CrossRef</collection><collection>OSTI.GOV</collection><jtitle>Journal of parallel and distributed computing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Seal, Sudip K.</au><au>Perumalla, Kalyan S.</au><au>Hirshman, Steven P.</au><aucorp>Center for Computational Sciences</aucorp><aucorp>Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility (OLCF)</aucorp><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Revisiting parallel cyclic reduction and parallel prefix-based algorithms for block tridiagonal systems of equations</atitle><jtitle>Journal of parallel and distributed computing</jtitle><date>2013-02-01</date><risdate>2013</risdate><volume>73</volume><issue>2</issue><spage>273</spage><epage>280</epage><pages>273-280</pages><issn>0743-7315</issn><eissn>1096-0848</eissn><abstract>Direct solvers based on prefix computation and cyclic reduction algorithms exploit the special structure of tridiagonal systems of equations to deliver better parallel performance compared to those designed for more general systems of equations. This performance advantage is even more pronounced for block tridiagonal systems. In this paper, we re-examine the performances of these two algorithms taking the effects of block size into account. Depending on the block size, the parameter space spanned by the number of block rows, size of the blocks and the processor count is shown to favor one or the other of the two algorithms. A critical block size that separates these two regions is shown to emerge and its dependence both on problem dependent parameters and on machine-specific constants is established. Empirical verification of these analytical findings is carried out on up to 2048 cores of a Cray XT4 system. ► Studies the effects of block size on the performance of block tridiagonal solvers. ► Establishes the existence of a critical block size. ► Establishes dependence of critical block size on N, P and machine-dependent constants. ► Studies the effect of block size on the weak and strong scalability. ► Presents empirical results to support analytical findings.</abstract><cop>Amsterdam</cop><pub>Elsevier Inc</pub><doi>10.1016/j.jpdc.2012.10.003</doi><tpages>8</tpages></addata></record>
fulltext	fulltext
identifier	ISSN: 0743-7315
ispartof	Journal of parallel and distributed computing, 2013-02, Vol.73 (2), p.273-280
issn	0743-7315 1096-0848
language	eng
recordid	cdi_osti_scitechconnect_1060245
source	Elsevier ScienceDirect Journals Complete
subjects	Applied sciences Block tridiagonal matrix Computer science control theory systems Computer systems and distributed systems. User interface Cyclic reduction Exact sciences and technology Parallel solver Prefix computation Software
title	Revisiting parallel cyclic reduction and parallel prefix-based algorithms for block tridiagonal systems of equations
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-17T14%3A56%3A18IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-elsevier_osti_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Revisiting%20parallel%20cyclic%20reduction%20and%20parallel%20prefix-based%20algorithms%20for%20block%20tridiagonal%20systems%20of%20equations&rft.jtitle=Journal%20of%20parallel%20and%20distributed%20computing&rft.au=Seal,%20Sudip%20K.&rft.aucorp=Center%20for%20Computational%20Sciences&rft.date=2013-02-01&rft.volume=73&rft.issue=2&rft.spage=273&rft.epage=280&rft.pages=273-280&rft.issn=0743-7315&rft.eissn=1096-0848&rft_id=info:doi/10.1016/j.jpdc.2012.10.003&rft_dat=%3Celsevier_osti_%3ES0743731512002535%3C/elsevier_osti_%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_els_id=S0743731512002535&rfr_iscdi=true