A vector-parallel FFT with a user-specifiable data distribution scheme

We propose a 1-dimensional FFT routine for distributed-memory vector-parallel machines which provides the user with both high performance and flexibility in data distribution. Our routine inputs/outputs data using block cyclic data distribution, and the block sizes for input and output can be specif...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Yamamoto, Yusaku, Igai, Mitsuyoshi, Naono, Ken
Format:	Tagungsbericht
Sprache:	eng
Schlagworte:	Applied sciences Computer science control theory systems Computer systems and distributed systems. User interface Exact sciences and technology Software
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	374
container_issue
container_start_page	362
container_title
container_volume	2745
creator	Yamamoto, Yusaku Igai, Mitsuyoshi Naono, Ken
description	We propose a 1-dimensional FFT routine for distributed-memory vector-parallel machines which provides the user with both high performance and flexibility in data distribution. Our routine inputs/outputs data using block cyclic data distribution, and the block sizes for input and output can be specified independently by the user. This flexibility is realized with the same amount of inter-processor communication as the widely used transpose algorithm and no additional overhead for data redistribution is necessary. We implemented our method on the Hitachi SR2201, a distributed-memory parallel machine with pseudovector processing nodes, and obtained 45% of the peak performance on 16 nodes when the problem size is N = 224. This performance was unchanged for a wide range of block sizes from 1 to 16.
doi_str_mv	10.5555/1761566.1761613
format	Conference Proceeding
fullrecord	<record><control><sourceid>proquest_pasca</sourceid><recordid>TN_cdi_pascalfrancis_primary_15618871</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>EBC3071750_42_373</sourcerecordid><originalsourceid>FETCH-LOGICAL-a231t-93d1140c0056bf5f71c6dd2e5cbd652dd6877ef757b0e18e1e68a7a6256cb5af3</originalsourceid><addsrcrecordid>eNqNkc9P6zAMx8ODhxg_zlx7QeLSkcRN0h4RYoA0iQucIzd1tfC6tSQZiP-eTNsf8HyxbH9s2f4ydi34XGW7E0YLpfV857WAI3YOquKQo6b6w2Y5J0qAqjneFyquJMgTNuPAZdmYCk7ZrAElG14bc8auYvzg2aQwjeQztrgvvsilMZQTBhwGGorF4q349mlVYLGNFMo4kfO9x3agosOERedjCr7dJj9uiuhWtKZL9rfHIdLVwV-w98Xj28NzuXx9enm4X5YoQaSygU6IijvOlW571RvhdNdJUq7ttJJdp_OS1BtlWk6iJkG6RoNaKu1ahT1csJv93Amjw6EPuHE-2in4NYYfmz8l6tqIzMGBC-PnlmKy1I7jP0eblK90K5wShWiBG2EUt5W0YCB33e670K3tjo9WcLuTwR5ksAcZMjr_T9S2wVMPv_eRf1M</addsrcrecordid><sourcetype>Index Database</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype><pqid>EBC3071750_42_373</pqid></control><display><type>conference_proceeding</type><title>A vector-parallel FFT with a user-specifiable data distribution scheme</title><source>Springer Books</source><creator>Yamamoto, Yusaku ; Igai, Mitsuyoshi ; Naono, Ken</creator><contributor>Yang, Laurence Tianruo ; Guo, Minyi ; Guo, Minyi ; Yang, Laurence Tianruo</contributor><creatorcontrib>Yamamoto, Yusaku ; Igai, Mitsuyoshi ; Naono, Ken ; Yang, Laurence Tianruo ; Guo, Minyi ; Guo, Minyi ; Yang, Laurence Tianruo</creatorcontrib><description>We propose a 1-dimensional FFT routine for distributed-memory vector-parallel machines which provides the user with both high performance and flexibility in data distribution. Our routine inputs/outputs data using block cyclic data distribution, and the block sizes for input and output can be specified independently by the user. This flexibility is realized with the same amount of inter-processor communication as the widely used transpose algorithm and no additional overhead for data redistribution is necessary. We implemented our method on the Hitachi SR2201, a distributed-memory parallel machine with pseudovector processing nodes, and obtained 45% of the peak performance on 16 nodes when the problem size is N = 224. This performance was unchanged for a wide range of block sizes from 1 to 16.</description><identifier>ISSN: 0302-9743</identifier><identifier>ISBN: 3540405232</identifier><identifier>ISBN: 9783540405238</identifier><identifier>EISSN: 1611-3349</identifier><identifier>EISBN: 3540376194</identifier><identifier>EISBN: 9783540376194</identifier><identifier>DOI: 10.5555/1761566.1761613</identifier><identifier>OCLC: 935290877</identifier><identifier>LCCallNum: P301-301.5</identifier><language>eng</language><publisher>Berlin, Heidelberg: Springer-Verlag</publisher><subject>Applied sciences ; Computer science; control theory; systems ; Computer systems and distributed systems. User interface ; Exact sciences and technology ; Software</subject><ispartof>Lecture notes in computer science, 2003, Vol.2745, p.362-374</ispartof><rights>2004 INIST-CNRS</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Uhttps://ebookcentral.proquest.com/covers/3071750-l.jpg</thumbnail><link.rule.ids>309,310,779,780,784,789,790,793,27925</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=15618871$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><contributor>Yang, Laurence Tianruo</contributor><contributor>Guo, Minyi</contributor><contributor>Guo, Minyi</contributor><contributor>Yang, Laurence Tianruo</contributor><creatorcontrib>Yamamoto, Yusaku</creatorcontrib><creatorcontrib>Igai, Mitsuyoshi</creatorcontrib><creatorcontrib>Naono, Ken</creatorcontrib><title>A vector-parallel FFT with a user-specifiable data distribution scheme</title><title>Lecture notes in computer science</title><description>We propose a 1-dimensional FFT routine for distributed-memory vector-parallel machines which provides the user with both high performance and flexibility in data distribution. Our routine inputs/outputs data using block cyclic data distribution, and the block sizes for input and output can be specified independently by the user. This flexibility is realized with the same amount of inter-processor communication as the widely used transpose algorithm and no additional overhead for data redistribution is necessary. We implemented our method on the Hitachi SR2201, a distributed-memory parallel machine with pseudovector processing nodes, and obtained 45% of the peak performance on 16 nodes when the problem size is N = 224. This performance was unchanged for a wide range of block sizes from 1 to 16.</description><subject>Applied sciences</subject><subject>Computer science; control theory; systems</subject><subject>Computer systems and distributed systems. User interface</subject><subject>Exact sciences and technology</subject><subject>Software</subject><issn>0302-9743</issn><issn>1611-3349</issn><isbn>3540405232</isbn><isbn>9783540405238</isbn><isbn>3540376194</isbn><isbn>9783540376194</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2003</creationdate><recordtype>conference_proceeding</recordtype><recordid>eNqNkc9P6zAMx8ODhxg_zlx7QeLSkcRN0h4RYoA0iQucIzd1tfC6tSQZiP-eTNsf8HyxbH9s2f4ydi34XGW7E0YLpfV857WAI3YOquKQo6b6w2Y5J0qAqjneFyquJMgTNuPAZdmYCk7ZrAElG14bc8auYvzg2aQwjeQztrgvvsilMZQTBhwGGorF4q349mlVYLGNFMo4kfO9x3agosOERedjCr7dJj9uiuhWtKZL9rfHIdLVwV-w98Xj28NzuXx9enm4X5YoQaSygU6IijvOlW571RvhdNdJUq7ttJJdp_OS1BtlWk6iJkG6RoNaKu1ahT1csJv93Amjw6EPuHE-2in4NYYfmz8l6tqIzMGBC-PnlmKy1I7jP0eblK90K5wShWiBG2EUt5W0YCB33e670K3tjo9WcLuTwR5ksAcZMjr_T9S2wVMPv_eRf1M</recordid><startdate>20030101</startdate><enddate>20030101</enddate><creator>Yamamoto, Yusaku</creator><creator>Igai, Mitsuyoshi</creator><creator>Naono, Ken</creator><general>Springer-Verlag</general><general>Springer Berlin / Heidelberg</general><general>Springer</general><scope>FFUUA</scope><scope>IQODW</scope></search><sort><creationdate>20030101</creationdate><title>A vector-parallel FFT with a user-specifiable data distribution scheme</title><author>Yamamoto, Yusaku ; Igai, Mitsuyoshi ; Naono, Ken</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a231t-93d1140c0056bf5f71c6dd2e5cbd652dd6877ef757b0e18e1e68a7a6256cb5af3</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2003</creationdate><topic>Applied sciences</topic><topic>Computer science; control theory; systems</topic><topic>Computer systems and distributed systems. User interface</topic><topic>Exact sciences and technology</topic><topic>Software</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Yamamoto, Yusaku</creatorcontrib><creatorcontrib>Igai, Mitsuyoshi</creatorcontrib><creatorcontrib>Naono, Ken</creatorcontrib><collection>ProQuest Ebook Central - Book Chapters - Demo use only</collection><collection>Pascal-Francis</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Yamamoto, Yusaku</au><au>Igai, Mitsuyoshi</au><au>Naono, Ken</au><au>Yang, Laurence Tianruo</au><au>Guo, Minyi</au><au>Guo, Minyi</au><au>Yang, Laurence Tianruo</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>A vector-parallel FFT with a user-specifiable data distribution scheme</atitle><btitle>Lecture notes in computer science</btitle><date>2003-01-01</date><risdate>2003</risdate><volume>2745</volume><spage>362</spage><epage>374</epage><pages>362-374</pages><issn>0302-9743</issn><eissn>1611-3349</eissn><isbn>3540405232</isbn><isbn>9783540405238</isbn><eisbn>3540376194</eisbn><eisbn>9783540376194</eisbn><abstract>We propose a 1-dimensional FFT routine for distributed-memory vector-parallel machines which provides the user with both high performance and flexibility in data distribution. Our routine inputs/outputs data using block cyclic data distribution, and the block sizes for input and output can be specified independently by the user. This flexibility is realized with the same amount of inter-processor communication as the widely used transpose algorithm and no additional overhead for data redistribution is necessary. We implemented our method on the Hitachi SR2201, a distributed-memory parallel machine with pseudovector processing nodes, and obtained 45% of the peak performance on 16 nodes when the problem size is N = 224. This performance was unchanged for a wide range of block sizes from 1 to 16.</abstract><cop>Berlin, Heidelberg</cop><pub>Springer-Verlag</pub><doi>10.5555/1761566.1761613</doi><oclcid>935290877</oclcid><tpages>13</tpages></addata></record>
fulltext	fulltext
identifier	ISSN: 0302-9743
ispartof	Lecture notes in computer science, 2003, Vol.2745, p.362-374
issn	0302-9743 1611-3349
language	eng
recordid	cdi_pascalfrancis_primary_15618871
source	Springer Books
subjects	Applied sciences Computer science control theory systems Computer systems and distributed systems. User interface Exact sciences and technology Software
title	A vector-parallel FFT with a user-specifiable data distribution scheme
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-27T18%3A47%3A03IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pasca&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=A%20vector-parallel%20FFT%20with%20a%20user-specifiable%20data%20distribution%20scheme&rft.btitle=Lecture%20notes%20in%20computer%20science&rft.au=Yamamoto,%20Yusaku&rft.date=2003-01-01&rft.volume=2745&rft.spage=362&rft.epage=374&rft.pages=362-374&rft.issn=0302-9743&rft.eissn=1611-3349&rft.isbn=3540405232&rft.isbn_list=9783540405238&rft_id=info:doi/10.5555/1761566.1761613&rft_dat=%3Cproquest_pasca%3EEBC3071750_42_373%3C/proquest_pasca%3E%3Curl%3E%3C/url%3E&rft.eisbn=3540376194&rft.eisbn_list=9783540376194&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=EBC3071750_42_373&rft_id=info:pmid/&rfr_iscdi=true