Optimal semi-oblique tiling

For 2D iteration space tiling, we address the problem of determining the tile parameters that minimize the total execution time on a parallel machine. We consider uniform dependency computations tiled so that (at least) one of the tile boundaries is parallel to the domain boundaries. We determine th...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on parallel and distributed systems 2003-09, Vol.14 (9), p.944-960
Hauptverfasser: Andonov, R., Balev, S., Rajopadhye, S., Yanev, N.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 960
container_issue 9
container_start_page 944
container_title IEEE transactions on parallel and distributed systems
container_volume 14
creator Andonov, R.
Balev, S.
Rajopadhye, S.
Yanev, N.
description For 2D iteration space tiling, we address the problem of determining the tile parameters that minimize the total execution time on a parallel machine. We consider uniform dependency computations tiled so that (at least) one of the tile boundaries is parallel to the domain boundaries. We determine the optimal tile size as a closed form solution. In addition, we determine the optimal number of processors and also the optimal slope of the oblique tile boundary. Our results are based on the BSP model, which assures the portability of the results. Our predictions are justified on a sequence global alignment problem specialized to similar sequences using Fickett's k-band algorithm, for which our optimal semi-oblique tiling yields an improvement of a factor of 2.5 over orthogonal tiling. Our optimal solution requires a block-cyclic distribution of tiles to processors. The best one can obtain with only block distribution (as many authors require) is three times slower. Furthermore, our best running time is within 10 percent of the "predicted theoretical peak" performance of the machine!.
doi_str_mv 10.1109/TPDS.2003.1233716
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_journals_922418324</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>1233716</ieee_id><sourcerecordid>28564695</sourcerecordid><originalsourceid>FETCH-LOGICAL-c321t-167b3dfa77e61f280589c4470447265a6570118c7a697a9b7e71f7d042d9e6d53</originalsourceid><addsrcrecordid>eNpdkEtLw0AQxxdRsFY_gPRSPHhLndln9ii1PqBQwXpeNslGtuRRs-nBb--GBAQPwwzMbx7_PyG3CCtE0A_796ePFQVgK6SMKZRnZIZCpAnFlJ3HGrhINEV9Sa5COAAgF8BnZLE79r621TK42idtVvnvk1v2vvLN1zW5KG0V3M2U5-TzebNfvybb3cvb-nGb5Ixin6BUGStKq5STWNIURKpzzhXEoFJYKRQgprmyUiurM-UUlqoATgvtZCHYnNyPe49dG6-H3tQ-5K6qbOPaUzA0FZJLPYB3_8BDe-qa-JvRlPKolPII4QjlXRtC50pz7KLC7scgmMErM3hlBq_M5FWcWYwz3jn3x0_dX2fqYdE</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>922418324</pqid></control><display><type>article</type><title>Optimal semi-oblique tiling</title><source>IEEE Electronic Library (IEL)</source><creator>Andonov, R. ; Balev, S. ; Rajopadhye, S. ; Yanev, N.</creator><creatorcontrib>Andonov, R. ; Balev, S. ; Rajopadhye, S. ; Yanev, N.</creatorcontrib><description>For 2D iteration space tiling, we address the problem of determining the tile parameters that minimize the total execution time on a parallel machine. We consider uniform dependency computations tiled so that (at least) one of the tile boundaries is parallel to the domain boundaries. We determine the optimal tile size as a closed form solution. In addition, we determine the optimal number of processors and also the optimal slope of the oblique tile boundary. Our results are based on the BSP model, which assures the portability of the results. Our predictions are justified on a sequence global alignment problem specialized to similar sequences using Fickett's k-band algorithm, for which our optimal semi-oblique tiling yields an improvement of a factor of 2.5 over orthogonal tiling. Our optimal solution requires a block-cyclic distribution of tiles to processors. The best one can obtain with only block distribution (as many authors require) is three times slower. Furthermore, our best running time is within 10 percent of the "predicted theoretical peak" performance of the machine!.</description><identifier>ISSN: 1045-9219</identifier><identifier>EISSN: 1558-2183</identifier><identifier>DOI: 10.1109/TPDS.2003.1233716</identifier><identifier>CODEN: ITDSEO</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Algorithms ; Biological information theory ; Biological system modeling ; Closed-form solution ; Computer Society ; Concurrent computing ; K-band ; Parallel machines ; Programming profession ; Studies</subject><ispartof>IEEE transactions on parallel and distributed systems, 2003-09, Vol.14 (9), p.944-960</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2003</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c321t-167b3dfa77e61f280589c4470447265a6570118c7a697a9b7e71f7d042d9e6d53</citedby><cites>FETCH-LOGICAL-c321t-167b3dfa77e61f280589c4470447265a6570118c7a697a9b7e71f7d042d9e6d53</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/1233716$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27903,27904,54736</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/1233716$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Andonov, R.</creatorcontrib><creatorcontrib>Balev, S.</creatorcontrib><creatorcontrib>Rajopadhye, S.</creatorcontrib><creatorcontrib>Yanev, N.</creatorcontrib><title>Optimal semi-oblique tiling</title><title>IEEE transactions on parallel and distributed systems</title><addtitle>TPDS</addtitle><description>For 2D iteration space tiling, we address the problem of determining the tile parameters that minimize the total execution time on a parallel machine. We consider uniform dependency computations tiled so that (at least) one of the tile boundaries is parallel to the domain boundaries. We determine the optimal tile size as a closed form solution. In addition, we determine the optimal number of processors and also the optimal slope of the oblique tile boundary. Our results are based on the BSP model, which assures the portability of the results. Our predictions are justified on a sequence global alignment problem specialized to similar sequences using Fickett's k-band algorithm, for which our optimal semi-oblique tiling yields an improvement of a factor of 2.5 over orthogonal tiling. Our optimal solution requires a block-cyclic distribution of tiles to processors. The best one can obtain with only block distribution (as many authors require) is three times slower. Furthermore, our best running time is within 10 percent of the "predicted theoretical peak" performance of the machine!.</description><subject>Algorithms</subject><subject>Biological information theory</subject><subject>Biological system modeling</subject><subject>Closed-form solution</subject><subject>Computer Society</subject><subject>Concurrent computing</subject><subject>K-band</subject><subject>Parallel machines</subject><subject>Programming profession</subject><subject>Studies</subject><issn>1045-9219</issn><issn>1558-2183</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2003</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpdkEtLw0AQxxdRsFY_gPRSPHhLndln9ii1PqBQwXpeNslGtuRRs-nBb--GBAQPwwzMbx7_PyG3CCtE0A_796ePFQVgK6SMKZRnZIZCpAnFlJ3HGrhINEV9Sa5COAAgF8BnZLE79r621TK42idtVvnvk1v2vvLN1zW5KG0V3M2U5-TzebNfvybb3cvb-nGb5Ixin6BUGStKq5STWNIURKpzzhXEoFJYKRQgprmyUiurM-UUlqoATgvtZCHYnNyPe49dG6-H3tQ-5K6qbOPaUzA0FZJLPYB3_8BDe-qa-JvRlPKolPII4QjlXRtC50pz7KLC7scgmMErM3hlBq_M5FWcWYwz3jn3x0_dX2fqYdE</recordid><startdate>200309</startdate><enddate>200309</enddate><creator>Andonov, R.</creator><creator>Balev, S.</creator><creator>Rajopadhye, S.</creator><creator>Yanev, N.</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>200309</creationdate><title>Optimal semi-oblique tiling</title><author>Andonov, R. ; Balev, S. ; Rajopadhye, S. ; Yanev, N.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c321t-167b3dfa77e61f280589c4470447265a6570118c7a697a9b7e71f7d042d9e6d53</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2003</creationdate><topic>Algorithms</topic><topic>Biological information theory</topic><topic>Biological system modeling</topic><topic>Closed-form solution</topic><topic>Computer Society</topic><topic>Concurrent computing</topic><topic>K-band</topic><topic>Parallel machines</topic><topic>Programming profession</topic><topic>Studies</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Andonov, R.</creatorcontrib><creatorcontrib>Balev, S.</creatorcontrib><creatorcontrib>Rajopadhye, S.</creatorcontrib><creatorcontrib>Yanev, N.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE transactions on parallel and distributed systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Andonov, R.</au><au>Balev, S.</au><au>Rajopadhye, S.</au><au>Yanev, N.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Optimal semi-oblique tiling</atitle><jtitle>IEEE transactions on parallel and distributed systems</jtitle><stitle>TPDS</stitle><date>2003-09</date><risdate>2003</risdate><volume>14</volume><issue>9</issue><spage>944</spage><epage>960</epage><pages>944-960</pages><issn>1045-9219</issn><eissn>1558-2183</eissn><coden>ITDSEO</coden><abstract>For 2D iteration space tiling, we address the problem of determining the tile parameters that minimize the total execution time on a parallel machine. We consider uniform dependency computations tiled so that (at least) one of the tile boundaries is parallel to the domain boundaries. We determine the optimal tile size as a closed form solution. In addition, we determine the optimal number of processors and also the optimal slope of the oblique tile boundary. Our results are based on the BSP model, which assures the portability of the results. Our predictions are justified on a sequence global alignment problem specialized to similar sequences using Fickett's k-band algorithm, for which our optimal semi-oblique tiling yields an improvement of a factor of 2.5 over orthogonal tiling. Our optimal solution requires a block-cyclic distribution of tiles to processors. The best one can obtain with only block distribution (as many authors require) is three times slower. Furthermore, our best running time is within 10 percent of the "predicted theoretical peak" performance of the machine!.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TPDS.2003.1233716</doi><tpages>17</tpages></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 1045-9219
ispartof IEEE transactions on parallel and distributed systems, 2003-09, Vol.14 (9), p.944-960
issn 1045-9219
1558-2183
language eng
recordid cdi_proquest_journals_922418324
source IEEE Electronic Library (IEL)
subjects Algorithms
Biological information theory
Biological system modeling
Closed-form solution
Computer Society
Concurrent computing
K-band
Parallel machines
Programming profession
Studies
title Optimal semi-oblique tiling
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-25T03%3A55%3A39IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Optimal%20semi-oblique%20tiling&rft.jtitle=IEEE%20transactions%20on%20parallel%20and%20distributed%20systems&rft.au=Andonov,%20R.&rft.date=2003-09&rft.volume=14&rft.issue=9&rft.spage=944&rft.epage=960&rft.pages=944-960&rft.issn=1045-9219&rft.eissn=1558-2183&rft.coden=ITDSEO&rft_id=info:doi/10.1109/TPDS.2003.1233716&rft_dat=%3Cproquest_RIE%3E28564695%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=922418324&rft_id=info:pmid/&rft_ieee_id=1233716&rfr_iscdi=true