Optimal semi-oblique tiling
For 2D iteration space tiling, we address the problem of determining the tile parameters that minimize the total execution time on a parallel machine. We consider uniform dependency computations tiled so that (at least) one of the tile boundaries is parallel to the domain boundaries. We determine th...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on parallel and distributed systems 2003-09, Vol.14 (9), p.944-960 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 960 |
---|---|
container_issue | 9 |
container_start_page | 944 |
container_title | IEEE transactions on parallel and distributed systems |
container_volume | 14 |
creator | Andonov, R. Balev, S. Rajopadhye, S. Yanev, N. |
description | For 2D iteration space tiling, we address the problem of determining the tile parameters that minimize the total execution time on a parallel machine. We consider uniform dependency computations tiled so that (at least) one of the tile boundaries is parallel to the domain boundaries. We determine the optimal tile size as a closed form solution. In addition, we determine the optimal number of processors and also the optimal slope of the oblique tile boundary. Our results are based on the BSP model, which assures the portability of the results. Our predictions are justified on a sequence global alignment problem specialized to similar sequences using Fickett's k-band algorithm, for which our optimal semi-oblique tiling yields an improvement of a factor of 2.5 over orthogonal tiling. Our optimal solution requires a block-cyclic distribution of tiles to processors. The best one can obtain with only block distribution (as many authors require) is three times slower. Furthermore, our best running time is within 10 percent of the "predicted theoretical peak" performance of the machine!. |
doi_str_mv | 10.1109/TPDS.2003.1233716 |
format | Article |
fullrecord | <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_journals_922418324</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>1233716</ieee_id><sourcerecordid>28564695</sourcerecordid><originalsourceid>FETCH-LOGICAL-c321t-167b3dfa77e61f280589c4470447265a6570118c7a697a9b7e71f7d042d9e6d53</originalsourceid><addsrcrecordid>eNpdkEtLw0AQxxdRsFY_gPRSPHhLndln9ii1PqBQwXpeNslGtuRRs-nBb--GBAQPwwzMbx7_PyG3CCtE0A_796ePFQVgK6SMKZRnZIZCpAnFlJ3HGrhINEV9Sa5COAAgF8BnZLE79r621TK42idtVvnvk1v2vvLN1zW5KG0V3M2U5-TzebNfvybb3cvb-nGb5Ixin6BUGStKq5STWNIURKpzzhXEoFJYKRQgprmyUiurM-UUlqoATgvtZCHYnNyPe49dG6-H3tQ-5K6qbOPaUzA0FZJLPYB3_8BDe-qa-JvRlPKolPII4QjlXRtC50pz7KLC7scgmMErM3hlBq_M5FWcWYwz3jn3x0_dX2fqYdE</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>922418324</pqid></control><display><type>article</type><title>Optimal semi-oblique tiling</title><source>IEEE Electronic Library (IEL)</source><creator>Andonov, R. ; Balev, S. ; Rajopadhye, S. ; Yanev, N.</creator><creatorcontrib>Andonov, R. ; Balev, S. ; Rajopadhye, S. ; Yanev, N.</creatorcontrib><description>For 2D iteration space tiling, we address the problem of determining the tile parameters that minimize the total execution time on a parallel machine. We consider uniform dependency computations tiled so that (at least) one of the tile boundaries is parallel to the domain boundaries. We determine the optimal tile size as a closed form solution. In addition, we determine the optimal number of processors and also the optimal slope of the oblique tile boundary. Our results are based on the BSP model, which assures the portability of the results. Our predictions are justified on a sequence global alignment problem specialized to similar sequences using Fickett's k-band algorithm, for which our optimal semi-oblique tiling yields an improvement of a factor of 2.5 over orthogonal tiling. Our optimal solution requires a block-cyclic distribution of tiles to processors. The best one can obtain with only block distribution (as many authors require) is three times slower. Furthermore, our best running time is within 10 percent of the "predicted theoretical peak" performance of the machine!.</description><identifier>ISSN: 1045-9219</identifier><identifier>EISSN: 1558-2183</identifier><identifier>DOI: 10.1109/TPDS.2003.1233716</identifier><identifier>CODEN: ITDSEO</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Algorithms ; Biological information theory ; Biological system modeling ; Closed-form solution ; Computer Society ; Concurrent computing ; K-band ; Parallel machines ; Programming profession ; Studies</subject><ispartof>IEEE transactions on parallel and distributed systems, 2003-09, Vol.14 (9), p.944-960</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2003</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c321t-167b3dfa77e61f280589c4470447265a6570118c7a697a9b7e71f7d042d9e6d53</citedby><cites>FETCH-LOGICAL-c321t-167b3dfa77e61f280589c4470447265a6570118c7a697a9b7e71f7d042d9e6d53</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/1233716$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27903,27904,54736</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/1233716$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Andonov, R.</creatorcontrib><creatorcontrib>Balev, S.</creatorcontrib><creatorcontrib>Rajopadhye, S.</creatorcontrib><creatorcontrib>Yanev, N.</creatorcontrib><title>Optimal semi-oblique tiling</title><title>IEEE transactions on parallel and distributed systems</title><addtitle>TPDS</addtitle><description>For 2D iteration space tiling, we address the problem of determining the tile parameters that minimize the total execution time on a parallel machine. We consider uniform dependency computations tiled so that (at least) one of the tile boundaries is parallel to the domain boundaries. We determine the optimal tile size as a closed form solution. In addition, we determine the optimal number of processors and also the optimal slope of the oblique tile boundary. Our results are based on the BSP model, which assures the portability of the results. Our predictions are justified on a sequence global alignment problem specialized to similar sequences using Fickett's k-band algorithm, for which our optimal semi-oblique tiling yields an improvement of a factor of 2.5 over orthogonal tiling. Our optimal solution requires a block-cyclic distribution of tiles to processors. The best one can obtain with only block distribution (as many authors require) is three times slower. Furthermore, our best running time is within 10 percent of the "predicted theoretical peak" performance of the machine!.</description><subject>Algorithms</subject><subject>Biological information theory</subject><subject>Biological system modeling</subject><subject>Closed-form solution</subject><subject>Computer Society</subject><subject>Concurrent computing</subject><subject>K-band</subject><subject>Parallel machines</subject><subject>Programming profession</subject><subject>Studies</subject><issn>1045-9219</issn><issn>1558-2183</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2003</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpdkEtLw0AQxxdRsFY_gPRSPHhLndln9ii1PqBQwXpeNslGtuRRs-nBb--GBAQPwwzMbx7_PyG3CCtE0A_796ePFQVgK6SMKZRnZIZCpAnFlJ3HGrhINEV9Sa5COAAgF8BnZLE79r621TK42idtVvnvk1v2vvLN1zW5KG0V3M2U5-TzebNfvybb3cvb-nGb5Ixin6BUGStKq5STWNIURKpzzhXEoFJYKRQgprmyUiurM-UUlqoATgvtZCHYnNyPe49dG6-H3tQ-5K6qbOPaUzA0FZJLPYB3_8BDe-qa-JvRlPKolPII4QjlXRtC50pz7KLC7scgmMErM3hlBq_M5FWcWYwz3jn3x0_dX2fqYdE</recordid><startdate>200309</startdate><enddate>200309</enddate><creator>Andonov, R.</creator><creator>Balev, S.</creator><creator>Rajopadhye, S.</creator><creator>Yanev, N.</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>200309</creationdate><title>Optimal semi-oblique tiling</title><author>Andonov, R. ; Balev, S. ; Rajopadhye, S. ; Yanev, N.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c321t-167b3dfa77e61f280589c4470447265a6570118c7a697a9b7e71f7d042d9e6d53</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2003</creationdate><topic>Algorithms</topic><topic>Biological information theory</topic><topic>Biological system modeling</topic><topic>Closed-form solution</topic><topic>Computer Society</topic><topic>Concurrent computing</topic><topic>K-band</topic><topic>Parallel machines</topic><topic>Programming profession</topic><topic>Studies</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Andonov, R.</creatorcontrib><creatorcontrib>Balev, S.</creatorcontrib><creatorcontrib>Rajopadhye, S.</creatorcontrib><creatorcontrib>Yanev, N.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE transactions on parallel and distributed systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Andonov, R.</au><au>Balev, S.</au><au>Rajopadhye, S.</au><au>Yanev, N.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Optimal semi-oblique tiling</atitle><jtitle>IEEE transactions on parallel and distributed systems</jtitle><stitle>TPDS</stitle><date>2003-09</date><risdate>2003</risdate><volume>14</volume><issue>9</issue><spage>944</spage><epage>960</epage><pages>944-960</pages><issn>1045-9219</issn><eissn>1558-2183</eissn><coden>ITDSEO</coden><abstract>For 2D iteration space tiling, we address the problem of determining the tile parameters that minimize the total execution time on a parallel machine. We consider uniform dependency computations tiled so that (at least) one of the tile boundaries is parallel to the domain boundaries. We determine the optimal tile size as a closed form solution. In addition, we determine the optimal number of processors and also the optimal slope of the oblique tile boundary. Our results are based on the BSP model, which assures the portability of the results. Our predictions are justified on a sequence global alignment problem specialized to similar sequences using Fickett's k-band algorithm, for which our optimal semi-oblique tiling yields an improvement of a factor of 2.5 over orthogonal tiling. Our optimal solution requires a block-cyclic distribution of tiles to processors. The best one can obtain with only block distribution (as many authors require) is three times slower. Furthermore, our best running time is within 10 percent of the "predicted theoretical peak" performance of the machine!.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TPDS.2003.1233716</doi><tpages>17</tpages></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 1045-9219 |
ispartof | IEEE transactions on parallel and distributed systems, 2003-09, Vol.14 (9), p.944-960 |
issn | 1045-9219 1558-2183 |
language | eng |
recordid | cdi_proquest_journals_922418324 |
source | IEEE Electronic Library (IEL) |
subjects | Algorithms Biological information theory Biological system modeling Closed-form solution Computer Society Concurrent computing K-band Parallel machines Programming profession Studies |
title | Optimal semi-oblique tiling |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-25T03%3A55%3A39IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Optimal%20semi-oblique%20tiling&rft.jtitle=IEEE%20transactions%20on%20parallel%20and%20distributed%20systems&rft.au=Andonov,%20R.&rft.date=2003-09&rft.volume=14&rft.issue=9&rft.spage=944&rft.epage=960&rft.pages=944-960&rft.issn=1045-9219&rft.eissn=1558-2183&rft.coden=ITDSEO&rft_id=info:doi/10.1109/TPDS.2003.1233716&rft_dat=%3Cproquest_RIE%3E28564695%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=922418324&rft_id=info:pmid/&rft_ieee_id=1233716&rfr_iscdi=true |