Lessons learned at 208K: Towards debugging millions of cores
Petascale systems will present several new challenges to performance and correctness tools. Such machines may contain millions of cores, requiring that tools use scalable data structures and analysis algorithms to collect and to process application data. In addition, at such scales, each tool itself...
Gespeichert in:
Hauptverfasser: | , , , , , , , |
---|---|
Format: | Tagungsbericht |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 9 |
---|---|
container_issue | |
container_start_page | 1 |
container_title | |
container_volume | |
creator | Lee, G.L. Ahn, D.H. Arnold, D.C. de Supinski, B.R. Legendre, M. Miller, B.P. Schulz, M. Liblit, B. |
description | Petascale systems will present several new challenges to performance and correctness tools. Such machines may contain millions of cores, requiring that tools use scalable data structures and analysis algorithms to collect and to process application data. In addition, at such scales, each tool itself will become a large parallel application - already, debugging the full Blue-Gene/L (BG/L) installation at the Lawrence Livermore National Laboratory requires employing 1664 tool daemons. To reach such sizes and beyond, tools must use a scalable communication infrastructure and manage their own tool processes efficiently. Some system resources, such as the file system, may also become tool bottlenecks. In this paper, we present challenges to petascale tool development, using the stack trace analysis tool (STAT) as a case study. STAT is a lightweight tool that gathers and merges stack traces from a parallel application to identify process equivalence classes. We use results gathered at thousands of tasks on an Infiniband cluster and results up to 208 K processes on BG/L to identify current scalability issues as well as challenges that will be faced at the petascale. We then present implemented solutions to these challenges and show the resulting performance improvements. We also discuss future plans to meet the debugging demands of petascale machines. |
doi_str_mv | 10.1109/SC.2008.5218557 |
format | Conference Proceeding |
fullrecord | <record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_5218557</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>5218557</ieee_id><sourcerecordid>5218557</sourcerecordid><originalsourceid>FETCH-LOGICAL-i1317-da463ee6803ca95e2c6264125d9bf8f927c0513fb411549d97698ef66e9cdcbd3</originalsourceid><addsrcrecordid>eNo9kDtPwzAUhc2jEm3JzMDiP5Di67cRC4p4iUgMlLly7JvIKE1QDEL8e6gonOUM39E3HELOgK0AmLt4rlacMbtSHKxS5oAsQHIpuRUKDsmcgzalFMIckcIZ-8ckP_5n3M3IYudwzDBQJ6TI-ZX9RCohmJmTqxpzHodMe_TTgJH6d8qZfbyk6_HTTzHTiM1H16Who9vU92m3HVsaxgnzKZm1vs9Y7HtJXm5v1tV9WT_dPVTXdZlAgCmjl1ogastE8E4hD5prCVxF17S2ddwEpkC0jQRQ0kVntLPYao0uxNBEsSTnv96EiJu3KW399LXZvyK-AYoHTOM</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Lessons learned at 208K: Towards debugging millions of cores</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Lee, G.L. ; Ahn, D.H. ; Arnold, D.C. ; de Supinski, B.R. ; Legendre, M. ; Miller, B.P. ; Schulz, M. ; Liblit, B.</creator><creatorcontrib>Lee, G.L. ; Ahn, D.H. ; Arnold, D.C. ; de Supinski, B.R. ; Legendre, M. ; Miller, B.P. ; Schulz, M. ; Liblit, B.</creatorcontrib><description>Petascale systems will present several new challenges to performance and correctness tools. Such machines may contain millions of cores, requiring that tools use scalable data structures and analysis algorithms to collect and to process application data. In addition, at such scales, each tool itself will become a large parallel application - already, debugging the full Blue-Gene/L (BG/L) installation at the Lawrence Livermore National Laboratory requires employing 1664 tool daemons. To reach such sizes and beyond, tools must use a scalable communication infrastructure and manage their own tool processes efficiently. Some system resources, such as the file system, may also become tool bottlenecks. In this paper, we present challenges to petascale tool development, using the stack trace analysis tool (STAT) as a case study. STAT is a lightweight tool that gathers and merges stack traces from a parallel application to identify process equivalence classes. We use results gathered at thousands of tasks on an Infiniband cluster and results up to 208 K processes on BG/L to identify current scalability issues as well as challenges that will be faced at the petascale. We then present implemented solutions to these challenges and show the resulting performance improvements. We also discuss future plans to meet the debugging demands of petascale machines.</description><identifier>ISSN: 2167-4329</identifier><identifier>ISBN: 9781424428342</identifier><identifier>ISBN: 1424428343</identifier><identifier>EISSN: 2167-4337</identifier><identifier>EISBN: 1424428351</identifier><identifier>EISBN: 9781424428359</identifier><identifier>DOI: 10.1109/SC.2008.5218557</identifier><identifier>LCCN: 2008907015</identifier><language>eng</language><publisher>IEEE</publisher><subject>Algorithm design and analysis ; Application software ; Data analysis ; Data structures ; Debugging ; File systems ; Laboratories ; Large-scale systems ; Scalability ; System software</subject><ispartof>2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis, 2008, p.1-9</ispartof><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/5218557$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,2058,27925,54920</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/5218557$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Lee, G.L.</creatorcontrib><creatorcontrib>Ahn, D.H.</creatorcontrib><creatorcontrib>Arnold, D.C.</creatorcontrib><creatorcontrib>de Supinski, B.R.</creatorcontrib><creatorcontrib>Legendre, M.</creatorcontrib><creatorcontrib>Miller, B.P.</creatorcontrib><creatorcontrib>Schulz, M.</creatorcontrib><creatorcontrib>Liblit, B.</creatorcontrib><title>Lessons learned at 208K: Towards debugging millions of cores</title><title>2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis</title><addtitle>SC</addtitle><description>Petascale systems will present several new challenges to performance and correctness tools. Such machines may contain millions of cores, requiring that tools use scalable data structures and analysis algorithms to collect and to process application data. In addition, at such scales, each tool itself will become a large parallel application - already, debugging the full Blue-Gene/L (BG/L) installation at the Lawrence Livermore National Laboratory requires employing 1664 tool daemons. To reach such sizes and beyond, tools must use a scalable communication infrastructure and manage their own tool processes efficiently. Some system resources, such as the file system, may also become tool bottlenecks. In this paper, we present challenges to petascale tool development, using the stack trace analysis tool (STAT) as a case study. STAT is a lightweight tool that gathers and merges stack traces from a parallel application to identify process equivalence classes. We use results gathered at thousands of tasks on an Infiniband cluster and results up to 208 K processes on BG/L to identify current scalability issues as well as challenges that will be faced at the petascale. We then present implemented solutions to these challenges and show the resulting performance improvements. We also discuss future plans to meet the debugging demands of petascale machines.</description><subject>Algorithm design and analysis</subject><subject>Application software</subject><subject>Data analysis</subject><subject>Data structures</subject><subject>Debugging</subject><subject>File systems</subject><subject>Laboratories</subject><subject>Large-scale systems</subject><subject>Scalability</subject><subject>System software</subject><issn>2167-4329</issn><issn>2167-4337</issn><isbn>9781424428342</isbn><isbn>1424428343</isbn><isbn>1424428351</isbn><isbn>9781424428359</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2008</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><sourceid>RIE</sourceid><recordid>eNo9kDtPwzAUhc2jEm3JzMDiP5Di67cRC4p4iUgMlLly7JvIKE1QDEL8e6gonOUM39E3HELOgK0AmLt4rlacMbtSHKxS5oAsQHIpuRUKDsmcgzalFMIckcIZ-8ckP_5n3M3IYudwzDBQJ6TI-ZX9RCohmJmTqxpzHodMe_TTgJH6d8qZfbyk6_HTTzHTiM1H16Who9vU92m3HVsaxgnzKZm1vs9Y7HtJXm5v1tV9WT_dPVTXdZlAgCmjl1ogastE8E4hD5prCVxF17S2ddwEpkC0jQRQ0kVntLPYao0uxNBEsSTnv96EiJu3KW399LXZvyK-AYoHTOM</recordid><startdate>200811</startdate><enddate>200811</enddate><creator>Lee, G.L.</creator><creator>Ahn, D.H.</creator><creator>Arnold, D.C.</creator><creator>de Supinski, B.R.</creator><creator>Legendre, M.</creator><creator>Miller, B.P.</creator><creator>Schulz, M.</creator><creator>Liblit, B.</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>200811</creationdate><title>Lessons learned at 208K: Towards debugging millions of cores</title><author>Lee, G.L. ; Ahn, D.H. ; Arnold, D.C. ; de Supinski, B.R. ; Legendre, M. ; Miller, B.P. ; Schulz, M. ; Liblit, B.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i1317-da463ee6803ca95e2c6264125d9bf8f927c0513fb411549d97698ef66e9cdcbd3</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2008</creationdate><topic>Algorithm design and analysis</topic><topic>Application software</topic><topic>Data analysis</topic><topic>Data structures</topic><topic>Debugging</topic><topic>File systems</topic><topic>Laboratories</topic><topic>Large-scale systems</topic><topic>Scalability</topic><topic>System software</topic><toplevel>online_resources</toplevel><creatorcontrib>Lee, G.L.</creatorcontrib><creatorcontrib>Ahn, D.H.</creatorcontrib><creatorcontrib>Arnold, D.C.</creatorcontrib><creatorcontrib>de Supinski, B.R.</creatorcontrib><creatorcontrib>Legendre, M.</creatorcontrib><creatorcontrib>Miller, B.P.</creatorcontrib><creatorcontrib>Schulz, M.</creatorcontrib><creatorcontrib>Liblit, B.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Lee, G.L.</au><au>Ahn, D.H.</au><au>Arnold, D.C.</au><au>de Supinski, B.R.</au><au>Legendre, M.</au><au>Miller, B.P.</au><au>Schulz, M.</au><au>Liblit, B.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Lessons learned at 208K: Towards debugging millions of cores</atitle><btitle>2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis</btitle><stitle>SC</stitle><date>2008-11</date><risdate>2008</risdate><spage>1</spage><epage>9</epage><pages>1-9</pages><issn>2167-4329</issn><eissn>2167-4337</eissn><isbn>9781424428342</isbn><isbn>1424428343</isbn><eisbn>1424428351</eisbn><eisbn>9781424428359</eisbn><abstract>Petascale systems will present several new challenges to performance and correctness tools. Such machines may contain millions of cores, requiring that tools use scalable data structures and analysis algorithms to collect and to process application data. In addition, at such scales, each tool itself will become a large parallel application - already, debugging the full Blue-Gene/L (BG/L) installation at the Lawrence Livermore National Laboratory requires employing 1664 tool daemons. To reach such sizes and beyond, tools must use a scalable communication infrastructure and manage their own tool processes efficiently. Some system resources, such as the file system, may also become tool bottlenecks. In this paper, we present challenges to petascale tool development, using the stack trace analysis tool (STAT) as a case study. STAT is a lightweight tool that gathers and merges stack traces from a parallel application to identify process equivalence classes. We use results gathered at thousands of tasks on an Infiniband cluster and results up to 208 K processes on BG/L to identify current scalability issues as well as challenges that will be faced at the petascale. We then present implemented solutions to these challenges and show the resulting performance improvements. We also discuss future plans to meet the debugging demands of petascale machines.</abstract><pub>IEEE</pub><doi>10.1109/SC.2008.5218557</doi><tpages>9</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 2167-4329 |
ispartof | 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis, 2008, p.1-9 |
issn | 2167-4329 2167-4337 |
language | eng |
recordid | cdi_ieee_primary_5218557 |
source | IEEE Electronic Library (IEL) Conference Proceedings |
subjects | Algorithm design and analysis Application software Data analysis Data structures Debugging File systems Laboratories Large-scale systems Scalability System software |
title | Lessons learned at 208K: Towards debugging millions of cores |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-05T04%3A37%3A16IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Lessons%20learned%20at%20208K:%20Towards%20debugging%20millions%20of%20cores&rft.btitle=2008%20SC%20-%20International%20Conference%20for%20High%20Performance%20Computing,%20Networking,%20Storage%20and%20Analysis&rft.au=Lee,%20G.L.&rft.date=2008-11&rft.spage=1&rft.epage=9&rft.pages=1-9&rft.issn=2167-4329&rft.eissn=2167-4337&rft.isbn=9781424428342&rft.isbn_list=1424428343&rft_id=info:doi/10.1109/SC.2008.5218557&rft_dat=%3Cieee_6IE%3E5218557%3C/ieee_6IE%3E%3Curl%3E%3C/url%3E&rft.eisbn=1424428351&rft.eisbn_list=9781424428359&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=5218557&rfr_iscdi=true |