Lessons learned at 208K: Towards debugging millions of cores

Petascale systems will present several new challenges to performance and correctness tools. Such machines may contain millions of cores, requiring that tools use scalable data structures and analysis algorithms to collect and to process application data. In addition, at such scales, each tool itself...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Lee, G.L., Ahn, D.H., Arnold, D.C., de Supinski, B.R., Legendre, M., Miller, B.P., Schulz, M., Liblit, B.
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 9
container_issue
container_start_page 1
container_title
container_volume
creator Lee, G.L.
Ahn, D.H.
Arnold, D.C.
de Supinski, B.R.
Legendre, M.
Miller, B.P.
Schulz, M.
Liblit, B.
description Petascale systems will present several new challenges to performance and correctness tools. Such machines may contain millions of cores, requiring that tools use scalable data structures and analysis algorithms to collect and to process application data. In addition, at such scales, each tool itself will become a large parallel application - already, debugging the full Blue-Gene/L (BG/L) installation at the Lawrence Livermore National Laboratory requires employing 1664 tool daemons. To reach such sizes and beyond, tools must use a scalable communication infrastructure and manage their own tool processes efficiently. Some system resources, such as the file system, may also become tool bottlenecks. In this paper, we present challenges to petascale tool development, using the stack trace analysis tool (STAT) as a case study. STAT is a lightweight tool that gathers and merges stack traces from a parallel application to identify process equivalence classes. We use results gathered at thousands of tasks on an Infiniband cluster and results up to 208 K processes on BG/L to identify current scalability issues as well as challenges that will be faced at the petascale. We then present implemented solutions to these challenges and show the resulting performance improvements. We also discuss future plans to meet the debugging demands of petascale machines.
doi_str_mv 10.1109/SC.2008.5218557
format Conference Proceeding
fullrecord <record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_5218557</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>5218557</ieee_id><sourcerecordid>5218557</sourcerecordid><originalsourceid>FETCH-LOGICAL-i1317-da463ee6803ca95e2c6264125d9bf8f927c0513fb411549d97698ef66e9cdcbd3</originalsourceid><addsrcrecordid>eNo9kDtPwzAUhc2jEm3JzMDiP5Di67cRC4p4iUgMlLly7JvIKE1QDEL8e6gonOUM39E3HELOgK0AmLt4rlacMbtSHKxS5oAsQHIpuRUKDsmcgzalFMIckcIZ-8ckP_5n3M3IYudwzDBQJ6TI-ZX9RCohmJmTqxpzHodMe_TTgJH6d8qZfbyk6_HTTzHTiM1H16Who9vU92m3HVsaxgnzKZm1vs9Y7HtJXm5v1tV9WT_dPVTXdZlAgCmjl1ogastE8E4hD5prCVxF17S2ddwEpkC0jQRQ0kVntLPYao0uxNBEsSTnv96EiJu3KW399LXZvyK-AYoHTOM</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Lessons learned at 208K: Towards debugging millions of cores</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Lee, G.L. ; Ahn, D.H. ; Arnold, D.C. ; de Supinski, B.R. ; Legendre, M. ; Miller, B.P. ; Schulz, M. ; Liblit, B.</creator><creatorcontrib>Lee, G.L. ; Ahn, D.H. ; Arnold, D.C. ; de Supinski, B.R. ; Legendre, M. ; Miller, B.P. ; Schulz, M. ; Liblit, B.</creatorcontrib><description>Petascale systems will present several new challenges to performance and correctness tools. Such machines may contain millions of cores, requiring that tools use scalable data structures and analysis algorithms to collect and to process application data. In addition, at such scales, each tool itself will become a large parallel application - already, debugging the full Blue-Gene/L (BG/L) installation at the Lawrence Livermore National Laboratory requires employing 1664 tool daemons. To reach such sizes and beyond, tools must use a scalable communication infrastructure and manage their own tool processes efficiently. Some system resources, such as the file system, may also become tool bottlenecks. In this paper, we present challenges to petascale tool development, using the stack trace analysis tool (STAT) as a case study. STAT is a lightweight tool that gathers and merges stack traces from a parallel application to identify process equivalence classes. We use results gathered at thousands of tasks on an Infiniband cluster and results up to 208 K processes on BG/L to identify current scalability issues as well as challenges that will be faced at the petascale. We then present implemented solutions to these challenges and show the resulting performance improvements. We also discuss future plans to meet the debugging demands of petascale machines.</description><identifier>ISSN: 2167-4329</identifier><identifier>ISBN: 9781424428342</identifier><identifier>ISBN: 1424428343</identifier><identifier>EISSN: 2167-4337</identifier><identifier>EISBN: 1424428351</identifier><identifier>EISBN: 9781424428359</identifier><identifier>DOI: 10.1109/SC.2008.5218557</identifier><identifier>LCCN: 2008907015</identifier><language>eng</language><publisher>IEEE</publisher><subject>Algorithm design and analysis ; Application software ; Data analysis ; Data structures ; Debugging ; File systems ; Laboratories ; Large-scale systems ; Scalability ; System software</subject><ispartof>2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis, 2008, p.1-9</ispartof><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/5218557$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,2058,27925,54920</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/5218557$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Lee, G.L.</creatorcontrib><creatorcontrib>Ahn, D.H.</creatorcontrib><creatorcontrib>Arnold, D.C.</creatorcontrib><creatorcontrib>de Supinski, B.R.</creatorcontrib><creatorcontrib>Legendre, M.</creatorcontrib><creatorcontrib>Miller, B.P.</creatorcontrib><creatorcontrib>Schulz, M.</creatorcontrib><creatorcontrib>Liblit, B.</creatorcontrib><title>Lessons learned at 208K: Towards debugging millions of cores</title><title>2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis</title><addtitle>SC</addtitle><description>Petascale systems will present several new challenges to performance and correctness tools. Such machines may contain millions of cores, requiring that tools use scalable data structures and analysis algorithms to collect and to process application data. In addition, at such scales, each tool itself will become a large parallel application - already, debugging the full Blue-Gene/L (BG/L) installation at the Lawrence Livermore National Laboratory requires employing 1664 tool daemons. To reach such sizes and beyond, tools must use a scalable communication infrastructure and manage their own tool processes efficiently. Some system resources, such as the file system, may also become tool bottlenecks. In this paper, we present challenges to petascale tool development, using the stack trace analysis tool (STAT) as a case study. STAT is a lightweight tool that gathers and merges stack traces from a parallel application to identify process equivalence classes. We use results gathered at thousands of tasks on an Infiniband cluster and results up to 208 K processes on BG/L to identify current scalability issues as well as challenges that will be faced at the petascale. We then present implemented solutions to these challenges and show the resulting performance improvements. We also discuss future plans to meet the debugging demands of petascale machines.</description><subject>Algorithm design and analysis</subject><subject>Application software</subject><subject>Data analysis</subject><subject>Data structures</subject><subject>Debugging</subject><subject>File systems</subject><subject>Laboratories</subject><subject>Large-scale systems</subject><subject>Scalability</subject><subject>System software</subject><issn>2167-4329</issn><issn>2167-4337</issn><isbn>9781424428342</isbn><isbn>1424428343</isbn><isbn>1424428351</isbn><isbn>9781424428359</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2008</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><sourceid>RIE</sourceid><recordid>eNo9kDtPwzAUhc2jEm3JzMDiP5Di67cRC4p4iUgMlLly7JvIKE1QDEL8e6gonOUM39E3HELOgK0AmLt4rlacMbtSHKxS5oAsQHIpuRUKDsmcgzalFMIckcIZ-8ckP_5n3M3IYudwzDBQJ6TI-ZX9RCohmJmTqxpzHodMe_TTgJH6d8qZfbyk6_HTTzHTiM1H16Who9vU92m3HVsaxgnzKZm1vs9Y7HtJXm5v1tV9WT_dPVTXdZlAgCmjl1ogastE8E4hD5prCVxF17S2ddwEpkC0jQRQ0kVntLPYao0uxNBEsSTnv96EiJu3KW399LXZvyK-AYoHTOM</recordid><startdate>200811</startdate><enddate>200811</enddate><creator>Lee, G.L.</creator><creator>Ahn, D.H.</creator><creator>Arnold, D.C.</creator><creator>de Supinski, B.R.</creator><creator>Legendre, M.</creator><creator>Miller, B.P.</creator><creator>Schulz, M.</creator><creator>Liblit, B.</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>200811</creationdate><title>Lessons learned at 208K: Towards debugging millions of cores</title><author>Lee, G.L. ; Ahn, D.H. ; Arnold, D.C. ; de Supinski, B.R. ; Legendre, M. ; Miller, B.P. ; Schulz, M. ; Liblit, B.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i1317-da463ee6803ca95e2c6264125d9bf8f927c0513fb411549d97698ef66e9cdcbd3</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2008</creationdate><topic>Algorithm design and analysis</topic><topic>Application software</topic><topic>Data analysis</topic><topic>Data structures</topic><topic>Debugging</topic><topic>File systems</topic><topic>Laboratories</topic><topic>Large-scale systems</topic><topic>Scalability</topic><topic>System software</topic><toplevel>online_resources</toplevel><creatorcontrib>Lee, G.L.</creatorcontrib><creatorcontrib>Ahn, D.H.</creatorcontrib><creatorcontrib>Arnold, D.C.</creatorcontrib><creatorcontrib>de Supinski, B.R.</creatorcontrib><creatorcontrib>Legendre, M.</creatorcontrib><creatorcontrib>Miller, B.P.</creatorcontrib><creatorcontrib>Schulz, M.</creatorcontrib><creatorcontrib>Liblit, B.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Lee, G.L.</au><au>Ahn, D.H.</au><au>Arnold, D.C.</au><au>de Supinski, B.R.</au><au>Legendre, M.</au><au>Miller, B.P.</au><au>Schulz, M.</au><au>Liblit, B.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Lessons learned at 208K: Towards debugging millions of cores</atitle><btitle>2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis</btitle><stitle>SC</stitle><date>2008-11</date><risdate>2008</risdate><spage>1</spage><epage>9</epage><pages>1-9</pages><issn>2167-4329</issn><eissn>2167-4337</eissn><isbn>9781424428342</isbn><isbn>1424428343</isbn><eisbn>1424428351</eisbn><eisbn>9781424428359</eisbn><abstract>Petascale systems will present several new challenges to performance and correctness tools. Such machines may contain millions of cores, requiring that tools use scalable data structures and analysis algorithms to collect and to process application data. In addition, at such scales, each tool itself will become a large parallel application - already, debugging the full Blue-Gene/L (BG/L) installation at the Lawrence Livermore National Laboratory requires employing 1664 tool daemons. To reach such sizes and beyond, tools must use a scalable communication infrastructure and manage their own tool processes efficiently. Some system resources, such as the file system, may also become tool bottlenecks. In this paper, we present challenges to petascale tool development, using the stack trace analysis tool (STAT) as a case study. STAT is a lightweight tool that gathers and merges stack traces from a parallel application to identify process equivalence classes. We use results gathered at thousands of tasks on an Infiniband cluster and results up to 208 K processes on BG/L to identify current scalability issues as well as challenges that will be faced at the petascale. We then present implemented solutions to these challenges and show the resulting performance improvements. We also discuss future plans to meet the debugging demands of petascale machines.</abstract><pub>IEEE</pub><doi>10.1109/SC.2008.5218557</doi><tpages>9</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 2167-4329
ispartof 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis, 2008, p.1-9
issn 2167-4329
2167-4337
language eng
recordid cdi_ieee_primary_5218557
source IEEE Electronic Library (IEL) Conference Proceedings
subjects Algorithm design and analysis
Application software
Data analysis
Data structures
Debugging
File systems
Laboratories
Large-scale systems
Scalability
System software
title Lessons learned at 208K: Towards debugging millions of cores
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-05T04%3A37%3A16IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Lessons%20learned%20at%20208K:%20Towards%20debugging%20millions%20of%20cores&rft.btitle=2008%20SC%20-%20International%20Conference%20for%20High%20Performance%20Computing,%20Networking,%20Storage%20and%20Analysis&rft.au=Lee,%20G.L.&rft.date=2008-11&rft.spage=1&rft.epage=9&rft.pages=1-9&rft.issn=2167-4329&rft.eissn=2167-4337&rft.isbn=9781424428342&rft.isbn_list=1424428343&rft_id=info:doi/10.1109/SC.2008.5218557&rft_dat=%3Cieee_6IE%3E5218557%3C/ieee_6IE%3E%3Curl%3E%3C/url%3E&rft.eisbn=1424428351&rft.eisbn_list=9781424428359&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=5218557&rfr_iscdi=true