Analysis of system overhead on parallel computers

Ever-increasing demand for computing capability is driving the construction of ever-larger computer clusters, typically comprising commodity compute nodes, ranging in size up to thousands of processors, with each node hosting an instance of the operating system (OS). Recent studies [E. Hendriks (200...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Gioiosa, R., Petrini, F., Davis, K., Lebaillif-Delamare, F.
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 390
container_issue
container_start_page 387
container_title
container_volume
creator Gioiosa, R.
Petrini, F.
Davis, K.
Lebaillif-Delamare, F.
description Ever-increasing demand for computing capability is driving the construction of ever-larger computer clusters, typically comprising commodity compute nodes, ranging in size up to thousands of processors, with each node hosting an instance of the operating system (OS). Recent studies [E. Hendriks (2002), F. Petrini et al. (2003)] have shown that even minimal intrusion by the OS on user applications, e.g. a slowdown of user processes of less than 1.0% on each OS instance, can result in a dramatic performance degradation - 50% or more - when the user applications are executed on thousands of processors. The contribution of this paper is the explication and demonstration by way of a case study, of a methodology for analyzing and evaluating the impact of the system (all software and hardware other than user applications) activity on application performance. Our methodology has three major components: 1) a set of simple benchmarks to quickly measure and identify the impact of intrusive system events; 2) a kernel-level profiling tool Oprofile to characterize all relevant events and their sources; and, 3) a kernel module that provides timing information for in-depth modeling of the frequency and duration of each relevant event and determines which sources have the greatest impact on performance (and are therefore the most important to eliminate). The paper provides a collection of experimental results conducted on a state-of-the-art dual AMD Opteron cluster running GNU/Linux 2.6.5. While our work has been performed on this specific OS, we argue that our contribution readily generalizes to other open source and commercial operating systems.
doi_str_mv 10.1109/ISSPIT.2004.1433800
format Conference Proceeding
fullrecord <record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_1433800</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>1433800</ieee_id><sourcerecordid>1433800</sourcerecordid><originalsourceid>FETCH-LOGICAL-i175t-d03d47554dcd8e5eabf0c9e8c19b307e67950e4078349c4835261a3880ead1b33</originalsourceid><addsrcrecordid>eNotj8tqwzAURAWlkDbNF2SjH7B75StZ0jKEPgyBFpKugyxfExc5NpJb8N_X0MxmdmfOMLYVkAsB9rk6Hj-rU14AyFxIRANwxx5BG0BTGlus2Calb1iCtlSID0zsri7MqUt8aHma00Q9H34pXsg1fLjy0UUXAgXuh378mSimJ3bfupBoc-s1-3p9Oe3fs8PHW7XfHbJOaDVlDWAjtVKy8Y0hRa5uwVsyXtgaQVOprQKSixtK66VBVZTCoTGwLIsacc22_9yOiM5j7HoX5_PtFf4B23FCpg</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Analysis of system overhead on parallel computers</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Gioiosa, R. ; Petrini, F. ; Davis, K. ; Lebaillif-Delamare, F.</creator><creatorcontrib>Gioiosa, R. ; Petrini, F. ; Davis, K. ; Lebaillif-Delamare, F.</creatorcontrib><description>Ever-increasing demand for computing capability is driving the construction of ever-larger computer clusters, typically comprising commodity compute nodes, ranging in size up to thousands of processors, with each node hosting an instance of the operating system (OS). Recent studies [E. Hendriks (2002), F. Petrini et al. (2003)] have shown that even minimal intrusion by the OS on user applications, e.g. a slowdown of user processes of less than 1.0% on each OS instance, can result in a dramatic performance degradation - 50% or more - when the user applications are executed on thousands of processors. The contribution of this paper is the explication and demonstration by way of a case study, of a methodology for analyzing and evaluating the impact of the system (all software and hardware other than user applications) activity on application performance. Our methodology has three major components: 1) a set of simple benchmarks to quickly measure and identify the impact of intrusive system events; 2) a kernel-level profiling tool Oprofile to characterize all relevant events and their sources; and, 3) a kernel module that provides timing information for in-depth modeling of the frequency and duration of each relevant event and determines which sources have the greatest impact on performance (and are therefore the most important to eliminate). The paper provides a collection of experimental results conducted on a state-of-the-art dual AMD Opteron cluster running GNU/Linux 2.6.5. While our work has been performed on this specific OS, we argue that our contribution readily generalizes to other open source and commercial operating systems.</description><identifier>ISBN: 0780386892</identifier><identifier>ISBN: 9780780386891</identifier><identifier>DOI: 10.1109/ISSPIT.2004.1433800</identifier><language>eng</language><publisher>IEEE</publisher><subject>Application software ; Concurrent computing ; Degradation ; Frequency measurement ; Hardware ; Kernel ; Operating systems ; Performance analysis ; Software performance ; Software systems</subject><ispartof>Proceedings of the Fourth IEEE International Symposium on Signal Processing and Information Technology, 2004, 2004, p.387-390</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/1433800$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,2058,4050,4051,27925,54920</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/1433800$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Gioiosa, R.</creatorcontrib><creatorcontrib>Petrini, F.</creatorcontrib><creatorcontrib>Davis, K.</creatorcontrib><creatorcontrib>Lebaillif-Delamare, F.</creatorcontrib><title>Analysis of system overhead on parallel computers</title><title>Proceedings of the Fourth IEEE International Symposium on Signal Processing and Information Technology, 2004</title><addtitle>ISSPIT</addtitle><description>Ever-increasing demand for computing capability is driving the construction of ever-larger computer clusters, typically comprising commodity compute nodes, ranging in size up to thousands of processors, with each node hosting an instance of the operating system (OS). Recent studies [E. Hendriks (2002), F. Petrini et al. (2003)] have shown that even minimal intrusion by the OS on user applications, e.g. a slowdown of user processes of less than 1.0% on each OS instance, can result in a dramatic performance degradation - 50% or more - when the user applications are executed on thousands of processors. The contribution of this paper is the explication and demonstration by way of a case study, of a methodology for analyzing and evaluating the impact of the system (all software and hardware other than user applications) activity on application performance. Our methodology has three major components: 1) a set of simple benchmarks to quickly measure and identify the impact of intrusive system events; 2) a kernel-level profiling tool Oprofile to characterize all relevant events and their sources; and, 3) a kernel module that provides timing information for in-depth modeling of the frequency and duration of each relevant event and determines which sources have the greatest impact on performance (and are therefore the most important to eliminate). The paper provides a collection of experimental results conducted on a state-of-the-art dual AMD Opteron cluster running GNU/Linux 2.6.5. While our work has been performed on this specific OS, we argue that our contribution readily generalizes to other open source and commercial operating systems.</description><subject>Application software</subject><subject>Concurrent computing</subject><subject>Degradation</subject><subject>Frequency measurement</subject><subject>Hardware</subject><subject>Kernel</subject><subject>Operating systems</subject><subject>Performance analysis</subject><subject>Software performance</subject><subject>Software systems</subject><isbn>0780386892</isbn><isbn>9780780386891</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2004</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><sourceid>RIE</sourceid><recordid>eNotj8tqwzAURAWlkDbNF2SjH7B75StZ0jKEPgyBFpKugyxfExc5NpJb8N_X0MxmdmfOMLYVkAsB9rk6Hj-rU14AyFxIRANwxx5BG0BTGlus2Calb1iCtlSID0zsri7MqUt8aHma00Q9H34pXsg1fLjy0UUXAgXuh378mSimJ3bfupBoc-s1-3p9Oe3fs8PHW7XfHbJOaDVlDWAjtVKy8Y0hRa5uwVsyXtgaQVOprQKSixtK66VBVZTCoTGwLIsacc22_9yOiM5j7HoX5_PtFf4B23FCpg</recordid><startdate>2004</startdate><enddate>2004</enddate><creator>Gioiosa, R.</creator><creator>Petrini, F.</creator><creator>Davis, K.</creator><creator>Lebaillif-Delamare, F.</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>2004</creationdate><title>Analysis of system overhead on parallel computers</title><author>Gioiosa, R. ; Petrini, F. ; Davis, K. ; Lebaillif-Delamare, F.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i175t-d03d47554dcd8e5eabf0c9e8c19b307e67950e4078349c4835261a3880ead1b33</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2004</creationdate><topic>Application software</topic><topic>Concurrent computing</topic><topic>Degradation</topic><topic>Frequency measurement</topic><topic>Hardware</topic><topic>Kernel</topic><topic>Operating systems</topic><topic>Performance analysis</topic><topic>Software performance</topic><topic>Software systems</topic><toplevel>online_resources</toplevel><creatorcontrib>Gioiosa, R.</creatorcontrib><creatorcontrib>Petrini, F.</creatorcontrib><creatorcontrib>Davis, K.</creatorcontrib><creatorcontrib>Lebaillif-Delamare, F.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Gioiosa, R.</au><au>Petrini, F.</au><au>Davis, K.</au><au>Lebaillif-Delamare, F.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Analysis of system overhead on parallel computers</atitle><btitle>Proceedings of the Fourth IEEE International Symposium on Signal Processing and Information Technology, 2004</btitle><stitle>ISSPIT</stitle><date>2004</date><risdate>2004</risdate><spage>387</spage><epage>390</epage><pages>387-390</pages><isbn>0780386892</isbn><isbn>9780780386891</isbn><abstract>Ever-increasing demand for computing capability is driving the construction of ever-larger computer clusters, typically comprising commodity compute nodes, ranging in size up to thousands of processors, with each node hosting an instance of the operating system (OS). Recent studies [E. Hendriks (2002), F. Petrini et al. (2003)] have shown that even minimal intrusion by the OS on user applications, e.g. a slowdown of user processes of less than 1.0% on each OS instance, can result in a dramatic performance degradation - 50% or more - when the user applications are executed on thousands of processors. The contribution of this paper is the explication and demonstration by way of a case study, of a methodology for analyzing and evaluating the impact of the system (all software and hardware other than user applications) activity on application performance. Our methodology has three major components: 1) a set of simple benchmarks to quickly measure and identify the impact of intrusive system events; 2) a kernel-level profiling tool Oprofile to characterize all relevant events and their sources; and, 3) a kernel module that provides timing information for in-depth modeling of the frequency and duration of each relevant event and determines which sources have the greatest impact on performance (and are therefore the most important to eliminate). The paper provides a collection of experimental results conducted on a state-of-the-art dual AMD Opteron cluster running GNU/Linux 2.6.5. While our work has been performed on this specific OS, we argue that our contribution readily generalizes to other open source and commercial operating systems.</abstract><pub>IEEE</pub><doi>10.1109/ISSPIT.2004.1433800</doi><tpages>4</tpages></addata></record>
fulltext fulltext_linktorsrc
identifier ISBN: 0780386892
ispartof Proceedings of the Fourth IEEE International Symposium on Signal Processing and Information Technology, 2004, 2004, p.387-390
issn
language eng
recordid cdi_ieee_primary_1433800
source IEEE Electronic Library (IEL) Conference Proceedings
subjects Application software
Concurrent computing
Degradation
Frequency measurement
Hardware
Kernel
Operating systems
Performance analysis
Software performance
Software systems
title Analysis of system overhead on parallel computers
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-23T03%3A21%3A10IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Analysis%20of%20system%20overhead%20on%20parallel%20computers&rft.btitle=Proceedings%20of%20the%20Fourth%20IEEE%20International%20Symposium%20on%20Signal%20Processing%20and%20Information%20Technology,%202004&rft.au=Gioiosa,%20R.&rft.date=2004&rft.spage=387&rft.epage=390&rft.pages=387-390&rft.isbn=0780386892&rft.isbn_list=9780780386891&rft_id=info:doi/10.1109/ISSPIT.2004.1433800&rft_dat=%3Cieee_6IE%3E1433800%3C/ieee_6IE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=1433800&rfr_iscdi=true