Precise Event Sampling on AMD Versus Intel: Quantitative and Qualitative Comparison
Precise event sampling is a profiling feature in commodity processors that can sample hardware events and accurately locate the instructions that trigger the events. This feature has been used in a large number of tools to detect application performance issues. Although precise event sampling is rea...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on parallel and distributed systems 2023-05, Vol.34 (5), p.1594-1608 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 1608 |
---|---|
container_issue | 5 |
container_start_page | 1594 |
container_title | IEEE transactions on parallel and distributed systems |
container_volume | 34 |
creator | Sasongko, Muhammad Aditya Chabbi, Milind Kelly, Paul H J Unat, Didem |
description | Precise event sampling is a profiling feature in commodity processors that can sample hardware events and accurately locate the instructions that trigger the events. This feature has been used in a large number of tools to detect application performance issues. Although precise event sampling is readily supported in modern multicore architectures, vendor supports exhibit great differences that affect their accuracy, stability, overhead, and functionality. This work presents the most comprehensive study to date on benchmarking the event sampling features of Intel PEBS and AMD IBS and performs in-depth analysis on key differences through series of microbenchmarks. Our qualitative and quantitative analysis shows that PEBS allows finer-grained and more accurate sampling of hardware events, while IBS offers richer set of information at each sample though it suffers from lower accuracy and stability. Moreover, OS signal delivery, which is a common method used by the profiling software, introduces significant time overhead to the original overhead incurred by the hardware mechanisms in both PEBS and IBS. We also found that both PEBS and IBS have bias in sampling events across multiple different locations in a code. Lastly, we demonstrate how our findings on microbenchmarks under different thread counts hold for a full-fledged profiling tool that runs on the state-of-the-art Intel and AMD machines. Overall our detailed comparisons serve as a great reference and provide invaluable information for hardware designers and profiling tool developers. |
doi_str_mv | 10.1109/TPDS.2023.3257105 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2790132400</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2790132400</sourcerecordid><originalsourceid>FETCH-LOGICAL-c316t-12f47bcced4b2fda7ba8be806af7bb5ae61932fcd9a89959e34b4e2b5432f643</originalsourceid><addsrcrecordid>eNo1kF9LwzAUxYMoOKcfwLeAz5352za-jW3qYOJkw9eQtLfS0aU1SQd-e1ucT_fcw-Ec-CF0T8mMUqIe99vlbsYI4zPOZEaJvEATKmWeMJrzy0ETIRPFqLpGNyEcCKFCEjFBu62Hog6AVydwEe_MsWtq94Vbh-dvS_wJPvQBr12E5gl_9MbFOppYnwAbV45G8_8v2mNnfB1ad4uuKtMEuDvfKdo_r_aL12Tz_rJezDdJwWkaE8oqkdmigFJYVpUmsya3kJPUVJm10kBKFWdVUSqTKyUVcGEFMCvF4KaCT9HDX23n2-8eQtSHtvduWNQsU4RyJggZUvQvVfg2BA-V7nx9NP5HU6JHdHpEp0d0-oyO_wJXrmIV</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2790132400</pqid></control><display><type>article</type><title>Precise Event Sampling on AMD Versus Intel: Quantitative and Qualitative Comparison</title><source>IEEE Electronic Library (IEL)</source><creator>Sasongko, Muhammad Aditya ; Chabbi, Milind ; Kelly, Paul H J ; Unat, Didem</creator><creatorcontrib>Sasongko, Muhammad Aditya ; Chabbi, Milind ; Kelly, Paul H J ; Unat, Didem</creatorcontrib><description>Precise event sampling is a profiling feature in commodity processors that can sample hardware events and accurately locate the instructions that trigger the events. This feature has been used in a large number of tools to detect application performance issues. Although precise event sampling is readily supported in modern multicore architectures, vendor supports exhibit great differences that affect their accuracy, stability, overhead, and functionality. This work presents the most comprehensive study to date on benchmarking the event sampling features of Intel PEBS and AMD IBS and performs in-depth analysis on key differences through series of microbenchmarks. Our qualitative and quantitative analysis shows that PEBS allows finer-grained and more accurate sampling of hardware events, while IBS offers richer set of information at each sample though it suffers from lower accuracy and stability. Moreover, OS signal delivery, which is a common method used by the profiling software, introduces significant time overhead to the original overhead incurred by the hardware mechanisms in both PEBS and IBS. We also found that both PEBS and IBS have bias in sampling events across multiple different locations in a code. Lastly, we demonstrate how our findings on microbenchmarks under different thread counts hold for a full-fledged profiling tool that runs on the state-of-the-art Intel and AMD machines. Overall our detailed comparisons serve as a great reference and provide invaluable information for hardware designers and profiling tool developers.</description><identifier>ISSN: 1045-9219</identifier><identifier>EISSN: 1558-2183</identifier><identifier>DOI: 10.1109/TPDS.2023.3257105</identifier><language>eng</language><publisher>New York: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</publisher><subject>Computer architecture ; Hardware ; Qualitative analysis ; Quantitative analysis ; Sampling ; Stability</subject><ispartof>IEEE transactions on parallel and distributed systems, 2023-05, Vol.34 (5), p.1594-1608</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c316t-12f47bcced4b2fda7ba8be806af7bb5ae61932fcd9a89959e34b4e2b5432f643</citedby><cites>FETCH-LOGICAL-c316t-12f47bcced4b2fda7ba8be806af7bb5ae61932fcd9a89959e34b4e2b5432f643</cites><orcidid>0000-0002-2351-0770 ; 0000-0002-6166-4252 ; 0000-0001-5905-1804</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27923,27924</link.rule.ids></links><search><creatorcontrib>Sasongko, Muhammad Aditya</creatorcontrib><creatorcontrib>Chabbi, Milind</creatorcontrib><creatorcontrib>Kelly, Paul H J</creatorcontrib><creatorcontrib>Unat, Didem</creatorcontrib><title>Precise Event Sampling on AMD Versus Intel: Quantitative and Qualitative Comparison</title><title>IEEE transactions on parallel and distributed systems</title><description>Precise event sampling is a profiling feature in commodity processors that can sample hardware events and accurately locate the instructions that trigger the events. This feature has been used in a large number of tools to detect application performance issues. Although precise event sampling is readily supported in modern multicore architectures, vendor supports exhibit great differences that affect their accuracy, stability, overhead, and functionality. This work presents the most comprehensive study to date on benchmarking the event sampling features of Intel PEBS and AMD IBS and performs in-depth analysis on key differences through series of microbenchmarks. Our qualitative and quantitative analysis shows that PEBS allows finer-grained and more accurate sampling of hardware events, while IBS offers richer set of information at each sample though it suffers from lower accuracy and stability. Moreover, OS signal delivery, which is a common method used by the profiling software, introduces significant time overhead to the original overhead incurred by the hardware mechanisms in both PEBS and IBS. We also found that both PEBS and IBS have bias in sampling events across multiple different locations in a code. Lastly, we demonstrate how our findings on microbenchmarks under different thread counts hold for a full-fledged profiling tool that runs on the state-of-the-art Intel and AMD machines. Overall our detailed comparisons serve as a great reference and provide invaluable information for hardware designers and profiling tool developers.</description><subject>Computer architecture</subject><subject>Hardware</subject><subject>Qualitative analysis</subject><subject>Quantitative analysis</subject><subject>Sampling</subject><subject>Stability</subject><issn>1045-9219</issn><issn>1558-2183</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><recordid>eNo1kF9LwzAUxYMoOKcfwLeAz5352za-jW3qYOJkw9eQtLfS0aU1SQd-e1ucT_fcw-Ec-CF0T8mMUqIe99vlbsYI4zPOZEaJvEATKmWeMJrzy0ETIRPFqLpGNyEcCKFCEjFBu62Hog6AVydwEe_MsWtq94Vbh-dvS_wJPvQBr12E5gl_9MbFOppYnwAbV45G8_8v2mNnfB1ad4uuKtMEuDvfKdo_r_aL12Tz_rJezDdJwWkaE8oqkdmigFJYVpUmsya3kJPUVJm10kBKFWdVUSqTKyUVcGEFMCvF4KaCT9HDX23n2-8eQtSHtvduWNQsU4RyJggZUvQvVfg2BA-V7nx9NP5HU6JHdHpEp0d0-oyO_wJXrmIV</recordid><startdate>20230501</startdate><enddate>20230501</enddate><creator>Sasongko, Muhammad Aditya</creator><creator>Chabbi, Milind</creator><creator>Kelly, Paul H J</creator><creator>Unat, Didem</creator><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-2351-0770</orcidid><orcidid>https://orcid.org/0000-0002-6166-4252</orcidid><orcidid>https://orcid.org/0000-0001-5905-1804</orcidid></search><sort><creationdate>20230501</creationdate><title>Precise Event Sampling on AMD Versus Intel: Quantitative and Qualitative Comparison</title><author>Sasongko, Muhammad Aditya ; Chabbi, Milind ; Kelly, Paul H J ; Unat, Didem</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c316t-12f47bcced4b2fda7ba8be806af7bb5ae61932fcd9a89959e34b4e2b5432f643</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Computer architecture</topic><topic>Hardware</topic><topic>Qualitative analysis</topic><topic>Quantitative analysis</topic><topic>Sampling</topic><topic>Stability</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Sasongko, Muhammad Aditya</creatorcontrib><creatorcontrib>Chabbi, Milind</creatorcontrib><creatorcontrib>Kelly, Paul H J</creatorcontrib><creatorcontrib>Unat, Didem</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE transactions on parallel and distributed systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Sasongko, Muhammad Aditya</au><au>Chabbi, Milind</au><au>Kelly, Paul H J</au><au>Unat, Didem</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Precise Event Sampling on AMD Versus Intel: Quantitative and Qualitative Comparison</atitle><jtitle>IEEE transactions on parallel and distributed systems</jtitle><date>2023-05-01</date><risdate>2023</risdate><volume>34</volume><issue>5</issue><spage>1594</spage><epage>1608</epage><pages>1594-1608</pages><issn>1045-9219</issn><eissn>1558-2183</eissn><abstract>Precise event sampling is a profiling feature in commodity processors that can sample hardware events and accurately locate the instructions that trigger the events. This feature has been used in a large number of tools to detect application performance issues. Although precise event sampling is readily supported in modern multicore architectures, vendor supports exhibit great differences that affect their accuracy, stability, overhead, and functionality. This work presents the most comprehensive study to date on benchmarking the event sampling features of Intel PEBS and AMD IBS and performs in-depth analysis on key differences through series of microbenchmarks. Our qualitative and quantitative analysis shows that PEBS allows finer-grained and more accurate sampling of hardware events, while IBS offers richer set of information at each sample though it suffers from lower accuracy and stability. Moreover, OS signal delivery, which is a common method used by the profiling software, introduces significant time overhead to the original overhead incurred by the hardware mechanisms in both PEBS and IBS. We also found that both PEBS and IBS have bias in sampling events across multiple different locations in a code. Lastly, we demonstrate how our findings on microbenchmarks under different thread counts hold for a full-fledged profiling tool that runs on the state-of-the-art Intel and AMD machines. Overall our detailed comparisons serve as a great reference and provide invaluable information for hardware designers and profiling tool developers.</abstract><cop>New York</cop><pub>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</pub><doi>10.1109/TPDS.2023.3257105</doi><tpages>15</tpages><orcidid>https://orcid.org/0000-0002-2351-0770</orcidid><orcidid>https://orcid.org/0000-0002-6166-4252</orcidid><orcidid>https://orcid.org/0000-0001-5905-1804</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1045-9219 |
ispartof | IEEE transactions on parallel and distributed systems, 2023-05, Vol.34 (5), p.1594-1608 |
issn | 1045-9219 1558-2183 |
language | eng |
recordid | cdi_proquest_journals_2790132400 |
source | IEEE Electronic Library (IEL) |
subjects | Computer architecture Hardware Qualitative analysis Quantitative analysis Sampling Stability |
title | Precise Event Sampling on AMD Versus Intel: Quantitative and Qualitative Comparison |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-11T14%3A44%3A15IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Precise%20Event%20Sampling%20on%20AMD%20Versus%20Intel:%20Quantitative%20and%20Qualitative%20Comparison&rft.jtitle=IEEE%20transactions%20on%20parallel%20and%20distributed%20systems&rft.au=Sasongko,%20Muhammad%20Aditya&rft.date=2023-05-01&rft.volume=34&rft.issue=5&rft.spage=1594&rft.epage=1608&rft.pages=1594-1608&rft.issn=1045-9219&rft.eissn=1558-2183&rft_id=info:doi/10.1109/TPDS.2023.3257105&rft_dat=%3Cproquest_cross%3E2790132400%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2790132400&rft_id=info:pmid/&rfr_iscdi=true |