Autotuning PolyBench benchmarks with LLVM Clang/Polly loop optimization pragmas using Bayesian optimization

We develop a ytopt autotuning framework that leverages Bayesian optimization to explore the parameter space search and compare four different supervised learning methods within Bayesian optimization and evaluate their effectiveness. We select six of the most complex PolyBench benchmarks and apply th...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Concurrency and computation 2022-09, Vol.34 (20), p.n/a
Hauptverfasser:	Wu, Xingfu, Kruse, Michael, Balaprakash, Prasanna, Finkel, Hal, Hovland, Paul, Taylor, Valerie, Hall, Mary
Format:	Artikel
Sprache:	eng
Schlagworte:	autotuning Bayesian analysis Benchmarks Clang Deep learning loop transformation Machine learning Optimization Parameters Performance enhancement Polly PolyBench benchmarks
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	n/a
container_issue	20
container_start_page
container_title	Concurrency and computation
container_volume	34
creator	Wu, Xingfu Kruse, Michael Balaprakash, Prasanna Finkel, Hal Hovland, Paul Taylor, Valerie Hall, Mary
description	We develop a ytopt autotuning framework that leverages Bayesian optimization to explore the parameter space search and compare four different supervised learning methods within Bayesian optimization and evaluate their effectiveness. We select six of the most complex PolyBench benchmarks and apply the newly developed LLVM Clang/Polly loop optimization pragmas to the benchmarks to optimize them. We then use the autotuning framework to optimize the pragma parameters to improve their performance. The experimental results show that our autotuning approach outperforms the other compiling methods to provide the smallest execution time for the benchmarks syr2k, 3mm, heat‐3d, lu, and covariance with two large datasets in 200 code evaluations for effectively searching the parameter spaces with up to 170,368 different configurations. We find that the Floyd–Warshall benchmark did not benefit from autotuning. To cope with this issue, we provide some compiler option solutions to improve the performance. Then we present loop autotuning without a user's knowledge using a simple mctree autotuning framework to further improve the performance of the Floyd–Warshall benchmark. We also extend the ytopt autotuning framework to tune a deep learning application.
doi_str_mv	10.1002/cpe.6683
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2699544312</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2699544312</sourcerecordid><originalsourceid>FETCH-LOGICAL-c3273-537e5214372df7e3d04a4b567fbfcbd9e56de1a9c9f44a55a26cca695e912d453</originalsourceid><addsrcrecordid>eNp10E1Lw0AQBuBFFKxV8CcsePGSdr9jjm2oHxCxB_W6bDabdts0G7MJJf56EyOCBy8zc3iYYV4ArjGaYYTIXFdmJsQdPQETzCkJkKDs9Hcm4hxceL9DCGNE8QTsF23jmra05QauXdEtTam3MB3qQdV7D4-22cIkeX-GcaHKzbxHRQcL5yroqsYe7KdqrCthVavNQXnY-mHVUnXGW1X-MZfgLFeFN1c_fQre7lev8WOQvDw8xYsk0JSENOA0NJxgRkOS5aGhGWKKpVyEeZrrNIsMF5nBKtJRzpjiXBGhtRIRNxEmGeN0Cm7GvVXtPlrjG7lzbV32JyURUcQZo5j06nZUunbe1yaXVW37nzuJkRyilH2Ucoiyp8FIj7Yw3b9OxuvVt_8CaLl2fw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2699544312</pqid></control><display><type>article</type><title>Autotuning PolyBench benchmarks with LLVM Clang/Polly loop optimization pragmas using Bayesian optimization</title><source>Wiley Online Library - AutoHoldings Journals</source><creator>Wu, Xingfu ; Kruse, Michael ; Balaprakash, Prasanna ; Finkel, Hal ; Hovland, Paul ; Taylor, Valerie ; Hall, Mary</creator><creatorcontrib>Wu, Xingfu ; Kruse, Michael ; Balaprakash, Prasanna ; Finkel, Hal ; Hovland, Paul ; Taylor, Valerie ; Hall, Mary</creatorcontrib><description>We develop a ytopt autotuning framework that leverages Bayesian optimization to explore the parameter space search and compare four different supervised learning methods within Bayesian optimization and evaluate their effectiveness. We select six of the most complex PolyBench benchmarks and apply the newly developed LLVM Clang/Polly loop optimization pragmas to the benchmarks to optimize them. We then use the autotuning framework to optimize the pragma parameters to improve their performance. The experimental results show that our autotuning approach outperforms the other compiling methods to provide the smallest execution time for the benchmarks syr2k, 3mm, heat‐3d, lu, and covariance with two large datasets in 200 code evaluations for effectively searching the parameter spaces with up to 170,368 different configurations. We find that the Floyd–Warshall benchmark did not benefit from autotuning. To cope with this issue, we provide some compiler option solutions to improve the performance. Then we present loop autotuning without a user's knowledge using a simple mctree autotuning framework to further improve the performance of the Floyd–Warshall benchmark. We also extend the ytopt autotuning framework to tune a deep learning application.</description><identifier>ISSN: 1532-0626</identifier><identifier>EISSN: 1532-0634</identifier><identifier>DOI: 10.1002/cpe.6683</identifier><language>eng</language><publisher>Hoboken: Wiley Subscription Services, Inc</publisher><subject>autotuning ; Bayesian analysis ; Benchmarks ; Clang ; Deep learning ; loop transformation ; Machine learning ; Optimization ; Parameters ; Performance enhancement ; Polly ; PolyBench benchmarks</subject><ispartof>Concurrency and computation, 2022-09, Vol.34 (20), p.n/a</ispartof><rights>2021 John Wiley & Sons Ltd.</rights><rights>2022 John Wiley & Sons, Ltd.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c3273-537e5214372df7e3d04a4b567fbfcbd9e56de1a9c9f44a55a26cca695e912d453</citedby><cites>FETCH-LOGICAL-c3273-537e5214372df7e3d04a4b567fbfcbd9e56de1a9c9f44a55a26cca695e912d453</cites><orcidid>0000-0001-8150-5171</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://onlinelibrary.wiley.com/doi/pdf/10.1002%2Fcpe.6683$$EPDF$$P50$$Gwiley$$H</linktopdf><linktohtml>$$Uhttps://onlinelibrary.wiley.com/doi/full/10.1002%2Fcpe.6683$$EHTML$$P50$$Gwiley$$H</linktohtml><link.rule.ids>314,776,780,1411,27903,27904,45553,45554</link.rule.ids></links><search><creatorcontrib>Wu, Xingfu</creatorcontrib><creatorcontrib>Kruse, Michael</creatorcontrib><creatorcontrib>Balaprakash, Prasanna</creatorcontrib><creatorcontrib>Finkel, Hal</creatorcontrib><creatorcontrib>Hovland, Paul</creatorcontrib><creatorcontrib>Taylor, Valerie</creatorcontrib><creatorcontrib>Hall, Mary</creatorcontrib><title>Autotuning PolyBench benchmarks with LLVM Clang/Polly loop optimization pragmas using Bayesian optimization</title><title>Concurrency and computation</title><description>We develop a ytopt autotuning framework that leverages Bayesian optimization to explore the parameter space search and compare four different supervised learning methods within Bayesian optimization and evaluate their effectiveness. We select six of the most complex PolyBench benchmarks and apply the newly developed LLVM Clang/Polly loop optimization pragmas to the benchmarks to optimize them. We then use the autotuning framework to optimize the pragma parameters to improve their performance. The experimental results show that our autotuning approach outperforms the other compiling methods to provide the smallest execution time for the benchmarks syr2k, 3mm, heat‐3d, lu, and covariance with two large datasets in 200 code evaluations for effectively searching the parameter spaces with up to 170,368 different configurations. We find that the Floyd–Warshall benchmark did not benefit from autotuning. To cope with this issue, we provide some compiler option solutions to improve the performance. Then we present loop autotuning without a user's knowledge using a simple mctree autotuning framework to further improve the performance of the Floyd–Warshall benchmark. We also extend the ytopt autotuning framework to tune a deep learning application.</description><subject>autotuning</subject><subject>Bayesian analysis</subject><subject>Benchmarks</subject><subject>Clang</subject><subject>Deep learning</subject><subject>loop transformation</subject><subject>Machine learning</subject><subject>Optimization</subject><subject>Parameters</subject><subject>Performance enhancement</subject><subject>Polly</subject><subject>PolyBench benchmarks</subject><issn>1532-0626</issn><issn>1532-0634</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><recordid>eNp10E1Lw0AQBuBFFKxV8CcsePGSdr9jjm2oHxCxB_W6bDabdts0G7MJJf56EyOCBy8zc3iYYV4ArjGaYYTIXFdmJsQdPQETzCkJkKDs9Hcm4hxceL9DCGNE8QTsF23jmra05QauXdEtTam3MB3qQdV7D4-22cIkeX-GcaHKzbxHRQcL5yroqsYe7KdqrCthVavNQXnY-mHVUnXGW1X-MZfgLFeFN1c_fQre7lev8WOQvDw8xYsk0JSENOA0NJxgRkOS5aGhGWKKpVyEeZrrNIsMF5nBKtJRzpjiXBGhtRIRNxEmGeN0Cm7GvVXtPlrjG7lzbV32JyURUcQZo5j06nZUunbe1yaXVW37nzuJkRyilH2Ucoiyp8FIj7Yw3b9OxuvVt_8CaLl2fw</recordid><startdate>20220910</startdate><enddate>20220910</enddate><creator>Wu, Xingfu</creator><creator>Kruse, Michael</creator><creator>Balaprakash, Prasanna</creator><creator>Finkel, Hal</creator><creator>Hovland, Paul</creator><creator>Taylor, Valerie</creator><creator>Hall, Mary</creator><general>Wiley Subscription Services, Inc</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0001-8150-5171</orcidid></search><sort><creationdate>20220910</creationdate><title>Autotuning PolyBench benchmarks with LLVM Clang/Polly loop optimization pragmas using Bayesian optimization</title><author>Wu, Xingfu ; Kruse, Michael ; Balaprakash, Prasanna ; Finkel, Hal ; Hovland, Paul ; Taylor, Valerie ; Hall, Mary</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c3273-537e5214372df7e3d04a4b567fbfcbd9e56de1a9c9f44a55a26cca695e912d453</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>autotuning</topic><topic>Bayesian analysis</topic><topic>Benchmarks</topic><topic>Clang</topic><topic>Deep learning</topic><topic>loop transformation</topic><topic>Machine learning</topic><topic>Optimization</topic><topic>Parameters</topic><topic>Performance enhancement</topic><topic>Polly</topic><topic>PolyBench benchmarks</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Wu, Xingfu</creatorcontrib><creatorcontrib>Kruse, Michael</creatorcontrib><creatorcontrib>Balaprakash, Prasanna</creatorcontrib><creatorcontrib>Finkel, Hal</creatorcontrib><creatorcontrib>Hovland, Paul</creatorcontrib><creatorcontrib>Taylor, Valerie</creatorcontrib><creatorcontrib>Hall, Mary</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Concurrency and computation</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Wu, Xingfu</au><au>Kruse, Michael</au><au>Balaprakash, Prasanna</au><au>Finkel, Hal</au><au>Hovland, Paul</au><au>Taylor, Valerie</au><au>Hall, Mary</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Autotuning PolyBench benchmarks with LLVM Clang/Polly loop optimization pragmas using Bayesian optimization</atitle><jtitle>Concurrency and computation</jtitle><date>2022-09-10</date><risdate>2022</risdate><volume>34</volume><issue>20</issue><epage>n/a</epage><issn>1532-0626</issn><eissn>1532-0634</eissn><abstract>We develop a ytopt autotuning framework that leverages Bayesian optimization to explore the parameter space search and compare four different supervised learning methods within Bayesian optimization and evaluate their effectiveness. We select six of the most complex PolyBench benchmarks and apply the newly developed LLVM Clang/Polly loop optimization pragmas to the benchmarks to optimize them. We then use the autotuning framework to optimize the pragma parameters to improve their performance. The experimental results show that our autotuning approach outperforms the other compiling methods to provide the smallest execution time for the benchmarks syr2k, 3mm, heat‐3d, lu, and covariance with two large datasets in 200 code evaluations for effectively searching the parameter spaces with up to 170,368 different configurations. We find that the Floyd–Warshall benchmark did not benefit from autotuning. To cope with this issue, we provide some compiler option solutions to improve the performance. Then we present loop autotuning without a user's knowledge using a simple mctree autotuning framework to further improve the performance of the Floyd–Warshall benchmark. We also extend the ytopt autotuning framework to tune a deep learning application.</abstract><cop>Hoboken</cop><pub>Wiley Subscription Services, Inc</pub><doi>10.1002/cpe.6683</doi><tpages>24</tpages><orcidid>https://orcid.org/0000-0001-8150-5171</orcidid><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 1532-0626
ispartof	Concurrency and computation, 2022-09, Vol.34 (20), p.n/a
issn	1532-0626 1532-0634
language	eng
recordid	cdi_proquest_journals_2699544312
source	Wiley Online Library - AutoHoldings Journals
subjects	autotuning Bayesian analysis Benchmarks Clang Deep learning loop transformation Machine learning Optimization Parameters Performance enhancement Polly PolyBench benchmarks
title	Autotuning PolyBench benchmarks with LLVM Clang/Polly loop optimization pragmas using Bayesian optimization
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-24T21%3A56%3A17IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Autotuning%20PolyBench%20benchmarks%20with%20LLVM%20Clang/Polly%20loop%20optimization%20pragmas%20using%20Bayesian%20optimization&rft.jtitle=Concurrency%20and%20computation&rft.au=Wu,%20Xingfu&rft.date=2022-09-10&rft.volume=34&rft.issue=20&rft.epage=n/a&rft.issn=1532-0626&rft.eissn=1532-0634&rft_id=info:doi/10.1002/cpe.6683&rft_dat=%3Cproquest_cross%3E2699544312%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2699544312&rft_id=info:pmid/&rfr_iscdi=true