TLB-pilot: Mitigating TLB Contention Attack on GPUs with Microarchitecture-Aware Scheduling

Co-running GPU kernels on a single GPU can provide high system throughput and improve hardware utilization, but this raises concerns on application security. We reveal that translation lookaside buffer (TLB) attack, one of the common attacks on CPU, can happen on GPU when multiple GPU kernels co-run...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	ACM transactions on architecture and code optimization 2022-03, Vol.19 (1), p.1-23
Hauptverfasser:	Di, Bang, Hu, Daokun, Xie, Zhen, Sun, Jianhua, Chen, Hao, Ren, Jinkui, Li, Dong
Format:	Artikel
Sprache:	eng
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	23
container_issue	1
container_start_page	1
container_title	ACM transactions on architecture and code optimization
container_volume	19
creator	Di, Bang Hu, Daokun Xie, Zhen Sun, Jianhua Chen, Hao Ren, Jinkui Li, Dong
description	Co-running GPU kernels on a single GPU can provide high system throughput and improve hardware utilization, but this raises concerns on application security. We reveal that translation lookaside buffer (TLB) attack, one of the common attacks on CPU, can happen on GPU when multiple GPU kernels co-run. We investigate conditions or principles under which a TLB attack can take effect, including the awareness of GPU TLB microarchitecture, being lightweight, and bypassing existing software and hardware mechanisms. This TLB-based attack can be leveraged to conduct Denial-of-Service (or Degradation-of-Service) attacks. Furthermore, we propose a solution to mitigate TLB attacks. In particular, based on the microarchitecture properties of GPU, we introduce a software-based system, TLB-pilot, that binds thread blocks of different kernels to different groups of streaming multiprocessors by considering hardware isolation of last-level TLBs and the application’s resource requirement. TLB-pilot employs lightweight online profiling to collect kernel information before kernel launches. By coordinating software- and hardware-based scheduling and employing a kernel splitting scheme to reduce load imbalance, TLB-pilot effectively mitigates TLB attacks. The result shows that when under TLB attack, TLB-pilot mitigates the attack and provides on average 56.2% and 60.6% improvement in average normalized turnaround times and overall system throughput, respectively, compared to the traditional Multi-Process Service based co-running solution. When under TLB attack, TLB-pilot also provides up to 47.3% and 64.3% improvement (41% and 42.9% on average) in average normalized turnaround times and overall system throughput, respectively, compared to a state-of-the-art co-running solution for efficiently scheduling of thread blocks.
doi_str_mv	10.1145/3491218
format	Article
fullrecord	<record><control><sourceid>crossref</sourceid><recordid>TN_cdi_crossref_primary_10_1145_3491218</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>10_1145_3491218</sourcerecordid><originalsourceid>FETCH-LOGICAL-c220t-2d2b72fed1267a70330fcba5510c1f177e1e93a34e368e80b968ae95da98feb93</originalsourceid><addsrcrecordid>eNo1kE1LxDAYhIMouK7iX8jNUzQfTdN4q0VXoaLg7slDSdM32-jaLkmWxX9vxfU0D8MwDIPQJaPXjGXyRmSacVYcoRmTWUaEVuL4n2Wen6KzGD8o5ZpTOkPvy_qObP1mTLf42Se_NskPazy5uBqHBEPy44DLlIz9xBMtXlcR733qp7QNowm29wls2gUg5d4EwG-2h263mVrO0YkzmwgXB52j1cP9snok9cviqSprYjmnifCOt4o76BjPlVFUCOpsa6Rk1DLHlAIGWhiRgcgLKGir88KAlp3RhYNWizm6-uudBsUYwDXb4L9M-G4YbX4_aQ6fiB8T9FNK</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>TLB-pilot: Mitigating TLB Contention Attack on GPUs with Microarchitecture-Aware Scheduling</title><source>ACM Digital Library Complete</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><creator>Di, Bang ; Hu, Daokun ; Xie, Zhen ; Sun, Jianhua ; Chen, Hao ; Ren, Jinkui ; Li, Dong</creator><creatorcontrib>Di, Bang ; Hu, Daokun ; Xie, Zhen ; Sun, Jianhua ; Chen, Hao ; Ren, Jinkui ; Li, Dong</creatorcontrib><description>Co-running GPU kernels on a single GPU can provide high system throughput and improve hardware utilization, but this raises concerns on application security. We reveal that translation lookaside buffer (TLB) attack, one of the common attacks on CPU, can happen on GPU when multiple GPU kernels co-run. We investigate conditions or principles under which a TLB attack can take effect, including the awareness of GPU TLB microarchitecture, being lightweight, and bypassing existing software and hardware mechanisms. This TLB-based attack can be leveraged to conduct Denial-of-Service (or Degradation-of-Service) attacks. Furthermore, we propose a solution to mitigate TLB attacks. In particular, based on the microarchitecture properties of GPU, we introduce a software-based system, TLB-pilot, that binds thread blocks of different kernels to different groups of streaming multiprocessors by considering hardware isolation of last-level TLBs and the application’s resource requirement. TLB-pilot employs lightweight online profiling to collect kernel information before kernel launches. By coordinating software- and hardware-based scheduling and employing a kernel splitting scheme to reduce load imbalance, TLB-pilot effectively mitigates TLB attacks. The result shows that when under TLB attack, TLB-pilot mitigates the attack and provides on average 56.2% and 60.6% improvement in average normalized turnaround times and overall system throughput, respectively, compared to the traditional Multi-Process Service based co-running solution. When under TLB attack, TLB-pilot also provides up to 47.3% and 64.3% improvement (41% and 42.9% on average) in average normalized turnaround times and overall system throughput, respectively, compared to a state-of-the-art co-running solution for efficiently scheduling of thread blocks.</description><identifier>ISSN: 1544-3566</identifier><identifier>EISSN: 1544-3973</identifier><identifier>DOI: 10.1145/3491218</identifier><language>eng</language><ispartof>ACM transactions on architecture and code optimization, 2022-03, Vol.19 (1), p.1-23</ispartof><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c220t-2d2b72fed1267a70330fcba5510c1f177e1e93a34e368e80b968ae95da98feb93</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,776,780,27901,27902</link.rule.ids></links><search><creatorcontrib>Di, Bang</creatorcontrib><creatorcontrib>Hu, Daokun</creatorcontrib><creatorcontrib>Xie, Zhen</creatorcontrib><creatorcontrib>Sun, Jianhua</creatorcontrib><creatorcontrib>Chen, Hao</creatorcontrib><creatorcontrib>Ren, Jinkui</creatorcontrib><creatorcontrib>Li, Dong</creatorcontrib><title>TLB-pilot: Mitigating TLB Contention Attack on GPUs with Microarchitecture-Aware Scheduling</title><title>ACM transactions on architecture and code optimization</title><description>Co-running GPU kernels on a single GPU can provide high system throughput and improve hardware utilization, but this raises concerns on application security. We reveal that translation lookaside buffer (TLB) attack, one of the common attacks on CPU, can happen on GPU when multiple GPU kernels co-run. We investigate conditions or principles under which a TLB attack can take effect, including the awareness of GPU TLB microarchitecture, being lightweight, and bypassing existing software and hardware mechanisms. This TLB-based attack can be leveraged to conduct Denial-of-Service (or Degradation-of-Service) attacks. Furthermore, we propose a solution to mitigate TLB attacks. In particular, based on the microarchitecture properties of GPU, we introduce a software-based system, TLB-pilot, that binds thread blocks of different kernels to different groups of streaming multiprocessors by considering hardware isolation of last-level TLBs and the application’s resource requirement. TLB-pilot employs lightweight online profiling to collect kernel information before kernel launches. By coordinating software- and hardware-based scheduling and employing a kernel splitting scheme to reduce load imbalance, TLB-pilot effectively mitigates TLB attacks. The result shows that when under TLB attack, TLB-pilot mitigates the attack and provides on average 56.2% and 60.6% improvement in average normalized turnaround times and overall system throughput, respectively, compared to the traditional Multi-Process Service based co-running solution. When under TLB attack, TLB-pilot also provides up to 47.3% and 64.3% improvement (41% and 42.9% on average) in average normalized turnaround times and overall system throughput, respectively, compared to a state-of-the-art co-running solution for efficiently scheduling of thread blocks.</description><issn>1544-3566</issn><issn>1544-3973</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><recordid>eNo1kE1LxDAYhIMouK7iX8jNUzQfTdN4q0VXoaLg7slDSdM32-jaLkmWxX9vxfU0D8MwDIPQJaPXjGXyRmSacVYcoRmTWUaEVuL4n2Wen6KzGD8o5ZpTOkPvy_qObP1mTLf42Se_NskPazy5uBqHBEPy44DLlIz9xBMtXlcR733qp7QNowm29wls2gUg5d4EwG-2h263mVrO0YkzmwgXB52j1cP9snok9cviqSprYjmnifCOt4o76BjPlVFUCOpsa6Rk1DLHlAIGWhiRgcgLKGir88KAlp3RhYNWizm6-uudBsUYwDXb4L9M-G4YbX4_aQ6fiB8T9FNK</recordid><startdate>20220301</startdate><enddate>20220301</enddate><creator>Di, Bang</creator><creator>Hu, Daokun</creator><creator>Xie, Zhen</creator><creator>Sun, Jianhua</creator><creator>Chen, Hao</creator><creator>Ren, Jinkui</creator><creator>Li, Dong</creator><scope>AAYXX</scope><scope>CITATION</scope></search><sort><creationdate>20220301</creationdate><title>TLB-pilot: Mitigating TLB Contention Attack on GPUs with Microarchitecture-Aware Scheduling</title><author>Di, Bang ; Hu, Daokun ; Xie, Zhen ; Sun, Jianhua ; Chen, Hao ; Ren, Jinkui ; Li, Dong</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c220t-2d2b72fed1267a70330fcba5510c1f177e1e93a34e368e80b968ae95da98feb93</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Di, Bang</creatorcontrib><creatorcontrib>Hu, Daokun</creatorcontrib><creatorcontrib>Xie, Zhen</creatorcontrib><creatorcontrib>Sun, Jianhua</creatorcontrib><creatorcontrib>Chen, Hao</creatorcontrib><creatorcontrib>Ren, Jinkui</creatorcontrib><creatorcontrib>Li, Dong</creatorcontrib><collection>CrossRef</collection><jtitle>ACM transactions on architecture and code optimization</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Di, Bang</au><au>Hu, Daokun</au><au>Xie, Zhen</au><au>Sun, Jianhua</au><au>Chen, Hao</au><au>Ren, Jinkui</au><au>Li, Dong</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>TLB-pilot: Mitigating TLB Contention Attack on GPUs with Microarchitecture-Aware Scheduling</atitle><jtitle>ACM transactions on architecture and code optimization</jtitle><date>2022-03-01</date><risdate>2022</risdate><volume>19</volume><issue>1</issue><spage>1</spage><epage>23</epage><pages>1-23</pages><issn>1544-3566</issn><eissn>1544-3973</eissn><abstract>Co-running GPU kernels on a single GPU can provide high system throughput and improve hardware utilization, but this raises concerns on application security. We reveal that translation lookaside buffer (TLB) attack, one of the common attacks on CPU, can happen on GPU when multiple GPU kernels co-run. We investigate conditions or principles under which a TLB attack can take effect, including the awareness of GPU TLB microarchitecture, being lightweight, and bypassing existing software and hardware mechanisms. This TLB-based attack can be leveraged to conduct Denial-of-Service (or Degradation-of-Service) attacks. Furthermore, we propose a solution to mitigate TLB attacks. In particular, based on the microarchitecture properties of GPU, we introduce a software-based system, TLB-pilot, that binds thread blocks of different kernels to different groups of streaming multiprocessors by considering hardware isolation of last-level TLBs and the application’s resource requirement. TLB-pilot employs lightweight online profiling to collect kernel information before kernel launches. By coordinating software- and hardware-based scheduling and employing a kernel splitting scheme to reduce load imbalance, TLB-pilot effectively mitigates TLB attacks. The result shows that when under TLB attack, TLB-pilot mitigates the attack and provides on average 56.2% and 60.6% improvement in average normalized turnaround times and overall system throughput, respectively, compared to the traditional Multi-Process Service based co-running solution. When under TLB attack, TLB-pilot also provides up to 47.3% and 64.3% improvement (41% and 42.9% on average) in average normalized turnaround times and overall system throughput, respectively, compared to a state-of-the-art co-running solution for efficiently scheduling of thread blocks.</abstract><doi>10.1145/3491218</doi><tpages>23</tpages><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 1544-3566
ispartof	ACM transactions on architecture and code optimization, 2022-03, Vol.19 (1), p.1-23
issn	1544-3566 1544-3973
language	eng
recordid	cdi_crossref_primary_10_1145_3491218
source	ACM Digital Library Complete; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals
title	TLB-pilot: Mitigating TLB Contention Attack on GPUs with Microarchitecture-Aware Scheduling
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-14T04%3A27%3A21IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-crossref&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=TLB-pilot:%20Mitigating%20TLB%20Contention%20Attack%20on%20GPUs%20with%20Microarchitecture-Aware%20Scheduling&rft.jtitle=ACM%20transactions%20on%20architecture%20and%20code%20optimization&rft.au=Di,%20Bang&rft.date=2022-03-01&rft.volume=19&rft.issue=1&rft.spage=1&rft.epage=23&rft.pages=1-23&rft.issn=1544-3566&rft.eissn=1544-3973&rft_id=info:doi/10.1145/3491218&rft_dat=%3Ccrossref%3E10_1145_3491218%3C/crossref%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true