TLB-pilot: Mitigating TLB Contention Attack on GPUs with Microarchitecture-Aware Scheduling

Co-running GPU kernels on a single GPU can provide high system throughput and improve hardware utilization, but this raises concerns on application security. We reveal that translation lookaside buffer (TLB) attack, one of the common attacks on CPU, can happen on GPU when multiple GPU kernels co-run...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:ACM transactions on architecture and code optimization 2022-03, Vol.19 (1), p.1-23
Hauptverfasser: Di, Bang, Hu, Daokun, Xie, Zhen, Sun, Jianhua, Chen, Hao, Ren, Jinkui, Li, Dong
Format: Artikel
Sprache:eng
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 23
container_issue 1
container_start_page 1
container_title ACM transactions on architecture and code optimization
container_volume 19
creator Di, Bang
Hu, Daokun
Xie, Zhen
Sun, Jianhua
Chen, Hao
Ren, Jinkui
Li, Dong
description Co-running GPU kernels on a single GPU can provide high system throughput and improve hardware utilization, but this raises concerns on application security. We reveal that translation lookaside buffer (TLB) attack, one of the common attacks on CPU, can happen on GPU when multiple GPU kernels co-run. We investigate conditions or principles under which a TLB attack can take effect, including the awareness of GPU TLB microarchitecture, being lightweight, and bypassing existing software and hardware mechanisms. This TLB-based attack can be leveraged to conduct Denial-of-Service (or Degradation-of-Service) attacks. Furthermore, we propose a solution to mitigate TLB attacks. In particular, based on the microarchitecture properties of GPU, we introduce a software-based system, TLB-pilot, that binds thread blocks of different kernels to different groups of streaming multiprocessors by considering hardware isolation of last-level TLBs and the application’s resource requirement. TLB-pilot employs lightweight online profiling to collect kernel information before kernel launches. By coordinating software- and hardware-based scheduling and employing a kernel splitting scheme to reduce load imbalance, TLB-pilot effectively mitigates TLB attacks. The result shows that when under TLB attack, TLB-pilot mitigates the attack and provides on average 56.2% and 60.6% improvement in average normalized turnaround times and overall system throughput, respectively, compared to the traditional Multi-Process Service based co-running solution. When under TLB attack, TLB-pilot also provides up to 47.3% and 64.3% improvement (41% and 42.9% on average) in average normalized turnaround times and overall system throughput, respectively, compared to a state-of-the-art co-running solution for efficiently scheduling of thread blocks.
doi_str_mv 10.1145/3491218
format Article
fullrecord <record><control><sourceid>crossref</sourceid><recordid>TN_cdi_crossref_primary_10_1145_3491218</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>10_1145_3491218</sourcerecordid><originalsourceid>FETCH-LOGICAL-c220t-2d2b72fed1267a70330fcba5510c1f177e1e93a34e368e80b968ae95da98feb93</originalsourceid><addsrcrecordid>eNo1kE1LxDAYhIMouK7iX8jNUzQfTdN4q0VXoaLg7slDSdM32-jaLkmWxX9vxfU0D8MwDIPQJaPXjGXyRmSacVYcoRmTWUaEVuL4n2Wen6KzGD8o5ZpTOkPvy_qObP1mTLf42Se_NskPazy5uBqHBEPy44DLlIz9xBMtXlcR733qp7QNowm29wls2gUg5d4EwG-2h263mVrO0YkzmwgXB52j1cP9snok9cviqSprYjmnifCOt4o76BjPlVFUCOpsa6Rk1DLHlAIGWhiRgcgLKGir88KAlp3RhYNWizm6-uudBsUYwDXb4L9M-G4YbX4_aQ6fiB8T9FNK</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>TLB-pilot: Mitigating TLB Contention Attack on GPUs with Microarchitecture-Aware Scheduling</title><source>ACM Digital Library Complete</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><creator>Di, Bang ; Hu, Daokun ; Xie, Zhen ; Sun, Jianhua ; Chen, Hao ; Ren, Jinkui ; Li, Dong</creator><creatorcontrib>Di, Bang ; Hu, Daokun ; Xie, Zhen ; Sun, Jianhua ; Chen, Hao ; Ren, Jinkui ; Li, Dong</creatorcontrib><description>Co-running GPU kernels on a single GPU can provide high system throughput and improve hardware utilization, but this raises concerns on application security. We reveal that translation lookaside buffer (TLB) attack, one of the common attacks on CPU, can happen on GPU when multiple GPU kernels co-run. We investigate conditions or principles under which a TLB attack can take effect, including the awareness of GPU TLB microarchitecture, being lightweight, and bypassing existing software and hardware mechanisms. This TLB-based attack can be leveraged to conduct Denial-of-Service (or Degradation-of-Service) attacks. Furthermore, we propose a solution to mitigate TLB attacks. In particular, based on the microarchitecture properties of GPU, we introduce a software-based system, TLB-pilot, that binds thread blocks of different kernels to different groups of streaming multiprocessors by considering hardware isolation of last-level TLBs and the application’s resource requirement. TLB-pilot employs lightweight online profiling to collect kernel information before kernel launches. By coordinating software- and hardware-based scheduling and employing a kernel splitting scheme to reduce load imbalance, TLB-pilot effectively mitigates TLB attacks. The result shows that when under TLB attack, TLB-pilot mitigates the attack and provides on average 56.2% and 60.6% improvement in average normalized turnaround times and overall system throughput, respectively, compared to the traditional Multi-Process Service based co-running solution. When under TLB attack, TLB-pilot also provides up to 47.3% and 64.3% improvement (41% and 42.9% on average) in average normalized turnaround times and overall system throughput, respectively, compared to a state-of-the-art co-running solution for efficiently scheduling of thread blocks.</description><identifier>ISSN: 1544-3566</identifier><identifier>EISSN: 1544-3973</identifier><identifier>DOI: 10.1145/3491218</identifier><language>eng</language><ispartof>ACM transactions on architecture and code optimization, 2022-03, Vol.19 (1), p.1-23</ispartof><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c220t-2d2b72fed1267a70330fcba5510c1f177e1e93a34e368e80b968ae95da98feb93</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,776,780,27901,27902</link.rule.ids></links><search><creatorcontrib>Di, Bang</creatorcontrib><creatorcontrib>Hu, Daokun</creatorcontrib><creatorcontrib>Xie, Zhen</creatorcontrib><creatorcontrib>Sun, Jianhua</creatorcontrib><creatorcontrib>Chen, Hao</creatorcontrib><creatorcontrib>Ren, Jinkui</creatorcontrib><creatorcontrib>Li, Dong</creatorcontrib><title>TLB-pilot: Mitigating TLB Contention Attack on GPUs with Microarchitecture-Aware Scheduling</title><title>ACM transactions on architecture and code optimization</title><description>Co-running GPU kernels on a single GPU can provide high system throughput and improve hardware utilization, but this raises concerns on application security. We reveal that translation lookaside buffer (TLB) attack, one of the common attacks on CPU, can happen on GPU when multiple GPU kernels co-run. We investigate conditions or principles under which a TLB attack can take effect, including the awareness of GPU TLB microarchitecture, being lightweight, and bypassing existing software and hardware mechanisms. This TLB-based attack can be leveraged to conduct Denial-of-Service (or Degradation-of-Service) attacks. Furthermore, we propose a solution to mitigate TLB attacks. In particular, based on the microarchitecture properties of GPU, we introduce a software-based system, TLB-pilot, that binds thread blocks of different kernels to different groups of streaming multiprocessors by considering hardware isolation of last-level TLBs and the application’s resource requirement. TLB-pilot employs lightweight online profiling to collect kernel information before kernel launches. By coordinating software- and hardware-based scheduling and employing a kernel splitting scheme to reduce load imbalance, TLB-pilot effectively mitigates TLB attacks. The result shows that when under TLB attack, TLB-pilot mitigates the attack and provides on average 56.2% and 60.6% improvement in average normalized turnaround times and overall system throughput, respectively, compared to the traditional Multi-Process Service based co-running solution. When under TLB attack, TLB-pilot also provides up to 47.3% and 64.3% improvement (41% and 42.9% on average) in average normalized turnaround times and overall system throughput, respectively, compared to a state-of-the-art co-running solution for efficiently scheduling of thread blocks.</description><issn>1544-3566</issn><issn>1544-3973</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><recordid>eNo1kE1LxDAYhIMouK7iX8jNUzQfTdN4q0VXoaLg7slDSdM32-jaLkmWxX9vxfU0D8MwDIPQJaPXjGXyRmSacVYcoRmTWUaEVuL4n2Wen6KzGD8o5ZpTOkPvy_qObP1mTLf42Se_NskPazy5uBqHBEPy44DLlIz9xBMtXlcR733qp7QNowm29wls2gUg5d4EwG-2h263mVrO0YkzmwgXB52j1cP9snok9cviqSprYjmnifCOt4o76BjPlVFUCOpsa6Rk1DLHlAIGWhiRgcgLKGir88KAlp3RhYNWizm6-uudBsUYwDXb4L9M-G4YbX4_aQ6fiB8T9FNK</recordid><startdate>20220301</startdate><enddate>20220301</enddate><creator>Di, Bang</creator><creator>Hu, Daokun</creator><creator>Xie, Zhen</creator><creator>Sun, Jianhua</creator><creator>Chen, Hao</creator><creator>Ren, Jinkui</creator><creator>Li, Dong</creator><scope>AAYXX</scope><scope>CITATION</scope></search><sort><creationdate>20220301</creationdate><title>TLB-pilot: Mitigating TLB Contention Attack on GPUs with Microarchitecture-Aware Scheduling</title><author>Di, Bang ; Hu, Daokun ; Xie, Zhen ; Sun, Jianhua ; Chen, Hao ; Ren, Jinkui ; Li, Dong</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c220t-2d2b72fed1267a70330fcba5510c1f177e1e93a34e368e80b968ae95da98feb93</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Di, Bang</creatorcontrib><creatorcontrib>Hu, Daokun</creatorcontrib><creatorcontrib>Xie, Zhen</creatorcontrib><creatorcontrib>Sun, Jianhua</creatorcontrib><creatorcontrib>Chen, Hao</creatorcontrib><creatorcontrib>Ren, Jinkui</creatorcontrib><creatorcontrib>Li, Dong</creatorcontrib><collection>CrossRef</collection><jtitle>ACM transactions on architecture and code optimization</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Di, Bang</au><au>Hu, Daokun</au><au>Xie, Zhen</au><au>Sun, Jianhua</au><au>Chen, Hao</au><au>Ren, Jinkui</au><au>Li, Dong</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>TLB-pilot: Mitigating TLB Contention Attack on GPUs with Microarchitecture-Aware Scheduling</atitle><jtitle>ACM transactions on architecture and code optimization</jtitle><date>2022-03-01</date><risdate>2022</risdate><volume>19</volume><issue>1</issue><spage>1</spage><epage>23</epage><pages>1-23</pages><issn>1544-3566</issn><eissn>1544-3973</eissn><abstract>Co-running GPU kernels on a single GPU can provide high system throughput and improve hardware utilization, but this raises concerns on application security. We reveal that translation lookaside buffer (TLB) attack, one of the common attacks on CPU, can happen on GPU when multiple GPU kernels co-run. We investigate conditions or principles under which a TLB attack can take effect, including the awareness of GPU TLB microarchitecture, being lightweight, and bypassing existing software and hardware mechanisms. This TLB-based attack can be leveraged to conduct Denial-of-Service (or Degradation-of-Service) attacks. Furthermore, we propose a solution to mitigate TLB attacks. In particular, based on the microarchitecture properties of GPU, we introduce a software-based system, TLB-pilot, that binds thread blocks of different kernels to different groups of streaming multiprocessors by considering hardware isolation of last-level TLBs and the application’s resource requirement. TLB-pilot employs lightweight online profiling to collect kernel information before kernel launches. By coordinating software- and hardware-based scheduling and employing a kernel splitting scheme to reduce load imbalance, TLB-pilot effectively mitigates TLB attacks. The result shows that when under TLB attack, TLB-pilot mitigates the attack and provides on average 56.2% and 60.6% improvement in average normalized turnaround times and overall system throughput, respectively, compared to the traditional Multi-Process Service based co-running solution. When under TLB attack, TLB-pilot also provides up to 47.3% and 64.3% improvement (41% and 42.9% on average) in average normalized turnaround times and overall system throughput, respectively, compared to a state-of-the-art co-running solution for efficiently scheduling of thread blocks.</abstract><doi>10.1145/3491218</doi><tpages>23</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1544-3566
ispartof ACM transactions on architecture and code optimization, 2022-03, Vol.19 (1), p.1-23
issn 1544-3566
1544-3973
language eng
recordid cdi_crossref_primary_10_1145_3491218
source ACM Digital Library Complete; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals
title TLB-pilot: Mitigating TLB Contention Attack on GPUs with Microarchitecture-Aware Scheduling
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-14T04%3A27%3A21IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-crossref&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=TLB-pilot:%20Mitigating%20TLB%20Contention%20Attack%20on%20GPUs%20with%20Microarchitecture-Aware%20Scheduling&rft.jtitle=ACM%20transactions%20on%20architecture%20and%20code%20optimization&rft.au=Di,%20Bang&rft.date=2022-03-01&rft.volume=19&rft.issue=1&rft.spage=1&rft.epage=23&rft.pages=1-23&rft.issn=1544-3566&rft.eissn=1544-3973&rft_id=info:doi/10.1145/3491218&rft_dat=%3Ccrossref%3E10_1145_3491218%3C/crossref%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true