Towards Accelerating Data Intensive Application's Shuffle Process Using SmartNICs

The wide adoption of the emerging SmartNIC technology creates new opportunities to offload application-level computation into the networking layer, which frees the burden of host CPUs, leading to performance improvement. Shuffle, the all-to-all data exchange process, is a critical building block for...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Proceedings of the ACM on measurement and analysis of computing systems 2023-05, Vol.7 (2), p.1-23, Article 36
Hauptverfasser: Lin, Jiaxin, Ji, Tao, Hao, Xiangpeng, Cha, Hokeun, Le, Yanfang, Yu, Xiangyao, Akella, Aditya
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 23
container_issue 2
container_start_page 1
container_title Proceedings of the ACM on measurement and analysis of computing systems
container_volume 7
creator Lin, Jiaxin
Ji, Tao
Hao, Xiangpeng
Cha, Hokeun
Le, Yanfang
Yu, Xiangyao
Akella, Aditya
description The wide adoption of the emerging SmartNIC technology creates new opportunities to offload application-level computation into the networking layer, which frees the burden of host CPUs, leading to performance improvement. Shuffle, the all-to-all data exchange process, is a critical building block for network communication in distributed data-intensive applications and can potentially benefit from SmartNICs. In this paper, we develop SmartShuffle, which accelerates the data-intensive application's shuffle process by offloading various computation tasks into the SmartNIC devices. SmartShuffle supports offloading both low-level network functions, including data partitioning and network transport, and high-level computation tasks, including filtering, aggregation, and sorting. SmartShuffle adopts a coordinated offload architecture to make sender-side and receiver-side SmartNICs jointly contribute to the benefits of shuffle computation offload. SmartShuffle carefully manages the tight and time-varying computation and memory constraints on the device. We propose a liquid offloading approach, which dynamically migrates operators between the host CPU and the SmartNIC at runtime such that resources in both devices are fully utilized. We prototype SmartShuffle on the Stingray SoC SmartNICs and plug it into Spark. Our evaluation shows that SmartShuffle improves host CPU efficiency and I/O efficiency with lower job completion time. SmartShuffle outperforms Spark, and Spark RDMA by up to 40% on TPC-H.
doi_str_mv 10.1145/3589980
format Article
fullrecord <record><control><sourceid>acm_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1145_3589980</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3589980</sourcerecordid><originalsourceid>FETCH-LOGICAL-a277t-53d4747610e30452db1a85cb1933eb02cc77561315b7d55e93833a4e3aedad523</originalsourceid><addsrcrecordid>eNpNkM1Lw0AUxBdRsNTi3dPeeoruZzc5hqg1UPyg7Tm87L5oJE3CblT8701pFU_zYH5vGIaQS86uOVf6Ruo4SWJ2QiZCmUXEhUpO_93nZBbCO2OMx5rpRE7Iy6b7Au8CTa3FBj0MdftKb2EAmrcDtqH-RJr2fVPb0eraeaDrt4-qapA--85iCHQb9i_rHfjhMc_CBTmroAk4O-qUbO_vNtlDtHpa5lm6ikAYM0RaOmXGXpyhZEoLV3KItS15IiWWTFhrjF5wyXVpnNaYyFhKUCgBHTgt5JTMD7nWdyF4rIre12OJ74KzYj9GcRxjJK8OJNjdH_Rr_gDos1hl</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Towards Accelerating Data Intensive Application's Shuffle Process Using SmartNICs</title><source>Access via ACM Digital Library</source><creator>Lin, Jiaxin ; Ji, Tao ; Hao, Xiangpeng ; Cha, Hokeun ; Le, Yanfang ; Yu, Xiangyao ; Akella, Aditya</creator><creatorcontrib>Lin, Jiaxin ; Ji, Tao ; Hao, Xiangpeng ; Cha, Hokeun ; Le, Yanfang ; Yu, Xiangyao ; Akella, Aditya</creatorcontrib><description>The wide adoption of the emerging SmartNIC technology creates new opportunities to offload application-level computation into the networking layer, which frees the burden of host CPUs, leading to performance improvement. Shuffle, the all-to-all data exchange process, is a critical building block for network communication in distributed data-intensive applications and can potentially benefit from SmartNICs. In this paper, we develop SmartShuffle, which accelerates the data-intensive application's shuffle process by offloading various computation tasks into the SmartNIC devices. SmartShuffle supports offloading both low-level network functions, including data partitioning and network transport, and high-level computation tasks, including filtering, aggregation, and sorting. SmartShuffle adopts a coordinated offload architecture to make sender-side and receiver-side SmartNICs jointly contribute to the benefits of shuffle computation offload. SmartShuffle carefully manages the tight and time-varying computation and memory constraints on the device. We propose a liquid offloading approach, which dynamically migrates operators between the host CPU and the SmartNIC at runtime such that resources in both devices are fully utilized. We prototype SmartShuffle on the Stingray SoC SmartNICs and plug it into Spark. Our evaluation shows that SmartShuffle improves host CPU efficiency and I/O efficiency with lower job completion time. SmartShuffle outperforms Spark, and Spark RDMA by up to 40% on TPC-H.</description><identifier>ISSN: 2476-1249</identifier><identifier>EISSN: 2476-1249</identifier><identifier>DOI: 10.1145/3589980</identifier><language>eng</language><publisher>New York, NY, USA: ACM</publisher><subject>In-network processing ; Network services ; Networks</subject><ispartof>Proceedings of the ACM on measurement and analysis of computing systems, 2023-05, Vol.7 (2), p.1-23, Article 36</ispartof><rights>Owner/Author</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-a277t-53d4747610e30452db1a85cb1933eb02cc77561315b7d55e93833a4e3aedad523</citedby><cites>FETCH-LOGICAL-a277t-53d4747610e30452db1a85cb1933eb02cc77561315b7d55e93833a4e3aedad523</cites><orcidid>0009-0008-4428-2771 ; 0009-0007-4305-5721 ; 0009-0001-9554-9084 ; 0000-0002-5680-9015 ; 0009-0001-0785-2519 ; 0000-0002-5920-170X ; 0009-0002-0494-5768</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://dl.acm.org/doi/pdf/10.1145/3589980$$EPDF$$P50$$Gacm$$Hfree_for_read</linktopdf><link.rule.ids>314,780,784,2282,27924,27925,40196,76228</link.rule.ids></links><search><creatorcontrib>Lin, Jiaxin</creatorcontrib><creatorcontrib>Ji, Tao</creatorcontrib><creatorcontrib>Hao, Xiangpeng</creatorcontrib><creatorcontrib>Cha, Hokeun</creatorcontrib><creatorcontrib>Le, Yanfang</creatorcontrib><creatorcontrib>Yu, Xiangyao</creatorcontrib><creatorcontrib>Akella, Aditya</creatorcontrib><title>Towards Accelerating Data Intensive Application's Shuffle Process Using SmartNICs</title><title>Proceedings of the ACM on measurement and analysis of computing systems</title><addtitle>ACM POMACS</addtitle><description>The wide adoption of the emerging SmartNIC technology creates new opportunities to offload application-level computation into the networking layer, which frees the burden of host CPUs, leading to performance improvement. Shuffle, the all-to-all data exchange process, is a critical building block for network communication in distributed data-intensive applications and can potentially benefit from SmartNICs. In this paper, we develop SmartShuffle, which accelerates the data-intensive application's shuffle process by offloading various computation tasks into the SmartNIC devices. SmartShuffle supports offloading both low-level network functions, including data partitioning and network transport, and high-level computation tasks, including filtering, aggregation, and sorting. SmartShuffle adopts a coordinated offload architecture to make sender-side and receiver-side SmartNICs jointly contribute to the benefits of shuffle computation offload. SmartShuffle carefully manages the tight and time-varying computation and memory constraints on the device. We propose a liquid offloading approach, which dynamically migrates operators between the host CPU and the SmartNIC at runtime such that resources in both devices are fully utilized. We prototype SmartShuffle on the Stingray SoC SmartNICs and plug it into Spark. Our evaluation shows that SmartShuffle improves host CPU efficiency and I/O efficiency with lower job completion time. SmartShuffle outperforms Spark, and Spark RDMA by up to 40% on TPC-H.</description><subject>In-network processing</subject><subject>Network services</subject><subject>Networks</subject><issn>2476-1249</issn><issn>2476-1249</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><recordid>eNpNkM1Lw0AUxBdRsNTi3dPeeoruZzc5hqg1UPyg7Tm87L5oJE3CblT8701pFU_zYH5vGIaQS86uOVf6Ruo4SWJ2QiZCmUXEhUpO_93nZBbCO2OMx5rpRE7Iy6b7Au8CTa3FBj0MdftKb2EAmrcDtqH-RJr2fVPb0eraeaDrt4-qapA--85iCHQb9i_rHfjhMc_CBTmroAk4O-qUbO_vNtlDtHpa5lm6ikAYM0RaOmXGXpyhZEoLV3KItS15IiWWTFhrjF5wyXVpnNaYyFhKUCgBHTgt5JTMD7nWdyF4rIre12OJ74KzYj9GcRxjJK8OJNjdH_Rr_gDos1hl</recordid><startdate>20230522</startdate><enddate>20230522</enddate><creator>Lin, Jiaxin</creator><creator>Ji, Tao</creator><creator>Hao, Xiangpeng</creator><creator>Cha, Hokeun</creator><creator>Le, Yanfang</creator><creator>Yu, Xiangyao</creator><creator>Akella, Aditya</creator><general>ACM</general><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0009-0008-4428-2771</orcidid><orcidid>https://orcid.org/0009-0007-4305-5721</orcidid><orcidid>https://orcid.org/0009-0001-9554-9084</orcidid><orcidid>https://orcid.org/0000-0002-5680-9015</orcidid><orcidid>https://orcid.org/0009-0001-0785-2519</orcidid><orcidid>https://orcid.org/0000-0002-5920-170X</orcidid><orcidid>https://orcid.org/0009-0002-0494-5768</orcidid></search><sort><creationdate>20230522</creationdate><title>Towards Accelerating Data Intensive Application's Shuffle Process Using SmartNICs</title><author>Lin, Jiaxin ; Ji, Tao ; Hao, Xiangpeng ; Cha, Hokeun ; Le, Yanfang ; Yu, Xiangyao ; Akella, Aditya</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a277t-53d4747610e30452db1a85cb1933eb02cc77561315b7d55e93833a4e3aedad523</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>In-network processing</topic><topic>Network services</topic><topic>Networks</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Lin, Jiaxin</creatorcontrib><creatorcontrib>Ji, Tao</creatorcontrib><creatorcontrib>Hao, Xiangpeng</creatorcontrib><creatorcontrib>Cha, Hokeun</creatorcontrib><creatorcontrib>Le, Yanfang</creatorcontrib><creatorcontrib>Yu, Xiangyao</creatorcontrib><creatorcontrib>Akella, Aditya</creatorcontrib><collection>CrossRef</collection><jtitle>Proceedings of the ACM on measurement and analysis of computing systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Lin, Jiaxin</au><au>Ji, Tao</au><au>Hao, Xiangpeng</au><au>Cha, Hokeun</au><au>Le, Yanfang</au><au>Yu, Xiangyao</au><au>Akella, Aditya</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Towards Accelerating Data Intensive Application's Shuffle Process Using SmartNICs</atitle><jtitle>Proceedings of the ACM on measurement and analysis of computing systems</jtitle><stitle>ACM POMACS</stitle><date>2023-05-22</date><risdate>2023</risdate><volume>7</volume><issue>2</issue><spage>1</spage><epage>23</epage><pages>1-23</pages><artnum>36</artnum><issn>2476-1249</issn><eissn>2476-1249</eissn><abstract>The wide adoption of the emerging SmartNIC technology creates new opportunities to offload application-level computation into the networking layer, which frees the burden of host CPUs, leading to performance improvement. Shuffle, the all-to-all data exchange process, is a critical building block for network communication in distributed data-intensive applications and can potentially benefit from SmartNICs. In this paper, we develop SmartShuffle, which accelerates the data-intensive application's shuffle process by offloading various computation tasks into the SmartNIC devices. SmartShuffle supports offloading both low-level network functions, including data partitioning and network transport, and high-level computation tasks, including filtering, aggregation, and sorting. SmartShuffle adopts a coordinated offload architecture to make sender-side and receiver-side SmartNICs jointly contribute to the benefits of shuffle computation offload. SmartShuffle carefully manages the tight and time-varying computation and memory constraints on the device. We propose a liquid offloading approach, which dynamically migrates operators between the host CPU and the SmartNIC at runtime such that resources in both devices are fully utilized. We prototype SmartShuffle on the Stingray SoC SmartNICs and plug it into Spark. Our evaluation shows that SmartShuffle improves host CPU efficiency and I/O efficiency with lower job completion time. SmartShuffle outperforms Spark, and Spark RDMA by up to 40% on TPC-H.</abstract><cop>New York, NY, USA</cop><pub>ACM</pub><doi>10.1145/3589980</doi><tpages>23</tpages><orcidid>https://orcid.org/0009-0008-4428-2771</orcidid><orcidid>https://orcid.org/0009-0007-4305-5721</orcidid><orcidid>https://orcid.org/0009-0001-9554-9084</orcidid><orcidid>https://orcid.org/0000-0002-5680-9015</orcidid><orcidid>https://orcid.org/0009-0001-0785-2519</orcidid><orcidid>https://orcid.org/0000-0002-5920-170X</orcidid><orcidid>https://orcid.org/0009-0002-0494-5768</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 2476-1249
ispartof Proceedings of the ACM on measurement and analysis of computing systems, 2023-05, Vol.7 (2), p.1-23, Article 36
issn 2476-1249
2476-1249
language eng
recordid cdi_crossref_primary_10_1145_3589980
source Access via ACM Digital Library
subjects In-network processing
Network services
Networks
title Towards Accelerating Data Intensive Application's Shuffle Process Using SmartNICs
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-03T20%3A55%3A21IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-acm_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Towards%20Accelerating%20Data%20Intensive%20Application's%20Shuffle%20Process%20Using%20SmartNICs&rft.jtitle=Proceedings%20of%20the%20ACM%20on%20measurement%20and%20analysis%20of%20computing%20systems&rft.au=Lin,%20Jiaxin&rft.date=2023-05-22&rft.volume=7&rft.issue=2&rft.spage=1&rft.epage=23&rft.pages=1-23&rft.artnum=36&rft.issn=2476-1249&rft.eissn=2476-1249&rft_id=info:doi/10.1145/3589980&rft_dat=%3Cacm_cross%3E3589980%3C/acm_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true