Towards Accelerating Data Intensive Application's Shuffle Process Using SmartNICs

The wide adoption of the emerging SmartNIC technology creates new opportunities to offload application-level computation into the networking layer, which frees the burden of host CPUs, leading to performance improvement. Shuffle, the all-to-all data exchange process, is a critical building block for...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Proceedings of the ACM on measurement and analysis of computing systems 2023-05, Vol.7 (2), p.1-23, Article 36
Hauptverfasser:	Lin, Jiaxin, Ji, Tao, Hao, Xiangpeng, Cha, Hokeun, Le, Yanfang, Yu, Xiangyao, Akella, Aditya
Format:	Artikel
Sprache:	eng
Schlagworte:	In-network processing Network services Networks
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	23
container_issue	2
container_start_page	1
container_title	Proceedings of the ACM on measurement and analysis of computing systems
container_volume	7
creator	Lin, Jiaxin Ji, Tao Hao, Xiangpeng Cha, Hokeun Le, Yanfang Yu, Xiangyao Akella, Aditya
description	The wide adoption of the emerging SmartNIC technology creates new opportunities to offload application-level computation into the networking layer, which frees the burden of host CPUs, leading to performance improvement. Shuffle, the all-to-all data exchange process, is a critical building block for network communication in distributed data-intensive applications and can potentially benefit from SmartNICs. In this paper, we develop SmartShuffle, which accelerates the data-intensive application's shuffle process by offloading various computation tasks into the SmartNIC devices. SmartShuffle supports offloading both low-level network functions, including data partitioning and network transport, and high-level computation tasks, including filtering, aggregation, and sorting. SmartShuffle adopts a coordinated offload architecture to make sender-side and receiver-side SmartNICs jointly contribute to the benefits of shuffle computation offload. SmartShuffle carefully manages the tight and time-varying computation and memory constraints on the device. We propose a liquid offloading approach, which dynamically migrates operators between the host CPU and the SmartNIC at runtime such that resources in both devices are fully utilized. We prototype SmartShuffle on the Stingray SoC SmartNICs and plug it into Spark. Our evaluation shows that SmartShuffle improves host CPU efficiency and I/O efficiency with lower job completion time. SmartShuffle outperforms Spark, and Spark RDMA by up to 40% on TPC-H.
doi_str_mv	10.1145/3589980
format	Article
fullrecord	<record><control><sourceid>acm_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1145_3589980</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3589980</sourcerecordid><originalsourceid>FETCH-LOGICAL-a277t-53d4747610e30452db1a85cb1933eb02cc77561315b7d55e93833a4e3aedad523</originalsourceid><addsrcrecordid>eNpNkM1Lw0AUxBdRsNTi3dPeeoruZzc5hqg1UPyg7Tm87L5oJE3CblT8701pFU_zYH5vGIaQS86uOVf6Ruo4SWJ2QiZCmUXEhUpO_93nZBbCO2OMx5rpRE7Iy6b7Au8CTa3FBj0MdftKb2EAmrcDtqH-RJr2fVPb0eraeaDrt4-qapA--85iCHQb9i_rHfjhMc_CBTmroAk4O-qUbO_vNtlDtHpa5lm6ikAYM0RaOmXGXpyhZEoLV3KItS15IiWWTFhrjF5wyXVpnNaYyFhKUCgBHTgt5JTMD7nWdyF4rIre12OJ74KzYj9GcRxjJK8OJNjdH_Rr_gDos1hl</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Towards Accelerating Data Intensive Application's Shuffle Process Using SmartNICs</title><source>Access via ACM Digital Library</source><creator>Lin, Jiaxin ; Ji, Tao ; Hao, Xiangpeng ; Cha, Hokeun ; Le, Yanfang ; Yu, Xiangyao ; Akella, Aditya</creator><creatorcontrib>Lin, Jiaxin ; Ji, Tao ; Hao, Xiangpeng ; Cha, Hokeun ; Le, Yanfang ; Yu, Xiangyao ; Akella, Aditya</creatorcontrib><description>The wide adoption of the emerging SmartNIC technology creates new opportunities to offload application-level computation into the networking layer, which frees the burden of host CPUs, leading to performance improvement. Shuffle, the all-to-all data exchange process, is a critical building block for network communication in distributed data-intensive applications and can potentially benefit from SmartNICs. In this paper, we develop SmartShuffle, which accelerates the data-intensive application's shuffle process by offloading various computation tasks into the SmartNIC devices. SmartShuffle supports offloading both low-level network functions, including data partitioning and network transport, and high-level computation tasks, including filtering, aggregation, and sorting. SmartShuffle adopts a coordinated offload architecture to make sender-side and receiver-side SmartNICs jointly contribute to the benefits of shuffle computation offload. SmartShuffle carefully manages the tight and time-varying computation and memory constraints on the device. We propose a liquid offloading approach, which dynamically migrates operators between the host CPU and the SmartNIC at runtime such that resources in both devices are fully utilized. We prototype SmartShuffle on the Stingray SoC SmartNICs and plug it into Spark. Our evaluation shows that SmartShuffle improves host CPU efficiency and I/O efficiency with lower job completion time. SmartShuffle outperforms Spark, and Spark RDMA by up to 40% on TPC-H.</description><identifier>ISSN: 2476-1249</identifier><identifier>EISSN: 2476-1249</identifier><identifier>DOI: 10.1145/3589980</identifier><language>eng</language><publisher>New York, NY, USA: ACM</publisher><subject>In-network processing ; Network services ; Networks</subject><ispartof>Proceedings of the ACM on measurement and analysis of computing systems, 2023-05, Vol.7 (2), p.1-23, Article 36</ispartof><rights>Owner/Author</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-a277t-53d4747610e30452db1a85cb1933eb02cc77561315b7d55e93833a4e3aedad523</citedby><cites>FETCH-LOGICAL-a277t-53d4747610e30452db1a85cb1933eb02cc77561315b7d55e93833a4e3aedad523</cites><orcidid>0009-0008-4428-2771 ; 0009-0007-4305-5721 ; 0009-0001-9554-9084 ; 0000-0002-5680-9015 ; 0009-0001-0785-2519 ; 0000-0002-5920-170X ; 0009-0002-0494-5768</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://dl.acm.org/doi/pdf/10.1145/3589980$$EPDF$$P50$$Gacm$$Hfree_for_read</linktopdf><link.rule.ids>314,780,784,2282,27924,27925,40196,76228</link.rule.ids></links><search><creatorcontrib>Lin, Jiaxin</creatorcontrib><creatorcontrib>Ji, Tao</creatorcontrib><creatorcontrib>Hao, Xiangpeng</creatorcontrib><creatorcontrib>Cha, Hokeun</creatorcontrib><creatorcontrib>Le, Yanfang</creatorcontrib><creatorcontrib>Yu, Xiangyao</creatorcontrib><creatorcontrib>Akella, Aditya</creatorcontrib><title>Towards Accelerating Data Intensive Application's Shuffle Process Using SmartNICs</title><title>Proceedings of the ACM on measurement and analysis of computing systems</title><addtitle>ACM POMACS</addtitle><description>The wide adoption of the emerging SmartNIC technology creates new opportunities to offload application-level computation into the networking layer, which frees the burden of host CPUs, leading to performance improvement. Shuffle, the all-to-all data exchange process, is a critical building block for network communication in distributed data-intensive applications and can potentially benefit from SmartNICs. In this paper, we develop SmartShuffle, which accelerates the data-intensive application's shuffle process by offloading various computation tasks into the SmartNIC devices. SmartShuffle supports offloading both low-level network functions, including data partitioning and network transport, and high-level computation tasks, including filtering, aggregation, and sorting. SmartShuffle adopts a coordinated offload architecture to make sender-side and receiver-side SmartNICs jointly contribute to the benefits of shuffle computation offload. SmartShuffle carefully manages the tight and time-varying computation and memory constraints on the device. We propose a liquid offloading approach, which dynamically migrates operators between the host CPU and the SmartNIC at runtime such that resources in both devices are fully utilized. We prototype SmartShuffle on the Stingray SoC SmartNICs and plug it into Spark. Our evaluation shows that SmartShuffle improves host CPU efficiency and I/O efficiency with lower job completion time. SmartShuffle outperforms Spark, and Spark RDMA by up to 40% on TPC-H.</description><subject>In-network processing</subject><subject>Network services</subject><subject>Networks</subject><issn>2476-1249</issn><issn>2476-1249</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><recordid>eNpNkM1Lw0AUxBdRsNTi3dPeeoruZzc5hqg1UPyg7Tm87L5oJE3CblT8701pFU_zYH5vGIaQS86uOVf6Ruo4SWJ2QiZCmUXEhUpO_93nZBbCO2OMx5rpRE7Iy6b7Au8CTa3FBj0MdftKb2EAmrcDtqH-RJr2fVPb0eraeaDrt4-qapA--85iCHQb9i_rHfjhMc_CBTmroAk4O-qUbO_vNtlDtHpa5lm6ikAYM0RaOmXGXpyhZEoLV3KItS15IiWWTFhrjF5wyXVpnNaYyFhKUCgBHTgt5JTMD7nWdyF4rIre12OJ74KzYj9GcRxjJK8OJNjdH_Rr_gDos1hl</recordid><startdate>20230522</startdate><enddate>20230522</enddate><creator>Lin, Jiaxin</creator><creator>Ji, Tao</creator><creator>Hao, Xiangpeng</creator><creator>Cha, Hokeun</creator><creator>Le, Yanfang</creator><creator>Yu, Xiangyao</creator><creator>Akella, Aditya</creator><general>ACM</general><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0009-0008-4428-2771</orcidid><orcidid>https://orcid.org/0009-0007-4305-5721</orcidid><orcidid>https://orcid.org/0009-0001-9554-9084</orcidid><orcidid>https://orcid.org/0000-0002-5680-9015</orcidid><orcidid>https://orcid.org/0009-0001-0785-2519</orcidid><orcidid>https://orcid.org/0000-0002-5920-170X</orcidid><orcidid>https://orcid.org/0009-0002-0494-5768</orcidid></search><sort><creationdate>20230522</creationdate><title>Towards Accelerating Data Intensive Application's Shuffle Process Using SmartNICs</title><author>Lin, Jiaxin ; Ji, Tao ; Hao, Xiangpeng ; Cha, Hokeun ; Le, Yanfang ; Yu, Xiangyao ; Akella, Aditya</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a277t-53d4747610e30452db1a85cb1933eb02cc77561315b7d55e93833a4e3aedad523</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>In-network processing</topic><topic>Network services</topic><topic>Networks</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Lin, Jiaxin</creatorcontrib><creatorcontrib>Ji, Tao</creatorcontrib><creatorcontrib>Hao, Xiangpeng</creatorcontrib><creatorcontrib>Cha, Hokeun</creatorcontrib><creatorcontrib>Le, Yanfang</creatorcontrib><creatorcontrib>Yu, Xiangyao</creatorcontrib><creatorcontrib>Akella, Aditya</creatorcontrib><collection>CrossRef</collection><jtitle>Proceedings of the ACM on measurement and analysis of computing systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Lin, Jiaxin</au><au>Ji, Tao</au><au>Hao, Xiangpeng</au><au>Cha, Hokeun</au><au>Le, Yanfang</au><au>Yu, Xiangyao</au><au>Akella, Aditya</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Towards Accelerating Data Intensive Application's Shuffle Process Using SmartNICs</atitle><jtitle>Proceedings of the ACM on measurement and analysis of computing systems</jtitle><stitle>ACM POMACS</stitle><date>2023-05-22</date><risdate>2023</risdate><volume>7</volume><issue>2</issue><spage>1</spage><epage>23</epage><pages>1-23</pages><artnum>36</artnum><issn>2476-1249</issn><eissn>2476-1249</eissn><abstract>The wide adoption of the emerging SmartNIC technology creates new opportunities to offload application-level computation into the networking layer, which frees the burden of host CPUs, leading to performance improvement. Shuffle, the all-to-all data exchange process, is a critical building block for network communication in distributed data-intensive applications and can potentially benefit from SmartNICs. In this paper, we develop SmartShuffle, which accelerates the data-intensive application's shuffle process by offloading various computation tasks into the SmartNIC devices. SmartShuffle supports offloading both low-level network functions, including data partitioning and network transport, and high-level computation tasks, including filtering, aggregation, and sorting. SmartShuffle adopts a coordinated offload architecture to make sender-side and receiver-side SmartNICs jointly contribute to the benefits of shuffle computation offload. SmartShuffle carefully manages the tight and time-varying computation and memory constraints on the device. We propose a liquid offloading approach, which dynamically migrates operators between the host CPU and the SmartNIC at runtime such that resources in both devices are fully utilized. We prototype SmartShuffle on the Stingray SoC SmartNICs and plug it into Spark. Our evaluation shows that SmartShuffle improves host CPU efficiency and I/O efficiency with lower job completion time. SmartShuffle outperforms Spark, and Spark RDMA by up to 40% on TPC-H.</abstract><cop>New York, NY, USA</cop><pub>ACM</pub><doi>10.1145/3589980</doi><tpages>23</tpages><orcidid>https://orcid.org/0009-0008-4428-2771</orcidid><orcidid>https://orcid.org/0009-0007-4305-5721</orcidid><orcidid>https://orcid.org/0009-0001-9554-9084</orcidid><orcidid>https://orcid.org/0000-0002-5680-9015</orcidid><orcidid>https://orcid.org/0009-0001-0785-2519</orcidid><orcidid>https://orcid.org/0000-0002-5920-170X</orcidid><orcidid>https://orcid.org/0009-0002-0494-5768</orcidid><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 2476-1249
ispartof	Proceedings of the ACM on measurement and analysis of computing systems, 2023-05, Vol.7 (2), p.1-23, Article 36
issn	2476-1249 2476-1249
language	eng
recordid	cdi_crossref_primary_10_1145_3589980
source	Access via ACM Digital Library
subjects	In-network processing Network services Networks
title	Towards Accelerating Data Intensive Application's Shuffle Process Using SmartNICs
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-03T20%3A55%3A21IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-acm_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Towards%20Accelerating%20Data%20Intensive%20Application's%20Shuffle%20Process%20Using%20SmartNICs&rft.jtitle=Proceedings%20of%20the%20ACM%20on%20measurement%20and%20analysis%20of%20computing%20systems&rft.au=Lin,%20Jiaxin&rft.date=2023-05-22&rft.volume=7&rft.issue=2&rft.spage=1&rft.epage=23&rft.pages=1-23&rft.artnum=36&rft.issn=2476-1249&rft.eissn=2476-1249&rft_id=info:doi/10.1145/3589980&rft_dat=%3Cacm_cross%3E3589980%3C/acm_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true