Automatic Parallelization of Software Network Functions

Software network functions (NFs) trade-off flexibility and ease of deployment for an increased challenge of performance. The traditional way to increase NF performance is by distributing traffic to multiple CPU cores, but this poses a significant challenge: how to parallelize an NF without breaking...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Pereira, Francisco, Ramos, Fernando M. V, Pedrosa, Luis
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Networking and Internet Architecture
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Pereira, Francisco Ramos, Fernando M. V Pedrosa, Luis
description	Software network functions (NFs) trade-off flexibility and ease of deployment for an increased challenge of performance. The traditional way to increase NF performance is by distributing traffic to multiple CPU cores, but this poses a significant challenge: how to parallelize an NF without breaking its semantics? We propose Maestro, a tool that analyzes a sequential implementation of an NF and automatically generates an enhanced parallel version that carefully configures the NIC's Receive Side Scaling mechanism to distribute traffic across cores, while preserving semantics. When possible, Maestro orchestrates a shared-nothing architecture, with each core operating independently without shared memory coordination, maximizing performance. Otherwise, Maestro choreographs a fine-grained read-write locking mechanism that optimizes operation for typical Internet traffic. We parallelized 8 software NFs and show that they generally scale-up linearly until bottlenecked by PCIe when using small packets or by 100Gbps line-rate with typical Internet traffic. Maestro further outperforms modern hardware-based transactional memory mechanisms, even for challenging parallel-unfriendly workloads.
doi_str_mv	10.48550/arxiv.2307.14791
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2307_14791</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2307_14791</sourcerecordid><originalsourceid>FETCH-LOGICAL-a671-5cd78d57a8541d26f2f9f2fab0cda1956266cf8a17cf5d29f0416e9f1c0c210a3</originalsourceid><addsrcrecordid>eNotj81qwzAQhHXpISR9gJ6qF7CjlS3JOgaTtIHQBJK72UpaMHGiIjtN26dv_g7DMAwM8zH2AiIvK6XEFNNP-53LQpgcSmNhxMzsNMQDDq3jG0zYdaFr_y4xHnkkvo00nDEF_hGGc0x7vjgd3bXsJ-yJsOvD88PHbLeY7-r3bLV-W9azVYbaQKacN5VXBitVgpeaJNmL8FM4j2CVllo7qhCMI-WlJVGCDpbACSdBYDFmr_fZ2_PmK7UHTL_NlaC5ERT_vfdBVg</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Automatic Parallelization of Software Network Functions</title><source>arXiv.org</source><creator>Pereira, Francisco ; Ramos, Fernando M. V ; Pedrosa, Luis</creator><creatorcontrib>Pereira, Francisco ; Ramos, Fernando M. V ; Pedrosa, Luis</creatorcontrib><description>Software network functions (NFs) trade-off flexibility and ease of deployment for an increased challenge of performance. The traditional way to increase NF performance is by distributing traffic to multiple CPU cores, but this poses a significant challenge: how to parallelize an NF without breaking its semantics? We propose Maestro, a tool that analyzes a sequential implementation of an NF and automatically generates an enhanced parallel version that carefully configures the NIC's Receive Side Scaling mechanism to distribute traffic across cores, while preserving semantics. When possible, Maestro orchestrates a shared-nothing architecture, with each core operating independently without shared memory coordination, maximizing performance. Otherwise, Maestro choreographs a fine-grained read-write locking mechanism that optimizes operation for typical Internet traffic. We parallelized 8 software NFs and show that they generally scale-up linearly until bottlenecked by PCIe when using small packets or by 100Gbps line-rate with typical Internet traffic. Maestro further outperforms modern hardware-based transactional memory mechanisms, even for challenging parallel-unfriendly workloads.</description><identifier>DOI: 10.48550/arxiv.2307.14791</identifier><language>eng</language><subject>Computer Science - Networking and Internet Architecture</subject><creationdate>2023-07</creationdate><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,781,886</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2307.14791$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2307.14791$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Pereira, Francisco</creatorcontrib><creatorcontrib>Ramos, Fernando M. V</creatorcontrib><creatorcontrib>Pedrosa, Luis</creatorcontrib><title>Automatic Parallelization of Software Network Functions</title><description>Software network functions (NFs) trade-off flexibility and ease of deployment for an increased challenge of performance. The traditional way to increase NF performance is by distributing traffic to multiple CPU cores, but this poses a significant challenge: how to parallelize an NF without breaking its semantics? We propose Maestro, a tool that analyzes a sequential implementation of an NF and automatically generates an enhanced parallel version that carefully configures the NIC's Receive Side Scaling mechanism to distribute traffic across cores, while preserving semantics. When possible, Maestro orchestrates a shared-nothing architecture, with each core operating independently without shared memory coordination, maximizing performance. Otherwise, Maestro choreographs a fine-grained read-write locking mechanism that optimizes operation for typical Internet traffic. We parallelized 8 software NFs and show that they generally scale-up linearly until bottlenecked by PCIe when using small packets or by 100Gbps line-rate with typical Internet traffic. Maestro further outperforms modern hardware-based transactional memory mechanisms, even for challenging parallel-unfriendly workloads.</description><subject>Computer Science - Networking and Internet Architecture</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotj81qwzAQhHXpISR9gJ6qF7CjlS3JOgaTtIHQBJK72UpaMHGiIjtN26dv_g7DMAwM8zH2AiIvK6XEFNNP-53LQpgcSmNhxMzsNMQDDq3jG0zYdaFr_y4xHnkkvo00nDEF_hGGc0x7vjgd3bXsJ-yJsOvD88PHbLeY7-r3bLV-W9azVYbaQKacN5VXBitVgpeaJNmL8FM4j2CVllo7qhCMI-WlJVGCDpbACSdBYDFmr_fZ2_PmK7UHTL_NlaC5ERT_vfdBVg</recordid><startdate>20230727</startdate><enddate>20230727</enddate><creator>Pereira, Francisco</creator><creator>Ramos, Fernando M. V</creator><creator>Pedrosa, Luis</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20230727</creationdate><title>Automatic Parallelization of Software Network Functions</title><author>Pereira, Francisco ; Ramos, Fernando M. V ; Pedrosa, Luis</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a671-5cd78d57a8541d26f2f9f2fab0cda1956266cf8a17cf5d29f0416e9f1c0c210a3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Computer Science - Networking and Internet Architecture</topic><toplevel>online_resources</toplevel><creatorcontrib>Pereira, Francisco</creatorcontrib><creatorcontrib>Ramos, Fernando M. V</creatorcontrib><creatorcontrib>Pedrosa, Luis</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Pereira, Francisco</au><au>Ramos, Fernando M. V</au><au>Pedrosa, Luis</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Automatic Parallelization of Software Network Functions</atitle><date>2023-07-27</date><risdate>2023</risdate><abstract>Software network functions (NFs) trade-off flexibility and ease of deployment for an increased challenge of performance. The traditional way to increase NF performance is by distributing traffic to multiple CPU cores, but this poses a significant challenge: how to parallelize an NF without breaking its semantics? We propose Maestro, a tool that analyzes a sequential implementation of an NF and automatically generates an enhanced parallel version that carefully configures the NIC's Receive Side Scaling mechanism to distribute traffic across cores, while preserving semantics. When possible, Maestro orchestrates a shared-nothing architecture, with each core operating independently without shared memory coordination, maximizing performance. Otherwise, Maestro choreographs a fine-grained read-write locking mechanism that optimizes operation for typical Internet traffic. We parallelized 8 software NFs and show that they generally scale-up linearly until bottlenecked by PCIe when using small packets or by 100Gbps line-rate with typical Internet traffic. Maestro further outperforms modern hardware-based transactional memory mechanisms, even for challenging parallel-unfriendly workloads.</abstract><doi>10.48550/arxiv.2307.14791</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.2307.14791
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_2307_14791
source	arXiv.org
subjects	Computer Science - Networking and Internet Architecture
title	Automatic Parallelization of Software Network Functions
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-17T11%3A33%3A51IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Automatic%20Parallelization%20of%20Software%20Network%20Functions&rft.au=Pereira,%20Francisco&rft.date=2023-07-27&rft_id=info:doi/10.48550/arxiv.2307.14791&rft_dat=%3Carxiv_GOX%3E2307_14791%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true