Automatic Parallelization of Software Network Functions

Software network functions (NFs) trade-off flexibility and ease of deployment for an increased challenge of performance. The traditional way to increase NF performance is by distributing traffic to multiple CPU cores, but this poses a significant challenge: how to parallelize an NF without breaking...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Pereira, Francisco, Ramos, Fernando M. V, Pedrosa, Luis
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Pereira, Francisco
Ramos, Fernando M. V
Pedrosa, Luis
description Software network functions (NFs) trade-off flexibility and ease of deployment for an increased challenge of performance. The traditional way to increase NF performance is by distributing traffic to multiple CPU cores, but this poses a significant challenge: how to parallelize an NF without breaking its semantics? We propose Maestro, a tool that analyzes a sequential implementation of an NF and automatically generates an enhanced parallel version that carefully configures the NIC's Receive Side Scaling mechanism to distribute traffic across cores, while preserving semantics. When possible, Maestro orchestrates a shared-nothing architecture, with each core operating independently without shared memory coordination, maximizing performance. Otherwise, Maestro choreographs a fine-grained read-write locking mechanism that optimizes operation for typical Internet traffic. We parallelized 8 software NFs and show that they generally scale-up linearly until bottlenecked by PCIe when using small packets or by 100Gbps line-rate with typical Internet traffic. Maestro further outperforms modern hardware-based transactional memory mechanisms, even for challenging parallel-unfriendly workloads.
doi_str_mv 10.48550/arxiv.2307.14791
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2307_14791</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2307_14791</sourcerecordid><originalsourceid>FETCH-LOGICAL-a671-5cd78d57a8541d26f2f9f2fab0cda1956266cf8a17cf5d29f0416e9f1c0c210a3</originalsourceid><addsrcrecordid>eNotj81qwzAQhHXpISR9gJ6qF7CjlS3JOgaTtIHQBJK72UpaMHGiIjtN26dv_g7DMAwM8zH2AiIvK6XEFNNP-53LQpgcSmNhxMzsNMQDDq3jG0zYdaFr_y4xHnkkvo00nDEF_hGGc0x7vjgd3bXsJ-yJsOvD88PHbLeY7-r3bLV-W9azVYbaQKacN5VXBitVgpeaJNmL8FM4j2CVllo7qhCMI-WlJVGCDpbACSdBYDFmr_fZ2_PmK7UHTL_NlaC5ERT_vfdBVg</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Automatic Parallelization of Software Network Functions</title><source>arXiv.org</source><creator>Pereira, Francisco ; Ramos, Fernando M. V ; Pedrosa, Luis</creator><creatorcontrib>Pereira, Francisco ; Ramos, Fernando M. V ; Pedrosa, Luis</creatorcontrib><description>Software network functions (NFs) trade-off flexibility and ease of deployment for an increased challenge of performance. The traditional way to increase NF performance is by distributing traffic to multiple CPU cores, but this poses a significant challenge: how to parallelize an NF without breaking its semantics? We propose Maestro, a tool that analyzes a sequential implementation of an NF and automatically generates an enhanced parallel version that carefully configures the NIC's Receive Side Scaling mechanism to distribute traffic across cores, while preserving semantics. When possible, Maestro orchestrates a shared-nothing architecture, with each core operating independently without shared memory coordination, maximizing performance. Otherwise, Maestro choreographs a fine-grained read-write locking mechanism that optimizes operation for typical Internet traffic. We parallelized 8 software NFs and show that they generally scale-up linearly until bottlenecked by PCIe when using small packets or by 100Gbps line-rate with typical Internet traffic. Maestro further outperforms modern hardware-based transactional memory mechanisms, even for challenging parallel-unfriendly workloads.</description><identifier>DOI: 10.48550/arxiv.2307.14791</identifier><language>eng</language><subject>Computer Science - Networking and Internet Architecture</subject><creationdate>2023-07</creationdate><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,781,886</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2307.14791$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2307.14791$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Pereira, Francisco</creatorcontrib><creatorcontrib>Ramos, Fernando M. V</creatorcontrib><creatorcontrib>Pedrosa, Luis</creatorcontrib><title>Automatic Parallelization of Software Network Functions</title><description>Software network functions (NFs) trade-off flexibility and ease of deployment for an increased challenge of performance. The traditional way to increase NF performance is by distributing traffic to multiple CPU cores, but this poses a significant challenge: how to parallelize an NF without breaking its semantics? We propose Maestro, a tool that analyzes a sequential implementation of an NF and automatically generates an enhanced parallel version that carefully configures the NIC's Receive Side Scaling mechanism to distribute traffic across cores, while preserving semantics. When possible, Maestro orchestrates a shared-nothing architecture, with each core operating independently without shared memory coordination, maximizing performance. Otherwise, Maestro choreographs a fine-grained read-write locking mechanism that optimizes operation for typical Internet traffic. We parallelized 8 software NFs and show that they generally scale-up linearly until bottlenecked by PCIe when using small packets or by 100Gbps line-rate with typical Internet traffic. Maestro further outperforms modern hardware-based transactional memory mechanisms, even for challenging parallel-unfriendly workloads.</description><subject>Computer Science - Networking and Internet Architecture</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotj81qwzAQhHXpISR9gJ6qF7CjlS3JOgaTtIHQBJK72UpaMHGiIjtN26dv_g7DMAwM8zH2AiIvK6XEFNNP-53LQpgcSmNhxMzsNMQDDq3jG0zYdaFr_y4xHnkkvo00nDEF_hGGc0x7vjgd3bXsJ-yJsOvD88PHbLeY7-r3bLV-W9azVYbaQKacN5VXBitVgpeaJNmL8FM4j2CVllo7qhCMI-WlJVGCDpbACSdBYDFmr_fZ2_PmK7UHTL_NlaC5ERT_vfdBVg</recordid><startdate>20230727</startdate><enddate>20230727</enddate><creator>Pereira, Francisco</creator><creator>Ramos, Fernando M. V</creator><creator>Pedrosa, Luis</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20230727</creationdate><title>Automatic Parallelization of Software Network Functions</title><author>Pereira, Francisco ; Ramos, Fernando M. V ; Pedrosa, Luis</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a671-5cd78d57a8541d26f2f9f2fab0cda1956266cf8a17cf5d29f0416e9f1c0c210a3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Computer Science - Networking and Internet Architecture</topic><toplevel>online_resources</toplevel><creatorcontrib>Pereira, Francisco</creatorcontrib><creatorcontrib>Ramos, Fernando M. V</creatorcontrib><creatorcontrib>Pedrosa, Luis</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Pereira, Francisco</au><au>Ramos, Fernando M. V</au><au>Pedrosa, Luis</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Automatic Parallelization of Software Network Functions</atitle><date>2023-07-27</date><risdate>2023</risdate><abstract>Software network functions (NFs) trade-off flexibility and ease of deployment for an increased challenge of performance. The traditional way to increase NF performance is by distributing traffic to multiple CPU cores, but this poses a significant challenge: how to parallelize an NF without breaking its semantics? We propose Maestro, a tool that analyzes a sequential implementation of an NF and automatically generates an enhanced parallel version that carefully configures the NIC's Receive Side Scaling mechanism to distribute traffic across cores, while preserving semantics. When possible, Maestro orchestrates a shared-nothing architecture, with each core operating independently without shared memory coordination, maximizing performance. Otherwise, Maestro choreographs a fine-grained read-write locking mechanism that optimizes operation for typical Internet traffic. We parallelized 8 software NFs and show that they generally scale-up linearly until bottlenecked by PCIe when using small packets or by 100Gbps line-rate with typical Internet traffic. Maestro further outperforms modern hardware-based transactional memory mechanisms, even for challenging parallel-unfriendly workloads.</abstract><doi>10.48550/arxiv.2307.14791</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.2307.14791
ispartof
issn
language eng
recordid cdi_arxiv_primary_2307_14791
source arXiv.org
subjects Computer Science - Networking and Internet Architecture
title Automatic Parallelization of Software Network Functions
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-17T11%3A33%3A51IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Automatic%20Parallelization%20of%20Software%20Network%20Functions&rft.au=Pereira,%20Francisco&rft.date=2023-07-27&rft_id=info:doi/10.48550/arxiv.2307.14791&rft_dat=%3Carxiv_GOX%3E2307_14791%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true