Scalable Adaptive NUMA-Aware Lock
Scalable locking is a key building block for scalable multi-threaded software. Its performance is especially critical in multi-socket, multi-core machines with non-uniform memory access (NUMA). Previous schemes such as in-place locks and delegation locks only perform well under a certain level of co...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on parallel and distributed systems 2017-06, Vol.28 (6), p.1754-1769 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 1769 |
---|---|
container_issue | 6 |
container_start_page | 1754 |
container_title | IEEE transactions on parallel and distributed systems |
container_volume | 28 |
creator | Mingzhe Zhang Haibo Chen Luwei Cheng Lau, Francis C. M. Cho-Li Wang |
description | Scalable locking is a key building block for scalable multi-threaded software. Its performance is especially critical in multi-socket, multi-core machines with non-uniform memory access (NUMA). Previous schemes such as in-place locks and delegation locks only perform well under a certain level of contention, and often require non-trivial tuning for a particular configuration. Besides, in large NUMA systems, current delegation locks cannot perform satisfactorily due to lack of optimized NUMA policies. In this work, we propose SANL, a locking scheme that can deliver high performance under various contention levels by adaptively switching between in-place locks and delegation locks. To optimize the performance of delegation locks, we introduce a new NUMA policy that jointly considers node distances and server utilization when choosing lock servers. We have implemented SANL and evaluated it with four popular multi-threaded applications (Memcached, Berkeley DB, Phoenix2 and SPLASH-2), on a 40-core Intel machine and a 64-core AMD machine. The comparison results with seven other representative locking schemes show that SANL outperforms them in most contention situations. For example, in one group test, SANL is 3.7 times faster than RCL lock and 17 times faster than POSIX mutex. |
doi_str_mv | 10.1109/TPDS.2016.2630695 |
format | Article |
fullrecord | <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_crossref_primary_10_1109_TPDS_2016_2630695</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>7748539</ieee_id><sourcerecordid>2174469414</sourcerecordid><originalsourceid>FETCH-LOGICAL-c293t-a54f9e411901b98f907a7b90fa004bb7f13233b7b91224462350205146114c5b3</originalsourceid><addsrcrecordid>eNo9kE1LxDAQhoMouK7-APFS8dw6k0za5ljWT6gfsLvnkNQUulZb013Ff29KF0_zMjzvDDyMnSMkiKCuV683y4QDpglPBaRKHrAZSpnHHHNxGDKQjBVHdcxOhmEDgCSBZuxyWZnW2NZFxZvpt823i57XT0Vc_BjvorKr3k_ZUW3awZ3t55yt725Xi4e4fLl_XBRlXHEltrGRVCtHiArQqrxWkJnMKqgNAFmb1Si4EDaskHOilAsJHCRSikiVtGLOrqa7ve--dm7Y6k2385_hpeaYhYYipEDhRFW-Gwbvat375sP4X42gRxN6NKFHE3pvInQupk7jnPvns4xyKZT4A9Q2VVw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2174469414</pqid></control><display><type>article</type><title>Scalable Adaptive NUMA-Aware Lock</title><source>IEEE Electronic Library (IEL)</source><creator>Mingzhe Zhang ; Haibo Chen ; Luwei Cheng ; Lau, Francis C. M. ; Cho-Li Wang</creator><creatorcontrib>Mingzhe Zhang ; Haibo Chen ; Luwei Cheng ; Lau, Francis C. M. ; Cho-Li Wang</creatorcontrib><description>Scalable locking is a key building block for scalable multi-threaded software. Its performance is especially critical in multi-socket, multi-core machines with non-uniform memory access (NUMA). Previous schemes such as in-place locks and delegation locks only perform well under a certain level of contention, and often require non-trivial tuning for a particular configuration. Besides, in large NUMA systems, current delegation locks cannot perform satisfactorily due to lack of optimized NUMA policies. In this work, we propose SANL, a locking scheme that can deliver high performance under various contention levels by adaptively switching between in-place locks and delegation locks. To optimize the performance of delegation locks, we introduce a new NUMA policy that jointly considers node distances and server utilization when choosing lock servers. We have implemented SANL and evaluated it with four popular multi-threaded applications (Memcached, Berkeley DB, Phoenix2 and SPLASH-2), on a 40-core Intel machine and a 64-core AMD machine. The comparison results with seven other representative locking schemes show that SANL outperforms them in most contention situations. For example, in one group test, SANL is 3.7 times faster than RCL lock and 17 times faster than POSIX mutex.</description><identifier>ISSN: 1045-9219</identifier><identifier>EISSN: 1558-2183</identifier><identifier>DOI: 10.1109/TPDS.2016.2630695</identifier><identifier>CODEN: ITDSEO</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>adaptive synchronization ; Delegation lock ; Hardware ; Instruction sets ; Locking ; Locks ; Message systems ; Servers ; Switches ; Synchronization</subject><ispartof>IEEE transactions on parallel and distributed systems, 2017-06, Vol.28 (6), p.1754-1769</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2017</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c293t-a54f9e411901b98f907a7b90fa004bb7f13233b7b91224462350205146114c5b3</citedby><cites>FETCH-LOGICAL-c293t-a54f9e411901b98f907a7b90fa004bb7f13233b7b91224462350205146114c5b3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/7748539$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/7748539$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Mingzhe Zhang</creatorcontrib><creatorcontrib>Haibo Chen</creatorcontrib><creatorcontrib>Luwei Cheng</creatorcontrib><creatorcontrib>Lau, Francis C. M.</creatorcontrib><creatorcontrib>Cho-Li Wang</creatorcontrib><title>Scalable Adaptive NUMA-Aware Lock</title><title>IEEE transactions on parallel and distributed systems</title><addtitle>TPDS</addtitle><description>Scalable locking is a key building block for scalable multi-threaded software. Its performance is especially critical in multi-socket, multi-core machines with non-uniform memory access (NUMA). Previous schemes such as in-place locks and delegation locks only perform well under a certain level of contention, and often require non-trivial tuning for a particular configuration. Besides, in large NUMA systems, current delegation locks cannot perform satisfactorily due to lack of optimized NUMA policies. In this work, we propose SANL, a locking scheme that can deliver high performance under various contention levels by adaptively switching between in-place locks and delegation locks. To optimize the performance of delegation locks, we introduce a new NUMA policy that jointly considers node distances and server utilization when choosing lock servers. We have implemented SANL and evaluated it with four popular multi-threaded applications (Memcached, Berkeley DB, Phoenix2 and SPLASH-2), on a 40-core Intel machine and a 64-core AMD machine. The comparison results with seven other representative locking schemes show that SANL outperforms them in most contention situations. For example, in one group test, SANL is 3.7 times faster than RCL lock and 17 times faster than POSIX mutex.</description><subject>adaptive synchronization</subject><subject>Delegation lock</subject><subject>Hardware</subject><subject>Instruction sets</subject><subject>Locking</subject><subject>Locks</subject><subject>Message systems</subject><subject>Servers</subject><subject>Switches</subject><subject>Synchronization</subject><issn>1045-9219</issn><issn>1558-2183</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2017</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNo9kE1LxDAQhoMouK7-APFS8dw6k0za5ljWT6gfsLvnkNQUulZb013Ff29KF0_zMjzvDDyMnSMkiKCuV683y4QDpglPBaRKHrAZSpnHHHNxGDKQjBVHdcxOhmEDgCSBZuxyWZnW2NZFxZvpt823i57XT0Vc_BjvorKr3k_ZUW3awZ3t55yt725Xi4e4fLl_XBRlXHEltrGRVCtHiArQqrxWkJnMKqgNAFmb1Si4EDaskHOilAsJHCRSikiVtGLOrqa7ve--dm7Y6k2385_hpeaYhYYipEDhRFW-Gwbvat375sP4X42gRxN6NKFHE3pvInQupk7jnPvns4xyKZT4A9Q2VVw</recordid><startdate>20170601</startdate><enddate>20170601</enddate><creator>Mingzhe Zhang</creator><creator>Haibo Chen</creator><creator>Luwei Cheng</creator><creator>Lau, Francis C. M.</creator><creator>Cho-Li Wang</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>20170601</creationdate><title>Scalable Adaptive NUMA-Aware Lock</title><author>Mingzhe Zhang ; Haibo Chen ; Luwei Cheng ; Lau, Francis C. M. ; Cho-Li Wang</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c293t-a54f9e411901b98f907a7b90fa004bb7f13233b7b91224462350205146114c5b3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2017</creationdate><topic>adaptive synchronization</topic><topic>Delegation lock</topic><topic>Hardware</topic><topic>Instruction sets</topic><topic>Locking</topic><topic>Locks</topic><topic>Message systems</topic><topic>Servers</topic><topic>Switches</topic><topic>Synchronization</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Mingzhe Zhang</creatorcontrib><creatorcontrib>Haibo Chen</creatorcontrib><creatorcontrib>Luwei Cheng</creatorcontrib><creatorcontrib>Lau, Francis C. M.</creatorcontrib><creatorcontrib>Cho-Li Wang</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE transactions on parallel and distributed systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Mingzhe Zhang</au><au>Haibo Chen</au><au>Luwei Cheng</au><au>Lau, Francis C. M.</au><au>Cho-Li Wang</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Scalable Adaptive NUMA-Aware Lock</atitle><jtitle>IEEE transactions on parallel and distributed systems</jtitle><stitle>TPDS</stitle><date>2017-06-01</date><risdate>2017</risdate><volume>28</volume><issue>6</issue><spage>1754</spage><epage>1769</epage><pages>1754-1769</pages><issn>1045-9219</issn><eissn>1558-2183</eissn><coden>ITDSEO</coden><abstract>Scalable locking is a key building block for scalable multi-threaded software. Its performance is especially critical in multi-socket, multi-core machines with non-uniform memory access (NUMA). Previous schemes such as in-place locks and delegation locks only perform well under a certain level of contention, and often require non-trivial tuning for a particular configuration. Besides, in large NUMA systems, current delegation locks cannot perform satisfactorily due to lack of optimized NUMA policies. In this work, we propose SANL, a locking scheme that can deliver high performance under various contention levels by adaptively switching between in-place locks and delegation locks. To optimize the performance of delegation locks, we introduce a new NUMA policy that jointly considers node distances and server utilization when choosing lock servers. We have implemented SANL and evaluated it with four popular multi-threaded applications (Memcached, Berkeley DB, Phoenix2 and SPLASH-2), on a 40-core Intel machine and a 64-core AMD machine. The comparison results with seven other representative locking schemes show that SANL outperforms them in most contention situations. For example, in one group test, SANL is 3.7 times faster than RCL lock and 17 times faster than POSIX mutex.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TPDS.2016.2630695</doi><tpages>16</tpages></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 1045-9219 |
ispartof | IEEE transactions on parallel and distributed systems, 2017-06, Vol.28 (6), p.1754-1769 |
issn | 1045-9219 1558-2183 |
language | eng |
recordid | cdi_crossref_primary_10_1109_TPDS_2016_2630695 |
source | IEEE Electronic Library (IEL) |
subjects | adaptive synchronization Delegation lock Hardware Instruction sets Locking Locks Message systems Servers Switches Synchronization |
title | Scalable Adaptive NUMA-Aware Lock |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-08T01%3A41%3A42IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Scalable%20Adaptive%20NUMA-Aware%20Lock&rft.jtitle=IEEE%20transactions%20on%20parallel%20and%20distributed%20systems&rft.au=Mingzhe%20Zhang&rft.date=2017-06-01&rft.volume=28&rft.issue=6&rft.spage=1754&rft.epage=1769&rft.pages=1754-1769&rft.issn=1045-9219&rft.eissn=1558-2183&rft.coden=ITDSEO&rft_id=info:doi/10.1109/TPDS.2016.2630695&rft_dat=%3Cproquest_RIE%3E2174469414%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2174469414&rft_id=info:pmid/&rft_ieee_id=7748539&rfr_iscdi=true |