Scalable Adaptive NUMA-Aware Lock

Scalable locking is a key building block for scalable multi-threaded software. Its performance is especially critical in multi-socket, multi-core machines with non-uniform memory access (NUMA). Previous schemes such as in-place locks and delegation locks only perform well under a certain level of co...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on parallel and distributed systems 2017-06, Vol.28 (6), p.1754-1769
Hauptverfasser:	Mingzhe Zhang, Haibo Chen, Luwei Cheng, Lau, Francis C. M., Cho-Li Wang
Format:	Artikel
Sprache:	eng
Schlagworte:	adaptive synchronization Delegation lock Hardware Instruction sets Locking Locks Message systems Servers Switches Synchronization
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	1769
container_issue	6
container_start_page	1754
container_title	IEEE transactions on parallel and distributed systems
container_volume	28
creator	Mingzhe Zhang Haibo Chen Luwei Cheng Lau, Francis C. M. Cho-Li Wang
description	Scalable locking is a key building block for scalable multi-threaded software. Its performance is especially critical in multi-socket, multi-core machines with non-uniform memory access (NUMA). Previous schemes such as in-place locks and delegation locks only perform well under a certain level of contention, and often require non-trivial tuning for a particular configuration. Besides, in large NUMA systems, current delegation locks cannot perform satisfactorily due to lack of optimized NUMA policies. In this work, we propose SANL, a locking scheme that can deliver high performance under various contention levels by adaptively switching between in-place locks and delegation locks. To optimize the performance of delegation locks, we introduce a new NUMA policy that jointly considers node distances and server utilization when choosing lock servers. We have implemented SANL and evaluated it with four popular multi-threaded applications (Memcached, Berkeley DB, Phoenix2 and SPLASH-2), on a 40-core Intel machine and a 64-core AMD machine. The comparison results with seven other representative locking schemes show that SANL outperforms them in most contention situations. For example, in one group test, SANL is 3.7 times faster than RCL lock and 17 times faster than POSIX mutex.
doi_str_mv	10.1109/TPDS.2016.2630695
format	Article
fullrecord	<record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_crossref_primary_10_1109_TPDS_2016_2630695</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>7748539</ieee_id><sourcerecordid>2174469414</sourcerecordid><originalsourceid>FETCH-LOGICAL-c293t-a54f9e411901b98f907a7b90fa004bb7f13233b7b91224462350205146114c5b3</originalsourceid><addsrcrecordid>eNo9kE1LxDAQhoMouK7-APFS8dw6k0za5ljWT6gfsLvnkNQUulZb013Ff29KF0_zMjzvDDyMnSMkiKCuV683y4QDpglPBaRKHrAZSpnHHHNxGDKQjBVHdcxOhmEDgCSBZuxyWZnW2NZFxZvpt823i57XT0Vc_BjvorKr3k_ZUW3awZ3t55yt725Xi4e4fLl_XBRlXHEltrGRVCtHiArQqrxWkJnMKqgNAFmb1Si4EDaskHOilAsJHCRSikiVtGLOrqa7ve--dm7Y6k2385_hpeaYhYYipEDhRFW-Gwbvat375sP4X42gRxN6NKFHE3pvInQupk7jnPvns4xyKZT4A9Q2VVw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2174469414</pqid></control><display><type>article</type><title>Scalable Adaptive NUMA-Aware Lock</title><source>IEEE Electronic Library (IEL)</source><creator>Mingzhe Zhang ; Haibo Chen ; Luwei Cheng ; Lau, Francis C. M. ; Cho-Li Wang</creator><creatorcontrib>Mingzhe Zhang ; Haibo Chen ; Luwei Cheng ; Lau, Francis C. M. ; Cho-Li Wang</creatorcontrib><description>Scalable locking is a key building block for scalable multi-threaded software. Its performance is especially critical in multi-socket, multi-core machines with non-uniform memory access (NUMA). Previous schemes such as in-place locks and delegation locks only perform well under a certain level of contention, and often require non-trivial tuning for a particular configuration. Besides, in large NUMA systems, current delegation locks cannot perform satisfactorily due to lack of optimized NUMA policies. In this work, we propose SANL, a locking scheme that can deliver high performance under various contention levels by adaptively switching between in-place locks and delegation locks. To optimize the performance of delegation locks, we introduce a new NUMA policy that jointly considers node distances and server utilization when choosing lock servers. We have implemented SANL and evaluated it with four popular multi-threaded applications (Memcached, Berkeley DB, Phoenix2 and SPLASH-2), on a 40-core Intel machine and a 64-core AMD machine. The comparison results with seven other representative locking schemes show that SANL outperforms them in most contention situations. For example, in one group test, SANL is 3.7 times faster than RCL lock and 17 times faster than POSIX mutex.</description><identifier>ISSN: 1045-9219</identifier><identifier>EISSN: 1558-2183</identifier><identifier>DOI: 10.1109/TPDS.2016.2630695</identifier><identifier>CODEN: ITDSEO</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>adaptive synchronization ; Delegation lock ; Hardware ; Instruction sets ; Locking ; Locks ; Message systems ; Servers ; Switches ; Synchronization</subject><ispartof>IEEE transactions on parallel and distributed systems, 2017-06, Vol.28 (6), p.1754-1769</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2017</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c293t-a54f9e411901b98f907a7b90fa004bb7f13233b7b91224462350205146114c5b3</citedby><cites>FETCH-LOGICAL-c293t-a54f9e411901b98f907a7b90fa004bb7f13233b7b91224462350205146114c5b3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/7748539$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/7748539$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Mingzhe Zhang</creatorcontrib><creatorcontrib>Haibo Chen</creatorcontrib><creatorcontrib>Luwei Cheng</creatorcontrib><creatorcontrib>Lau, Francis C. M.</creatorcontrib><creatorcontrib>Cho-Li Wang</creatorcontrib><title>Scalable Adaptive NUMA-Aware Lock</title><title>IEEE transactions on parallel and distributed systems</title><addtitle>TPDS</addtitle><description>Scalable locking is a key building block for scalable multi-threaded software. Its performance is especially critical in multi-socket, multi-core machines with non-uniform memory access (NUMA). Previous schemes such as in-place locks and delegation locks only perform well under a certain level of contention, and often require non-trivial tuning for a particular configuration. Besides, in large NUMA systems, current delegation locks cannot perform satisfactorily due to lack of optimized NUMA policies. In this work, we propose SANL, a locking scheme that can deliver high performance under various contention levels by adaptively switching between in-place locks and delegation locks. To optimize the performance of delegation locks, we introduce a new NUMA policy that jointly considers node distances and server utilization when choosing lock servers. We have implemented SANL and evaluated it with four popular multi-threaded applications (Memcached, Berkeley DB, Phoenix2 and SPLASH-2), on a 40-core Intel machine and a 64-core AMD machine. The comparison results with seven other representative locking schemes show that SANL outperforms them in most contention situations. For example, in one group test, SANL is 3.7 times faster than RCL lock and 17 times faster than POSIX mutex.</description><subject>adaptive synchronization</subject><subject>Delegation lock</subject><subject>Hardware</subject><subject>Instruction sets</subject><subject>Locking</subject><subject>Locks</subject><subject>Message systems</subject><subject>Servers</subject><subject>Switches</subject><subject>Synchronization</subject><issn>1045-9219</issn><issn>1558-2183</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2017</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNo9kE1LxDAQhoMouK7-APFS8dw6k0za5ljWT6gfsLvnkNQUulZb013Ff29KF0_zMjzvDDyMnSMkiKCuV683y4QDpglPBaRKHrAZSpnHHHNxGDKQjBVHdcxOhmEDgCSBZuxyWZnW2NZFxZvpt823i57XT0Vc_BjvorKr3k_ZUW3awZ3t55yt725Xi4e4fLl_XBRlXHEltrGRVCtHiArQqrxWkJnMKqgNAFmb1Si4EDaskHOilAsJHCRSikiVtGLOrqa7ve--dm7Y6k2385_hpeaYhYYipEDhRFW-Gwbvat375sP4X42gRxN6NKFHE3pvInQupk7jnPvns4xyKZT4A9Q2VVw</recordid><startdate>20170601</startdate><enddate>20170601</enddate><creator>Mingzhe Zhang</creator><creator>Haibo Chen</creator><creator>Luwei Cheng</creator><creator>Lau, Francis C. M.</creator><creator>Cho-Li Wang</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>20170601</creationdate><title>Scalable Adaptive NUMA-Aware Lock</title><author>Mingzhe Zhang ; Haibo Chen ; Luwei Cheng ; Lau, Francis C. M. ; Cho-Li Wang</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c293t-a54f9e411901b98f907a7b90fa004bb7f13233b7b91224462350205146114c5b3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2017</creationdate><topic>adaptive synchronization</topic><topic>Delegation lock</topic><topic>Hardware</topic><topic>Instruction sets</topic><topic>Locking</topic><topic>Locks</topic><topic>Message systems</topic><topic>Servers</topic><topic>Switches</topic><topic>Synchronization</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Mingzhe Zhang</creatorcontrib><creatorcontrib>Haibo Chen</creatorcontrib><creatorcontrib>Luwei Cheng</creatorcontrib><creatorcontrib>Lau, Francis C. M.</creatorcontrib><creatorcontrib>Cho-Li Wang</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE transactions on parallel and distributed systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Mingzhe Zhang</au><au>Haibo Chen</au><au>Luwei Cheng</au><au>Lau, Francis C. M.</au><au>Cho-Li Wang</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Scalable Adaptive NUMA-Aware Lock</atitle><jtitle>IEEE transactions on parallel and distributed systems</jtitle><stitle>TPDS</stitle><date>2017-06-01</date><risdate>2017</risdate><volume>28</volume><issue>6</issue><spage>1754</spage><epage>1769</epage><pages>1754-1769</pages><issn>1045-9219</issn><eissn>1558-2183</eissn><coden>ITDSEO</coden><abstract>Scalable locking is a key building block for scalable multi-threaded software. Its performance is especially critical in multi-socket, multi-core machines with non-uniform memory access (NUMA). Previous schemes such as in-place locks and delegation locks only perform well under a certain level of contention, and often require non-trivial tuning for a particular configuration. Besides, in large NUMA systems, current delegation locks cannot perform satisfactorily due to lack of optimized NUMA policies. In this work, we propose SANL, a locking scheme that can deliver high performance under various contention levels by adaptively switching between in-place locks and delegation locks. To optimize the performance of delegation locks, we introduce a new NUMA policy that jointly considers node distances and server utilization when choosing lock servers. We have implemented SANL and evaluated it with four popular multi-threaded applications (Memcached, Berkeley DB, Phoenix2 and SPLASH-2), on a 40-core Intel machine and a 64-core AMD machine. The comparison results with seven other representative locking schemes show that SANL outperforms them in most contention situations. For example, in one group test, SANL is 3.7 times faster than RCL lock and 17 times faster than POSIX mutex.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TPDS.2016.2630695</doi><tpages>16</tpages></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 1045-9219
ispartof	IEEE transactions on parallel and distributed systems, 2017-06, Vol.28 (6), p.1754-1769
issn	1045-9219 1558-2183
language	eng
recordid	cdi_crossref_primary_10_1109_TPDS_2016_2630695
source	IEEE Electronic Library (IEL)
subjects	adaptive synchronization Delegation lock Hardware Instruction sets Locking Locks Message systems Servers Switches Synchronization
title	Scalable Adaptive NUMA-Aware Lock
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-08T01%3A41%3A42IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Scalable%20Adaptive%20NUMA-Aware%20Lock&rft.jtitle=IEEE%20transactions%20on%20parallel%20and%20distributed%20systems&rft.au=Mingzhe%20Zhang&rft.date=2017-06-01&rft.volume=28&rft.issue=6&rft.spage=1754&rft.epage=1769&rft.pages=1754-1769&rft.issn=1045-9219&rft.eissn=1558-2183&rft.coden=ITDSEO&rft_id=info:doi/10.1109/TPDS.2016.2630695&rft_dat=%3Cproquest_RIE%3E2174469414%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2174469414&rft_id=info:pmid/&rft_ieee_id=7748539&rfr_iscdi=true