A Fast Learned Key-Value Store for Concurrent and Distributed Systems

Efficient key-value (KV) store becomes important for concurrent and distributed systems to deliver high performance. The promising learned indexes leverage deep-learning models to complement existing KV stores and obtain significant performance improvements. However, existing schemes show limited sc...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on knowledge and data engineering 2024-06, Vol.36 (6), p.2301-2315
Hauptverfasser:	Li, Pengfei, Hua, Yu, Jia, Jingnan, Zuo, Pengfei
Format:	Artikel
Sprache:	eng
Schlagworte:	computer architecture Computer networks Computers and information processing Concurrent computing Consistency Data models Data storage Data structures distributed computing Distributed databases Indexes Performance enhancement Performance indices Predictive models Scalability Training
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	2315
container_issue	6
container_start_page	2301
container_title	IEEE transactions on knowledge and data engineering
container_volume	36
creator	Li, Pengfei Hua, Yu Jia, Jingnan Zuo, Pengfei
description	Efficient key-value (KV) store becomes important for concurrent and distributed systems to deliver high performance. The promising learned indexes leverage deep-learning models to complement existing KV stores and obtain significant performance improvements. However, existing schemes show limited scalability in concurrent systems due to containing high dependency among data. The practical system performance decreases when inserting a large amount of new data due to triggering frequent and inefficient retraining operations. Moreover, existing learned indexes become inefficient in distributed systems, since different machines incur high overheads to guarantee the data consistency when the index structures dynamically change. To address these problems in concurrent and distributed systems, we propose a fine-grained learned index scheme with high scalability, called FineStore, which constructs independent models with a flattened data structure under the trained data array to concurrently process the requests with low overheads. FineStore processes the new requests in-place with the support of non-blocking retraining, hence adapting to the new distributions without blocking the systems. In the distributed systems, different machines efficiently leverage the extended RCU barrier to guarantee the data consistency. We evaluate FineStore via YCSB and real-world datasets, and extensive experimental results demonstrate that FineStore improves the performance respectively by up to 1.8× and 2.5× than state-of-the-art XIndex and Masstree. We have released the open-source codes of FineStore for public use in GitHub.
doi_str_mv	10.1109/TKDE.2023.3327009
format	Article
fullrecord	<record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_journals_3044655081</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10292900</ieee_id><sourcerecordid>3044655081</sourcerecordid><originalsourceid>FETCH-LOGICAL-c246t-b139efaa4152b914c2e85202c141747dd275037bd149657b426402f9b1556f383</originalsourceid><addsrcrecordid>eNpNkLtOwzAUhi0EEqXwAEgMlphTzvEljseqF0CtxNDCajmJI6Vqk2I7Q98eV-3AdM7w_efyEfKMMEEE_bZdzRcTBoxPOGcKQN-QEUpZZAw13qYeBGaCC3VPHkLYAUChChyRxZQubYh07azvXE1X7pT92P3g6Cb23tGm93TWd9XgvesitV1N522Ivi2HmPDNKUR3CI_krrH74J6udUy-l4vt7CNbf71_zqbrrGIij1mJXLvGWoGSlRpFxVwh09EVClRC1TVTErgqaxQ6l6oULBfAGl2mT_KGF3xMXi9zj77_HVyIZtcPvksrDQchcimhwEThhap8H4J3jTn69mD9ySCYsy1ztmXOtszVVsq8XDKtc-4fzzTTAPwPhIpjSw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3044655081</pqid></control><display><type>article</type><title>A Fast Learned Key-Value Store for Concurrent and Distributed Systems</title><source>IEEE Electronic Library (IEL)</source><creator>Li, Pengfei ; Hua, Yu ; Jia, Jingnan ; Zuo, Pengfei</creator><creatorcontrib>Li, Pengfei ; Hua, Yu ; Jia, Jingnan ; Zuo, Pengfei</creatorcontrib><description>Efficient key-value (KV) store becomes important for concurrent and distributed systems to deliver high performance. The promising learned indexes leverage deep-learning models to complement existing KV stores and obtain significant performance improvements. However, existing schemes show limited scalability in concurrent systems due to containing high dependency among data. The practical system performance decreases when inserting a large amount of new data due to triggering frequent and inefficient retraining operations. Moreover, existing learned indexes become inefficient in distributed systems, since different machines incur high overheads to guarantee the data consistency when the index structures dynamically change. To address these problems in concurrent and distributed systems, we propose a fine-grained learned index scheme with high scalability, called FineStore, which constructs independent models with a flattened data structure under the trained data array to concurrently process the requests with low overheads. FineStore processes the new requests in-place with the support of non-blocking retraining, hence adapting to the new distributions without blocking the systems. In the distributed systems, different machines efficiently leverage the extended RCU barrier to guarantee the data consistency. We evaluate FineStore via YCSB and real-world datasets, and extensive experimental results demonstrate that FineStore improves the performance respectively by up to 1.8× and 2.5× than state-of-the-art XIndex and Masstree. We have released the open-source codes of FineStore for public use in GitHub.</description><identifier>ISSN: 1041-4347</identifier><identifier>EISSN: 1558-2191</identifier><identifier>DOI: 10.1109/TKDE.2023.3327009</identifier><identifier>CODEN: ITKEEH</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>computer architecture ; Computer networks ; Computers and information processing ; Concurrent computing ; Consistency ; Data models ; Data storage ; Data structures ; distributed computing ; Distributed databases ; Indexes ; Performance enhancement ; Performance indices ; Predictive models ; Scalability ; Training</subject><ispartof>IEEE transactions on knowledge and data engineering, 2024-06, Vol.36 (6), p.2301-2315</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c246t-b139efaa4152b914c2e85202c141747dd275037bd149657b426402f9b1556f383</cites><orcidid>0000-0001-6793-0964 ; 0009-0003-8633-2603 ; 0000-0001-7730-3796 ; 0000-0001-9982-5130</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10292900$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27903,27904,54736</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10292900$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Li, Pengfei</creatorcontrib><creatorcontrib>Hua, Yu</creatorcontrib><creatorcontrib>Jia, Jingnan</creatorcontrib><creatorcontrib>Zuo, Pengfei</creatorcontrib><title>A Fast Learned Key-Value Store for Concurrent and Distributed Systems</title><title>IEEE transactions on knowledge and data engineering</title><addtitle>TKDE</addtitle><description>Efficient key-value (KV) store becomes important for concurrent and distributed systems to deliver high performance. The promising learned indexes leverage deep-learning models to complement existing KV stores and obtain significant performance improvements. However, existing schemes show limited scalability in concurrent systems due to containing high dependency among data. The practical system performance decreases when inserting a large amount of new data due to triggering frequent and inefficient retraining operations. Moreover, existing learned indexes become inefficient in distributed systems, since different machines incur high overheads to guarantee the data consistency when the index structures dynamically change. To address these problems in concurrent and distributed systems, we propose a fine-grained learned index scheme with high scalability, called FineStore, which constructs independent models with a flattened data structure under the trained data array to concurrently process the requests with low overheads. FineStore processes the new requests in-place with the support of non-blocking retraining, hence adapting to the new distributions without blocking the systems. In the distributed systems, different machines efficiently leverage the extended RCU barrier to guarantee the data consistency. We evaluate FineStore via YCSB and real-world datasets, and extensive experimental results demonstrate that FineStore improves the performance respectively by up to 1.8× and 2.5× than state-of-the-art XIndex and Masstree. We have released the open-source codes of FineStore for public use in GitHub.</description><subject>computer architecture</subject><subject>Computer networks</subject><subject>Computers and information processing</subject><subject>Concurrent computing</subject><subject>Consistency</subject><subject>Data models</subject><subject>Data storage</subject><subject>Data structures</subject><subject>distributed computing</subject><subject>Distributed databases</subject><subject>Indexes</subject><subject>Performance enhancement</subject><subject>Performance indices</subject><subject>Predictive models</subject><subject>Scalability</subject><subject>Training</subject><issn>1041-4347</issn><issn>1558-2191</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNkLtOwzAUhi0EEqXwAEgMlphTzvEljseqF0CtxNDCajmJI6Vqk2I7Q98eV-3AdM7w_efyEfKMMEEE_bZdzRcTBoxPOGcKQN-QEUpZZAw13qYeBGaCC3VPHkLYAUChChyRxZQubYh07azvXE1X7pT92P3g6Cb23tGm93TWd9XgvesitV1N522Ivi2HmPDNKUR3CI_krrH74J6udUy-l4vt7CNbf71_zqbrrGIij1mJXLvGWoGSlRpFxVwh09EVClRC1TVTErgqaxQ6l6oULBfAGl2mT_KGF3xMXi9zj77_HVyIZtcPvksrDQchcimhwEThhap8H4J3jTn69mD9ySCYsy1ztmXOtszVVsq8XDKtc-4fzzTTAPwPhIpjSw</recordid><startdate>20240601</startdate><enddate>20240601</enddate><creator>Li, Pengfei</creator><creator>Hua, Yu</creator><creator>Jia, Jingnan</creator><creator>Zuo, Pengfei</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0001-6793-0964</orcidid><orcidid>https://orcid.org/0009-0003-8633-2603</orcidid><orcidid>https://orcid.org/0000-0001-7730-3796</orcidid><orcidid>https://orcid.org/0000-0001-9982-5130</orcidid></search><sort><creationdate>20240601</creationdate><title>A Fast Learned Key-Value Store for Concurrent and Distributed Systems</title><author>Li, Pengfei ; Hua, Yu ; Jia, Jingnan ; Zuo, Pengfei</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c246t-b139efaa4152b914c2e85202c141747dd275037bd149657b426402f9b1556f383</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>computer architecture</topic><topic>Computer networks</topic><topic>Computers and information processing</topic><topic>Concurrent computing</topic><topic>Consistency</topic><topic>Data models</topic><topic>Data storage</topic><topic>Data structures</topic><topic>distributed computing</topic><topic>Distributed databases</topic><topic>Indexes</topic><topic>Performance enhancement</topic><topic>Performance indices</topic><topic>Predictive models</topic><topic>Scalability</topic><topic>Training</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Li, Pengfei</creatorcontrib><creatorcontrib>Hua, Yu</creatorcontrib><creatorcontrib>Jia, Jingnan</creatorcontrib><creatorcontrib>Zuo, Pengfei</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE transactions on knowledge and data engineering</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Li, Pengfei</au><au>Hua, Yu</au><au>Jia, Jingnan</au><au>Zuo, Pengfei</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A Fast Learned Key-Value Store for Concurrent and Distributed Systems</atitle><jtitle>IEEE transactions on knowledge and data engineering</jtitle><stitle>TKDE</stitle><date>2024-06-01</date><risdate>2024</risdate><volume>36</volume><issue>6</issue><spage>2301</spage><epage>2315</epage><pages>2301-2315</pages><issn>1041-4347</issn><eissn>1558-2191</eissn><coden>ITKEEH</coden><abstract>Efficient key-value (KV) store becomes important for concurrent and distributed systems to deliver high performance. The promising learned indexes leverage deep-learning models to complement existing KV stores and obtain significant performance improvements. However, existing schemes show limited scalability in concurrent systems due to containing high dependency among data. The practical system performance decreases when inserting a large amount of new data due to triggering frequent and inefficient retraining operations. Moreover, existing learned indexes become inefficient in distributed systems, since different machines incur high overheads to guarantee the data consistency when the index structures dynamically change. To address these problems in concurrent and distributed systems, we propose a fine-grained learned index scheme with high scalability, called FineStore, which constructs independent models with a flattened data structure under the trained data array to concurrently process the requests with low overheads. FineStore processes the new requests in-place with the support of non-blocking retraining, hence adapting to the new distributions without blocking the systems. In the distributed systems, different machines efficiently leverage the extended RCU barrier to guarantee the data consistency. We evaluate FineStore via YCSB and real-world datasets, and extensive experimental results demonstrate that FineStore improves the performance respectively by up to 1.8× and 2.5× than state-of-the-art XIndex and Masstree. We have released the open-source codes of FineStore for public use in GitHub.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TKDE.2023.3327009</doi><tpages>15</tpages><orcidid>https://orcid.org/0000-0001-6793-0964</orcidid><orcidid>https://orcid.org/0009-0003-8633-2603</orcidid><orcidid>https://orcid.org/0000-0001-7730-3796</orcidid><orcidid>https://orcid.org/0000-0001-9982-5130</orcidid></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 1041-4347
ispartof	IEEE transactions on knowledge and data engineering, 2024-06, Vol.36 (6), p.2301-2315
issn	1041-4347 1558-2191
language	eng
recordid	cdi_proquest_journals_3044655081
source	IEEE Electronic Library (IEL)
subjects	computer architecture Computer networks Computers and information processing Concurrent computing Consistency Data models Data storage Data structures distributed computing Distributed databases Indexes Performance enhancement Performance indices Predictive models Scalability Training
title	A Fast Learned Key-Value Store for Concurrent and Distributed Systems
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-27T07%3A57%3A22IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20Fast%20Learned%20Key-Value%20Store%20for%20Concurrent%20and%20Distributed%20Systems&rft.jtitle=IEEE%20transactions%20on%20knowledge%20and%20data%20engineering&rft.au=Li,%20Pengfei&rft.date=2024-06-01&rft.volume=36&rft.issue=6&rft.spage=2301&rft.epage=2315&rft.pages=2301-2315&rft.issn=1041-4347&rft.eissn=1558-2191&rft.coden=ITKEEH&rft_id=info:doi/10.1109/TKDE.2023.3327009&rft_dat=%3Cproquest_RIE%3E3044655081%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3044655081&rft_id=info:pmid/&rft_ieee_id=10292900&rfr_iscdi=true