A Scalable Learned Index Scheme in Storage Systems

Index structures are important for efficient data access, which have been widely used to improve the performance in many in-memory systems. Due to high in-memory overheads, traditional index structures become difficult to process the explosive growth of data, let alone providing low latency and high...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Li, Pengfei, Hua, Yu, Zuo, Pengfei, Jia, Jingnan
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Li, Pengfei
Hua, Yu
Zuo, Pengfei
Jia, Jingnan
description Index structures are important for efficient data access, which have been widely used to improve the performance in many in-memory systems. Due to high in-memory overheads, traditional index structures become difficult to process the explosive growth of data, let alone providing low latency and high throughput performance with limited system resources. The promising learned indexes leverage deep-learning models to complement existing index structures and obtain significant memory savings. However, the learned indexes fail to become scalable due to the heavy inter-model dependency and expensive retraining. To address these problems, we propose a scalable learned index scheme to construct different linear regression models according to the data distribution. Moreover, the used models are independent so as to reduce the complexity of retraining and become easy to partition and store the data into different pages, blocks or distributed systems. Our experimental results show that compared with state-of-the-art schemes, AIDEL improves the insertion performance by about 2$\times$ and provides comparable lookup performance, while efficiently supporting scalability.
doi_str_mv 10.48550/arxiv.1905.06256
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_1905_06256</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1905_06256</sourcerecordid><originalsourceid>FETCH-LOGICAL-a676-69eec523ab3127940a5542e34cbd7e1dd76d17dbdf05aeee8f61ce9b409f17e33</originalsourceid><addsrcrecordid>eNotzr1uwjAYhWEvDBX0AjrVN5DU_8YjQv1BitQh7NHn-KRESkLloAruvkA7Hekdjh7GnqQozdpa8UL53P-UMghbCqese2Bqw-uWBooDeAXKExLfTQnnaz5gBO8nXp-Omb7A68t8wjiv2KKjYcbj_y7Z_u11v_0oqs_33XZTFeS8K1wAWqs0RS2VD0aQtUZBmzYmD5mSd0n6FFMnLAFYd062CNGI0EkPrZfs-e_2jm6-cz9SvjQ3fHPH619JIT5o</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>A Scalable Learned Index Scheme in Storage Systems</title><source>arXiv.org</source><creator>Li, Pengfei ; Hua, Yu ; Zuo, Pengfei ; Jia, Jingnan</creator><creatorcontrib>Li, Pengfei ; Hua, Yu ; Zuo, Pengfei ; Jia, Jingnan</creatorcontrib><description>Index structures are important for efficient data access, which have been widely used to improve the performance in many in-memory systems. Due to high in-memory overheads, traditional index structures become difficult to process the explosive growth of data, let alone providing low latency and high throughput performance with limited system resources. The promising learned indexes leverage deep-learning models to complement existing index structures and obtain significant memory savings. However, the learned indexes fail to become scalable due to the heavy inter-model dependency and expensive retraining. To address these problems, we propose a scalable learned index scheme to construct different linear regression models according to the data distribution. Moreover, the used models are independent so as to reduce the complexity of retraining and become easy to partition and store the data into different pages, blocks or distributed systems. Our experimental results show that compared with state-of-the-art schemes, AIDEL improves the insertion performance by about 2$\times$ and provides comparable lookup performance, while efficiently supporting scalability.</description><identifier>DOI: 10.48550/arxiv.1905.06256</identifier><language>eng</language><subject>Computer Science - Databases ; Computer Science - Learning ; Statistics - Machine Learning</subject><creationdate>2019-05</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/1905.06256$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.1905.06256$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Li, Pengfei</creatorcontrib><creatorcontrib>Hua, Yu</creatorcontrib><creatorcontrib>Zuo, Pengfei</creatorcontrib><creatorcontrib>Jia, Jingnan</creatorcontrib><title>A Scalable Learned Index Scheme in Storage Systems</title><description>Index structures are important for efficient data access, which have been widely used to improve the performance in many in-memory systems. Due to high in-memory overheads, traditional index structures become difficult to process the explosive growth of data, let alone providing low latency and high throughput performance with limited system resources. The promising learned indexes leverage deep-learning models to complement existing index structures and obtain significant memory savings. However, the learned indexes fail to become scalable due to the heavy inter-model dependency and expensive retraining. To address these problems, we propose a scalable learned index scheme to construct different linear regression models according to the data distribution. Moreover, the used models are independent so as to reduce the complexity of retraining and become easy to partition and store the data into different pages, blocks or distributed systems. Our experimental results show that compared with state-of-the-art schemes, AIDEL improves the insertion performance by about 2$\times$ and provides comparable lookup performance, while efficiently supporting scalability.</description><subject>Computer Science - Databases</subject><subject>Computer Science - Learning</subject><subject>Statistics - Machine Learning</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotzr1uwjAYhWEvDBX0AjrVN5DU_8YjQv1BitQh7NHn-KRESkLloAruvkA7Hekdjh7GnqQozdpa8UL53P-UMghbCqese2Bqw-uWBooDeAXKExLfTQnnaz5gBO8nXp-Omb7A68t8wjiv2KKjYcbj_y7Z_u11v_0oqs_33XZTFeS8K1wAWqs0RS2VD0aQtUZBmzYmD5mSd0n6FFMnLAFYd062CNGI0EkPrZfs-e_2jm6-cz9SvjQ3fHPH619JIT5o</recordid><startdate>20190508</startdate><enddate>20190508</enddate><creator>Li, Pengfei</creator><creator>Hua, Yu</creator><creator>Zuo, Pengfei</creator><creator>Jia, Jingnan</creator><scope>AKY</scope><scope>EPD</scope><scope>GOX</scope></search><sort><creationdate>20190508</creationdate><title>A Scalable Learned Index Scheme in Storage Systems</title><author>Li, Pengfei ; Hua, Yu ; Zuo, Pengfei ; Jia, Jingnan</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a676-69eec523ab3127940a5542e34cbd7e1dd76d17dbdf05aeee8f61ce9b409f17e33</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Computer Science - Databases</topic><topic>Computer Science - Learning</topic><topic>Statistics - Machine Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Li, Pengfei</creatorcontrib><creatorcontrib>Hua, Yu</creatorcontrib><creatorcontrib>Zuo, Pengfei</creatorcontrib><creatorcontrib>Jia, Jingnan</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv Statistics</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Li, Pengfei</au><au>Hua, Yu</au><au>Zuo, Pengfei</au><au>Jia, Jingnan</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A Scalable Learned Index Scheme in Storage Systems</atitle><date>2019-05-08</date><risdate>2019</risdate><abstract>Index structures are important for efficient data access, which have been widely used to improve the performance in many in-memory systems. Due to high in-memory overheads, traditional index structures become difficult to process the explosive growth of data, let alone providing low latency and high throughput performance with limited system resources. The promising learned indexes leverage deep-learning models to complement existing index structures and obtain significant memory savings. However, the learned indexes fail to become scalable due to the heavy inter-model dependency and expensive retraining. To address these problems, we propose a scalable learned index scheme to construct different linear regression models according to the data distribution. Moreover, the used models are independent so as to reduce the complexity of retraining and become easy to partition and store the data into different pages, blocks or distributed systems. Our experimental results show that compared with state-of-the-art schemes, AIDEL improves the insertion performance by about 2$\times$ and provides comparable lookup performance, while efficiently supporting scalability.</abstract><doi>10.48550/arxiv.1905.06256</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.1905.06256
ispartof
issn
language eng
recordid cdi_arxiv_primary_1905_06256
source arXiv.org
subjects Computer Science - Databases
Computer Science - Learning
Statistics - Machine Learning
title A Scalable Learned Index Scheme in Storage Systems
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-26T09%3A21%3A54IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20Scalable%20Learned%20Index%20Scheme%20in%20Storage%20Systems&rft.au=Li,%20Pengfei&rft.date=2019-05-08&rft_id=info:doi/10.48550/arxiv.1905.06256&rft_dat=%3Carxiv_GOX%3E1905_06256%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true