Relational Memory: Native In-Memory Accesses on Rows and Columns

Analytical database systems are typically designed to use a column-first data layout to access only the desired fields. On the other hand, storing data row-first works great for accessing, inserting, or updating entire rows. Transforming rows to columns at runtime is expensive, hence, many analytica...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Roozkhosh, Shahin, Hoornaert, Denis, Mun, Ju Hyoung, Papon, Tarikul Islam, Sanaullah, Ahmed, Drepper, Ulrich, Mancuso, Renato, Athanassoulis, Manos
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Roozkhosh, Shahin
Hoornaert, Denis
Mun, Ju Hyoung
Papon, Tarikul Islam
Sanaullah, Ahmed
Drepper, Ulrich
Mancuso, Renato
Athanassoulis, Manos
description Analytical database systems are typically designed to use a column-first data layout to access only the desired fields. On the other hand, storing data row-first works great for accessing, inserting, or updating entire rows. Transforming rows to columns at runtime is expensive, hence, many analytical systems ingest data in row-first form and transform it in the background to columns to facilitate future analytical queries. How will this design change if we can always efficiently access only the desired set of columns? To address this question, we present a radically new approach to data transformation from rows to columns. We build upon recent advancements in embedded platforms with re-programmable logic to design native in-memory access on rows and columns. Our approach, termed Relational Memory, relies on an FPGA- based accelerator that sits between the CPU and main memory and transparently transforms base data to any group of columns with minimal overhead at runtime. This design allows accessing any group of columns as if it already exists in memory. We implement and deploy Relational Memory in real hardware, and we show that we can access the desired columns up to 1.63x faster than accessing them from their row-wise counterpart, while matching the performance of a pure columnar access for low projectivity, and outperforming it by up to 1.87x as projectivity (and tuple re-construction cost) increases. Moreover, our approach can be easily extended to support offloading of a number of operations to hardware, e.g., selection, group by, aggregation, and joins, having the potential to vastly simplify the software logic and accelerate the query execution.
doi_str_mv 10.48550/arxiv.2109.14349
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2109_14349</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2109_14349</sourcerecordid><originalsourceid>FETCH-LOGICAL-a679-edfd7601dc1605452df706ff1ac0d4502ceef32020b0bea42a7f103b3e2096153</originalsourceid><addsrcrecordid>eNotz81Kw0AUhuHZdCHVC3DVuYHEM79pXLUErYWqULoPJzNnIJDMSEarvXtr6-qDd_HBw9i9gFIvjYEHnH76YykF1KXQStc3bLWnAT_7FHHgrzSm6fTI387hSHwbi2vha-coZ8o8Rb5P35lj9LxJw9cY8y2bBRwy3f3vnB2enw7NS7F732yb9a5AW9UF-eArC8I7YcFoI32owIYg0IHXBqQjCkqChA46Qi2xCgJUp0hCbYVRc7a43l4I7cfUjzid2j9Ke6GoX7YoQxc</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Relational Memory: Native In-Memory Accesses on Rows and Columns</title><source>arXiv.org</source><creator>Roozkhosh, Shahin ; Hoornaert, Denis ; Mun, Ju Hyoung ; Papon, Tarikul Islam ; Sanaullah, Ahmed ; Drepper, Ulrich ; Mancuso, Renato ; Athanassoulis, Manos</creator><creatorcontrib>Roozkhosh, Shahin ; Hoornaert, Denis ; Mun, Ju Hyoung ; Papon, Tarikul Islam ; Sanaullah, Ahmed ; Drepper, Ulrich ; Mancuso, Renato ; Athanassoulis, Manos</creatorcontrib><description>Analytical database systems are typically designed to use a column-first data layout to access only the desired fields. On the other hand, storing data row-first works great for accessing, inserting, or updating entire rows. Transforming rows to columns at runtime is expensive, hence, many analytical systems ingest data in row-first form and transform it in the background to columns to facilitate future analytical queries. How will this design change if we can always efficiently access only the desired set of columns? To address this question, we present a radically new approach to data transformation from rows to columns. We build upon recent advancements in embedded platforms with re-programmable logic to design native in-memory access on rows and columns. Our approach, termed Relational Memory, relies on an FPGA- based accelerator that sits between the CPU and main memory and transparently transforms base data to any group of columns with minimal overhead at runtime. This design allows accessing any group of columns as if it already exists in memory. We implement and deploy Relational Memory in real hardware, and we show that we can access the desired columns up to 1.63x faster than accessing them from their row-wise counterpart, while matching the performance of a pure columnar access for low projectivity, and outperforming it by up to 1.87x as projectivity (and tuple re-construction cost) increases. Moreover, our approach can be easily extended to support offloading of a number of operations to hardware, e.g., selection, group by, aggregation, and joins, having the potential to vastly simplify the software logic and accelerate the query execution.</description><identifier>DOI: 10.48550/arxiv.2109.14349</identifier><language>eng</language><subject>Computer Science - Databases ; Computer Science - Hardware Architecture</subject><creationdate>2021-09</creationdate><rights>http://creativecommons.org/licenses/by-nc-sa/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2109.14349$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2109.14349$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Roozkhosh, Shahin</creatorcontrib><creatorcontrib>Hoornaert, Denis</creatorcontrib><creatorcontrib>Mun, Ju Hyoung</creatorcontrib><creatorcontrib>Papon, Tarikul Islam</creatorcontrib><creatorcontrib>Sanaullah, Ahmed</creatorcontrib><creatorcontrib>Drepper, Ulrich</creatorcontrib><creatorcontrib>Mancuso, Renato</creatorcontrib><creatorcontrib>Athanassoulis, Manos</creatorcontrib><title>Relational Memory: Native In-Memory Accesses on Rows and Columns</title><description>Analytical database systems are typically designed to use a column-first data layout to access only the desired fields. On the other hand, storing data row-first works great for accessing, inserting, or updating entire rows. Transforming rows to columns at runtime is expensive, hence, many analytical systems ingest data in row-first form and transform it in the background to columns to facilitate future analytical queries. How will this design change if we can always efficiently access only the desired set of columns? To address this question, we present a radically new approach to data transformation from rows to columns. We build upon recent advancements in embedded platforms with re-programmable logic to design native in-memory access on rows and columns. Our approach, termed Relational Memory, relies on an FPGA- based accelerator that sits between the CPU and main memory and transparently transforms base data to any group of columns with minimal overhead at runtime. This design allows accessing any group of columns as if it already exists in memory. We implement and deploy Relational Memory in real hardware, and we show that we can access the desired columns up to 1.63x faster than accessing them from their row-wise counterpart, while matching the performance of a pure columnar access for low projectivity, and outperforming it by up to 1.87x as projectivity (and tuple re-construction cost) increases. Moreover, our approach can be easily extended to support offloading of a number of operations to hardware, e.g., selection, group by, aggregation, and joins, having the potential to vastly simplify the software logic and accelerate the query execution.</description><subject>Computer Science - Databases</subject><subject>Computer Science - Hardware Architecture</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotz81Kw0AUhuHZdCHVC3DVuYHEM79pXLUErYWqULoPJzNnIJDMSEarvXtr6-qDd_HBw9i9gFIvjYEHnH76YykF1KXQStc3bLWnAT_7FHHgrzSm6fTI387hSHwbi2vha-coZ8o8Rb5P35lj9LxJw9cY8y2bBRwy3f3vnB2enw7NS7F732yb9a5AW9UF-eArC8I7YcFoI32owIYg0IHXBqQjCkqChA46Qi2xCgJUp0hCbYVRc7a43l4I7cfUjzid2j9Ke6GoX7YoQxc</recordid><startdate>20210929</startdate><enddate>20210929</enddate><creator>Roozkhosh, Shahin</creator><creator>Hoornaert, Denis</creator><creator>Mun, Ju Hyoung</creator><creator>Papon, Tarikul Islam</creator><creator>Sanaullah, Ahmed</creator><creator>Drepper, Ulrich</creator><creator>Mancuso, Renato</creator><creator>Athanassoulis, Manos</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20210929</creationdate><title>Relational Memory: Native In-Memory Accesses on Rows and Columns</title><author>Roozkhosh, Shahin ; Hoornaert, Denis ; Mun, Ju Hyoung ; Papon, Tarikul Islam ; Sanaullah, Ahmed ; Drepper, Ulrich ; Mancuso, Renato ; Athanassoulis, Manos</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a679-edfd7601dc1605452df706ff1ac0d4502ceef32020b0bea42a7f103b3e2096153</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Computer Science - Databases</topic><topic>Computer Science - Hardware Architecture</topic><toplevel>online_resources</toplevel><creatorcontrib>Roozkhosh, Shahin</creatorcontrib><creatorcontrib>Hoornaert, Denis</creatorcontrib><creatorcontrib>Mun, Ju Hyoung</creatorcontrib><creatorcontrib>Papon, Tarikul Islam</creatorcontrib><creatorcontrib>Sanaullah, Ahmed</creatorcontrib><creatorcontrib>Drepper, Ulrich</creatorcontrib><creatorcontrib>Mancuso, Renato</creatorcontrib><creatorcontrib>Athanassoulis, Manos</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Roozkhosh, Shahin</au><au>Hoornaert, Denis</au><au>Mun, Ju Hyoung</au><au>Papon, Tarikul Islam</au><au>Sanaullah, Ahmed</au><au>Drepper, Ulrich</au><au>Mancuso, Renato</au><au>Athanassoulis, Manos</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Relational Memory: Native In-Memory Accesses on Rows and Columns</atitle><date>2021-09-29</date><risdate>2021</risdate><abstract>Analytical database systems are typically designed to use a column-first data layout to access only the desired fields. On the other hand, storing data row-first works great for accessing, inserting, or updating entire rows. Transforming rows to columns at runtime is expensive, hence, many analytical systems ingest data in row-first form and transform it in the background to columns to facilitate future analytical queries. How will this design change if we can always efficiently access only the desired set of columns? To address this question, we present a radically new approach to data transformation from rows to columns. We build upon recent advancements in embedded platforms with re-programmable logic to design native in-memory access on rows and columns. Our approach, termed Relational Memory, relies on an FPGA- based accelerator that sits between the CPU and main memory and transparently transforms base data to any group of columns with minimal overhead at runtime. This design allows accessing any group of columns as if it already exists in memory. We implement and deploy Relational Memory in real hardware, and we show that we can access the desired columns up to 1.63x faster than accessing them from their row-wise counterpart, while matching the performance of a pure columnar access for low projectivity, and outperforming it by up to 1.87x as projectivity (and tuple re-construction cost) increases. Moreover, our approach can be easily extended to support offloading of a number of operations to hardware, e.g., selection, group by, aggregation, and joins, having the potential to vastly simplify the software logic and accelerate the query execution.</abstract><doi>10.48550/arxiv.2109.14349</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.2109.14349
ispartof
issn
language eng
recordid cdi_arxiv_primary_2109_14349
source arXiv.org
subjects Computer Science - Databases
Computer Science - Hardware Architecture
title Relational Memory: Native In-Memory Accesses on Rows and Columns
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-24T08%3A22%3A16IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Relational%20Memory:%20Native%20In-Memory%20Accesses%20on%20Rows%20and%20Columns&rft.au=Roozkhosh,%20Shahin&rft.date=2021-09-29&rft_id=info:doi/10.48550/arxiv.2109.14349&rft_dat=%3Carxiv_GOX%3E2109_14349%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true