Attribute value reordering for efficient hybrid OLAP

The normalization of a data cube is the ordering of the attribute values. For large multidimensional arrays where dense and sparse chunks are stored differently, proper normalization can lead to improved storage efficiency. We show that it is NP-hard to compute an optimal normalization even for 1 ×...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Information sciences 2006-08, Vol.176 (16), p.2304-2336
Hauptverfasser: Kaser, Owen, Lemire, Daniel
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 2336
container_issue 16
container_start_page 2304
container_title Information sciences
container_volume 176
creator Kaser, Owen
Lemire, Daniel
description The normalization of a data cube is the ordering of the attribute values. For large multidimensional arrays where dense and sparse chunks are stored differently, proper normalization can lead to improved storage efficiency. We show that it is NP-hard to compute an optimal normalization even for 1 × 3 chunks, although we find an exact algorithm for 1 × 2 chunks. When dimensions are nearly statistically independent, we show that dimension-wise attribute frequency sorting is an optimal normalization and takes time O( dn log( n)) for data cubes of size n d . When dimensions are not independent, we propose and evaluate a several heuristics. The hybrid OLAP (HOLAP) storage mechanism is already 19–30% more efficient than ROLAP, but normalization can improve it further by 9–13% for a total gain of 29–44% over ROLAP.
doi_str_mv 10.1016/j.ins.2005.09.005
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_28126941</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S002002550500280X</els_id><sourcerecordid>28126941</sourcerecordid><originalsourceid>FETCH-LOGICAL-c328t-678632cff6678ff55f85db21553643f6ccf1d94ef538bdf6feacf0aa495a204c3</originalsourceid><addsrcrecordid>eNp9kD1PwzAURS0EEqXwA9gysSU8O7GbiKmq-JIqlQFmy7HfA1dpUmynEv-eVGVmum-450n3MHbLoeDA1f228H0sBIAsoCmmOGMzXi9ErkTDz9kMQEAOQspLdhXjFgCqhVIzVi1TCr4dE2YH042YBRyCw-D7z4yGkCGRtx77lH39tMG7bLNevl2zCzJdxJu_nLOPp8f31Uu-3jy_rpbr3JaiTrla1KoUlkhNF5GUVEvXCi5lqaqSlLXEXVMhybJuHSlCYwmMqRppBFS2nLO70999GL5HjEnvfLTYdabHYYxa1FyopuJTkZ-KNgwxBiS9D35nwo_moI9-9FZPfvTRj4ZGTzExDycGpwUHj0HH41CLzge0SbvB_0P_AncnbZA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>28126941</pqid></control><display><type>article</type><title>Attribute value reordering for efficient hybrid OLAP</title><source>Elsevier ScienceDirect Journals</source><creator>Kaser, Owen ; Lemire, Daniel</creator><creatorcontrib>Kaser, Owen ; Lemire, Daniel</creatorcontrib><description>The normalization of a data cube is the ordering of the attribute values. For large multidimensional arrays where dense and sparse chunks are stored differently, proper normalization can lead to improved storage efficiency. We show that it is NP-hard to compute an optimal normalization even for 1 × 3 chunks, although we find an exact algorithm for 1 × 2 chunks. When dimensions are nearly statistically independent, we show that dimension-wise attribute frequency sorting is an optimal normalization and takes time O( dn log( n)) for data cubes of size n d . When dimensions are not independent, we propose and evaluate a several heuristics. The hybrid OLAP (HOLAP) storage mechanism is already 19–30% more efficient than ROLAP, but normalization can improve it further by 9–13% for a total gain of 29–44% over ROLAP.</description><identifier>ISSN: 0020-0255</identifier><identifier>EISSN: 1872-6291</identifier><identifier>DOI: 10.1016/j.ins.2005.09.005</identifier><language>eng</language><publisher>Elsevier Inc</publisher><subject>Chunking ; Data cubes ; MOLAP ; Multidimensional binary arrays ; Normalization</subject><ispartof>Information sciences, 2006-08, Vol.176 (16), p.2304-2336</ispartof><rights>2005 Elsevier Inc.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c328t-678632cff6678ff55f85db21553643f6ccf1d94ef538bdf6feacf0aa495a204c3</citedby></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.sciencedirect.com/science/article/pii/S002002550500280X$$EHTML$$P50$$Gelsevier$$H</linktohtml><link.rule.ids>314,776,780,3536,27903,27904,65309</link.rule.ids></links><search><creatorcontrib>Kaser, Owen</creatorcontrib><creatorcontrib>Lemire, Daniel</creatorcontrib><title>Attribute value reordering for efficient hybrid OLAP</title><title>Information sciences</title><description>The normalization of a data cube is the ordering of the attribute values. For large multidimensional arrays where dense and sparse chunks are stored differently, proper normalization can lead to improved storage efficiency. We show that it is NP-hard to compute an optimal normalization even for 1 × 3 chunks, although we find an exact algorithm for 1 × 2 chunks. When dimensions are nearly statistically independent, we show that dimension-wise attribute frequency sorting is an optimal normalization and takes time O( dn log( n)) for data cubes of size n d . When dimensions are not independent, we propose and evaluate a several heuristics. The hybrid OLAP (HOLAP) storage mechanism is already 19–30% more efficient than ROLAP, but normalization can improve it further by 9–13% for a total gain of 29–44% over ROLAP.</description><subject>Chunking</subject><subject>Data cubes</subject><subject>MOLAP</subject><subject>Multidimensional binary arrays</subject><subject>Normalization</subject><issn>0020-0255</issn><issn>1872-6291</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2006</creationdate><recordtype>article</recordtype><recordid>eNp9kD1PwzAURS0EEqXwA9gysSU8O7GbiKmq-JIqlQFmy7HfA1dpUmynEv-eVGVmum-450n3MHbLoeDA1f228H0sBIAsoCmmOGMzXi9ErkTDz9kMQEAOQspLdhXjFgCqhVIzVi1TCr4dE2YH042YBRyCw-D7z4yGkCGRtx77lH39tMG7bLNevl2zCzJdxJu_nLOPp8f31Uu-3jy_rpbr3JaiTrla1KoUlkhNF5GUVEvXCi5lqaqSlLXEXVMhybJuHSlCYwmMqRppBFS2nLO70999GL5HjEnvfLTYdabHYYxa1FyopuJTkZ-KNgwxBiS9D35nwo_moI9-9FZPfvTRj4ZGTzExDycGpwUHj0HH41CLzge0SbvB_0P_AncnbZA</recordid><startdate>20060822</startdate><enddate>20060822</enddate><creator>Kaser, Owen</creator><creator>Lemire, Daniel</creator><general>Elsevier Inc</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>20060822</creationdate><title>Attribute value reordering for efficient hybrid OLAP</title><author>Kaser, Owen ; Lemire, Daniel</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c328t-678632cff6678ff55f85db21553643f6ccf1d94ef538bdf6feacf0aa495a204c3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2006</creationdate><topic>Chunking</topic><topic>Data cubes</topic><topic>MOLAP</topic><topic>Multidimensional binary arrays</topic><topic>Normalization</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Kaser, Owen</creatorcontrib><creatorcontrib>Lemire, Daniel</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Information sciences</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Kaser, Owen</au><au>Lemire, Daniel</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Attribute value reordering for efficient hybrid OLAP</atitle><jtitle>Information sciences</jtitle><date>2006-08-22</date><risdate>2006</risdate><volume>176</volume><issue>16</issue><spage>2304</spage><epage>2336</epage><pages>2304-2336</pages><issn>0020-0255</issn><eissn>1872-6291</eissn><abstract>The normalization of a data cube is the ordering of the attribute values. For large multidimensional arrays where dense and sparse chunks are stored differently, proper normalization can lead to improved storage efficiency. We show that it is NP-hard to compute an optimal normalization even for 1 × 3 chunks, although we find an exact algorithm for 1 × 2 chunks. When dimensions are nearly statistically independent, we show that dimension-wise attribute frequency sorting is an optimal normalization and takes time O( dn log( n)) for data cubes of size n d . When dimensions are not independent, we propose and evaluate a several heuristics. The hybrid OLAP (HOLAP) storage mechanism is already 19–30% more efficient than ROLAP, but normalization can improve it further by 9–13% for a total gain of 29–44% over ROLAP.</abstract><pub>Elsevier Inc</pub><doi>10.1016/j.ins.2005.09.005</doi><tpages>33</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0020-0255
ispartof Information sciences, 2006-08, Vol.176 (16), p.2304-2336
issn 0020-0255
1872-6291
language eng
recordid cdi_proquest_miscellaneous_28126941
source Elsevier ScienceDirect Journals
subjects Chunking
Data cubes
MOLAP
Multidimensional binary arrays
Normalization
title Attribute value reordering for efficient hybrid OLAP
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-25T10%3A32%3A26IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Attribute%20value%20reordering%20for%20efficient%20hybrid%20OLAP&rft.jtitle=Information%20sciences&rft.au=Kaser,%20Owen&rft.date=2006-08-22&rft.volume=176&rft.issue=16&rft.spage=2304&rft.epage=2336&rft.pages=2304-2336&rft.issn=0020-0255&rft.eissn=1872-6291&rft_id=info:doi/10.1016/j.ins.2005.09.005&rft_dat=%3Cproquest_cross%3E28126941%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=28126941&rft_id=info:pmid/&rft_els_id=S002002550500280X&rfr_iscdi=true