Optimized Data Loading for a Multi-Terabyte Sky Survey Repository
Advanced instruments in a variety of scientific domains are collecting massive amounts of data that must be postprocessed and organized to support research activities. Astronomers have been pioneers in the use of databases to host sky survey data. Increasing data volumes from more powerful telescope...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Tagungsbericht |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 42 |
---|---|
container_issue | |
container_start_page | 42 |
container_title | |
container_volume | |
creator | Cai, Y. Dora Aydt, Ruth Brunner, Robert J. |
description | Advanced instruments in a variety of scientific domains are collecting massive amounts of data that must be postprocessed and organized to support research activities. Astronomers have been pioneers in the use of databases to host sky survey data. Increasing data volumes from more powerful telescopes pose enormous challenges to state-ofthe- art database systems and data-loading techniques. In this paper we present SkyLoader, our novel framework for data loading that is being used to populate a multi-table, multi-terabyte database repository for the Palomar-Quest sky survey. SkyLoader consists of an efficient algorithm for bulk loading, an effective data structure to support data integrity, optimized parallelism, and guidelines for system tuning. Performance studies show the positive effects of these techniques, with load time for a 40-gigabyte data set reduced from over 20 hours to less than 3 hours. Our framework offers a promising approach for loading other large and complex scientific databases. |
doi_str_mv | 10.1109/SC.2005.50 |
format | Conference Proceeding |
fullrecord | <record><control><sourceid>proquest_6IE</sourceid><recordid>TN_cdi_ieee_primary_1559994</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>1559994</ieee_id><sourcerecordid>31486394</sourcerecordid><originalsourceid>FETCH-LOGICAL-a2890-294dc97f53a93020765fde7ea258d1934abe8cc98cf1624ede5c62938b339d1f3</originalsourceid><addsrcrecordid>eNqFkEtLw0AUhQdEUGs3bt0MCIKL1Hlk0txliU-oFExdD5PkRsYmnTiTCPHX21r3rs7i-zgcDiEXnM04Z3CbZzPBmJopdkTOuAIFkiVcnJBpCB-MMQ4JcBWfksWq621rv7Gid6Y3dOlMZbfvtHaeGvoyNL2N1uhNMfZI881I88F_4UhfsXPB9s6P5-S4Nk3A6V9OyNvD_Tp7iparx-dssYyMSIFFAuKqhHmtpNltEWyeqLrCORqh0oqDjE2BaVlCWtY8ETFWqMpEgEwLKaHitZyQ60Nv593ngKHXrQ0lNo3ZohuCljxOEwnxTrw8iBYRdedta_youVIAv_TqQE3Z6sK5TdCc6f1pOs_0_jSt2M66-d_ShbdYyx8DNGqF</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype><pqid>31486394</pqid></control><display><type>conference_proceeding</type><title>Optimized Data Loading for a Multi-Terabyte Sky Survey Repository</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Cai, Y. Dora ; Aydt, Ruth ; Brunner, Robert J.</creator><creatorcontrib>Cai, Y. Dora ; Aydt, Ruth ; Brunner, Robert J.</creatorcontrib><description>Advanced instruments in a variety of scientific domains are collecting massive amounts of data that must be postprocessed and organized to support research activities. Astronomers have been pioneers in the use of databases to host sky survey data. Increasing data volumes from more powerful telescopes pose enormous challenges to state-ofthe- art database systems and data-loading techniques. In this paper we present SkyLoader, our novel framework for data loading that is being used to populate a multi-table, multi-terabyte database repository for the Palomar-Quest sky survey. SkyLoader consists of an efficient algorithm for bulk loading, an effective data structure to support data integrity, optimized parallelism, and guidelines for system tuning. Performance studies show the positive effects of these techniques, with load time for a 40-gigabyte data set reduced from over 20 hours to less than 3 hours. Our framework offers a promising approach for loading other large and complex scientific databases.</description><identifier>ISBN: 1595930612</identifier><identifier>ISBN: 9781595930613</identifier><identifier>DOI: 10.1109/SC.2005.50</identifier><language>eng</language><publisher>Washington, DC, USA: IEEE Computer Society</publisher><subject>Astronomy ; Buildings ; Data structures ; Database systems ; Guidelines ; Human-centered computing -- Visualization -- Visualization application domains -- Scientific visualization ; Information systems -- Information retrieval ; Information systems -- Information storage systems ; Information systems -- Information systems applications ; Instruments ; Parallel processing ; Permission ; Software and its engineering -- Software notations and tools -- General programming languages -- Language features -- Frameworks ; Space technology ; Telescopes</subject><ispartof>ACM/IEEE SC 2005 Conference (SC'05), 2005, p.42-42</ispartof><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/1559994$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,776,780,785,786,2051,4035,4036,27904,54898</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/1559994$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Cai, Y. Dora</creatorcontrib><creatorcontrib>Aydt, Ruth</creatorcontrib><creatorcontrib>Brunner, Robert J.</creatorcontrib><title>Optimized Data Loading for a Multi-Terabyte Sky Survey Repository</title><title>ACM/IEEE SC 2005 Conference (SC'05)</title><addtitle>SUPERC</addtitle><description>Advanced instruments in a variety of scientific domains are collecting massive amounts of data that must be postprocessed and organized to support research activities. Astronomers have been pioneers in the use of databases to host sky survey data. Increasing data volumes from more powerful telescopes pose enormous challenges to state-ofthe- art database systems and data-loading techniques. In this paper we present SkyLoader, our novel framework for data loading that is being used to populate a multi-table, multi-terabyte database repository for the Palomar-Quest sky survey. SkyLoader consists of an efficient algorithm for bulk loading, an effective data structure to support data integrity, optimized parallelism, and guidelines for system tuning. Performance studies show the positive effects of these techniques, with load time for a 40-gigabyte data set reduced from over 20 hours to less than 3 hours. Our framework offers a promising approach for loading other large and complex scientific databases.</description><subject>Astronomy</subject><subject>Buildings</subject><subject>Data structures</subject><subject>Database systems</subject><subject>Guidelines</subject><subject>Human-centered computing -- Visualization -- Visualization application domains -- Scientific visualization</subject><subject>Information systems -- Information retrieval</subject><subject>Information systems -- Information storage systems</subject><subject>Information systems -- Information systems applications</subject><subject>Instruments</subject><subject>Parallel processing</subject><subject>Permission</subject><subject>Software and its engineering -- Software notations and tools -- General programming languages -- Language features -- Frameworks</subject><subject>Space technology</subject><subject>Telescopes</subject><isbn>1595930612</isbn><isbn>9781595930613</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2005</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><sourceid>RIE</sourceid><recordid>eNqFkEtLw0AUhQdEUGs3bt0MCIKL1Hlk0txliU-oFExdD5PkRsYmnTiTCPHX21r3rs7i-zgcDiEXnM04Z3CbZzPBmJopdkTOuAIFkiVcnJBpCB-MMQ4JcBWfksWq621rv7Gid6Y3dOlMZbfvtHaeGvoyNL2N1uhNMfZI881I88F_4UhfsXPB9s6P5-S4Nk3A6V9OyNvD_Tp7iparx-dssYyMSIFFAuKqhHmtpNltEWyeqLrCORqh0oqDjE2BaVlCWtY8ETFWqMpEgEwLKaHitZyQ60Nv593ngKHXrQ0lNo3ZohuCljxOEwnxTrw8iBYRdedta_youVIAv_TqQE3Z6sK5TdCc6f1pOs_0_jSt2M66-d_ShbdYyx8DNGqF</recordid><startdate>2005</startdate><enddate>2005</enddate><creator>Cai, Y. Dora</creator><creator>Aydt, Ruth</creator><creator>Brunner, Robert J.</creator><general>IEEE Computer Society</general><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>2005</creationdate><title>Optimized Data Loading for a Multi-Terabyte Sky Survey Repository</title><author>Cai, Y. Dora ; Aydt, Ruth ; Brunner, Robert J.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a2890-294dc97f53a93020765fde7ea258d1934abe8cc98cf1624ede5c62938b339d1f3</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2005</creationdate><topic>Astronomy</topic><topic>Buildings</topic><topic>Data structures</topic><topic>Database systems</topic><topic>Guidelines</topic><topic>Human-centered computing -- Visualization -- Visualization application domains -- Scientific visualization</topic><topic>Information systems -- Information retrieval</topic><topic>Information systems -- Information storage systems</topic><topic>Information systems -- Information systems applications</topic><topic>Instruments</topic><topic>Parallel processing</topic><topic>Permission</topic><topic>Software and its engineering -- Software notations and tools -- General programming languages -- Language features -- Frameworks</topic><topic>Space technology</topic><topic>Telescopes</topic><toplevel>online_resources</toplevel><creatorcontrib>Cai, Y. Dora</creatorcontrib><creatorcontrib>Aydt, Ruth</creatorcontrib><creatorcontrib>Brunner, Robert J.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Cai, Y. Dora</au><au>Aydt, Ruth</au><au>Brunner, Robert J.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Optimized Data Loading for a Multi-Terabyte Sky Survey Repository</atitle><btitle>ACM/IEEE SC 2005 Conference (SC'05)</btitle><stitle>SUPERC</stitle><date>2005</date><risdate>2005</risdate><spage>42</spage><epage>42</epage><pages>42-42</pages><isbn>1595930612</isbn><isbn>9781595930613</isbn><abstract>Advanced instruments in a variety of scientific domains are collecting massive amounts of data that must be postprocessed and organized to support research activities. Astronomers have been pioneers in the use of databases to host sky survey data. Increasing data volumes from more powerful telescopes pose enormous challenges to state-ofthe- art database systems and data-loading techniques. In this paper we present SkyLoader, our novel framework for data loading that is being used to populate a multi-table, multi-terabyte database repository for the Palomar-Quest sky survey. SkyLoader consists of an efficient algorithm for bulk loading, an effective data structure to support data integrity, optimized parallelism, and guidelines for system tuning. Performance studies show the positive effects of these techniques, with load time for a 40-gigabyte data set reduced from over 20 hours to less than 3 hours. Our framework offers a promising approach for loading other large and complex scientific databases.</abstract><cop>Washington, DC, USA</cop><pub>IEEE Computer Society</pub><doi>10.1109/SC.2005.50</doi><tpages>1</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISBN: 1595930612 |
ispartof | ACM/IEEE SC 2005 Conference (SC'05), 2005, p.42-42 |
issn | |
language | eng |
recordid | cdi_ieee_primary_1559994 |
source | IEEE Electronic Library (IEL) Conference Proceedings |
subjects | Astronomy Buildings Data structures Database systems Guidelines Human-centered computing -- Visualization -- Visualization application domains -- Scientific visualization Information systems -- Information retrieval Information systems -- Information storage systems Information systems -- Information systems applications Instruments Parallel processing Permission Software and its engineering -- Software notations and tools -- General programming languages -- Language features -- Frameworks Space technology Telescopes |
title | Optimized Data Loading for a Multi-Terabyte Sky Survey Repository |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-25T16%3A38%3A27IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Optimized%20Data%20Loading%20for%20a%20Multi-Terabyte%20Sky%20Survey%20Repository&rft.btitle=ACM/IEEE%20SC%202005%20Conference%20(SC'05)&rft.au=Cai,%20Y.%20Dora&rft.date=2005&rft.spage=42&rft.epage=42&rft.pages=42-42&rft.isbn=1595930612&rft.isbn_list=9781595930613&rft_id=info:doi/10.1109/SC.2005.50&rft_dat=%3Cproquest_6IE%3E31486394%3C/proquest_6IE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=31486394&rft_id=info:pmid/&rft_ieee_id=1559994&rfr_iscdi=true |