SETL: A programmable semantic extract-transform-load framework for semantic data warehouses

•This paper describes our programmable Semantic ETL (SETL) framework. SETL builds on Semantic Web (SW) standards and tools.•SETL provides a number of powerful modules, classes, and methods for (dimensional and semantic) DW constructs and tasks.•SETL supports semantic and traditional data sources, se...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Information systems (Oxford) 2017-08, Vol.68, p.17-43
Hauptverfasser: Deb Nath, Rudra Pratap, Hose, Katja, Pedersen, Torben Bach, Romero, Oscar
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 43
container_issue
container_start_page 17
container_title Information systems (Oxford)
container_volume 68
creator Deb Nath, Rudra Pratap
Hose, Katja
Pedersen, Torben Bach
Romero, Oscar
description •This paper describes our programmable Semantic ETL (SETL) framework. SETL builds on Semantic Web (SW) standards and tools.•SETL provides a number of powerful modules, classes, and methods for (dimensional and semantic) DW constructs and tasks.•SETL supports semantic and traditional data sources, semantic integration, and creating or publishing a (MD) semantic DW.•Using SETL, we perform a comprehensive experimental evaluation by producing a MD semantic DW that integrates a semantic and non semantic data sources.•The evaluation shows that SETL improves considerably over the competing solutions/tools in terms of productivity, KB quality, and performance. In order to create better decisions for business analytics, organizations increasingly use external structured, semi-structured, and unstructured data in addition to the (mostly structured) internal data. Current Extract-Transform-Load (ETL) tools are not suitable for this “open world scenario” because they do not consider semantic issues in the integration processing. Current ETL tools neither support processing semantic data nor create a semantic Data Warehouse (DW), a repository of semantically integrated data. This paper describes our programmable Semantic ETL (SETL) framework. SETL builds on Semantic Web (SW) standards and tools and supports developers by offering a number of powerful modules, classes, and methods for (dimensional and semantic) DW constructs and tasks. Thus it supports semantic data sources in addition to traditional data sources, semantic integration, and creating or publishing a semantic (multidimensional) DW in terms of a knowledge base. A comprehensive experimental evaluation comparing SETL to a solution made with traditional tools (requiring much more hand-coding) on a concrete use case, shows that SETL provides better programmer productivity, knowledge base quality, and performance.
doi_str_mv 10.1016/j.is.2017.01.005
format Article
fullrecord <record><control><sourceid>csuc_cross</sourceid><recordid>TN_cdi_csuc_recercat_oai_recercat_cat_2072_305211</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0306437916302101</els_id><sourcerecordid>oai_recercat_cat_2072_305211</sourcerecordid><originalsourceid>FETCH-LOGICAL-c421t-632af875fb0163bc576e9836543008db0a7ed3250b13ca26a8d1339f0387c2fb3</originalsourceid><addsrcrecordid>eNp1kE1PwzAMhiMEEmNw59g_0OIka9LuNk3jQ5rEgXHiELmpAxnrOiXl69-TaZM4cbAtW35svS9j1xwKDlzdrAsfCwFcF8ALgPKEjXilZa5Aq1M2Agkqn0hdn7OLGNcAIMq6HrGXp8VqOc1m2S70rwG7DpsNZZE63A7eZvQ9BLRDnvI2uj50-abHNnNpk7768J6l2d92iwNmXxjorf-IFC_ZmcNNpKtjHbPn28Vqfp8vH-8e5rNlbieCD7mSAl2lS9ckHbKxpVZUV1KVEwlQtQ2gplaKEhouLQqFVculrB3ISlvhGjlm_HDXxg9rAlkKFgfTo_9r9iFACyOhFJwnBo5M6GMM5Mwu-A7Dj-Fg9oaatfHR7A01wE0yNCHTA0JJy6enYKL1tLXU-vRmMG3v_4d_AWb1fY8</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>SETL: A programmable semantic extract-transform-load framework for semantic data warehouses</title><source>Recercat</source><source>Elsevier ScienceDirect Journals</source><creator>Deb Nath, Rudra Pratap ; Hose, Katja ; Pedersen, Torben Bach ; Romero, Oscar</creator><creatorcontrib>Deb Nath, Rudra Pratap ; Hose, Katja ; Pedersen, Torben Bach ; Romero, Oscar</creatorcontrib><description>•This paper describes our programmable Semantic ETL (SETL) framework. SETL builds on Semantic Web (SW) standards and tools.•SETL provides a number of powerful modules, classes, and methods for (dimensional and semantic) DW constructs and tasks.•SETL supports semantic and traditional data sources, semantic integration, and creating or publishing a (MD) semantic DW.•Using SETL, we perform a comprehensive experimental evaluation by producing a MD semantic DW that integrates a semantic and non semantic data sources.•The evaluation shows that SETL improves considerably over the competing solutions/tools in terms of productivity, KB quality, and performance. In order to create better decisions for business analytics, organizations increasingly use external structured, semi-structured, and unstructured data in addition to the (mostly structured) internal data. Current Extract-Transform-Load (ETL) tools are not suitable for this “open world scenario” because they do not consider semantic issues in the integration processing. Current ETL tools neither support processing semantic data nor create a semantic Data Warehouse (DW), a repository of semantically integrated data. This paper describes our programmable Semantic ETL (SETL) framework. SETL builds on Semantic Web (SW) standards and tools and supports developers by offering a number of powerful modules, classes, and methods for (dimensional and semantic) DW constructs and tasks. Thus it supports semantic data sources in addition to traditional data sources, semantic integration, and creating or publishing a semantic (multidimensional) DW in terms of a knowledge base. A comprehensive experimental evaluation comparing SETL to a solution made with traditional tools (requiring much more hand-coding) on a concrete use case, shows that SETL provides better programmer productivity, knowledge base quality, and performance.</description><identifier>ISSN: 0306-4379</identifier><identifier>EISSN: 1873-6076</identifier><identifier>DOI: 10.1016/j.is.2017.01.005</identifier><language>eng</language><publisher>Elsevier Ltd</publisher><subject>Data warehouse ; Data warehousing ; ETL ; Expert systems (Computer science) ; Gestor de dades ; Informàtica ; Knowledge base ; RDF ; Semantic computing ; Semantic integration ; Semantic-aware ; Sistemes d'informació ; Sistemes experts (Informàtica) ; Àrees temàtiques de la UPC</subject><ispartof>Information systems (Oxford), 2017-08, Vol.68, p.17-43</ispartof><rights>2017 Elsevier Ltd</rights><rights>Attribution-NonCommercial-NoDerivs 3.0 Spain info:eu-repo/semantics/openAccess &lt;a href="http://creativecommons.org/licenses/by-nc-nd/3.0/es/"&gt;http://creativecommons.org/licenses/by-nc-nd/3.0/es/&lt;/a&gt;</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c421t-632af875fb0163bc576e9836543008db0a7ed3250b13ca26a8d1339f0387c2fb3</citedby><cites>FETCH-LOGICAL-c421t-632af875fb0163bc576e9836543008db0a7ed3250b13ca26a8d1339f0387c2fb3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.sciencedirect.com/science/article/pii/S0306437916302101$$EHTML$$P50$$Gelsevier$$H</linktohtml><link.rule.ids>230,314,776,780,881,3537,26951,27901,27902,65534</link.rule.ids></links><search><creatorcontrib>Deb Nath, Rudra Pratap</creatorcontrib><creatorcontrib>Hose, Katja</creatorcontrib><creatorcontrib>Pedersen, Torben Bach</creatorcontrib><creatorcontrib>Romero, Oscar</creatorcontrib><title>SETL: A programmable semantic extract-transform-load framework for semantic data warehouses</title><title>Information systems (Oxford)</title><description>•This paper describes our programmable Semantic ETL (SETL) framework. SETL builds on Semantic Web (SW) standards and tools.•SETL provides a number of powerful modules, classes, and methods for (dimensional and semantic) DW constructs and tasks.•SETL supports semantic and traditional data sources, semantic integration, and creating or publishing a (MD) semantic DW.•Using SETL, we perform a comprehensive experimental evaluation by producing a MD semantic DW that integrates a semantic and non semantic data sources.•The evaluation shows that SETL improves considerably over the competing solutions/tools in terms of productivity, KB quality, and performance. In order to create better decisions for business analytics, organizations increasingly use external structured, semi-structured, and unstructured data in addition to the (mostly structured) internal data. Current Extract-Transform-Load (ETL) tools are not suitable for this “open world scenario” because they do not consider semantic issues in the integration processing. Current ETL tools neither support processing semantic data nor create a semantic Data Warehouse (DW), a repository of semantically integrated data. This paper describes our programmable Semantic ETL (SETL) framework. SETL builds on Semantic Web (SW) standards and tools and supports developers by offering a number of powerful modules, classes, and methods for (dimensional and semantic) DW constructs and tasks. Thus it supports semantic data sources in addition to traditional data sources, semantic integration, and creating or publishing a semantic (multidimensional) DW in terms of a knowledge base. A comprehensive experimental evaluation comparing SETL to a solution made with traditional tools (requiring much more hand-coding) on a concrete use case, shows that SETL provides better programmer productivity, knowledge base quality, and performance.</description><subject>Data warehouse</subject><subject>Data warehousing</subject><subject>ETL</subject><subject>Expert systems (Computer science)</subject><subject>Gestor de dades</subject><subject>Informàtica</subject><subject>Knowledge base</subject><subject>RDF</subject><subject>Semantic computing</subject><subject>Semantic integration</subject><subject>Semantic-aware</subject><subject>Sistemes d'informació</subject><subject>Sistemes experts (Informàtica)</subject><subject>Àrees temàtiques de la UPC</subject><issn>0306-4379</issn><issn>1873-6076</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2017</creationdate><recordtype>article</recordtype><sourceid>XX2</sourceid><recordid>eNp1kE1PwzAMhiMEEmNw59g_0OIka9LuNk3jQ5rEgXHiELmpAxnrOiXl69-TaZM4cbAtW35svS9j1xwKDlzdrAsfCwFcF8ALgPKEjXilZa5Aq1M2Agkqn0hdn7OLGNcAIMq6HrGXp8VqOc1m2S70rwG7DpsNZZE63A7eZvQ9BLRDnvI2uj50-abHNnNpk7768J6l2d92iwNmXxjorf-IFC_ZmcNNpKtjHbPn28Vqfp8vH-8e5rNlbieCD7mSAl2lS9ckHbKxpVZUV1KVEwlQtQ2gplaKEhouLQqFVculrB3ISlvhGjlm_HDXxg9rAlkKFgfTo_9r9iFACyOhFJwnBo5M6GMM5Mwu-A7Dj-Fg9oaatfHR7A01wE0yNCHTA0JJy6enYKL1tLXU-vRmMG3v_4d_AWb1fY8</recordid><startdate>20170801</startdate><enddate>20170801</enddate><creator>Deb Nath, Rudra Pratap</creator><creator>Hose, Katja</creator><creator>Pedersen, Torben Bach</creator><creator>Romero, Oscar</creator><general>Elsevier Ltd</general><scope>AAYXX</scope><scope>CITATION</scope><scope>XX2</scope></search><sort><creationdate>20170801</creationdate><title>SETL: A programmable semantic extract-transform-load framework for semantic data warehouses</title><author>Deb Nath, Rudra Pratap ; Hose, Katja ; Pedersen, Torben Bach ; Romero, Oscar</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c421t-632af875fb0163bc576e9836543008db0a7ed3250b13ca26a8d1339f0387c2fb3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2017</creationdate><topic>Data warehouse</topic><topic>Data warehousing</topic><topic>ETL</topic><topic>Expert systems (Computer science)</topic><topic>Gestor de dades</topic><topic>Informàtica</topic><topic>Knowledge base</topic><topic>RDF</topic><topic>Semantic computing</topic><topic>Semantic integration</topic><topic>Semantic-aware</topic><topic>Sistemes d'informació</topic><topic>Sistemes experts (Informàtica)</topic><topic>Àrees temàtiques de la UPC</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Deb Nath, Rudra Pratap</creatorcontrib><creatorcontrib>Hose, Katja</creatorcontrib><creatorcontrib>Pedersen, Torben Bach</creatorcontrib><creatorcontrib>Romero, Oscar</creatorcontrib><collection>CrossRef</collection><collection>Recercat</collection><jtitle>Information systems (Oxford)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Deb Nath, Rudra Pratap</au><au>Hose, Katja</au><au>Pedersen, Torben Bach</au><au>Romero, Oscar</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>SETL: A programmable semantic extract-transform-load framework for semantic data warehouses</atitle><jtitle>Information systems (Oxford)</jtitle><date>2017-08-01</date><risdate>2017</risdate><volume>68</volume><spage>17</spage><epage>43</epage><pages>17-43</pages><issn>0306-4379</issn><eissn>1873-6076</eissn><abstract>•This paper describes our programmable Semantic ETL (SETL) framework. SETL builds on Semantic Web (SW) standards and tools.•SETL provides a number of powerful modules, classes, and methods for (dimensional and semantic) DW constructs and tasks.•SETL supports semantic and traditional data sources, semantic integration, and creating or publishing a (MD) semantic DW.•Using SETL, we perform a comprehensive experimental evaluation by producing a MD semantic DW that integrates a semantic and non semantic data sources.•The evaluation shows that SETL improves considerably over the competing solutions/tools in terms of productivity, KB quality, and performance. In order to create better decisions for business analytics, organizations increasingly use external structured, semi-structured, and unstructured data in addition to the (mostly structured) internal data. Current Extract-Transform-Load (ETL) tools are not suitable for this “open world scenario” because they do not consider semantic issues in the integration processing. Current ETL tools neither support processing semantic data nor create a semantic Data Warehouse (DW), a repository of semantically integrated data. This paper describes our programmable Semantic ETL (SETL) framework. SETL builds on Semantic Web (SW) standards and tools and supports developers by offering a number of powerful modules, classes, and methods for (dimensional and semantic) DW constructs and tasks. Thus it supports semantic data sources in addition to traditional data sources, semantic integration, and creating or publishing a semantic (multidimensional) DW in terms of a knowledge base. A comprehensive experimental evaluation comparing SETL to a solution made with traditional tools (requiring much more hand-coding) on a concrete use case, shows that SETL provides better programmer productivity, knowledge base quality, and performance.</abstract><pub>Elsevier Ltd</pub><doi>10.1016/j.is.2017.01.005</doi><tpages>27</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0306-4379
ispartof Information systems (Oxford), 2017-08, Vol.68, p.17-43
issn 0306-4379
1873-6076
language eng
recordid cdi_csuc_recercat_oai_recercat_cat_2072_305211
source Recercat; Elsevier ScienceDirect Journals
subjects Data warehouse
Data warehousing
ETL
Expert systems (Computer science)
Gestor de dades
Informàtica
Knowledge base
RDF
Semantic computing
Semantic integration
Semantic-aware
Sistemes d'informació
Sistemes experts (Informàtica)
Àrees temàtiques de la UPC
title SETL: A programmable semantic extract-transform-load framework for semantic data warehouses
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-21T21%3A56%3A14IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-csuc_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=SETL:%20A%20programmable%20semantic%20extract-transform-load%20framework%20for%20semantic%20data%20warehouses&rft.jtitle=Information%20systems%20(Oxford)&rft.au=Deb%20Nath,%20Rudra%20Pratap&rft.date=2017-08-01&rft.volume=68&rft.spage=17&rft.epage=43&rft.pages=17-43&rft.issn=0306-4379&rft.eissn=1873-6076&rft_id=info:doi/10.1016/j.is.2017.01.005&rft_dat=%3Ccsuc_cross%3Eoai_recercat_cat_2072_305211%3C/csuc_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_els_id=S0306437916302101&rfr_iscdi=true