Computational methods and optimizations for containment and complementarity in web data cubes

•Definitions of full containment, partial containment and complementarity for RDF data cubes.•Presentation of baseline quadratic method for computation of the defined relationships.•Presentation of three alternative methods for efficient computation of containment and complementarity relationships.•...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Information systems (Oxford) 2018-06, Vol.75, p.56-74
Hauptverfasser:	Meimaris, Marios, Papastefanatos, George, Vassiliadis, Panos, Anagnostopoulos, Ioannis
Format:	Artikel
Sprache:	eng
Schlagworte:	Analytics Containment Cubes Datasets elsarticle.cls Elsevier Hierarchies Information sharing Information systems LaTeX Multidimensional data Optimization Semantic web Template Vocabularies & taxonomies
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	74
container_issue
container_start_page	56
container_title	Information systems (Oxford)
container_volume	75
creator	Meimaris, Marios Papastefanatos, George Vassiliadis, Panos Anagnostopoulos, Ioannis
description	•Definitions of full containment, partial containment and complementarity for RDF data cubes.•Presentation of baseline quadratic method for computation of the defined relationships.•Presentation of three alternative methods for efficient computation of containment and complementarity relationships.•Experimental evaluation of efficiency and scalability on real world and synthetic data. The increasing availability of diverse multidimensional data on the web has led to the creation and adoption of common vocabularies and practices that facilitate sharing, aggregating and reusing data from remote origins. One prominent example in the Web of Data is the RDF Data Cube vocabulary, which has recently attracted great attention from the industrial, government and academic sectors as the de facto representational model for publishing open multidimensional data. As a result, different datasets share terms from common code lists and hierarchies, this way creating an implicit relatedness between independent sources. Identifying and analyzing relationships between disparate data sources is a major prerequisite for enabling traditional business analytics at the web scale. However, discovery of instance-level relationships between datasets becomes a computationally costly procedure, as typically all pairs of records must be compared. In this paper, we define three types of relationships between multidimensional observations, namely full containment, partial containment and complementarity, and we propose four methods for efficient and scalable computation of these relationships. We conduct an extensive experimental evaluation over both real and synthetic datasets, comparing with traditional query-based and inference-based alternatives, and we show how our methods provide efficient and scalable solutions.
doi_str_mv	10.1016/j.is.2018.02.010
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2073135531</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S030643791730488X</els_id><sourcerecordid>2073135531</sourcerecordid><originalsourceid>FETCH-LOGICAL-c375t-6817df48d046d490c4e1500f20fe826a4b30ca8f292c4c9574cf36a6cbba7dc13</originalsourceid><addsrcrecordid>eNp1kE1LxDAQhoMouK7ePQY8t06SNmm9yeIXLHjRo4Q0STFl29QkVfTX2-569TQM8zzDzIvQJYGcAOHXXe5iToFUOdAcCByhFakEyzgIfoxWwIBnBRP1KTqLsQMAWtb1Cr1tfD9OSSXnB7XDvU3v3kSsBoP9mFzvfvajiFsfsPZDUm7o7ZD2hJ7dnV1aFVz6xm7AX7bBRiWF9dTYeI5OWrWL9uKvrtHr_d3L5jHbPj88bW63mWaiTBmviDBtURkouClq0IUlJUBLobUV5apoGGhVtbSmutB1KQrdMq64bholjCZsja4Oe8fgPyYbk-z8FOaHoqQgGGFlyRYKDpQOPsZgWzkG16vwLQnIJUTZSbcYpJJA5RzirNwcFDtf_-lskFE7O2hrXLA6SePd__Iv8W16lA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2073135531</pqid></control><display><type>article</type><title>Computational methods and optimizations for containment and complementarity in web data cubes</title><source>ScienceDirect Journals (5 years ago - present)</source><creator>Meimaris, Marios ; Papastefanatos, George ; Vassiliadis, Panos ; Anagnostopoulos, Ioannis</creator><creatorcontrib>Meimaris, Marios ; Papastefanatos, George ; Vassiliadis, Panos ; Anagnostopoulos, Ioannis</creatorcontrib><description>•Definitions of full containment, partial containment and complementarity for RDF data cubes.•Presentation of baseline quadratic method for computation of the defined relationships.•Presentation of three alternative methods for efficient computation of containment and complementarity relationships.•Experimental evaluation of efficiency and scalability on real world and synthetic data. The increasing availability of diverse multidimensional data on the web has led to the creation and adoption of common vocabularies and practices that facilitate sharing, aggregating and reusing data from remote origins. One prominent example in the Web of Data is the RDF Data Cube vocabulary, which has recently attracted great attention from the industrial, government and academic sectors as the de facto representational model for publishing open multidimensional data. As a result, different datasets share terms from common code lists and hierarchies, this way creating an implicit relatedness between independent sources. Identifying and analyzing relationships between disparate data sources is a major prerequisite for enabling traditional business analytics at the web scale. However, discovery of instance-level relationships between datasets becomes a computationally costly procedure, as typically all pairs of records must be compared. In this paper, we define three types of relationships between multidimensional observations, namely full containment, partial containment and complementarity, and we propose four methods for efficient and scalable computation of these relationships. We conduct an extensive experimental evaluation over both real and synthetic datasets, comparing with traditional query-based and inference-based alternatives, and we show how our methods provide efficient and scalable solutions.</description><identifier>ISSN: 0306-4379</identifier><identifier>EISSN: 1873-6076</identifier><identifier>DOI: 10.1016/j.is.2018.02.010</identifier><language>eng</language><publisher>Oxford: Elsevier Ltd</publisher><subject>Analytics ; Containment ; Cubes ; Datasets ; elsarticle.cls ; Elsevier ; Hierarchies ; Information sharing ; Information systems ; LaTeX ; Multidimensional data ; Optimization ; Semantic web ; Template ; Vocabularies & taxonomies</subject><ispartof>Information systems (Oxford), 2018-06, Vol.75, p.56-74</ispartof><rights>2018 Elsevier Ltd</rights><rights>Copyright Elsevier Science Ltd. Jun 2018</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c375t-6817df48d046d490c4e1500f20fe826a4b30ca8f292c4c9574cf36a6cbba7dc13</citedby><cites>FETCH-LOGICAL-c375t-6817df48d046d490c4e1500f20fe826a4b30ca8f292c4c9574cf36a6cbba7dc13</cites><orcidid>0000-0002-9273-9843 ; 0000-0003-0085-6776</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://dx.doi.org/10.1016/j.is.2018.02.010$$EHTML$$P50$$Gelsevier$$H</linktohtml><link.rule.ids>314,778,782,3539,27911,27912,45982</link.rule.ids></links><search><creatorcontrib>Meimaris, Marios</creatorcontrib><creatorcontrib>Papastefanatos, George</creatorcontrib><creatorcontrib>Vassiliadis, Panos</creatorcontrib><creatorcontrib>Anagnostopoulos, Ioannis</creatorcontrib><title>Computational methods and optimizations for containment and complementarity in web data cubes</title><title>Information systems (Oxford)</title><description>•Definitions of full containment, partial containment and complementarity for RDF data cubes.•Presentation of baseline quadratic method for computation of the defined relationships.•Presentation of three alternative methods for efficient computation of containment and complementarity relationships.•Experimental evaluation of efficiency and scalability on real world and synthetic data. The increasing availability of diverse multidimensional data on the web has led to the creation and adoption of common vocabularies and practices that facilitate sharing, aggregating and reusing data from remote origins. One prominent example in the Web of Data is the RDF Data Cube vocabulary, which has recently attracted great attention from the industrial, government and academic sectors as the de facto representational model for publishing open multidimensional data. As a result, different datasets share terms from common code lists and hierarchies, this way creating an implicit relatedness between independent sources. Identifying and analyzing relationships between disparate data sources is a major prerequisite for enabling traditional business analytics at the web scale. However, discovery of instance-level relationships between datasets becomes a computationally costly procedure, as typically all pairs of records must be compared. In this paper, we define three types of relationships between multidimensional observations, namely full containment, partial containment and complementarity, and we propose four methods for efficient and scalable computation of these relationships. We conduct an extensive experimental evaluation over both real and synthetic datasets, comparing with traditional query-based and inference-based alternatives, and we show how our methods provide efficient and scalable solutions.</description><subject>Analytics</subject><subject>Containment</subject><subject>Cubes</subject><subject>Datasets</subject><subject>elsarticle.cls</subject><subject>Elsevier</subject><subject>Hierarchies</subject><subject>Information sharing</subject><subject>Information systems</subject><subject>LaTeX</subject><subject>Multidimensional data</subject><subject>Optimization</subject><subject>Semantic web</subject><subject>Template</subject><subject>Vocabularies & taxonomies</subject><issn>0306-4379</issn><issn>1873-6076</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><recordid>eNp1kE1LxDAQhoMouK7ePQY8t06SNmm9yeIXLHjRo4Q0STFl29QkVfTX2-569TQM8zzDzIvQJYGcAOHXXe5iToFUOdAcCByhFakEyzgIfoxWwIBnBRP1KTqLsQMAWtb1Cr1tfD9OSSXnB7XDvU3v3kSsBoP9mFzvfvajiFsfsPZDUm7o7ZD2hJ7dnV1aFVz6xm7AX7bBRiWF9dTYeI5OWrWL9uKvrtHr_d3L5jHbPj88bW63mWaiTBmviDBtURkouClq0IUlJUBLobUV5apoGGhVtbSmutB1KQrdMq64bholjCZsja4Oe8fgPyYbk-z8FOaHoqQgGGFlyRYKDpQOPsZgWzkG16vwLQnIJUTZSbcYpJJA5RzirNwcFDtf_-lskFE7O2hrXLA6SePd__Iv8W16lA</recordid><startdate>20180601</startdate><enddate>20180601</enddate><creator>Meimaris, Marios</creator><creator>Papastefanatos, George</creator><creator>Vassiliadis, Panos</creator><creator>Anagnostopoulos, Ioannis</creator><general>Elsevier Ltd</general><general>Elsevier Science Ltd</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>E3H</scope><scope>F2A</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-9273-9843</orcidid><orcidid>https://orcid.org/0000-0003-0085-6776</orcidid></search><sort><creationdate>20180601</creationdate><title>Computational methods and optimizations for containment and complementarity in web data cubes</title><author>Meimaris, Marios ; Papastefanatos, George ; Vassiliadis, Panos ; Anagnostopoulos, Ioannis</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c375t-6817df48d046d490c4e1500f20fe826a4b30ca8f292c4c9574cf36a6cbba7dc13</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Analytics</topic><topic>Containment</topic><topic>Cubes</topic><topic>Datasets</topic><topic>elsarticle.cls</topic><topic>Elsevier</topic><topic>Hierarchies</topic><topic>Information sharing</topic><topic>Information systems</topic><topic>LaTeX</topic><topic>Multidimensional data</topic><topic>Optimization</topic><topic>Semantic web</topic><topic>Template</topic><topic>Vocabularies & taxonomies</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Meimaris, Marios</creatorcontrib><creatorcontrib>Papastefanatos, George</creatorcontrib><creatorcontrib>Vassiliadis, Panos</creatorcontrib><creatorcontrib>Anagnostopoulos, Ioannis</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>Library & Information Sciences Abstracts (LISA)</collection><collection>Library & Information Science Abstracts (LISA)</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Information systems (Oxford)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Meimaris, Marios</au><au>Papastefanatos, George</au><au>Vassiliadis, Panos</au><au>Anagnostopoulos, Ioannis</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Computational methods and optimizations for containment and complementarity in web data cubes</atitle><jtitle>Information systems (Oxford)</jtitle><date>2018-06-01</date><risdate>2018</risdate><volume>75</volume><spage>56</spage><epage>74</epage><pages>56-74</pages><issn>0306-4379</issn><eissn>1873-6076</eissn><abstract>•Definitions of full containment, partial containment and complementarity for RDF data cubes.•Presentation of baseline quadratic method for computation of the defined relationships.•Presentation of three alternative methods for efficient computation of containment and complementarity relationships.•Experimental evaluation of efficiency and scalability on real world and synthetic data. The increasing availability of diverse multidimensional data on the web has led to the creation and adoption of common vocabularies and practices that facilitate sharing, aggregating and reusing data from remote origins. One prominent example in the Web of Data is the RDF Data Cube vocabulary, which has recently attracted great attention from the industrial, government and academic sectors as the de facto representational model for publishing open multidimensional data. As a result, different datasets share terms from common code lists and hierarchies, this way creating an implicit relatedness between independent sources. Identifying and analyzing relationships between disparate data sources is a major prerequisite for enabling traditional business analytics at the web scale. However, discovery of instance-level relationships between datasets becomes a computationally costly procedure, as typically all pairs of records must be compared. In this paper, we define three types of relationships between multidimensional observations, namely full containment, partial containment and complementarity, and we propose four methods for efficient and scalable computation of these relationships. We conduct an extensive experimental evaluation over both real and synthetic datasets, comparing with traditional query-based and inference-based alternatives, and we show how our methods provide efficient and scalable solutions.</abstract><cop>Oxford</cop><pub>Elsevier Ltd</pub><doi>10.1016/j.is.2018.02.010</doi><tpages>19</tpages><orcidid>https://orcid.org/0000-0002-9273-9843</orcidid><orcidid>https://orcid.org/0000-0003-0085-6776</orcidid></addata></record>
fulltext	fulltext
identifier	ISSN: 0306-4379
ispartof	Information systems (Oxford), 2018-06, Vol.75, p.56-74
issn	0306-4379 1873-6076
language	eng
recordid	cdi_proquest_journals_2073135531
source	ScienceDirect Journals (5 years ago - present)
subjects	Analytics Containment Cubes Datasets elsarticle.cls Elsevier Hierarchies Information sharing Information systems LaTeX Multidimensional data Optimization Semantic web Template Vocabularies & taxonomies
title	Computational methods and optimizations for containment and complementarity in web data cubes
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-15T15%3A12%3A16IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Computational%20methods%20and%20optimizations%20for%20containment%20and%20complementarity%20in%20web%20data%20cubes&rft.jtitle=Information%20systems%20(Oxford)&rft.au=Meimaris,%20Marios&rft.date=2018-06-01&rft.volume=75&rft.spage=56&rft.epage=74&rft.pages=56-74&rft.issn=0306-4379&rft.eissn=1873-6076&rft_id=info:doi/10.1016/j.is.2018.02.010&rft_dat=%3Cproquest_cross%3E2073135531%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2073135531&rft_id=info:pmid/&rft_els_id=S030643791730488X&rfr_iscdi=true