Incremental discovery of denial constraints

We investigate the problem of incremental denial constraint (DC) discovery, aiming at discovering DCs in response to a set ▵ r of tuple insertions to a given relational instance r and the known set Σ of DCs holding on r . The need for the study is evident since real-life data are often frequently up...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	The VLDB journal 2023-11, Vol.32 (6), p.1289-1313
Hauptverfasser:	Qian, Chaoqin, Li, Menglu, Tan, Zijing, Ran, Ai, Ma, Shuai
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Computer Science Database Management Indexing Regular Paper
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	1313
container_issue	6
container_start_page	1289
container_title	The VLDB journal
container_volume	32
creator	Qian, Chaoqin Li, Menglu Tan, Zijing Ran, Ai Ma, Shuai
description	We investigate the problem of incremental denial constraint (DC) discovery, aiming at discovering DCs in response to a set ▵ r of tuple insertions to a given relational instance r and the known set Σ of DCs holding on r . The need for the study is evident since real-life data are often frequently updated, and it is often prohibitively expensive to perform DC discovery from scratch for every update. We tackle this problem with two steps. We first employ indexing techniques to efficiently identify the incremental evidences caused by ▵ r . We present algorithms to build indexes for Σ and r in the pre-processing step, and to visit and update indexes in response to ▵ r. In particular, we propose a novel indexing technique for two inequality comparisons possibly across the attributes of r . By leveraging the indexes, we can identify all the tuple pairs incurred by ▵ r that simultaneously satisfy the two comparisons, with a cost dependent on log( \| r \| ). We then compute the changes ▵ Σ to Σ based on the incremental evidences, such that Σ ⊕ ▵ Σ is the set of DCs holding on r + ▵ r . ▵ Σ may contain new DCs that are added into Σ and obsolete DCs that are removed from Σ . Our experimental evaluations show that our incremental approach is faster than the two state-of-the-art batch DC discovery approaches that compute from scratch on r + ▵ r by orders of magnitude, even when ▵ r is up to 30% of r .
doi_str_mv	10.1007/s00778-023-00788-y
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2879581498</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2879581498</sourcerecordid><originalsourceid>FETCH-LOGICAL-c270t-b9237addbe05a88841f6a9aab2349aaafd4bc392f62efc2a04741a16a751f0df3</originalsourceid><addsrcrecordid>eNp9UE1LxDAQDaJgXf0DngoeJTpJ0yY5yuLHwoIXBW9hmibSZTddk67Qf2-0gjfnMG8Y3gc8Qi4Z3DAAeZvykooCr2i-lKLTESlAC02VlG_HpGDQNFTlOSVnKW0AgHNeF-R6FWx0OxdG3JZdn-zw6eJUDr7sXOjzzw4hjRH7MKZzcuJxm9zFLy7I68P9y_KJrp8fV8u7NbVcwkhbzSuJXdc6qDEnCuYb1Igtr0QG9J1obaW5b7jzliMIKRiyBmXNPHS-WpCr2Xcfh4-DS6PZDIcYcqThSupaMaFVZvGZZeOQUnTe7GO_wzgZBua7FDOXYnIp5qcUM2VRNYtSJod3F_-s_1F9AVdwZZs</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2879581498</pqid></control><display><type>article</type><title>Incremental discovery of denial constraints</title><source>ACM Digital Library</source><source>SpringerLink (Online service)</source><creator>Qian, Chaoqin ; Li, Menglu ; Tan, Zijing ; Ran, Ai ; Ma, Shuai</creator><creatorcontrib>Qian, Chaoqin ; Li, Menglu ; Tan, Zijing ; Ran, Ai ; Ma, Shuai</creatorcontrib><description>We investigate the problem of incremental denial constraint (DC) discovery, aiming at discovering DCs in response to a set ▵ r of tuple insertions to a given relational instance r and the known set Σ of DCs holding on r . The need for the study is evident since real-life data are often frequently updated, and it is often prohibitively expensive to perform DC discovery from scratch for every update. We tackle this problem with two steps. We first employ indexing techniques to efficiently identify the incremental evidences caused by ▵ r . We present algorithms to build indexes for Σ and r in the pre-processing step, and to visit and update indexes in response to ▵ r. In particular, we propose a novel indexing technique for two inequality comparisons possibly across the attributes of r . By leveraging the indexes, we can identify all the tuple pairs incurred by ▵ r that simultaneously satisfy the two comparisons, with a cost dependent on log( \| r \| ). We then compute the changes ▵ Σ to Σ based on the incremental evidences, such that Σ ⊕ ▵ Σ is the set of DCs holding on r + ▵ r . ▵ Σ may contain new DCs that are added into Σ and obsolete DCs that are removed from Σ . Our experimental evaluations show that our incremental approach is faster than the two state-of-the-art batch DC discovery approaches that compute from scratch on r + ▵ r by orders of magnitude, even when ▵ r is up to 30% of r .</description><identifier>ISSN: 1066-8888</identifier><identifier>EISSN: 0949-877X</identifier><identifier>DOI: 10.1007/s00778-023-00788-y</identifier><language>eng</language><publisher>Berlin/Heidelberg: Springer Berlin Heidelberg</publisher><subject>Algorithms ; Computer Science ; Database Management ; Indexing ; Regular Paper</subject><ispartof>The VLDB journal, 2023-11, Vol.32 (6), p.1289-1313</ispartof><rights>The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2023. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.</rights><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c270t-b9237addbe05a88841f6a9aab2349aaafd4bc392f62efc2a04741a16a751f0df3</cites><orcidid>0000-0001-6332-780X</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s00778-023-00788-y$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s00778-023-00788-y$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,776,780,27903,27904,41467,42536,51298</link.rule.ids></links><search><creatorcontrib>Qian, Chaoqin</creatorcontrib><creatorcontrib>Li, Menglu</creatorcontrib><creatorcontrib>Tan, Zijing</creatorcontrib><creatorcontrib>Ran, Ai</creatorcontrib><creatorcontrib>Ma, Shuai</creatorcontrib><title>Incremental discovery of denial constraints</title><title>The VLDB journal</title><addtitle>The VLDB Journal</addtitle><description>We investigate the problem of incremental denial constraint (DC) discovery, aiming at discovering DCs in response to a set ▵ r of tuple insertions to a given relational instance r and the known set Σ of DCs holding on r . The need for the study is evident since real-life data are often frequently updated, and it is often prohibitively expensive to perform DC discovery from scratch for every update. We tackle this problem with two steps. We first employ indexing techniques to efficiently identify the incremental evidences caused by ▵ r . We present algorithms to build indexes for Σ and r in the pre-processing step, and to visit and update indexes in response to ▵ r. In particular, we propose a novel indexing technique for two inequality comparisons possibly across the attributes of r . By leveraging the indexes, we can identify all the tuple pairs incurred by ▵ r that simultaneously satisfy the two comparisons, with a cost dependent on log( \| r \| ). We then compute the changes ▵ Σ to Σ based on the incremental evidences, such that Σ ⊕ ▵ Σ is the set of DCs holding on r + ▵ r . ▵ Σ may contain new DCs that are added into Σ and obsolete DCs that are removed from Σ . Our experimental evaluations show that our incremental approach is faster than the two state-of-the-art batch DC discovery approaches that compute from scratch on r + ▵ r by orders of magnitude, even when ▵ r is up to 30% of r .</description><subject>Algorithms</subject><subject>Computer Science</subject><subject>Database Management</subject><subject>Indexing</subject><subject>Regular Paper</subject><issn>1066-8888</issn><issn>0949-877X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><recordid>eNp9UE1LxDAQDaJgXf0DngoeJTpJ0yY5yuLHwoIXBW9hmibSZTddk67Qf2-0gjfnMG8Y3gc8Qi4Z3DAAeZvykooCr2i-lKLTESlAC02VlG_HpGDQNFTlOSVnKW0AgHNeF-R6FWx0OxdG3JZdn-zw6eJUDr7sXOjzzw4hjRH7MKZzcuJxm9zFLy7I68P9y_KJrp8fV8u7NbVcwkhbzSuJXdc6qDEnCuYb1Igtr0QG9J1obaW5b7jzliMIKRiyBmXNPHS-WpCr2Xcfh4-DS6PZDIcYcqThSupaMaFVZvGZZeOQUnTe7GO_wzgZBua7FDOXYnIp5qcUM2VRNYtSJod3F_-s_1F9AVdwZZs</recordid><startdate>20231101</startdate><enddate>20231101</enddate><creator>Qian, Chaoqin</creator><creator>Li, Menglu</creator><creator>Tan, Zijing</creator><creator>Ran, Ai</creator><creator>Ma, Shuai</creator><general>Springer Berlin Heidelberg</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0000-0001-6332-780X</orcidid></search><sort><creationdate>20231101</creationdate><title>Incremental discovery of denial constraints</title><author>Qian, Chaoqin ; Li, Menglu ; Tan, Zijing ; Ran, Ai ; Ma, Shuai</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c270t-b9237addbe05a88841f6a9aab2349aaafd4bc392f62efc2a04741a16a751f0df3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Algorithms</topic><topic>Computer Science</topic><topic>Database Management</topic><topic>Indexing</topic><topic>Regular Paper</topic><toplevel>online_resources</toplevel><creatorcontrib>Qian, Chaoqin</creatorcontrib><creatorcontrib>Li, Menglu</creatorcontrib><creatorcontrib>Tan, Zijing</creatorcontrib><creatorcontrib>Ran, Ai</creatorcontrib><creatorcontrib>Ma, Shuai</creatorcontrib><collection>CrossRef</collection><jtitle>The VLDB journal</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Qian, Chaoqin</au><au>Li, Menglu</au><au>Tan, Zijing</au><au>Ran, Ai</au><au>Ma, Shuai</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Incremental discovery of denial constraints</atitle><jtitle>The VLDB journal</jtitle><stitle>The VLDB Journal</stitle><date>2023-11-01</date><risdate>2023</risdate><volume>32</volume><issue>6</issue><spage>1289</spage><epage>1313</epage><pages>1289-1313</pages><issn>1066-8888</issn><eissn>0949-877X</eissn><abstract>We investigate the problem of incremental denial constraint (DC) discovery, aiming at discovering DCs in response to a set ▵ r of tuple insertions to a given relational instance r and the known set Σ of DCs holding on r . The need for the study is evident since real-life data are often frequently updated, and it is often prohibitively expensive to perform DC discovery from scratch for every update. We tackle this problem with two steps. We first employ indexing techniques to efficiently identify the incremental evidences caused by ▵ r . We present algorithms to build indexes for Σ and r in the pre-processing step, and to visit and update indexes in response to ▵ r. In particular, we propose a novel indexing technique for two inequality comparisons possibly across the attributes of r . By leveraging the indexes, we can identify all the tuple pairs incurred by ▵ r that simultaneously satisfy the two comparisons, with a cost dependent on log( \| r \| ). We then compute the changes ▵ Σ to Σ based on the incremental evidences, such that Σ ⊕ ▵ Σ is the set of DCs holding on r + ▵ r . ▵ Σ may contain new DCs that are added into Σ and obsolete DCs that are removed from Σ . Our experimental evaluations show that our incremental approach is faster than the two state-of-the-art batch DC discovery approaches that compute from scratch on r + ▵ r by orders of magnitude, even when ▵ r is up to 30% of r .</abstract><cop>Berlin/Heidelberg</cop><pub>Springer Berlin Heidelberg</pub><doi>10.1007/s00778-023-00788-y</doi><tpages>25</tpages><orcidid>https://orcid.org/0000-0001-6332-780X</orcidid></addata></record>
fulltext	fulltext
identifier	ISSN: 1066-8888
ispartof	The VLDB journal, 2023-11, Vol.32 (6), p.1289-1313
issn	1066-8888 0949-877X
language	eng
recordid	cdi_proquest_journals_2879581498
source	ACM Digital Library; SpringerLink (Online service)
subjects	Algorithms Computer Science Database Management Indexing Regular Paper
title	Incremental discovery of denial constraints
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-22T16%3A53%3A00IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Incremental%20discovery%20of%20denial%20constraints&rft.jtitle=The%20VLDB%20journal&rft.au=Qian,%20Chaoqin&rft.date=2023-11-01&rft.volume=32&rft.issue=6&rft.spage=1289&rft.epage=1313&rft.pages=1289-1313&rft.issn=1066-8888&rft.eissn=0949-877X&rft_id=info:doi/10.1007/s00778-023-00788-y&rft_dat=%3Cproquest_cross%3E2879581498%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2879581498&rft_id=info:pmid/&rfr_iscdi=true