Independence in Infinite Probabilistic Databases
Probabilistic databases (PDBs) model uncertainty in data. The current standard is to view PDBs as finite probability spaces over relational database instances. Since many attributes in typical databases have infinite domains, such as integers, strings, or real numbers, it is often more natural to vi...
Gespeichert in:
Veröffentlicht in: | Journal of the ACM 2022-10, Vol.69 (5), p.1-42 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 42 |
---|---|
container_issue | 5 |
container_start_page | 1 |
container_title | Journal of the ACM |
container_volume | 69 |
creator | Grohe, Martin Lindner, Peter |
description | Probabilistic databases (PDBs) model uncertainty in data. The current standard is to view PDBs as finite probability spaces over relational database instances. Since many attributes in typical databases have infinite domains, such as integers, strings, or real numbers, it is often more natural to view PDBs as infinite probability spaces over database instances. In this article, we lay the mathematical foundations of infinite probabilistic databases. Our focus then is on independence assumptions. Tuple-independent PDBs play a central role in theory and practice of PDBs. Here we study infinite tuple-independent PDBs as well as related models such as infinite block-independent disjoint PDBs. While the standard model of PDBs focuses on a set-based semantics, we also study tuple-independent PDBs with a bag semantics and independence in PDBs over uncountable fact spaces.
We also propose a new approach to PDBs with an open-world assumption, addressing issues raised by Ceylan et al. (Proc. KR 2016) and generalizing their work, which is still rooted in finite tuple-independent PDBs.
Moreover, for countable PDBs we propose an approximate query answering algorithm. |
doi_str_mv | 10.1145/3549525 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2729962425</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2729962425</sourcerecordid><originalsourceid>FETCH-LOGICAL-c145t-262c11627c8710a31873a89bd417d7516fed1b73e8bc22959d0959c3e89d94813</originalsourceid><addsrcrecordid>eNotUEtLxDAYDKJgXcW_UPDgqZovjyY5yvoqLOhBwVvIq5BlTWvSPfjvjexeZhgYZoZB6BrwHQDj95QzxQk_QQ1wLjpB-dcpajDGrOMM4BxdlLKtEhMsGoSH5MMcKiQX2pjaIY0xxSW073myxsZdLEt07aNZjDUllEt0NppdCVdHXqHP56eP9Wu3eXsZ1g-bztURS0d64gB6IpwUgA0FKaiRynoGwgsO_Rg8WEGDtI4QxZXHFVzVyismga7QzSF3ztPPPpRFb6d9TrVSE0GU6gkjvLpuDy6Xp1JyGPWc47fJvxqw_r9DH--gf1KgTsM</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2729962425</pqid></control><display><type>article</type><title>Independence in Infinite Probabilistic Databases</title><source>ACM Digital Library</source><creator>Grohe, Martin ; Lindner, Peter</creator><creatorcontrib>Grohe, Martin ; Lindner, Peter</creatorcontrib><description>Probabilistic databases (PDBs) model uncertainty in data. The current standard is to view PDBs as finite probability spaces over relational database instances. Since many attributes in typical databases have infinite domains, such as integers, strings, or real numbers, it is often more natural to view PDBs as infinite probability spaces over database instances. In this article, we lay the mathematical foundations of infinite probabilistic databases. Our focus then is on independence assumptions. Tuple-independent PDBs play a central role in theory and practice of PDBs. Here we study infinite tuple-independent PDBs as well as related models such as infinite block-independent disjoint PDBs. While the standard model of PDBs focuses on a set-based semantics, we also study tuple-independent PDBs with a bag semantics and independence in PDBs over uncountable fact spaces.
We also propose a new approach to PDBs with an open-world assumption, addressing issues raised by Ceylan et al. (Proc. KR 2016) and generalizing their work, which is still rooted in finite tuple-independent PDBs.
Moreover, for countable PDBs we propose an approximate query answering algorithm.</description><identifier>ISSN: 0004-5411</identifier><identifier>EISSN: 1557-735X</identifier><identifier>DOI: 10.1145/3549525</identifier><language>eng</language><publisher>New York: Association for Computing Machinery</publisher><subject>Algorithms ; Real numbers ; Relational data bases ; Semantics ; Statistical analysis</subject><ispartof>Journal of the ACM, 2022-10, Vol.69 (5), p.1-42</ispartof><rights>Copyright Association for Computing Machinery Oct 2022</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c145t-262c11627c8710a31873a89bd417d7516fed1b73e8bc22959d0959c3e89d94813</cites><orcidid>0000-0002-0292-9142 ; 0000-0003-2041-7201</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,776,780,27901,27902</link.rule.ids></links><search><creatorcontrib>Grohe, Martin</creatorcontrib><creatorcontrib>Lindner, Peter</creatorcontrib><title>Independence in Infinite Probabilistic Databases</title><title>Journal of the ACM</title><description>Probabilistic databases (PDBs) model uncertainty in data. The current standard is to view PDBs as finite probability spaces over relational database instances. Since many attributes in typical databases have infinite domains, such as integers, strings, or real numbers, it is often more natural to view PDBs as infinite probability spaces over database instances. In this article, we lay the mathematical foundations of infinite probabilistic databases. Our focus then is on independence assumptions. Tuple-independent PDBs play a central role in theory and practice of PDBs. Here we study infinite tuple-independent PDBs as well as related models such as infinite block-independent disjoint PDBs. While the standard model of PDBs focuses on a set-based semantics, we also study tuple-independent PDBs with a bag semantics and independence in PDBs over uncountable fact spaces.
We also propose a new approach to PDBs with an open-world assumption, addressing issues raised by Ceylan et al. (Proc. KR 2016) and generalizing their work, which is still rooted in finite tuple-independent PDBs.
Moreover, for countable PDBs we propose an approximate query answering algorithm.</description><subject>Algorithms</subject><subject>Real numbers</subject><subject>Relational data bases</subject><subject>Semantics</subject><subject>Statistical analysis</subject><issn>0004-5411</issn><issn>1557-735X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><recordid>eNotUEtLxDAYDKJgXcW_UPDgqZovjyY5yvoqLOhBwVvIq5BlTWvSPfjvjexeZhgYZoZB6BrwHQDj95QzxQk_QQ1wLjpB-dcpajDGrOMM4BxdlLKtEhMsGoSH5MMcKiQX2pjaIY0xxSW073myxsZdLEt07aNZjDUllEt0NppdCVdHXqHP56eP9Wu3eXsZ1g-bztURS0d64gB6IpwUgA0FKaiRynoGwgsO_Rg8WEGDtI4QxZXHFVzVyismga7QzSF3ztPPPpRFb6d9TrVSE0GU6gkjvLpuDy6Xp1JyGPWc47fJvxqw_r9DH--gf1KgTsM</recordid><startdate>20221001</startdate><enddate>20221001</enddate><creator>Grohe, Martin</creator><creator>Lindner, Peter</creator><general>Association for Computing Machinery</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-0292-9142</orcidid><orcidid>https://orcid.org/0000-0003-2041-7201</orcidid></search><sort><creationdate>20221001</creationdate><title>Independence in Infinite Probabilistic Databases</title><author>Grohe, Martin ; Lindner, Peter</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c145t-262c11627c8710a31873a89bd417d7516fed1b73e8bc22959d0959c3e89d94813</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Algorithms</topic><topic>Real numbers</topic><topic>Relational data bases</topic><topic>Semantics</topic><topic>Statistical analysis</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Grohe, Martin</creatorcontrib><creatorcontrib>Lindner, Peter</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Journal of the ACM</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Grohe, Martin</au><au>Lindner, Peter</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Independence in Infinite Probabilistic Databases</atitle><jtitle>Journal of the ACM</jtitle><date>2022-10-01</date><risdate>2022</risdate><volume>69</volume><issue>5</issue><spage>1</spage><epage>42</epage><pages>1-42</pages><issn>0004-5411</issn><eissn>1557-735X</eissn><abstract>Probabilistic databases (PDBs) model uncertainty in data. The current standard is to view PDBs as finite probability spaces over relational database instances. Since many attributes in typical databases have infinite domains, such as integers, strings, or real numbers, it is often more natural to view PDBs as infinite probability spaces over database instances. In this article, we lay the mathematical foundations of infinite probabilistic databases. Our focus then is on independence assumptions. Tuple-independent PDBs play a central role in theory and practice of PDBs. Here we study infinite tuple-independent PDBs as well as related models such as infinite block-independent disjoint PDBs. While the standard model of PDBs focuses on a set-based semantics, we also study tuple-independent PDBs with a bag semantics and independence in PDBs over uncountable fact spaces.
We also propose a new approach to PDBs with an open-world assumption, addressing issues raised by Ceylan et al. (Proc. KR 2016) and generalizing their work, which is still rooted in finite tuple-independent PDBs.
Moreover, for countable PDBs we propose an approximate query answering algorithm.</abstract><cop>New York</cop><pub>Association for Computing Machinery</pub><doi>10.1145/3549525</doi><tpages>42</tpages><orcidid>https://orcid.org/0000-0002-0292-9142</orcidid><orcidid>https://orcid.org/0000-0003-2041-7201</orcidid></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0004-5411 |
ispartof | Journal of the ACM, 2022-10, Vol.69 (5), p.1-42 |
issn | 0004-5411 1557-735X |
language | eng |
recordid | cdi_proquest_journals_2729962425 |
source | ACM Digital Library |
subjects | Algorithms Real numbers Relational data bases Semantics Statistical analysis |
title | Independence in Infinite Probabilistic Databases |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-01T21%3A01%3A19IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Independence%20in%20Infinite%20Probabilistic%20Databases&rft.jtitle=Journal%20of%20the%20ACM&rft.au=Grohe,%20Martin&rft.date=2022-10-01&rft.volume=69&rft.issue=5&rft.spage=1&rft.epage=42&rft.pages=1-42&rft.issn=0004-5411&rft.eissn=1557-735X&rft_id=info:doi/10.1145/3549525&rft_dat=%3Cproquest_cross%3E2729962425%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2729962425&rft_id=info:pmid/&rfr_iscdi=true |