Computing expected multiplicities for bag-TIDBs with bounded multiplicities
In this work, we study the problem of computing a tuple's expected multiplicity over probabilistic databases with bag semantics (where each tuple is associated with a multiplicity) exactly and approximately. We consider bag-TIDBs where we have a bound $c$ on the maximum multiplicity of each tup...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | In this work, we study the problem of computing a tuple's expected
multiplicity over probabilistic databases with bag semantics (where each tuple
is associated with a multiplicity) exactly and approximately. We consider
bag-TIDBs where we have a bound $c$ on the maximum multiplicity of each tuple
and tuples are independent probabilistic events (we refer to such databases as
c-TIDBs. We are specifically interested in the fine-grained complexity of
computing expected multiplicities and how it compares to the complexity of
deterministic query evaluation algorithms -- if these complexities are
comparable, it opens the door to practical deployment of probabilistic
databases. Unfortunately, our results imply that computing expected
multiplicities for c-TIDBs based on the results produced by such query
evaluation algorithms introduces super-linear overhead (under parameterized
complexity hardness assumptions/conjectures). We proceed to study approximation
of expected result tuple multiplicities for positive relational algebra queries
($RA^+$) over c-TIDBs and for a non-trivial subclass of block-independent
databases (BIDBs). We develop a sampling algorithm that computes a
1$\pm\epsilon$ approximation of the expected multiplicity of an output tuple in
time linear in the runtime of the corresponding deterministic query for any
$RA^+$ query. |
---|---|
DOI: | 10.48550/arxiv.2204.02758 |