You say 'what', i hear 'where' and 'why': (mis-)interpreting SQL to derive fine-grained provenance

SQL declaratively specifies what the desired output of a query is. This work shows that a non-standard interpretation of the SQL semantics can, instead, disclose where a piece of the output originated in the input and why that piece found its way into the result. We derive such data provenance for v...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Proceedings of the VLDB Endowment 2018-07, Vol.11 (11), p.1536-1549
Hauptverfasser: Müller, Tobias, Dietrich, Benjamin, Grust, Torsten
Format: Artikel
Sprache:eng
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:SQL declaratively specifies what the desired output of a query is. This work shows that a non-standard interpretation of the SQL semantics can, instead, disclose where a piece of the output originated in the input and why that piece found its way into the result. We derive such data provenance for very rich SQL dialects---including recursion, windowed aggregates, and user-defined functions---at the fine-grained level of individual table cells. The approach is non-invasive and implemented as a compositional source-level SQL rewrite: an input SQL query is transformed into its own interpreter that wields data dependencies instead of regular values. We deliberately design this transformation to preserve the shape of both data and query, which allows provenance derivation to scale to complex queries without overwhelming the underlying database system.
ISSN:2150-8097
2150-8097
DOI:10.14778/3236187.3236204