Summary graphs for relational database schemas

Increasingly complex databases need ever more sophisticated tools to help users understand their schemas and interact with the data. Existing tools fall short of either providing the "big picture," or of presenting useful connectivity information. In this paper we define summary graphs, a...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Proceedings of the VLDB Endowment 2011-08, Vol.4 (11), p.899-910
Hauptverfasser: Yang, Xiaoyan, Procopiuc, Cecilia M., Srivastava, Divesh
Format: Artikel
Sprache:eng
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Increasingly complex databases need ever more sophisticated tools to help users understand their schemas and interact with the data. Existing tools fall short of either providing the "big picture," or of presenting useful connectivity information. In this paper we define summary graphs, a novel approach for summarizing schemas. Given a set of user-specified query tables, the summary graph automatically computes the most relevant tables and joins for that query set. The output preserves the most informative join paths between the query tables, while meeting size constraints. In the process, we define a novel information-theoretic measure over join edges. Unlike most subgraph extraction work, we allow metaedges (i.e., edges in the transitive closure) to help reduce output complexity. We prove that the problem is NP-Hard, and solve it as an integer program. Our extensive experimental study shows that our method returns high-quality summaries under independent quality measures.
ISSN:2150-8097
2150-8097
DOI:10.14778/3402707.3402728