MLPPI Wizard: An Automated Multi-level Partitioning Tool on Analytical Workloads

An important technique used by database administrators (DBAs) is to improve performance in decision-support workloads associated with a Star schema is multi-level partitioning. Queries will then benefit from performance improvements via partition elimination, due to constraints on queries expressed...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:KSII transactions on Internet and information systems 2018, 12(4), , pp.1693-1713
Hauptverfasser: Suh, Young-Kyoon, Crolotte, Alain, Kostamaa, Pekka
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:An important technique used by database administrators (DBAs) is to improve performance in decision-support workloads associated with a Star schema is multi-level partitioning. Queries will then benefit from performance improvements via partition elimination, due to constraints on queries expressed on the dimension tables. As the task of multi-level partitioning can be overwhelming for a DBA we are proposing a wizard that facilitates the task by calculating a partitioning scheme for a particular workload. The system resides completely on a client and interacts with the costing estimation subsystem of the query optimizer via an API over the network, thereby eliminating any need to make changes to the optimizer. In addition, since only cost estimates are needed the wizard overhead is very low. By using a greedy algorithm for search space enumeration over the query predicates in the workload the wizard is efficient with worst-case polynomial complexity. The technology proposed can be applied to any clustering or partitioning scheme in any database management system that provides an interface to the query optimizer. Applied to the Teradata database the technology provides recommendations that outperform a human expert's solution as measured by the total execution time of the workload. We also demonstrate the scalability of our approach when the fact table (and workload) size increases. Keywords: Data warehousing, information systems, star schema, fact table, OLAP, analytical workloads, multi-level partitioning
ISSN:1976-7277
1976-7277
DOI:10.3837/tiis.2018.04.016