Tempura: a general cost-based optimizer framework for incremental data processing (Journal Version)

Incremental processing is widely adopted in many applications, ranging from incremental view maintenance, stream computing, to recently emerging progressive data warehouse and intermittent query processing. Despite many algorithms developed on this topic, none of them can produce an incremental plan...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:The VLDB journal 2023-11, Vol.32 (6), p.1315-1342
Hauptverfasser: Wang, Zuozhi, Zeng, Kai, Huang, Botong, Chen, Wei, Cui, Xiaozong, Wang, Bo, Liu, Ji, Fan, Liya, Qu, Dachuan, Hou, Zhenyu, Guan, Tao, Li, Chen, Zhou, Jingren
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Incremental processing is widely adopted in many applications, ranging from incremental view maintenance, stream computing, to recently emerging progressive data warehouse and intermittent query processing. Despite many algorithms developed on this topic, none of them can produce an incremental plan that always achieves the best performance, since the optimal plan is data dependent. In this paper, we develop a novel cost-based optimizer framework, called Tempura, for optimizing incremental data processing. We propose an incremental query planning model called TIP based on the concept of time-varying relations, which can formally model incremental processing in its most general form. We give a full specification of Tempura, which can not only unify various existing techniques to generate an optimal incremental plan, but also allow the developer to add their rewrite rules. We study how to explore the plan space and search for an optimal incremental plan. We evaluate Tempura  in various incremental processing scenarios to show its effectiveness and efficiency.
ISSN:1066-8888
0949-877X
DOI:10.1007/s00778-023-00785-1