WindFlow: High-Speed Continuous Stream Processing With Parallel Building Blocks

Nowadays, we are witnessing the diffusion of Stream Processing Systems (SPSs) able to analyze data streams in near realtime. Traditional SPSs like Storm and Flink target distributed clusters and adopt the continuous streaming model , where inputs are processed as soon as they are available while out...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on parallel and distributed systems 2021-11, Vol.32 (11), p.2748-2763
Hauptverfasser: Mencagli, Gabriele, Torquati, Massimo, Cardaci, Andrea, Fais, Alessandra, Rinaldi, Luca, Danelutto, Marco
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Nowadays, we are witnessing the diffusion of Stream Processing Systems (SPSs) able to analyze data streams in near realtime. Traditional SPSs like Storm and Flink target distributed clusters and adopt the continuous streaming model , where inputs are processed as soon as they are available while outputs are continuously emitted. Recently, there has been a great focus on SPSs for scale-up machines. Some of them (e.g., BriskStream ) still use the continuous model to achieve low latency. Others optimize throughput with batching approaches that are, however, often inadequate to minimize latency for live-streaming applications. Our contribution is to show a novel software engineering approach to design the runtime system of SPSs targeting multicores, with the aim of providing a uniform solution able to optimize throughput and latency. The approach has a formal nature based on the assembly of components called building blocks , whose composition allows optimizations to be easily expressed in a compositional manner. We use this methodology to build a new SPS called WindFlow . Our evaluation showcases the benefits of WindFlow : it provides lower latency than SPSs for continuous streaming, and can be configured to optimize throughput, to perform similarly and even better than batch-based scale-up SPSs.
ISSN:1045-9219
1558-2183
DOI:10.1109/TPDS.2021.3073970