Scalable distributed data streaming computations across multiple data processing clusters

An apparatus in one embodiment comprises at least one processing device having a processor coupled to a memory. The processing device is configured to initiate distributed data streaming computations across data processing clusters associated with respective data zones, and in each of the data proce...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Florissi, Patricia Gomes Soares, Masad, Ofri
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:An apparatus in one embodiment comprises at least one processing device having a processor coupled to a memory. The processing device is configured to initiate distributed data streaming computations across data processing clusters associated with respective data zones, and in each of the data processing clusters, to separate a data stream provided by a data source of the corresponding data zone into a plurality of data batches and process the data batches to generate respective result batches. Multiple ones of the data batches across the data processing clusters are associated with a global data batch data structure, and multiple ones of the result batches across the data processing clusters are associated with a global result batch data structure based at least in part on the global data batch data structure. The result batches are processed in accordance with the global result batch data structure to generate one or more global result streams.