Aggregation operations in a distributed database

Query planning in a distributed database that includes a table partitioned into shards according to a sharding criterion and distributed to database instances includes receiving a data-query. The data-query includes a "distinct count" clause on a first column and a "group by" cla...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Jayakumari, Ambareesh Sreekumaran Nair, Anand, Ashok, Gaur, Prateek, Donjerkovic, Donko
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Query planning in a distributed database that includes a table partitioned into shards according to a sharding criterion and distributed to database instances includes receiving a data-query. The data-query includes a "distinct count" clause on a first column and a "group by" clause on least a second column. A query plan is formulated to include respective instructions for converting, at at least some of the database instances, distinct values of the first column grouped by values of the second column into a count of the distinct values grouped by the values of the second column to obtain respective intermediate results; instructions for receiving the respective intermediate results from at least a subset of the at least some of the database instances; and instructions for concatenating the respective intermediate results using a summing operation to obtain the first "distinct count" of the first column grouped by the second column.