SYSTEM AND METHOD FOR HANDLING TOP COUNT QUERIES FOR ARBITRARY, SELECTABLE INTERVALS RELATING TO A LARGE, STREAMED DATA SET

A system and method are provided for enabling querying of a large set, including accessing a data structure associated with a metadata parameter and configured to store partial information associated with the data set in a plurality of bins. Each bin, associated with a unique time interval, is confi...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
1. Verfasser: Murphy, Frank P
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:A system and method are provided for enabling querying of a large set, including accessing a data structure associated with a metadata parameter and configured to store partial information associated with the data set in a plurality of bins. Each bin, associated with a unique time interval, is configured to store a plurality of entries associated with identified respective members of the metadata parameter's that have a detection time included in the bin's time interval. Each entry has at least one of an updated maximum and minimum possible count value determined using a probabilistic algorithm. The method includes receiving a query having a requested time interval, selecting two or more bins f the data structure that in combination describe the requested time interval, selecting k entries from a combination of the entries in the selected bins based on at least one of an updated maximum and minimum possible count value associated with entries of the selected bins, and determining top-k data, the top-k data including identification of the selected k entries.