STRATIFIED SAMPLING OF LOG RECORDS FOR APPROXIMATE FULL-TEXT SEARCH

A log record from a host machine node includes an invariant string and a term. A template identifier is selected, from among template identifiers within a template repository, for a template string matching the invariant string. A sampling count threshold is selected from among a set of sampling cou...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
1. Verfasser: GUKAL SREENIVAS
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:A log record from a host machine node includes an invariant string and a term. A template identifier is selected, from among template identifiers within a template repository, for a template string matching the invariant string. A sampling count threshold is selected from among a set of sampling count thresholds based on the template identifier and the term. A template-term count is obtained based on a number of earlier log records that were received since the count was reset and have a template identifier and a term that match the template identifier and the term of the log record. Based on the template-term count satisfying the sampling count threshold, an index entry is generated in a sampled log records index based on the log record and the template-term count is reset to a defined value. Based on the template-term count not satisfying the sampling count threshold, the template-term count is incremented.