IDENTIFYING HIGH VALUE SEGMENTS IN CATEGORICAL DATA

Systems and techniques for identifying segments in categorical data include receiving multiple transaction ID (TID) lists with univariate values that satisfy a thresholding metric with each TID list representing an occurrence of a single attribute in a set of transactions. The TID lists are stored w...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Maneriker, Pranav Ravindra, Singal, Dhruv, Sinha, Ritwik, Sinha, Atanu R
Format: Patent
Sprache:eng
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Systems and techniques for identifying segments in categorical data include receiving multiple transaction ID (TID) lists with univariate values that satisfy a thresholding metric with each TID list representing an occurrence of a single attribute in a set of transactions. The TID lists are stored with the univariate values that satisfy the thresholding metric in a data structure. In a loop, candidate itemsets to form from combinations of TID lists are determined using only the combinations of TID lists that satisfy categorical constraints. In the loop, for the candidate itemsets that satisfy categorical constraints, both the thresholding metric and a similarity metric are applied to the candidate itemsets. Final itemsets are formed from only the candidate itemsets that satisfy both the thresholding metric and the similarity metric.