IDENTIFYING HIGH VALUE SEGMENTS IN CATEGORICAL DATA
Systems and techniques for identifying segments in categorical data include receiving multiple transaction ID (TID) lists with univariate values that satisfy a thresholding metric with each TID list representing an occurrence of a single attribute in a set of transactions. The TID lists are stored w...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Patent |
Sprache: | eng |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Systems and techniques for identifying segments in categorical data include receiving multiple transaction ID (TID) lists with univariate values that satisfy a thresholding metric with each TID list representing an occurrence of a single attribute in a set of transactions. The TID lists are stored with the univariate values that satisfy the thresholding metric in a data structure. In a loop, candidate itemsets to form from combinations of TID lists are determined using only the combinations of TID lists that satisfy categorical constraints. In the loop, for the candidate itemsets that satisfy categorical constraints, both the thresholding metric and a similarity metric are applied to the candidate itemsets. Final itemsets are formed from only the candidate itemsets that satisfy both the thresholding metric and the similarity metric. |
---|