Parallel frequent closed sequence mining method based on vertical resolution

The invention provides a parallel frequent closed sequence mining method based on a vertical resolution, and belongs to the field of data mining. In the method, a sequence intersection manner is adopted, so that the lengths of sequences are reduced, namely, original sequences are divided into shorte...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: BI TIANCHI, ZHAO YUHAI, LI CHENGUANG, WANG GUOREN, YIN YING
Format: Patent
Sprache:chi ; eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The invention provides a parallel frequent closed sequence mining method based on a vertical resolution, and belongs to the field of data mining. In the method, a sequence intersection manner is adopted, so that the lengths of sequences are reduced, namely, original sequences are divided into shorter sequences in a vertical direction; and then, K sequences having highest difference degrees are selected from an intersection result, so that the differences in column numbers among the sequences are relatively large, and the mining time can be shortened by two steps. A viewpoint of compressing a frequent mode is put forward, thereby realizing the advantages of narrowing the enumeration range of a frequent closed mode, shortening the mining time and lowering the time complexity of an algorithm. A frequency closed sequence mining algorithm is realized by the most popular parallel framework Hadoop at present. The concurrency characteristic of the Hadoop is fully utilized; mass data are stored in nodes in a cluster i