IDHUP: Incremental Discovery of High Utility Pattern

As a sub-problem of pattern discovery, utility-oriented pattern mining has recently emerged as a focus of researchers' attention and offers broad application prospects. Considering the dynamic characteristics of the input databases, incremental utility mining methods have been proposed, aiming...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Wangji Wanglu Jishu Xuekan = Journal of Internet Technology 2023-01, Vol.24 (1), p.135-147
Hauptverfasser: Lele Yu, Lele Yu, Lele Yu, Wensheng Gan, Wensheng Gan, Zhixiong Chen, Zhixiong Chen, Yining Liu
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:As a sub-problem of pattern discovery, utility-oriented pattern mining has recently emerged as a focus of researchers' attention and offers broad application prospects. Considering the dynamic characteristics of the input databases, incremental utility mining methods have been proposed, aiming to discover implicit information/ patterns whose importance/utility is not less than a user-specified threshold from incremental databases. However, due to the explosive growth of the search space, most existing methods perform unsatisfactorily under the low utility threshold, so there is still room for improvement in terms of running efficiency and pruning capacity. Motivated by this, we provide an effective and efficient method called IDHUP by designing an indexed partitioned utility list structure and employing four pruning strategies. With the proposed data structure, IDHUP can not only dynamically update the utility values of patterns but also avoid visiting non-occurred patterns. Moreover, to further exclude ineligible patterns and avoid unnecessary exploration, we put forward the remaining utility reducing strategy and three other revised pruning strategies. Experiments on various datasets demonstrated that the designed IDHUP algorithm has the best performance in terms of running time compared to state-of-the-art algorithms.
ISSN:1607-9264
1607-9264
2079-4029
DOI:10.53106/160792642023012401013