MCHT: A maximal clique and hash table-based maximal prevalent co-location pattern mining algorithm

•A novel maximal prevalent co-location pattern mining framework is presented.•The time and space costs are reduced efficiently by maximal cliques and hash tables.•Enumerating maximal cliques is accelerated by bit string operations.•The performance of the proposed method is proved by comparative expe...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Expert systems with applications 2021-08, Vol.175, p.114830, Article 114830
Hauptverfasser:	Tran, Vanha, Wang, Lizhen, Chen, Hongmei, Xiao, Qing
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Boolean algebra Computing time Data mining Datasets Hash table Maximal clique Maximal co-location pattern Model testing Pattern analysis Spatial data Spatial data mining
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	•A novel maximal prevalent co-location pattern mining framework is presented.•The time and space costs are reduced efficiently by maximal cliques and hash tables.•Enumerating maximal cliques is accelerated by bit string operations.•The performance of the proposed method is proved by comparative experiments. Co-location patterns refer to subsets of Boolean spatial features with instances of these features frequently appear in nearby geographic space. Maximal co-location patterns are a compact representation of these patterns that lead users more easily to absorb results and make meaningful inferences. The current algorithms for maximal co-location pattern mining are based on a generate-test candidate model. The main execution time of this model is occupied by collecting co-location instances of candidates, which makes discovering maximal co-location patterns is still very challenging when data is big and/or dense. To take up the challenge, a novel maximal co-location pattern mining framework based on maximal cliques and hash tables (MCHT) is developed in this study. First, all maximal cliques that can compactly represent neighbor relationships between instances of a spatial data set are enumerated. The advantages of bit string operations are fully utilized to speed up the process of enumerating maximal cliques. Next, a participating instance hash table structure is constructed based on these maximal cliques. Then information about the co-location instances of maximal patterns can be queried and collected efficiently from the hash table. After that, by calculating participation indexes of these patterns to measure their prevalence, maximal prevalent co-location patterns can be filtered efficiently. Finally, a series of experiments is conducted on both synthetic and real-facility data sets to demonstrate that the proposed algorithm can efficiently reduce both the computational time and the memory consumption compared with the existing algorithms.
ISSN:	0957-4174 1873-6793
DOI:	10.1016/j.eswa.2021.114830