Stat-DSM: Statistically Discriminative Sub-Trajectory Mining With Multiple Testing Correction

We propose a novel statistical approach to evaluate the statistical significance (reliability) of the results from discriminative sub-trajectory mining, which we call Statistically Discriminative Sub-trajectory Mining (Stat-DSM) . Given two groups of trajectories, the goal of Stat-DSM is to extract...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on knowledge and data engineering 2022-03, Vol.34 (3), p.1477-1488
Hauptverfasser: Le Duy, Vo Nguyen, Sakuma, Takuto, Ishiyama, Taiju, Toda, Hiroki, Arai, Kazuya, Karasuyama, Masayuki, Okubo, Yuta, Sunaga, Masayuki, Hanada, Hiroyuki, Tabei, Yasuo, Takeuchi, Ichiro
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:We propose a novel statistical approach to evaluate the statistical significance (reliability) of the results from discriminative sub-trajectory mining, which we call Statistically Discriminative Sub-trajectory Mining (Stat-DSM) . Given two groups of trajectories, the goal of Stat-DSM is to extract moving patterns in the form of sub-trajectories that occur statistically significantly more often in one group than in the other. An advantage of the proposed method is that the statistical significance of the extracted sub-trajectories are properly controlled in the sense that the probability of finding a falsely discriminative sub-trajectory is smaller than a specified significance threshold \alpha α (e.g., 0.05), which is crucial when the method is used in scientific or social science studies under noisy environments. Finding such statistically discriminative sub-trajectories from a massive trajectory dataset is both computationally and statistically challenging. In the Stat-DSM method, we address these difficulties by introducing a tree representation of sub-trajectories, and applying an efficient permutation-based statistical inference method to the tree. To the best of our knowledge, Stat-DSM is the first method that provides a statistical approach to quantify the reliability of discriminative sub-trajectory mining results. We illustrate the effectiveness and scalability of the Stat-DSM method by applying it to a real-world dataset containing 1,000,000 trajectories.
ISSN:1041-4347
1558-2191
DOI:10.1109/TKDE.2020.2994344