TEAGS: time-aware text embedding approach to generate subgraphs

Contagions (e.g. virus and gossip) spread over the nodes in propagation graphs. We can use temporal-textual contents of nodes to compute the edge weights and generate subgraphs with highly relevant nodes. This is beneficial to many applications. Yet, challenges abound. First, the propagation pattern...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Data mining and knowledge discovery 2020-07, Vol.34 (4), p.1136-1174
Hauptverfasser: Hosseini, Saeid, Najafipour, Saeed, Cheung, Ngai-Man, Yin, Hongzhi, Kangavari, Mohammad Reza, Zhou, Xiaofang
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Contagions (e.g. virus and gossip) spread over the nodes in propagation graphs. We can use temporal-textual contents of nodes to compute the edge weights and generate subgraphs with highly relevant nodes. This is beneficial to many applications. Yet, challenges abound. First, the propagation pattern between each pair of nodes may change by time. Second, not always the same contagion propagates. Hence, current text mining approaches including topic-modeling cannot effectively compute the edge weights. Third, since the propagation is affected by time, the word–word co-occurrence patterns may differ in various temporal dimensions which adversely impacts the performance of word embedding approaches. We argue that multi-aspect temporal dimensions (hour, day, etc) should be considered to better calculate the correlation weights between the nodes. In this work, we devise a novel framework that on the one hand, integrates a time-aware word embedding component to construct the word vectors through multiple temporal facets, and on the other hand, uses a time-only multi-facet generative model to compute the weights. Subsequently, we propose a Max-Heap Graph cutting algorithm to generate subgraphs. We validate our model through experiments on real-world datasets. The results show that our model can generate the subgraphs more effective than other rivals and temporal dynamics must be adhered in the modeling of the dynamical processes.
ISSN:1384-5810
1573-756X
DOI:10.1007/s10618-020-00688-7