Deep-profiling: a deep neural network model for scholarly Web user profiling

Scholarly big data refer to the rapidly growing scholarly source of information, including a large number of authors, papers, and massive scale scholarly networks. Extracting the profile attributes for Web users is an important step in Web user analysis. For Web scholarly users, profile attributes e...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Cluster computing 2023-06, Vol.26 (3), p.1753-1766
Hauptverfasser:	Lin, Weiwei, Xu, Haojun, Li, Jianzhuo, Wu, Ziming, Hu, Zhengyang, Chang, Victor, Wang, James Z.
Format:	Artikel
Sprache:	eng
Schlagworte:	Artificial intelligence Artificial neural networks Big Data Computer Communication Networks Computer Science Depth profiling Information resources Neural networks Operating Systems Processor Architectures Recurrent neural networks Search engines Social networks Training
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Scholarly big data refer to the rapidly growing scholarly source of information, including a large number of authors, papers, and massive scale scholarly networks. Extracting the profile attributes for Web users is an important step in Web user analysis. For Web scholarly users, profile attributes extraction should integrate multi-source and heterogeneous information resources. However, the traditional extraction models have two main drawbacks: (1) The traditional models require manual feature selection based on specific domain knowledge; (2) The traditional models cannot adapt to the diversities of Scholarly Web pages and cannot discover the relationships between different target entities which are far apart in different domains. To address these issues, we propose a profile attributes extraction model, PAE-NN, based on a Bi-LSTM-CRF neural network. This model can automatically extract the characteristics and contextual representations of each extracting entity through a Recurrent Neural Network with end-to-end training. It takes advantage of the long-memory sequence characteristics of LSTM network to effectively discover the long-term dependencies on extracting entities. Our experimental results on published datasets from the SMPCUP2017 Open Academic Competition and Aminer demonstrate that the proposed PAE-NN model outperforms existing models in terms of extraction precision, recall, and F1-score with large-scale training data.
ISSN:	1386-7857 1573-7543
DOI:	10.1007/s10586-021-03315-2