Deep-profiling: a deep neural network model for scholarly Web user profiling
Scholarly big data refer to the rapidly growing scholarly source of information, including a large number of authors, papers, and massive scale scholarly networks. Extracting the profile attributes for Web users is an important step in Web user analysis. For Web scholarly users, profile attributes e...
Gespeichert in:
Veröffentlicht in: | Cluster computing 2023-06, Vol.26 (3), p.1753-1766 |
---|---|
Hauptverfasser: | , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Scholarly big data refer to the rapidly growing scholarly source of information, including a large number of authors, papers, and massive scale scholarly networks. Extracting the profile attributes for Web users is an important step in Web user analysis. For Web scholarly users, profile attributes extraction should integrate multi-source and heterogeneous information resources. However, the traditional extraction models have two main drawbacks: (1) The traditional models require manual feature selection based on specific domain knowledge; (2) The traditional models cannot adapt to the diversities of Scholarly Web pages and cannot discover the relationships between different target entities which are far apart in different domains. To address these issues, we propose a profile attributes extraction model, PAE-NN, based on a Bi-LSTM-CRF neural network. This model can automatically extract the characteristics and contextual representations of each extracting entity through a Recurrent Neural Network with end-to-end training. It takes advantage of the long-memory sequence characteristics of LSTM network to effectively discover the long-term dependencies on extracting entities. Our experimental results on published datasets from the SMPCUP2017 Open Academic Competition and Aminer demonstrate that the proposed PAE-NN model outperforms existing models in terms of extraction precision, recall, and F1-score with large-scale training data. |
---|---|
ISSN: | 1386-7857 1573-7543 |
DOI: | 10.1007/s10586-021-03315-2 |