Measuring Quality of Wikipedia Articles by Feature Fusion‐based Stack Learning
Online open‐source knowledge repository such as Wikipedia has become an increasingly important source for users to access knowledge. However, due to its large volume, it is challenging to evaluate Wikipedia article quality manually. To fill this gap, we propose a novel approach named “feature fusion...
Gespeichert in:
Veröffentlicht in: | Proceedings of the ASIST Annual Meeting 2021, Vol.58 (1), p.206-217 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Online open‐source knowledge repository such as Wikipedia has become an increasingly important source for users to access knowledge. However, due to its large volume, it is challenging to evaluate Wikipedia article quality manually. To fill this gap, we propose a novel approach named “feature fusion‐based stack learning” to assess the quality of Wikipedia articles. Pre‐trained language models including BERT (Bidirectional Encoder Representations from Transformers) and ELMo (Embeddings from Language Models) are applied to extract semantic information in Wikipedia content. The feature fusion framework consisting of semantic and statistical features is built and fed into an out‐of‐sample (OOS) stacking model, which includes both machine learning and deep learning models. We compare the performance of proposed model with some existing models with different metrics extensively, and conduct ablation studies to prove the effectiveness of our framework and OOS stacking. Generally, the experiment shows that our method is much better than state‐of‐the‐art models. |
---|---|
ISSN: | 2373-9231 2373-9231 1550-8390 |
DOI: | 10.1002/pra2.449 |