A lead-lag analysis of the topic evolution patterns for preprints and publications

This study applied LDA (latent Dirichlet allocation) and regression analysis to conduct a lead‐lag analysis to identify different topic evolution patterns between preprints and papers from arXiv and the Web of Science (WoS) in astrophysics over the last 20 years (1992–2011). Fifty topics in arXiv an...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of the Association for Information Science and Technology 2015-12, Vol.66 (12), p.2643-2656
Hauptverfasser: Hu, Beibei, Dong, Xianlei, Zhang, Chenwei, Bowman, Timothy D., Ding, Ying, Milojević, Staša, Ni, Chaoqun, Yan, Erjia, Larivière, Vincent
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:This study applied LDA (latent Dirichlet allocation) and regression analysis to conduct a lead‐lag analysis to identify different topic evolution patterns between preprints and papers from arXiv and the Web of Science (WoS) in astrophysics over the last 20 years (1992–2011). Fifty topics in arXiv and WoS were generated using an LDA algorithm and then regression models were used to explain 4 types of topic growth patterns. Based on the slopes of the fitted equation curves, the paper redefines the topic trends and popularity. Results show that arXiv and WoS share similar topics in a given domain, but differ in evolution trends. Topics in WoS lose their popularity much earlier and their durations of popularity are shorter than those in arXiv. This work demonstrates that open access preprints have stronger growth tendency as compared to traditional printed publications.
ISSN:2330-1635
2330-1643
DOI:10.1002/asi.23347