Algorithmic identification of Ph.D. thesis-related publications: a proof-of-concept study
In this study we propose and evaluate a method to automatically identify the journal publications that are related to a Ph.D. thesis using bibliographical data of both items. We build a manually curated ground truth dataset from German cumulative doctoral theses that explicitly list the included pub...
Gespeichert in:
Veröffentlicht in: | Scientometrics 2022-10, Vol.127 (10), p.5863-5877 |
---|---|
1. Verfasser: | |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | In this study we propose and evaluate a method to automatically identify the journal publications that are related to a Ph.D. thesis using bibliographical data of both items. We build a manually curated ground truth dataset from German cumulative doctoral theses that explicitly list the included publications, which we match with records in the Scopus database. We then test supervised classification methods on the task of identifying the correct associated publications among high numbers of potential candidates using features of the thesis and publication records. The results indicate that this approach results in good match quality in general and with the best results attained by the “random forest” classification algorithm. |
---|---|
ISSN: | 0138-9130 1588-2861 |
DOI: | 10.1007/s11192-022-04480-w |