biblioverlap: an R package for document matching across bibliographic datasets
Bibliographic databases have long been a cornerstone of scientometrics research, and new information sources have prompted several comparative studies between them. Such studies often employ document-level matching procedures to identify overlaps in the corpus of each database and assess their cover...
Gespeichert in:
Veröffentlicht in: | Scientometrics 2024, Vol.129 (7), p.4513-4527 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Bibliographic databases have long been a cornerstone of scientometrics research, and new information sources have prompted several comparative studies between them. Such studies often employ document-level matching procedures to identify overlaps in the corpus of each database and assess their coverage. However, despite being increasingly relevant in comparative studies, such a type of analysis still lacks an open-source tool to automate it. To fill this gap, we have developed an R package called
biblioverlap,
which implements a hybrid matching approach using a unique identifier and a selection of ubiquitous bibliographic fields to establish document co-occurrence. It supports data analysis from a broad range of secondary sources and can be used for comparing databases and assessing document overlap in virtually any bibliographic dataset, which can be insightful for various research questions. This paper presents the
biblioverlap
tool, details the matching procedure’s implementation, and uses an example dataset containing records from the Federal University of Rio de Janeiro to illustrate the package’s built-in functionality. |
---|---|
ISSN: | 0138-9130 1588-2861 |
DOI: | 10.1007/s11192-024-05065-5 |