biblioverlap: an R package for document matching across bibliographic datasets

Bibliographic databases have long been a cornerstone of scientometrics research, and new information sources have prompted several comparative studies between them. Such studies often employ document-level matching procedures to identify overlaps in the corpus of each database and assess their cover...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Scientometrics 2024, Vol.129 (7), p.4513-4527
Hauptverfasser:	Vieira, Gabriel Alves, Leta, Jacqueline
Format:	Artikel
Sprache:	eng
Schlagworte:	Bibliographies Comparative studies Computer Science Data analysis Datasets Information sources Information Storage and Retrieval Library Science Matching Scientometrics
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Bibliographic databases have long been a cornerstone of scientometrics research, and new information sources have prompted several comparative studies between them. Such studies often employ document-level matching procedures to identify overlaps in the corpus of each database and assess their coverage. However, despite being increasingly relevant in comparative studies, such a type of analysis still lacks an open-source tool to automate it. To fill this gap, we have developed an R package called biblioverlap, which implements a hybrid matching approach using a unique identifier and a selection of ubiquitous bibliographic fields to establish document co-occurrence. It supports data analysis from a broad range of secondary sources and can be used for comparing databases and assessing document overlap in virtually any bibliographic dataset, which can be insightful for various research questions. This paper presents the biblioverlap tool, details the matching procedure’s implementation, and uses an example dataset containing records from the Federal University of Rio de Janeiro to illustrate the package’s built-in functionality.
ISSN:	0138-9130 1588-2861
DOI:	10.1007/s11192-024-05065-5