Similarity measures for document mapping: A comparative study on the level of an individual scientist

This paper investigates the utility of the Inclusion Index, the Jaccard Index and the Cosine Index for calculating similarities of documents, as used for mapping science and technology. It is shown that, provided that the same content is searched across various documents, the Inclusion Index general...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Scientometrics 2009, Vol.78 (1), p.113-130
Hauptverfasser: Sternitzke, Christian, Bergmann, Isumo
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:This paper investigates the utility of the Inclusion Index, the Jaccard Index and the Cosine Index for calculating similarities of documents, as used for mapping science and technology. It is shown that, provided that the same content is searched across various documents, the Inclusion Index generally delivers more exact results, in particular when computing the degree of similarity based on citation data. In addition, various methodologies such as co-word analysis, Subject-Action-Object (SAO) structures, bibliographic coupling, co-citation analysis, and self-citation links are compared. We find that the two former ones tend to describe rather semantic similarities that differ from knowledge flows as expressed by the citation-based methodologies.
ISSN:0138-9130
1588-2861
DOI:10.1007/s11192-007-1961-z