Comparing Two Partitions of Non-Equal Sets of Units

Metodolo\v{s}ki zvezki - Advances in Methodology and Statistics, Vol. 15, No. 1, 2018, 1-21. Avaiable at http://ibmi.mf.uni-lj.si/mz/2018/no-1/Cugmas2018.pdf Rand (1971) proposed what has since become a well-known index for comparing two partitions obtained on the same set of units. The index takes...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Cugmas, Marjan, Ferligoj, Anuška
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Metodolo\v{s}ki zvezki - Advances in Methodology and Statistics, Vol. 15, No. 1, 2018, 1-21. Avaiable at http://ibmi.mf.uni-lj.si/mz/2018/no-1/Cugmas2018.pdf Rand (1971) proposed what has since become a well-known index for comparing two partitions obtained on the same set of units. The index takes a value on the interval between 0 and 1, where a higher value indicates more similar partitions. Sometimes, e.g. when the units are observed in two time periods, the splitting and merging of clusters should be considered differently, according to the operationalization of the stability of clusters. The Rand Index is symmetric in the sense that both the splitting and merging of clusters lower the value of the index. In such a non-symmetric case, one of the Wallace indexes (Wallace, 1983) can be used. Further, there are several cases when one wants to compare two partitions obtained on different sets of units, where the intersection of these sets of units is a non-empty set of units. In this instance, the new units and units which leave the clusters from the first partition can be considered as a factor lowering the value of the index. Therefore, a modified Rand index is presented. Because the splitting and merging of clusters have to be considered differently in some situations, an asymmetric modified Wallace Index is also proposed. For all presented indices, the correction for chance is described, which allows different values of a selected index to be compared.
DOI:10.48550/arxiv.1805.07996