Globally Homogenous Mixture Components and Local Heterogeneity of Rank Data

The traditional methods of finding mixture components of rank data are mostly based on distance and latent class models; these models may exhibit the phenomenon of masking of groups of small sizes; probably due to the spherical nature of rank data. Our approach diverges from the traditional methods;...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	arXiv.org 2016-08
1. Verfasser:	Choulakian, V
Format:	Artikel
Sprache:	eng
Schlagworte:	Coding Data analysis Homogeneity Masking Outliers (statistics) Taxicabs Voters
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	The traditional methods of finding mixture components of rank data are mostly based on distance and latent class models; these models may exhibit the phenomenon of masking of groups of small sizes; probably due to the spherical nature of rank data. Our approach diverges from the traditional methods; it is directional and uses a logical principle, the law of contradiction. We discuss the concept of a mixture for rank data essentially in terms of the notion of global homogeneity of its group components. Local heterogeneities may appear once the group components of the mixture have been discovered. This is done via the exploratory analysis of rank data by taxicab correspondence analysis with the nega coding: If the first factor is an affine function of the Borda count, then we say that the rank data are globally homogenous, and local heterogeneities may appear on the consequent factors; otherwise, the rank data either are globally homogenous with outliers, or a mixture of globally homogenous groups. Also we introduce a new coefficient of global homogeneity, GHC. GHC is based on the first taxicab dispersion measure: it takes values between 0 and 100\%, so it is easily interpretable. GHC measures the extent of crossing of scores of voters between two or three blocks seriation of the items where the Borda count statistic provides consensus ordering of the items on the first axis. Examples are provided. Key words: Preferences; rankings; Borda count; global homogeneity coefficient; nega coding; law of contradiction; mixture; outliers; taxicab correspondence analysis; masking.
ISSN:	2331-8422