Rank-based approach for estimating correlations in mixed ordinal data

High-dimensional mixed data as a combination of both continuous and ordinal variables are widely seen in many research areas such as genomic studies and survey data analysis. Estimating the underlying correlation among mixed data is hence crucial for further inferring dependence structure. We propos...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org 2018-09
Hauptverfasser: Quan, Xiaoyun, Booth, James G, Wells, Martin T
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:High-dimensional mixed data as a combination of both continuous and ordinal variables are widely seen in many research areas such as genomic studies and survey data analysis. Estimating the underlying correlation among mixed data is hence crucial for further inferring dependence structure. We propose a semiparametric latent Gaussian copula model for this problem. We start with estimating the association among ternary-continuous mixed data via a rank-based approach and generalize the methodology to p-level-ordinal and continuous mixed data. Concentration rate of the estimator is also provided and proved. At last, we demonstrate the performance of the proposed estimator by extensive simulations and two case studies of real data examples of algorithmic risk score evaluation and cancer patients survival data.
ISSN:2331-8422