Log-Likelihood-Based Pseudo-R[superscript 2] in Logistic Regression: Deriving Sample-Sensitive Benchmarks

The literature proposes numerous so-called pseudo-R[superscript 2] measures for evaluating "goodness of fit" in regression models with categorical dependent variables. Unlike ordinary least square-R[superscript 2], log-likelihood-based pseudo-R[superscript 2]s do not represent the proporti...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Sociological methods & research 2018-08, Vol.47 (3), p.507
Hauptverfasser: Hemmert, Giselmar A. J, Schons, Laura M, Wieseke, Jan, Schimmelpfennig, Heiko
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The literature proposes numerous so-called pseudo-R[superscript 2] measures for evaluating "goodness of fit" in regression models with categorical dependent variables. Unlike ordinary least square-R[superscript 2], log-likelihood-based pseudo-R[superscript 2]s do not represent the proportion of explained variance but rather the improvement in model likelihood over a null model. The multitude of available pseudo-R[superscript 2] measures and the absence of benchmarks often lead to confusing interpretations and unclear reporting. Drawing on a meta-analysis of 274 published logistic regression models as well as simulated data, this study investigates fundamental differences of distinct pseudo-R[superscript 2] measures, focusing on their dependence on basic study design characteristics. Results indicate that almost all pseudo-R[superscript 2]s are influenced to some extent by sample size, number of predictor variables, and number of categories of the dependent variable and its distribution asymmetry. Hence, an interpretation by goodness-of-fit benchmark values must explicitly consider these characteristics. The authors derive a set of goodness-of-fit benchmark values with respect to ranges of sample size and distribution of observations for this measure. This study raises awareness of fundamental differences in characteristics of pseudo-R[superscript 2]s and the need for greater precision in reporting these measures.
ISSN:0049-1241
DOI:10.1177/0049124116638107