Generalized Inflated Discrete Models: A Strategy to Work with Multimodal Discrete Distributions

Analysts of discrete data often face the challenge of managing the tendency of inflation on certain values. When treated improperly, such phenomenon may lead to biased estimates and incorrect inferences. This study extends the existing literature on single-value inflated models and develops a genera...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Sociological methods & research 2021-02, Vol.50 (1), p.365-400
Hauptverfasser: Cai, Tianji, Xia, Yiwei, Zhou, Yisu
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Analysts of discrete data often face the challenge of managing the tendency of inflation on certain values. When treated improperly, such phenomenon may lead to biased estimates and incorrect inferences. This study extends the existing literature on single-value inflated models and develops a general framework to handle variables with more than one inflated value. To assess the performance of the proposed maximum likelihood estimator, we conducted Monte Carlo experiments under several scenarios for different levels of inflated probabilities under multinomial, ordinal, Poisson, and zero-truncated Poisson outcomes with covariates. We found that ignoring the inflations leads to substantial bias and poor inference of the inflations—not only for the intercept(s) of the inflated categories but other coefficients as well. Specifically, higher values of inflated probabilities are associated with larger biases. By contrast, the generalized inflated discrete models (GIDMs) perform well with unbiased estimates and satisfactory coverages even when the number of parameters that need to be estimated is quite large. We showed that model fit criteria, such as Akaike information criterion, could be used in selecting the appropriate specifications of inflated models. Lastly, the GIDM was implemented using large-scale health survey data as a comparison to conventional modeling approaches such as various Poisson and Ordered Logit models. We showed that the GIDM fits the data better in general. The current work provides a practical approach to analyze multimodal data that exists in many fields, such as heaping in self-reported behavioral outcomes, inflated categories of indifference and neutral in attitude surveys, large amounts of zero, and low occurrences of delinquent behaviors.
ISSN:0049-1241
1552-8294
DOI:10.1177/0049124118782535