A Flexible Multivariate Distribution for Correlated Count Data

Multivariate count data are often modeled via a multivariate Poisson distribution, but it contains an underlying, constraining assumption of data equi-dispersion (where its variance equals its mean). Real data are oftentimes over-dispersed and, as such, consider various advancements of a negative bi...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Stats (Basel, Switzerland) Switzerland), 2021-06, Vol.4 (2), p.308-326
Hauptverfasser: Sellers, Kimberly F., Li, Tong, Wu, Yixuan, Balakrishnan, Narayanaswamy
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Multivariate count data are often modeled via a multivariate Poisson distribution, but it contains an underlying, constraining assumption of data equi-dispersion (where its variance equals its mean). Real data are oftentimes over-dispersed and, as such, consider various advancements of a negative binomial structure. While data over-dispersion is more prevalent than under-dispersion in real data, however, examples containing under-dispersed data are surfacing with greater frequency. Thus, there is a demonstrated need for a flexible model that can accommodate both data types. We develop a multivariate Conway–Maxwell–Poisson (MCMP) distribution to serve as a flexible alternative for correlated count data that contain data dispersion. This structure contains the multivariate Poisson, multivariate geometric, and the multivariate Bernoulli distributions as special cases, and serves as a bridge distribution across these three classical models to address other levels of over- or under-dispersion. In this work, we not only derive the distributional form and statistical properties of this model, but we further address parameter estimation, establish informative hypothesis tests to detect statistically significant data dispersion and aid in model parsimony, and illustrate the distribution’s flexibility through several simulated and real-world data examples. These examples demonstrate that the MCMP distribution performs on par with the multivariate negative binomial distribution for over-dispersed data, and proves particularly beneficial in effectively representing under-dispersed data. Thus, the MCMP distribution offers an effective, unifying framework for modeling over- or under-dispersed multivariate correlated count data that do not necessarily adhere to Poisson assumptions.
ISSN:2571-905X
2571-905X
DOI:10.3390/stats4020021