A Flexible Mixed Model for Clustered Count Data
Abstract
:1. Introduction
2. COM-Poisson Distribution and Regression Model
3. COM-Poisson Regression Mixed Model
3.1. Model Formulation
3.2. COM-Poisson-Lognormal Model
3.3. COM-Poisson-Conjugate Model
4. Analysis: Simulated Data
4.1. Simulated Data Scenario I: Normal-Distributed Random Effects
4.2. Simulated Data Scenario II: Gamma-Distributed Random Effects
5. Analysis: Epilepsy Data
6. Conclusions and Discussion
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
COM-Poisson | Conway-Maxwell-Poisson |
CMP | Conway-Maxwell-Poisson |
CMP-LN | COM-Poisson-lognormal |
CMP-C | COM-Poisson-conjugate |
Poi-LN | Poisson-lognormal |
NB-LN | negative binomial-lognormal |
L-LN | logistic-lognormal |
AIC | Akaike information criterion |
References
- Hilbe, J.M. Negative Binomial Regression; Cambridge University Press: Cambridge, UK, 2007. [Google Scholar]
- Aerts, M.; Geys, H.; Molenberghs, G.; Ryan, L.M. Topics in Modelling of Clustered Data; Monographs on Statistics and Applied Probability; Chapman and Hall/CRC: Boca Raton, FL, USA, 2002. [Google Scholar]
- Cameron, A.C.; Trivedi, P.K. Regression Analysis of Count Data; Cambridge University Press: Cambridge, UK, 1998. [Google Scholar]
- Winkelmann, R. Econometric Analysis of Count Data; Springer: Berlin/Heidelberg, Germany, 2008. [Google Scholar]
- Agresti, A. Categorical Data Analysis, 2nd ed.; John Wiley & Sons, Inc.: Hoboken, NJ, USA, 2002. [Google Scholar]
- Breslow, N. Extra-Poisson variation in log-linear models. Appl. Stat. Sci. 1984, 33, 38–44. [Google Scholar] [CrossRef]
- Hinde, J. Compound Poisson regression models. In GLIM 82: Proceedings of the International Conference on Generalised Linear Models; Gilchrist, R., Ed.; Springer: New York, NY, USA, 1982; pp. 109–121. [Google Scholar]
- Hausman, J.; Hall, B.H.; Griliches, Z. Econometric Models for Count data with an application to the patents-R&D relationship. Econometrica 1984, 52, 909–938. [Google Scholar]
- Greene, W.H. Fixed and Random Effects Models for Count Data. SSRN eLibrary. 2007. Available online: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=990012 (accessed on 5 January 2022).
- Booth, J.; Casella, G.; Friedl, H.; Hobart, J. Negative binomial loglinear mixed models. Stat. Model. 2003, 3, 179–191. [Google Scholar] [CrossRef]
- Molenberghs, G.; Verbeke, G.; Demétrio, C.G.B. An extended random-effects approach to modeling repeated, overdispersed count data. Lifetime Data Anal. 2007, 13, 513–531. [Google Scholar] [CrossRef]
- Molenberghs, G.; Verbeke, G.; Demétrio, C.G.B.; Afranio, M.C. A family of generalized linear models for repeated measures with normal and conjugate random effects. Stat. Sci. 2010, 25, 325–347. [Google Scholar] [CrossRef]
- Rizzato, F.B.; Leandro, R.A.; Demertrio, C.G.; Molenberghs, G. A Bayesian approach to analyse overdispersed longitudinal data. J. Appl. Stat. 2016, 43, 2085–2109. [Google Scholar] [CrossRef]
- Lord, D.; Guikema, S.D.; Geedipally, S.R. Application of the Conway-Maxwell-Poisson Generalized Linear Model for Analyzing Motor Vehicle Crashes. Accid. Anal. Prev. 2008, 40, 1123–1134. [Google Scholar] [CrossRef]
- Shmueli, G.; Minka, T.P.; Kadane, J.B.; Borle, S.; Boatwright, P. A useful distribution for fitting discrete data: Revival of the Conway-Maxwell-Poisson distribution. Appl. Stat. 2005, 54, 127–142. [Google Scholar] [CrossRef]
- Lord, D.; Guikema, S.D.; Geedipally, S.R. Extension of the application of Conway-Maxwell-Poisson models: Analyzing traffic crash data exhibiting underdispersion. Risk Anal. 2010, 30, 1268–1276. [Google Scholar] [CrossRef] [Green Version]
- Sellers, K.F.; Morris, D.S. Underdispersion models: Models that are “under the radar”. Commun. Stat.-Theory Methods 2017, 46, 12075–12086. [Google Scholar] [CrossRef]
- Sellers, K.F.; Shmueli, G. A Flexible Regression Model for Count Data. Ann. Appl. Stat. 2010, 4, 943–961. [Google Scholar] [CrossRef] [Green Version]
- Chatla, S.B.; Shmueli, G. Efficient estimation of COM-Poisson regression and a generalized additive model. Comput. Stat. Data Anal. 2018, 121, 71–88. [Google Scholar] [CrossRef]
- Huang, A. Mean-parameterized Conway-Maxwell-Poisson Regression Models for Dispersed Counts. Stat. Model. 2017, 17, 359–380. [Google Scholar] [CrossRef] [Green Version]
- Ribeiro, E.E., Jr.; Zeviani, W.M.; Bonat, W.H.; Demétrio, C.G.B.; Hinde, J. Reparameterization of COM-Poisson Regression Models with Applications in the Analysis of Experimental Data. Stat. Model. 2020, 20, 443–466. [Google Scholar] [CrossRef]
- Guikema, S.D.; Coffelt, J.P. A Flexible Count Data Regression Model for Risk Analysis. Risk Anal. 2008, 28, 213–223. [Google Scholar] [CrossRef]
- Huang, A.; Kim, A.S.I. Bayesian Conway-Maxwell-Poisson regression models for ovedispersed and underdispersed counts. Commun. Stat.-Theory Methods 2021, 50, 3094–3105. [Google Scholar] [CrossRef] [Green Version]
- Khan, N.M.; Jowaheer, V. Comparing joint GQL estimation and GMM adaptive estimation in COM-Poisson longitudinal regression model. Commun. Stat.-Simulations Comput. 2013, 42, 755–770. [Google Scholar] [CrossRef]
- Choo-Wosoba, H.; Levy, S.M.; Datta, S. Marginal regression models for clustered count data based on zero-inflated Conway-Maxwell-Poisson distribution with applications. Biometrics 2016, 72, 606–618. [Google Scholar] [CrossRef] [Green Version]
- Morris, D.S.; Sellers, K.F.; Menger, A. Fitting a Flexible Model for Longitudinal Count Data Using the NLMIXED Procedure. In SAS Global Forum Proceedings; SAS Institute: Cary, NC, USA, 2017. [Google Scholar]
- Morris, D.S.; Sellers, K.F. A COM-Poisson Mixed Model with Normal Random Effects for Clustered Count Data. In Proceedings of the 61st World Statistics Congress of the International Statistical Institute, Marrakech, Morocco, 16–21 July 2017; ISI: The Hague, The Netherlands, 2017. [Google Scholar]
- Choo-Wosoba, H.; Datta, S. Analyzing clustered count data with a cluster-specific random effect zero-inflated Conway-Maxwell-Poisson distribution. J. Appl. Stat. 2018, 45, 799–814. [Google Scholar] [CrossRef]
- Choo-Wosoba, H.; Gaskins, J.; Levy, S.M.; Datta, S. A Bayesian approach for analyzing zero-inflated clustered count data with dispersion. Stat. Med. 2018, 37, 801–812. [Google Scholar] [CrossRef]
- Kadane, J.B.; Shmueli, G.; Minka, T.P.; Borle, S.; Boatwright, P. Conjugate Analysis of the Conway-Maxwell-Poisson Distribution. Bayesian Anal. 2018, 1, 363–374, Erratum in Bayesian Anal. 2018, 13, 1005. [Google Scholar]
- Conway, R.W.; Maxwell, W.L. A queuing model with state dependent service rates. J. Ind. Eng. 1962, 12, 132–136. [Google Scholar]
- Sellers, K.F.; Borle, S.; Shmueli, G. The COM-Poisson model for count data: A survey of methods and applications. Appl. Stoch. Model. Bus. Ind. 2012, 28, 104–116. [Google Scholar] [CrossRef]
- Piessens, R.; de Doncker Kapenga, E.; Uberhuber, C.; Kahaner, D. Quadpack: A Subroutine Package for Automatic Integration; Springer: Berlin/Heidelberg, Germany, 1983. [Google Scholar]
- Eddelbuettel, D.; François, R. Rcpp: Seamless R and C++ Integration. J. Stat. Softw. 2011, 40, 1–18. [Google Scholar] [CrossRef] [Green Version]
- Morris, D.S. COM-Poisson Conditional Conjugate. Available online: https://dsteeg.shinyapps.io/CMPMMshinyapp (accessed on 5 January 2022).
- Sellers, K.F.; Lotze, T.; Raim, A. COMPoissonReg: Conway-Maxwell-Poisson Regression. Version 0.4.1. 2017. Available online: https://cran.r-project.org/web/packages/COMPoissonReg/index.html (accessed on 5 January 2022).
- Bates, D.; Mächler, M.; Bolker, B.; Walker, S. Fitting Linear Mixed-Effects Models Using lme4. J. Stat. Softw. 2015, 67, 1–48. [Google Scholar] [CrossRef]
- Burnham, K.P.; Anderson, D.R. Model Selection and Multimodal Inference; Springer: Berlin/Heidelberg, Germany, 2002. [Google Scholar]
- Thall, P.F.; Vail, S.C. Some covariance models for longitudinal count data with overdispersion. Biometrics 1990, 46, 657–671. [Google Scholar] [CrossRef] [Green Version]
- Diggle, P.; Heagerty, P.J.; Liang, K.Y.; Zeger, S.L. Analysis of Longitudinal Data; Oxford University Press: Oxford, UK, 2002. [Google Scholar]
- Self, S.G.; Liang, K.Y. Asymptotic Properties of Maximum Likelihood Estimators and Likelihood Ratio Tests under Nonstandard Conditions. J. Am. Stat. Assoc. 1987, 82, 605–610. [Google Scholar] [CrossRef]
Simulated Dataset | Estimate | Model | |||
---|---|---|---|---|---|
Poi-LN | NB-LN | CMP-LN | CMP-C | ||
Poisson | Dispersion | ||||
Variance | |||||
min AIC | 0.96 | 0.72 | 0.96 | 0.12 | |
max | 0.00 | 0.04 | 0.76 | 0.20 | |
Bernoulli * | Dispersion | ||||
Variance | |||||
min AIC | 0.00 | 0.00 | 1.00 | 1.00 | |
max | 0.00 | 0.00 | 0.55 | 0.45 | |
Geometric | Dispersion | ||||
Variance | |||||
min AIC | 0.00 | 0.98 | 0.22 | 0.34 | |
max | 0.00 | 0.64 | 0.02 | 0.34 | |
CMP | Dispersion | ||||
(under) | Variance | ||||
min AIC | 0.00 | 0.00 | 1.00 | 0.97 | |
max | 0.00 | 0.00 | 0.58 | 0.42 | |
CMP | Dispersion | ||||
(over) | Variance | ||||
min AIC | 0.17 | 0.19 | 0.93 | 0.19 | |
max | 0.00 | 0.10 | 0.79 | 0.12 |
Simulated Dataset | Estimate | Model | |||
---|---|---|---|---|---|
Poi-LN | NB-LN | CMP-LN | CMP-C | ||
Poisson | Dispersion | ||||
Variance | |||||
min AIC | 0.28 | 0.08 | 0.22 | 0.88 | |
max | 0.00 | 0.00 | 0.12 | 0.88 | |
Bernoulli * | Dispersion | ||||
Variance | |||||
min AIC | 0.00 | 0.00 | 1.00 | 1.00 | |
max | 0.00 | 0.00 | 0.69 | 0.31 | |
Geometric | Dispersion | ||||
Variance | |||||
min AIC | 0.00 | 0.97 | 0.00 | 0.45 | |
max | 0.00 | 0.76 | 0.00 | 0.24 | |
CMP | Dispersion | ||||
(under) | Variance | ||||
min AIC | 0.00 | 0.00 | 1.00 | 1.00 | |
max | 0.00 | 0.00 | 0.21 | 0.79 | |
CMP | Dispersion | ||||
(over) | Variance | ||||
min AIC | 0.06 | 0.09 | 0.23 | 0.86 | |
max | 0.00 | 0.06 | 0.14 | 0.80 |
Parameter | Model | |||
---|---|---|---|---|
Poi-LN | NB-LN | CMP-LN | CMP-C | |
or | 1.200 (0.157) | 1.116 (0.180) | 0.407 (0.106) | - |
0.920 (0.034) | 0.983 (0.072) | 0.429 (0.051) | 0.422 (0.023) | |
−0.024 (0.211) | 0.073 (0.242) | 0.051 (0.106) | 0.150 (0.077) | |
−0.104 (0.065) | −0.310 (0.141) | −0.177 (0.053) | −0.158 (0.035) | |
k | - | 0.148 [0.000] | - | - |
- | - | 0.420 [0.000] | 0.412 [0.000] | |
0.608 [0.000] | 0.661 [0.000] | 0.143 [0.000] | - | |
a, c | - | - | - | 3.18, 0.71 [0.000] |
1011.0 | 888.7 | 871.1 | 881.6 | |
AIC | 2031.0 | 1789.5 | 1754.1 | 1775.3 |
664.6 | 758.1 | 1026.0 | 927.3 | |
AIC [] | 1907.2 | 1725.4 | 1755.1 | 1759.1 |
[] | 600.6 | 647.9 | 669.5 | 665.0 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Morris, D.S.; Sellers, K.F. A Flexible Mixed Model for Clustered Count Data. Stats 2022, 5, 52-69. https://doi.org/10.3390/stats5010004
Morris DS, Sellers KF. A Flexible Mixed Model for Clustered Count Data. Stats. 2022; 5(1):52-69. https://doi.org/10.3390/stats5010004
Chicago/Turabian StyleMorris, Darcy Steeg, and Kimberly F. Sellers. 2022. "A Flexible Mixed Model for Clustered Count Data" Stats 5, no. 1: 52-69. https://doi.org/10.3390/stats5010004
APA StyleMorris, D. S., & Sellers, K. F. (2022). A Flexible Mixed Model for Clustered Count Data. Stats, 5(1), 52-69. https://doi.org/10.3390/stats5010004