Next Article in Journal
Stochastic Dynamics of a Time-Delayed Ecosystem Driven by Poisson White Noise Excitation
Next Article in Special Issue
The Power Law Characteristics of Stock Price Jump Intervals: An Empirical and Computational Experimental Study
Previous Article in Journal
Finding a Hadamard Matrix by Simulated Quantum Annealing
Previous Article in Special Issue
Is Natural Language a Perigraphic Process? The Theorem about Facts and Words Revisited
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Simple and Adaptive Dispersion Regression Model for Count Data

1
Department of Statistics, Faculty of Science, King Abdulaziz University, Jeddah 21589, Saudi Arabia
2
Department of Mathematics, Brunel University London, Uxbridge UB8 3PH, UK
*
Author to whom correspondence should be addressed.
Entropy 2018, 20(2), 142; https://doi.org/10.3390/e20020142
Submission received: 19 January 2018 / Revised: 14 February 2018 / Accepted: 16 February 2018 / Published: 22 February 2018
(This article belongs to the Special Issue Power Law Behaviour in Complex Systems)

Abstract

:
Regression for count data is widely performed by models such as Poisson, negative binomial (NB) and zero-inflated regression. A challenge often faced by practitioners is the selection of the right model to take into account dispersion, which typically occurs in count datasets. It is highly desirable to have a unified model that can automatically adapt to the underlying dispersion and that can be easily implemented in practice. In this paper, a discrete Weibull regression model is shown to be able to adapt in a simple way to different types of dispersions relative to Poisson regression: overdispersion, underdispersion and covariate-specific dispersion. Maximum likelihood can be used for efficient parameter estimation. The description of the model, parameter inference and model diagnostics is accompanied by simulated and real data analyses.

1. Introduction

Count data, which refers to the number of times an item or an event occurs within a fixed period of time, commonly arises in many fields. Indeed, examples of count data include the number of heart attacks or the number of hospitalisation days in medical studies, the number of students absent during a period of time in education studies, or the number of times parents perpetrate domestic violence against their child in social science investigations. There is now a great deal of interest in the literature on investigating the relationship between a count response variable and other variables: for example, how the education level of parents can affect the incidence of domestic violence against their children. Methods to address these questions fall in the general area of regression analysis of count data (see [1,2] among others).
Classical regression models for count data belong to the family of generalised linear models [3] such as Poisson regression, which models the conditional mean of the counts as a linear regression on a set of covariates through the log link function. Although Poisson regression is fundamental to the regression analysis of count data, it is often of limited use for real data, because of its property of an equal mean and variance. Real data usually feature overdispersion relative to Poisson regression, or the opposite case of underdispersion. Thus, accounting for overdispersion and underdispersion when modelling count data is essential, and failing to cope with these features of the data can lead to biased parameter estimates and thus false conclusions and decisions.
Negative Binomial (NB) regression is widely considered as the default choice for data that are overdispersed relative to Poisson regression. However, NB regression may not be the best choice for power-law data with long tails, or for highly skewed data with an excessive number of zeros, because of the rare occurrence of non-zero events. These often require the application of zero-inflated and hurdle models. In addition, NB regression cannot deal with data that are underdispersed relative to Poisson regression. There have been some attempts to extend Poisson regression-based models to underdispersion, such as generalised Poisson (GP) regression [4,5], COM–Poisson regression [6] or hyper-Poisson regression models [7]. However, these models are all modifications of a Poisson model and have been shown to be rather complex and computationally expensive in practice. In this paper, we introduce the discrete Weibull (DW) distribution to the regression modelling of count data and present the DW regression model. The motivation behind considering the DW distribution [8] stems from the vital role played by the continuous Weibull distribution in survival analysis and failure time studies. However, few contributions can be found in the literature on statistical inference and applications using this distribution, aside from some early work on parameter estimation [9,10] and some limited use in applied contexts: for example, Refs. [11,12], who showed that the counts of living microbes (pathogens) in water are highly skewed and can be efficiently modelled using a DW distribution. In contrast to this, this paper shows a number of desirable features of this distribution, which are particularly appealing within a regression context. Specifically, this simple count regression model can capture different levels of dispersion adaptively, which is a challenge faced by existing count regression models. Moreover, we show how the DW model can capture power-law behaviour and high skewness in the underlying distributions.
Section 2 provides a review and description of the DW distribution and its properties. Section 3 illustrates the ability of a DW distribution to model data that are both overdispersed and underdispersed relative to Poisson regression. The DW regression model is introduced in Section 4 to investigate the relationship between a count response and a set of covariates. Section 5 shows the ability of the DW regression model to handle cases of mixed levels of dispersion. This model is applied to a number of real datasets in Section 6.

2. Discrete Weibull Distribution

2.1. The Distribution

If Y follows a (type 1) DW distribution [8], then the cumulative distribution function of Y is given by
F ( y ; q , β ) = 1 q ( y + 1 ) β for y = 0 , 1 , 2 , 3 , 0 otherwise
and its probability mass function is given by
f ( y ; q , β ) = q y β q ( y + 1 ) β for y = 0 , 1 , 2 , 3 , 0 otherwise
with the parameters 0 < q < 1 and β > 0 . Because f ( 0 ) = 1 q , the parameter q is the probability of obtaining a non-zero response. We refer later to this distribution as D W ( q , β ) . The distribution is connected to other well-known distributions. In particular, the following:
  • The discrete Rayleigh distribution in [13] is a special case of a DW distribution with β = 2 and q = θ .
  • The geometric distribution is a special case of a DW distribution, with β = 1 and q = 1 p . Moreover, for the geometric distribution, the variance is always greater than its mean. Therefore, a DW distribution with β = 1 is a case of overdispersion relative to Poisson regression, regardless of the value of q. In particular, when β = 1 and q = e λ , the distribution is the discrete exponential distribution introduced by [14].
  • β can be considered as controlling the range of values of the variable. In other words, this parameter controls the skewness of the DW distribution. In order to show this, Figure 1 plots the probability mass functions for a fixed parameter q and different values of β . The plot shows how the the probability of 0 stays constant, while the tail of the distribution becomes increasingly longer as β 0 , and the distribution approaches a Bernoulli distribution with probability q as β .

2.2. Moments and Quantiles

The first two moments of a DW distribution are given by
E ( Y ) = μ = y = 1 q y β E ( Y 2 ) = 2 y = 1 y q y β E ( Y )
From these, the variance for a DW distribution is given by
V a r ( Y ) = σ 2 = 2 y = 1 y q y β μ μ 2
for which there are no closed-form expressions, although numerical approximations can be obtained on a truncated support [15]. Equations (3) and (4) show that both E ( Y ) > V a r ( Y ) and E ( Y ) < V a r ( Y ) are generally possible, making the DW distribution suitable both for overdispersion and underdispersion.
A nice property of the DW distribution is that its τ th ( 0 < τ < 1 ) quantile, that is, the smallest value of y for which F ( y ) τ , has a closed-form expression, given by
Q ( τ ) = log ( 1 τ ) log ( q ) 1 β 1
for τ 1 q . This is in contrast to Poisson and NB regression, which do not have a closed-form expression for the quantiles.

2.3. Parameter Estimation

Given a sample y 1 , y 2 , , y n from a DW distribution, the log-likelihood can be written as
= i = 1 n log q y i β q ( y i + 1 ) β
from which the maximum likelihood estimators (MLEs) of q and β can be easily obtained by directly maximising this log-likelihood using any standard nonlinear optimisation tool.

3. DW Accounts for Different Types of Dispersion

In this section, we discuss a property of the DW distribution that is particularly advantageous as a model for count data. Dispersion in count data is formally defined in relation to a specified model being fitted to the data [1,2]. In particular, we let
VR = observed variance theoretical variance
Thus the variance ratio (VR) is the ratio between the observed variance from the data and the theoretical variance from the model. Then the data are said to be overdispersed/equidispersed/underdispersed relative to the fitted model if the observed variance is larger/equal/smaller than the theoretical variance specified by the model, respectively. It is common to refer to dispersion relative to Poisson regression. In this case, the variance of the model is estimated by the sample mean. Thus, overdispersion/equidispersion/underdispersion relative to Poisson regression refers to cases in which the sample variance is larger/equal/smaller than the sample mean, respectively. Because the theoretical variance of a NB regression is always greater than its mean, as σ 2 = μ + 1 k μ 2 for k > 0 , the NB regression model is the natural choice for data that are overdispersed relative to Poisson regression. However, crucially, NB regression cannot handle underdispersed data.
In contrast to this, we show how a DW distribution can handle data that are both overdispersed and underdispersed relative to Poisson regression. In particular, Figure 2 (left) shows the VR values in Equation (6) for data simulated by D W ( β , 0.7 ) and fitted by Poisson and NB distributions, respectively. Comparing these values of the VR to 1, the plot shows cases of data overdispersed and underdispersed relative to Poisson regression. In addition, while NB regression can fit data that are overdispersed relative to Poisson regression (i.e., VR close to 1) well, this does not happen for underdispersed data, for which both Poisson and NB regression are inappropriate.
Figure 2 (right) considers more closely the case of dispersion relative to Poisson regression for a range of values of q and β and shows how the DW distribution, a single distribution with as many parameters as NB regression, can capture cases of underdispersion, equidispersion and overdispersion relative to Poisson regression.
In particular, these numerical analyses have approximately shown the following:
  • 0 < β 1 is a case of overdispersion, regardless of the value of q.
  • β 3 is a case of underdispersion, regardless of the value of q. In fact, the DW distribution approaches the Bernoulli distribution with mean p and variance p ( 1 p ) for β .
  • 1 < β < 3 leads to both cases of overdispersion and underdispersion depending on the value of q.

4. DW Regression Model

We now exploit this advantageous property of a DW distribution within a regression context, where the interest is to model the relationship between a count response variable and a set of covariates.

4.1. Model Formulation

We introduce the DW regression model for count data in analogy with the continuous Weibull regression, which is well known in survival analysis and life-time modelling. Recalling that the distribution function of a continuous Weibull distribution is given by
F ( y ; λ , β ) = 1 e λ y β , y 0 ,
with scale parameter λ , one can see that the parameter q of a DW distribution is equivalent to e λ in the continuous case. Because Weibull regression imposes a log link between the parameter λ and the predictors [16,17], the DW regression can be introduced via the parameter q. Figure 3 shows how the parameter q affects the scale and the shape of the probability mass function of the DW distribution.
From Equation (5) with τ = 1 2 , the median of Y, denoted by M, satisfies
log M + 1 = 1 β log log ( 2 ) 1 β log log ( q )
Thus, in order to introduce a DW regression model, we assume that, for i = 1 , 2 , , n , the response Y i has a DW conditional distribution f ( y i , q ( x i ) , β | x i ) , where q ( x i ) is the DW parameter related to the explanatory variables x i through the link function:
log log ( q i ) = x i α , x i α = α 0 + x i 1 α 1 + + x i P α P
This transforms q from the probability scale (i.e., the interval [ 0 , 1 ] ) to the interval [ , + ] and ensures that the parameter q remains in [ 0 , 1 ] . The l o g ( l o g ) link in q is also motivated by the analytical formula for the quantile Equation (5), which facilitates the interpretation of the parameters, as discussed in the next subsection. Other link functions are possible, such as the logit (analogous with geometric regression) or probit link on q. Moreover, the DW regression model can be introduced by relating β to the explanatory variable, f ( y i , q , β ( x i ) | x i ) , or more generally, by adding a link to both parameters, f ( y i , q ( x i ) , β ( x i ) | x i ) .
Then, from Equation (8), q i can be expressed as
q i = e e x i α
from which the conditional probability mass function of the response variable Y i given x i is as follows
f ( y i | x i ) = e e x i α y i β e e x i α ( y i + 1 ) β
Finally, in order to obtain the MLEs for the unknown parameters α and β , the log-likelihood of Equation (10) is maximised numerically using standard nonlinear optimisation tools.

4.2. Interpretation of the Regression Coefficients

After a DW regression model has been estimated, the following can be obtained:
  • The fitted values for the central trend of the conditional distribution, namely, the following:
    -
    Mean: Equation (3), as mentioned earlier, can be calculated numerically using the approximated moments of the DW regression [15].
    -
    Median: The quantile formula provided in Equation (5) can be applied. Because of the skewness, which is common for count data, the median is more appropriate than the mean. The fitted conditional median can be obtained easily from the closed-form expression of quantiles for DW regression, as
    M ( x ) = log ( 2 ) log ( q ( x ) ) 1 β 1
  • The conditional quantile for any τ can be obtained from Equation (5).
The analytical expression of the quantiles, combined with the chosen l o g ( l o g ) link function, offers a way of interpreting the parameters. Indeed, substituting Equation (9) into Equation (11) leads to
log M ( x ) + 1 = 1 β log log ( 2 ) 1 β x α
Thus, the regression parameters α can be interpreted in relation to the log of the median. This is in analogy with Poisson and NB models, for which the parameters are linked to the mean. In particular, log log ( 2 ) α 0 β is related to the conditional median when all covariates are set to zero, whereas α p β , p = 1 , , P , can be related to the change in the median of the response corresponding to a one-unit change of X p , keeping all other covariates constant.

4.3. Diagnostics Checks

After fitting a DW regression model, it is essential to consider a diagnostics analysis to investigate the appropriateness of the model. Given that the response is discrete, we advise performing a residual analysis on the basis of the randomised quantile residuals, as developed by [18] and used in many other studies (e.g., [19,20]). In particular, we let
r i = Φ 1 ( u i ) i = 1 , , n
where Φ ( . ) is the standard normal distribution function and u i is a uniform random variable on the interval
a i , b i = lim y y i F ( y ; q i ^ , β ^ ) , F ( y i ; q i ^ , β ^ ) F ( y i 1 ; q i ^ , β ^ ) , F ( y i ; q i ^ , β ^ )
These residuals follow the standard normal distribution, apart from the sampling variability in q i ^ and β ^ . Hence, the validity of a DW model can be assessed using goodness-of-fit investigations of the normality of the residuals, such as Q-Q plots and normality tests. Simulated envelopes can be added to the Q-Q plots, as in [7,21,22,23].
In addition to the residual analysis, it is informative also to check whether the data shows any underdispersion or overdispersion relative to the specified DW conditional distribution. In the case of good fitting, we would expect the ratio of observed and theoretical variance in Equation (6) to be close to 1 for each x. We expand more on this point in the next section.

5. DW Regression Naturally Handles Covariate-Specific Dispersion

We have already shown in Section 3 how DW regression can model both data that are overdispersion and underdispersed relative to Poisson regression. In this section, we investigate this further within a regression context. Here, it is also possible that the conditional variance is larger than the conditional mean for a specific covariate pattern (overdispersion), but the conditional variance is smaller than the conditional mean for another covariate pattern (underdispersion).
In the literature, regression models for count data that can capture underdispersion or both types of overdispersion and underdispersion simultaneously are in the form of extended versions of Poisson regression, such as quasi-Poisson, COM–Poisson or hyper-poisson regression [7]. In the case of mixed types of dispersion, the dispersion parameter can be assumed to be linked to the covariates. However, a covariate-dependent dispersion increases the complexity of the model significantly and reduces its interpretability. Thus, in practice, most implementations fix the dispersion parameter and assume that only the mean is linked to the covariates. As the DW distribution naturally accounts for overdispersion and underdispersion, a DW regression model becomes a simple and attractive alternative to existing regression models for count data.
We emphasize this point by a simple simulation study. We have considered two cases, a small sample size ( n = 25 ) and a large sample size ( n = 600 ), with two covariates, X 1 N ( 0 , 1 ) and X 2 Uniform ( 0 , 10 ) . We assumed the regression parameters to take values α = ( α 0 , α 1 , α 2 ) = ( 0.5 , 0.4 , 0.3 ) . In addition, the parameter β of the DW regression was assumed to be β = 2.1 . Then, for each case, we respectively sampled 25 and 600 values of the covariates and the corresponding response from D W ( q i , β ) , where q i is calculated as in Equation (9), for i = 1 , , n . Table 1 reports the estimates of the parameters, together with the average bias and the mean-squared error (MSE) over 1000 iterations.
Figure 4 shows a boxplot of the dispersions in Equation (6) in the case of Poisson, NB and DW fitting. A note is required on the calculation of these ratios, as the observed variance could not be computed for each individual covariate vector x. For the calculation, we split the response values into 10 groups of similar size, on the basis of the percentiles of the linear predictors x α . Then the observed variance was computed within each group, while the theoretical variance was averaged within each group. If the model was well specified, we would expect these values to have been close to 1. This is shown in Figure 4 for DW regression, which was the model used in the simulation. Poisson and NB regression showed underdispersion in most cases and overdispersion in two cases. Thus, this simulation shows a simple scenario of a mixed level of dispersion, which could not be captured well by standard Poisson and NB models.

6. Application to Real Datasets

To demonstrate the ability of the DW regression model to handle overdispersion and underdispersion automatically, in this section, DW regression has been applied to different datasets that show various types of dispersions relative to Poisson regression. The first subsection uses an underdispersed dataset, while the second uses an overdispersed case. The third subsection focuses on a zero-inflated dataset. Finally, an illustrative example for the mixed level of dispersion is provided. Various popular count data regression models, namely, Poisson regression (R function glm), NB regression (R function glm.nb), COM–Poisson regression (R package COMPoissonReg [24]), GP regression (R package VGAM [25]), and zero-inflated and hurdle models (pscl R package [26]), have been applied and compared with DW regression by means of classical AIC and BIC criteria [27].

6.1. The Case of Underdispersion: Inhaler Usage Data

For this example, we used data from [28]. These consisted of 5209 observations and reported the daily count of using (albuterol) asthma inhalers for 48 children, aged between 6 and 13, suffering from asthma during the school day, for a period of time at the Kunsberg School of National Jewish Health in Denver, Colorado. The main objective of this analysis was to investigate the relationship between the inhaler use (representing the asthma severity) and air pollution, which was recorded by four covariates: the percentage of humidity, the barometric pressure (in mmHG/1000), the average daily temperature (in °F/100), and the morning levels of P M 25 , which are small air particles less than 25 mm in diameter. The response variable, which was the inhaler use count, had a sample mean of 1.2705 and variance of 0.8433 , thus pointing to a case of underdispersion relative to Poisson regression.
The results in Table 2 suggest that DW and COM–Poisson regression provided better fitting than both Poisson and NB models, according to both AIC and BIC. The COM–Poisson regression considered here was based on [6] with the following probability mass function:
P ( Y i = y i ) = λ i y i ( y i ! ) Z ( λ i , ν ) Z ( λ i , ν ) = s = 0 λ i s ( s ! ) ν
GP regression was also attempted on this dataset, but it did not improve on COM–Poisson regression (AIC = 13550.17) and is thus not reported in the table. For DW regression, the parameters are reported with the parameterisation linked to the median, as previously described. The left panel of Figure 5 indicates underdispersion relative to Poisson and NB regression across the full range of the covariates and a good fit of DW regression compared to the other models (VR values close to 1). COM–Poisson regression could not be added to this plot because of the complexity of calculating the theoretical variances in this case. The right panel compares the observed and expected frequencies for the four models and shows again a good fit for DW regression. Finally, Figure 6 plots the randomised quantile residuals from the DW regression model, which only moderately depart from normality (p-value of Kolmogorov–Smirnov (KS) test: 0.025).

6.2. The Case of Overdispersion: Doctor Visits from German Health Survey Data

This dataset comes from the German Health Survey and is available in the COUNT R package [29], under the name of badhealth. The response variable is the number of visits to certain doctors during 1998. Two predictors are considered: an indicator variable representing patients claiming to be in bad health (1) or not (0), and the age of the patient. The response variable ranges from 0 to 40 visits and has a sample mean of 2.3532 and variance of 11.9818 , suggesting overdispersion relative to Poisson regression. Indeed, a comparison of Poisson and NB distributions solely on the response variable using a likelihood ratio test (lmtest R package [30]) shows evidence of overdispersion with a chi-square test statistic of 1165.267 and p-value < 0.001 .
After fitting three regression models and comparing them via AIC and BIC, Table 3 shows that the DW model was only marginally superior to the NB, but both DW and NB regression models gave much better fits to the data than the Poisson regression model. The left panel of Figure 7 indicates a case of overdispersion relative to Poisson regression across the whole range of the covariates. Additionally, it indicates the better fit of the NB and DW models than Poisson regression with VR values closer to 1. The right panel confirms the good fit of NB and DW regression. For visualisation purposes, the small number of observations larger than 16 are grouped together in this plot. Finally, Figure 8 shows that the residuals closely followed a normal distribution (KS p-value: 0.06), with not many points falling outside the simulated 95% envelope’s bounds.

6.3. The Case of Excessive Zeros: Doctor Visits from the United States Data

The following dataset illustrates the case of excessive zero counts. Thus, besides the Poisson, NB, and DW regression, we include also zero-inflated and hurdle models in the comparison. Particularly, zero-inflated Poisson (ZIP), zero-inflated negative binomial (ZINB), hurdle Poisson (HP) and hurdle negative binomial (HNB) are considered with the logit link function for the binomial distribution representing the probability of the extra zeros (R package pscl [26]). The data are available from the Ecdat R package, under the name Doctor. The data consist of 485 observations from the United States in the year 1986 and contain four variables for each patient: the number of doctor visits, which is taken as the response, and the number of children in the household, a measure of access to health care and a measure of health status (larger positive numbers are associated with poorer health). The response variable in this study, the number of doctor visits, has approximately 50 % of zeros, and thus it could be considered as a case of excessive zeros. Indeed, the response variable has a mean of 1.6103 and variance of 11.2011 , and the likelihood ratio test between NB and Poisson regression returned a test statistic of 599.61 and a p-value of < 0.001 .
Table 4 shows the best fit for the DW regression model in terms of AIC and BIC. The left panel of Figure 9 shows a case of overdispersion relative to Poisson regression across the full range of covariates and a good fit for DW and ZINB regression. We excluded Poisson regression from this plot, as the VR values were very large in this case, as well as the hurdle models, as they provided almost identical results to the corresponding zero-inflated models. The right panel confirms the good fit of ZINB and DW regression. For visualisation purposes, the small number of observations larger than 12 are grouped together on this plot. As in the previous example, the residuals of the DW model were well approximated by a normal distribution (KS p-value: 0.856). This example shows how DW regression in its simplest form can also model cases of excessive zeros, although additional zero-inflated components could also be added to a DW model if necessary and will be explored in future work.

6.4. The Case of a Mixed Level of Dispersion: Bids Data

In this section, we report the analysis of a dataset for which a mixed level of dispersion was observed; that is, the conditional distribution is overdispersed relative to Poisson regression for some covariate pattern but is underdispersed for another covariate pattern. The data are taken from [31] and are available in the Ecdat R package under the name of Bids. The data record the number of bids received by 126 U.S. firms that were targets of tender offers during a certain period of time. The dependent variable here is the number of bids, with a mean of 1.7381 and a variance of 2.0509 . The objective of the study was to investigate the effect of some variables on the number of bids. For this analysis, we considered the following covariates: bid price, taken as the price at a particular week divided by the price 14 working days before the bid; the size, that ism the total book value of assets measured in billions dollars; and a regulator, a dummy variable, which was 1 if there was an intervention by federal regulators and 0 otherwise.
Figure 10 and Table 5 show once again a very good fit of the DW regression model to these data, compared to Poisson and NB regression. Figure 10 shows a mixed level of dispersion relative to Poisson and NB regression, with most covariate patterns leading to underdispersion, but with a small number of overdispersed cases. The DW model has a clearer distribution of VR values around 1, and it also fits the data well, with a KS p-value of 0.11 for the randomised quantile residuals.

7. Conclusions

In this paper, we introduce a regression model based on a DW distribution and show how this model can be seen as a simple and unified model to capture different levels of dispersion in the data, namely, underdispersion and overdispersion relative to Poisson regression. This is an attractive feature of DW regression, similar to the flexibility of the continuous Weibull distribution in adapting to a variety of hazard rates. In addition, the proposed DW regression model, unlike generalised linear models in which the conditional mean is central to the interpretation, has the advantage that the conditional quantiles can be easily extracted from the fitted model and the regression coefficients can be easily interpreted in terms of changes in the conditional median. This is particularly useful, as most count data have a highly skewed distribution.
A popular model for underdispersion is the COM–Poisson regression model. However, the probability mass function of COM–Poisson regression is not in a closed form and it contains an infinite sum, which requires an approximate computation. In fact, the COM–Poisson implementation that was used for the examples in this paper requires more computational time than the DW regression model, which uses a straightforward MLE procedure. This is particularly beneficial in the case of large sample sizes. While NB regression is the most widely applied model for overdispersion, the DW regression model is shown to be an attractive alternative to the NB regression model for overdispersion.
The DW regression model described in this paper is implemented in the R package DWreg, freely available in CRAN [32].

Acknowledgments

This work was supported by the Major Program of Hubei Provincial Department of Education for Philosophy and Social Science Research (Grant No. 17ZD018) and the National Institute for Health Research Method Grant (NIHR-RMOFS-2013-03- 09) .

Author Contributions

Under the idea and detailed guidance of the corresponding author, Hadeel S. Klakattawi together with Veronica Vinciotti explored details of inference. Hadeel S. Klakattaw carried out all numerical calculations and a first draft, and Veronica Vinciotti and the Keming Yu polished the writing. All authors have read and approved the final manuscript.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
DWDiscrete Weibull
NBNegative binomial
GPGeneralised Poisson
ZIPZero-inflated Poisson
ZINBZero-inflated negative binomial
HPHurdle Poisson
ZINBHurdle negative binomial
MSEMean-squared error
MLEMaximum likelihood estimator
VRVariance ratio
KSKolmogorov–Smirnov

References

  1. Cameron, A.C.; Trivedi, P.K. Regression Analysis of Count Data; Cambridge University Press: Cambridge, UK, 2013. [Google Scholar]
  2. Hilbe, J.M. Modeling Count Data; Cambridge University Press: Cambridge, UK, 2014. [Google Scholar]
  3. Nelder, J.A.; Wedderburn, R.W. Generalized linear models. J. R. Stat. Soc. Ser. A 1972, 135, 370–384. [Google Scholar] [CrossRef]
  4. Efron, B. Double exponential families and their use in generalized linear regression. J. Am. Stat. Assoc. 1986, 81, 709–721. [Google Scholar] [CrossRef]
  5. Famoye, F. Restricted generalized Poisson regression model. Commun. Stat. Theory Methods 1993, 22, 1335–1354. [Google Scholar] [CrossRef]
  6. Sellers, K.F.; Shmueli, G. A flexible regression model for count data. Ann. Appl. Stat. 2010, 4, 943–961. [Google Scholar] [CrossRef]
  7. Sáez-Castillo, A.; Conde-Sánchez, A. A hyper-Poisson regression model for overdispersed and underdispersed count data. Comput. Stat. Data Anal. 2013, 61, 148–157. [Google Scholar] [CrossRef]
  8. Nakagawa, T.; Osaki, S. The discrete Weibull distribution. IEEE Trans. Reliab. 1975, 24, 300–301. [Google Scholar] [CrossRef]
  9. Khan, M.A.; Khalique, A.; Abouammoh, A. On estimating parameters in a discrete Weibull distribution. IEEE Trans. Reliab. 1989, 38, 348–350. [Google Scholar] [CrossRef]
  10. Kulasekera, K. Approximate MLE’s of the parameters of a discrete Weibull distribution with type I censored data. Microelectron. Reliab. 1994, 34, 1185–1188. [Google Scholar] [CrossRef]
  11. Englehardt, J.D.; Li, R. The discrete Weibull distribution: An alternative for correlated counts with confirmation for microbial counts in water. Risk Anal. 2011, 31, 370–381. [Google Scholar] [CrossRef] [PubMed]
  12. Englehardt, J.D.; Ashbolt, N.J.; Loewenstine, C.; Gadzinski, E.R.; Ayenu-Prah, A.Y. Methods for assessing long-term mean pathogen count in drinking water and risk management implications. J. Water Health 2012, 10, 197–208. [Google Scholar] [CrossRef] [PubMed]
  13. Roy, D. Discrete Rayleigh distribution. IEEE Trans. Reliab. 2004, 53, 255–260. [Google Scholar] [CrossRef]
  14. Sato, H.; Ikota, M.; Sugimoto, A.; Masuda, H. A new defect distribution metrology with a consistent discrete exponential formula and its applications. IEEE Trans. Semicond. Manuf. 1999, 12, 409–418. [Google Scholar] [CrossRef]
  15. Barbiero, A. DiscreteWeibull: Discrete Weibull Distributions (Type 1 and 3), 2015. R Package Version 1.0.1. Available online: https://cran.r-project.org/web/packages/DiscreteWeibull/index.html (accessed on 17 February 2018).
  16. Da Silva, M.F.; Ferrari, S.L.; Cribari-Neto, F. Improved likelihood inference for the shape parameter in Weibull regression. J. Stat. Comput. Simul. 2008, 78, 789–811. [Google Scholar] [CrossRef]
  17. Lee, E.T.; Wang, J. Statistical Methods for Survival Data Analysis; John Wiley & Sons: Hoboken, NJ, USA, 2003. [Google Scholar]
  18. Dunn, P.K.; Smyth, G.K. Randomized quantile residuals. J. Comput. Graph. Stat. 1996, 5, 236–244. [Google Scholar]
  19. Ospina, R.; Ferrari, S.L. A general class of zero-or-one inflated beta regression models. Comput. Stat. Data Anal. 2012, 56, 1609–1623. [Google Scholar] [CrossRef]
  20. Vanegas, L.H.; Rondón, L.M.; Cordeiro, G.M. Diagnostic tools in generalized Weibull linear regression models. J. Stat. Comput. Simul. 2013, 83, 2315–2338. [Google Scholar] [CrossRef]
  21. Ferrari, S.; Cribari-Neto, F. Beta regression for modelling rates and proportions. J. Appl. Stat. 2004, 31, 799–815. [Google Scholar] [CrossRef]
  22. Garay, A.M.; Hashimoto, E.M.; Ortega, E.M.; Lachos, V.H. On estimation and influence diagnostics for zero-inflated negative binomial regression models. Comput. Stat. Data Anal. 2011, 55, 1304–1318. [Google Scholar] [CrossRef]
  23. Atkinson, A.C. Plots, Transformations, and Regression: An Introduction to Graphical Methods of Diagnostic Regression Analysis; Clarendon Press Oxford: Oxford, UK, 1985. [Google Scholar]
  24. Sellers, K.; Lotze, T. COMPoissonReg: Conway-Maxwell Poisson (COM-Poisson) Regression. 2011. R Package Version 0.3.4. Available online: https://mran.microsoft.com/snapshot/2014-11-09/web/packages/COMPoissonReg/index.html (accessed on 17 February 2018).
  25. Yee, T.W. The VGAM package for categorical data analysis. J. Stat. Softw. 2010, 32, 1–34. [Google Scholar] [CrossRef]
  26. Zeileis, A.; Kleiber, C.; Jackman, S. Regression Models for Count Data in R. J. Stat. Softw. 2008, 27. [Google Scholar] [CrossRef]
  27. Dayton, C.M. Model comparisons using information measures. J. Mod. Appl. Stat. Methods 2003, 2, 2. [Google Scholar] [CrossRef]
  28. Grunwald, G.K.; Bruce, S.L.; Jiang, L.; Strand, M.; Rabinovitch, N. A statistical model for under-or overdispersed clustered and longitudinal count data. Biom. J. 2011, 53, 578–594. [Google Scholar] [CrossRef] [PubMed]
  29. Hilbe, J.M. COUNT: Functions, Data and Code for Count Data. 2014. R Package Version 1.3.2. Available online: https://cran.r-project.org/web/packages/COUNT/index.html (accessed on 17 February 2018).
  30. Zeileis, A.; Hothorn, T. Diagnostic Checking in Regression Relationships. R News 2002, 2, 7–10. [Google Scholar]
  31. Cameron, A.C.; Johansson, P. Count data regression using series expansions: With applications. J. Appl. Econom. 1997, 12, 203–223. [Google Scholar] [CrossRef]
  32. DWreg: Parametric Regression for Discrete Response. Available online: https://cran.r-project.org/web/packages/DWreg/index.html (accessed on 17 February 2018).
Figure 1. The effect of β on the discrete Weibull (DW) probability mass function with q = 0.6 .
Figure 1. The effect of β on the discrete Weibull (DW) probability mass function with q = 0.6 .
Entropy 20 00142 g001
Figure 2. Ratio of observed and theoretical variance of data simulated by D W ( q , β ) . Left: q = 0.7 ; data fitted by Poisson and negative binomial (NB) regression. Right: a range of q and β values; data fitted by Poisson regression. The area below 1 corresponds to cases of underdispersion relative to Poisson regression, whereas the area above 1 corresponds to cases of overdispersion.
Figure 2. Ratio of observed and theoretical variance of data simulated by D W ( q , β ) . Left: q = 0.7 ; data fitted by Poisson and negative binomial (NB) regression. Right: a range of q and β values; data fitted by Poisson regression. The area below 1 corresponds to cases of underdispersion relative to Poisson regression, whereas the area above 1 corresponds to cases of overdispersion.
Entropy 20 00142 g002
Figure 3. Effect of q on the probability mass function of the discrete Weibull (DW) distribution for different β values.
Figure 3. Effect of q on the probability mass function of the discrete Weibull (DW) distribution for different β values.
Entropy 20 00142 g003
Figure 4. Distribution of ratios of observed and theoretical conditional variance on simulated data from the discrete Weibull (DW) regression model, with the theoretical variance fitted by Poisson, negative binomial (NB) and DW models.
Figure 4. Distribution of ratios of observed and theoretical conditional variance on simulated data from the discrete Weibull (DW) regression model, with the theoretical variance fitted by Poisson, negative binomial (NB) and DW models.
Entropy 20 00142 g004
Figure 5. Comparison of discrete Weibull (DW) regression with the other regression models on inhaler use data. Left: distribution of ratios of observed and theoretical conditional variance on the data fitted by Poisson, negative binomial (NB) and DW regression, respectively. Right: observed and expected frequencies for each model.
Figure 5. Comparison of discrete Weibull (DW) regression with the other regression models on inhaler use data. Left: distribution of ratios of observed and theoretical conditional variance on the data fitted by Poisson, negative binomial (NB) and DW regression, respectively. Right: observed and expected frequencies for each model.
Entropy 20 00142 g005
Figure 6. Residuals analysis for inhaler use data for the discrete Weibull (DW) regression model: histogram of randomised quantile residuals with superimposed N ( 0 , 1 ) density (dotted line).
Figure 6. Residuals analysis for inhaler use data for the discrete Weibull (DW) regression model: histogram of randomised quantile residuals with superimposed N ( 0 , 1 ) density (dotted line).
Entropy 20 00142 g006
Figure 7. Comparison of discrete Weibull (DW) with negative binomial (NB) and Poisson regression on doctor visits from German Health Survey data. Left: distribution of ratios of observed and theoretical conditional variance on the data fitted by Poisson, NB and DW regression, respectively. Right: observed and expected frequencies for each model.
Figure 7. Comparison of discrete Weibull (DW) with negative binomial (NB) and Poisson regression on doctor visits from German Health Survey data. Left: distribution of ratios of observed and theoretical conditional variance on the data fitted by Poisson, NB and DW regression, respectively. Right: observed and expected frequencies for each model.
Entropy 20 00142 g007
Figure 8. Residuals analysis for doctor visits from German Health Survey data: Q-Q plot of randomised quantile residuals of the discrete Weibull (DW) regression model.
Figure 8. Residuals analysis for doctor visits from German Health Survey data: Q-Q plot of randomised quantile residuals of the discrete Weibull (DW) regression model.
Entropy 20 00142 g008
Figure 9. Comparison of discrete Weibull (DW) regression with negative binomial (NB), zero-inflated Poisson (ZIP) and zero-inflated negative binomial (ZINB) regression on doctor visits from the United States data. Left: distribution of ratios of observed and theoretical conditional variance on the data. Right: observed and expected frequencies for each model.
Figure 9. Comparison of discrete Weibull (DW) regression with negative binomial (NB), zero-inflated Poisson (ZIP) and zero-inflated negative binomial (ZINB) regression on doctor visits from the United States data. Left: distribution of ratios of observed and theoretical conditional variance on the data. Right: observed and expected frequencies for each model.
Entropy 20 00142 g009
Figure 10. Distribution of ratios of observed and theoretical conditional variance on the Bids data fitted by Poisson, negative binomial (NB) and discrete Weibull (DW) regression.
Figure 10. Distribution of ratios of observed and theoretical conditional variance on the Bids data fitted by Poisson, negative binomial (NB) and discrete Weibull (DW) regression.
Entropy 20 00142 g010
Table 1. Simulation study: discrete Weibull (DW) parameter estimates by maximum likelihood estimators (MLEs), together with bias, mean-squared error (MSE) and length of the 95% confidence interva (Cl), averaged over 1000 iterations.
Table 1. Simulation study: discrete Weibull (DW) parameter estimates by maximum likelihood estimators (MLEs), together with bias, mean-squared error (MSE) and length of the 95% confidence interva (Cl), averaged over 1000 iterations.
MLEBiasMSE95% CI Length
n = 25 α 0 0.64670.14670.29321.7324
α 1 0.49080.09080.07630.8821
α 2 −0.3651−0.06510.02410.4694
β 2.59240.49240.84552.4718
n = 600 α 0 0.50740.00740.0090.3611
α 1 0.4020.0020.00210.1782
α 2 −0.3033−0.00330.00040.0789
β 2.11960.01960.00860.3549
Table 2. Maximum likelihood estimates, AIC and BIC from different regression models fitted to the inhaler use data.
Table 2. Maximum likelihood estimates, AIC and BIC from different regression models fitted to the inhaler use data.
HumidityPressureTemperatureParticlesOtherAICBIC
Poisson−0.11254.0950−0.20350.022513915.4713948.26
NB−0.11254.0950−0.20350.0225 k ^ = 31905.28 13917.5413956.89
COM–Poisson−0.17246.2864−0.31280.0348 ν ^ = 1.9203 13450.7713490.12
DW−0.10502.6376−0.17350.0136 β ^ = 2.1277 13484.3613523.71
Table 3. Maximum likelihood estimates, AIC and BIC from different regression models fitted to the doctor visits from German Health Survey data.
Table 3. Maximum likelihood estimates, AIC and BIC from different regression models fitted to the doctor visits from German Health Survey data.
Bad HealthAgeOtherAICBIC
Poisson1.10830.00585638.5525653.634
NB1.10730.0070 k ^ = 0.9975 4475.2854495.394
DW1.00680.0120 β ^ = 0.9887 4474.9734495.083
Table 4. Maximum likelihood estimates, AIC and BIC from different regression models fitted to doctor visits from the United States data.
Table 4. Maximum likelihood estimates, AIC and BIC from different regression models fitted to doctor visits from the United States data.
ChildrenAccessHealthOtherAICBIC
Poisson−0.17590.93690.28982179.4872196.223
NB−0.17060.41970.3154 k ^ = 0.5525 1581.881602.801
Zero-inflated models
Poisson
Count model−0.14980.80530.17361885.8131919.287
Logit model0.0843−0.1048−0.4147
NB
Count model−0.14140.64910.2239 k ^ = 0.6869 1578.51616.158
Logit model0.24651.2085−2.0676
Hurdle models
Logit model−0.14620.42520.4524
Poisson count model−0.15060.81430.17331885.8081919.281
NB count model−0.16640.54040.2157 k ^ = 0.2596 1576.3021613.959
DW−0.13090.34030.2758 β ^ = 0.7823 1575.7961596.717
Table 5. Maximum likelihood estimates, AIC and BIC from different regression models fitted to Bids data.
Table 5. Maximum likelihood estimates, AIC and BIC from different regression models fitted to Bids data.
PriceSizeRegulatorOtherAICBIC
Poisson−0.78490.03620.0547402.2602413.6054
NB−0.78240.03690.0544 k ^ = 33.3289 403.9481418.1295
DW−0.67610.05520.0293 β ^ = 1.9403 395.1214409.3028

Share and Cite

MDPI and ACS Style

Klakattawi, H.S.; Vinciotti, V.; Yu, K. A Simple and Adaptive Dispersion Regression Model for Count Data. Entropy 2018, 20, 142. https://doi.org/10.3390/e20020142

AMA Style

Klakattawi HS, Vinciotti V, Yu K. A Simple and Adaptive Dispersion Regression Model for Count Data. Entropy. 2018; 20(2):142. https://doi.org/10.3390/e20020142

Chicago/Turabian Style

Klakattawi, Hadeel S., Veronica Vinciotti, and Keming Yu. 2018. "A Simple and Adaptive Dispersion Regression Model for Count Data" Entropy 20, no. 2: 142. https://doi.org/10.3390/e20020142

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop