On Discrete Poisson–Mirra Distribution: Regression, INAR(1) Process and Applications

: Several pieces of research have spotlighted the importance of count data modelling and its applications in real-world phenomena. In light of this, a novel two-parameter compound-Poisson distribution is developed in this paper. Its mathematical functionalities are investigated. The two unknown parameters are estimated using both maximum likelihood and Bayesian approaches. We also offer a parametric regression model for the count datasets based on the proposed distribution. Furthermore, the ﬁrst-order integer-valued autoregressive process, or INAR(1) process, is also used to demonstrate the utility of the suggested distribution in time series analysis. The unknown parameters of the proposed INAR(1) model are estimated using the conditional maximum likelihood, conditional least squares, and Yule–Walker techniques. Simulation studies for the suggested distribution and the INAR(1) model based on this innovative distribution are also undertaken as an assessment of the long-term performance of the estimators. Finally, we utilized three real datasets to depict the new model’s real-world applicability.


Introduction
Many studies have underlined the importance of modelling count data as well as the time series of counts, which has sparked considerable interest in a variety of sectors, including medical science, earth science, physics, finance, and insurance. For modelling count datasets, the Poisson distribution is the most frequently used, but it has the drawback of not being able to model over-dispersed datasets, while the negative binomial distribution is used for modelling over-dispersed data. When investigating counting data, the existence of at least one overdispersion warrants extra consideration when selecting a count model (for details, see [1,2]). As a consequence, the conventional Poisson regression model is rarely used in these situations.
Furthermore, one of the most widely utilized methodologies for analyzing timedependent data are time series analysis. The corresponding time series is known as a time series of counts or, equivalently, an integer-valued time series when the data are generated by a random counting procedure. The demand for modelling count time series is seen in diverse real-life situations-for example, the monthly number of earthquakes in a specific place, the monthly number of automobile sales, the monthly number of deaths due to a specific disease, the monthly number of traffic accidents, and the number of living As indicated by [11], the hf of the MiD is shaped as a bathtub, decreasing for t < 2 α , and increasing for t > 2 α . As a special case, for α = θ = λ, the Xgamma distribution (XGD) is obtained with the pdf given by where t > 0 and the parameter λ > 0. For more information on the XGD, see [12].

Presentation
Through the following proposition, a novel mixed-Poisson distribution is introduced by compounding the Poisson and Mirra distributions.
Proof. The pmf of X can be derived using the general compounding formula as follows: We have employed the gamma function defined by Γ(x) = Now, for α = θ = λ, the pmf of the PMiD reduces to p(x; λ) = λ 2 2(1 + λ) x+4 2(1 + λ) 2 + λ(x + 1)(x + 2) , x = 0, 1, 2, . . . . (2) The expression in Equation (2) is the pmf of Poisson-Xgamma distribution (PXGD), which was introduced by [13]. Thus, the PXGD is a special case of the PMiD. Now, the possible pmf plots for various values of the parameters of the PMiD are portrayed in Figure 2. The pmf appears to be declining, growing, and unimodal, with some fluctuation in the mode and tails.

Mode
The next result clarifies the mode analysis of the PMiD. Proposition 2. Let X be a random variable following a PMiD. Then, if α ≥ 8θ 2 (1+θ) 2 (θ+2) 2 , the mode of X, denoted by x m , exists, and lies in either of these two cases: Proof. We have to find the integer x = x m , for which p(x; α, θ) takes its maximum value. That is, we aim to solve p(x; α, θ) ≥ p(x − 1; α, θ) and p(x; α, θ) ≥ p(x + 1; α, θ), which is equivalent to solving: and respectively. On solving the quadratic inequality in Equation (4), we obtain a 1 (α, θ) ≤ x m ≤ b 1 (α, θ), and on solving the quadratic inequality in Equation (5), , a 2 (α, θ), and b 2 (α, θ) are given in Equation (3). Combining these three inequalities, we obtain the mode x m such that This completes the proof.

Cdf and Hf
The corresponding cdf of the PMiD is given by and the hf of the PMiD is given by where p(x; α, θ) and F(x; α, θ) are respectively given in Equations (1) and (6). Furthermore, plots in Figure 3 refer to the shapes of the hf of the PMiD.  The hf is found to have all of the typical shapes, such as increasing, decreasing, bathtub, and upside-down bathtub shapes.

Moments
Some aspects of the PMiD are now being studied using various moment measures. Proposition 3. The r th factorial moment of X ∼ PMiD(α, θ) is given by Proof. Based on the compound-Poisson theory (see [14]), the r th factorial moment of X can be obtained as follows: Thus, the proof is complete.
The first four factorial moments of the PMiD can be obtained by substituting r = 1, 2, 3, and 4 in Equation (7). That is, Now, the first four moments about the origin of the PMiD are obtained by utilizing the general relationship between factorial moments and moments about the origin. We get Therefore, the variance of the PMiD is obtained as The dispersion index of the PMiD is given by Since α and θ > 0, one can establish that the PMiD's dispersion index is greater than one, i.e., DI > 1, indicating that the PMiD is over-dispersed. The following formulae can be used to get explicit expressions for the skewness and kurtosis of the PMiD: and Now, Table 1 shows some numerical values for the mean, variance, DI, skewness, and kurtosis for the PMiD for various parameter settings.

Proof.
Owing to the well-known compound-Poisson theory, the pgf of the PMiD is obtained as follows: Thus, the proof is complete. When s is replaced by e t and e it in Equation (11), we obtain the moment generating function (mgf ) and characteristic function (cf ) of the PMiD, respectively. They are respectively given by for t ≤ 0, and for t ∈ R.

Rényi and Shannon Entropies
Entropy is a measure of uncertainty fluctuation in a stochastic situation, in which higher entropy indicates less information. The most popular entropy measures are Rényi entropy and Shannon entropy, which are among the most accessible in the literature.
For every discrete distribution with pmf p(x), the related Rényi entropy is defined by for γ > 0 and γ = 1.
In the context of the PMiD, by using Equation (1), we obtain Thus, the Rényi entropy of the PMiD is simplified to the following formula: Now, the Shannon entropy for a discrete distribution with pmf p(x) is given by Hence, the Shannon entropy for the PMiD can be expressed as

Estimation of the Parameters
Hereafter, we perform the estimation of parameters of the PMiD using two well-known estimation approaches: ML and Bayesian methods.

Maximum Likelihood Estimation
Let X 1 , X 2 , . . . , X n be a random sample of size n from X ∼ PMiD(α, θ) (so n independent and identically distributed (iid) random variables with the PMiD), with unknown α and θ, and x 1 , x 2 , . . . , x n be observations of X 1 , X 2 , . . . , X n . Then, the log-likelihood function is The maximization of log L n with respect to the parameters give their ML estimates (MLEs).
The following approach can be considered. The score function associated with this log-likelihood function is

Now, by solving
∂ log L n ∂α = 0, and ∂ log L n ∂θ = 0, we obtain the associated nonlinear log-likelihood equations. They are respectively given by The solutions of these two equations give the MLEs. We obtained the MLEs numerically using the fitdistrplus package of the R software (see [15]). For more details on the fitdistrplus package, one should go through the lin k https://CRAN.R-project.org/ package=fitdistrplus accessed on 14 February 2021. For the detailed R-code for finding the MLEs of the PMiD, see Appendix A of this article.

Bayesian Estimation
The Bayesian estimation technique is used to estimate the PMiD parameters in this subsection. That is, each parameter of PMiD must have some prior densities. For both of the parameters α and θ, the half-Cauchy (hC) distribution is used as the prior densities. The hereunder is the pdf of the hC distribution with scale parameter δ: There is no mean and variance for the hC distribution. Other than that, its mode is equal to 0. The hC distribution with the value of δ equals 25 is the preferable alternative to the uniform distribution, if more information is needed, according to [16]. Thus, as a noninformative prior distribution for the parameters α and θ, we utilize hC distribution with its δ value fixed to 25. That is, we use Thus, using Equation (12), the joint posterior pdf is given by where L n is the likelihood function for the PMiD, and ψ(x) is the pdf of the hC distribution with δ = 25. It is clear from Equation (13) that there is no analytical solution for determining the Bayesian estimates. As a consequence, we adopt the Metropolis-Hastings algorithm (MHA) of the Markov Chain Monte Carlo (MCMC) technique, which is a phenomenal simulation method, using the R software.

Performance of the PMiD Parameters Using Simulation Study
For some finite sample sizes, we execute the simulation studies to test the long-run accuracy of the MLEs of the PMiD parameters. We have generated samples of sizes n = 100, 250, 500, 750, and 1000 from the PMiD using various sets of parameter values. The R-code for generating the PMiD random samples for the specified parameter settings are given in Appendix A. The iteration is conducted 1001 times in this case. As a consequence, we calculated the average of the biases, mean squared errors (MSEs), coverage probabilities (CPs), and average lengths (ALs) of each parameter estimate for all iterations in the relevant sample sizes. The results are reported in Table 2. It can be seen that, as the sample size increases, the MSEs and ALs were associated with each estimate decrease. Interestingly, the CPs of the confidence intervals (CIs) for each parameter are relatively close to the nominal 95 percent level. This illustrates the steady performances of the MLEs.

PMiD Regression Model
In this section, we define a new count regression model based on the PMiD known as a PMiD regression model. Let Y ∼ PMiD(α, θ). By considering the re-parametrization α = θ 2 (µθ − 1)/(3 − µθ), the pmf of the PMiD can be expressed in terms of the mean E(Y) = µ > 0 and θ > 0, and it is given by The pmf in Equation (14) that defines a distribution is denoted as PMiD(µ, θ). Thus, we have Y ∼ PMiD(µ, θ). Assume that the response variable Y satisfies Y ∼ PMiD(µ, θ), and the covariates are in relation with the i th mean by the log-link function given as follows: where . . , v ip is the vector of covariates and τ = τ 0 , τ 1 , . . . , τ p T is the unknown vector of regression coefficients. The log-likelihood function of the PMiD regression model is derived by inserting the log-link function of Equation (15) in Equation (14), and is given by where y i is the i th observations of Y, µ i is given in Equation (15) for i = 1, 2, . . . , n, and Θ = (τ, θ) are obtained by maximizing Equation (16) using optim function in the R software.

INAR(1) Model with PMiD Innovations
The novel distribution is particularly well suited for modelling integer-valued time series data.
The stochastic process, {X t } t∈Z , is an INAR(1) process if it is given by where p ∈ [0, 1) and ε t is represented as the innovation process, thus composed of iid integer-valued random variables, with mean E(ε t ) = µ ε and variance Var(ε t ) = σ 2 ε . The operator symbol '•' represents the binomial thinning operator, which is defined as where B j j≥1 is a sequence of iid Bernoulli random variables with success probability, p.
For p ∈ [0, 1), the INAR(1) process is stationary, while, for the case p = 1, the process is non-stationary. The INAR(1) process, according to [4,5], is a homogeneous Markov chain with 1-step transition probabilities given by where p ∈ (0, 1). Therefore, in general, the mean, variance, and dispersion index of the INAR(1) process are, respectively, given by Var and where µ ε , σ 2 ε , and DI ε are given in Equations (8)-(10), respectively. For the detailed information on the INAR process, one can go through [17]. Now, in this section, based on the PMiD innovations, we present a new over-dispersed INAR(1) model, and we call it the PMiD-INAR(1) process.

Definition 1.
Assume that the innovation process {ε t } t∈Z follows a PMiD. Then, the (one-step) transition probability of the PMiD-INAR(1) process is given by Now, using Equations (17)- (19), the mean, variance, and dispersion index of the PMiD-INAR(1) process are respectively obtained as and Since DI X t is clearly greater than 1, the process is appropriate for over-dispersed integer valued autoregressive time series data. The conditional expectation and variance of the PMiD-INAR(1) process are given by and where σ 2 ε is given in Equation (9) (see [17,18]).

Estimation of the Parameters: PMiD-INAR(1) Process
In this section, the conditional maximum likelihood (CML), conditional least squares (CLS), and Yule-Walker (YW) methods are used to investigate the inference of the PMiD-INAR(1) process. To examine the efficiency of these three strategies, a simulation study is also carried out.

Conditional Maximum Likelihood (CML) Estimation
Let X 1 , X 2 , . . . , X T be a random sample of the stationary PMiD-INAR(1) process, and x 1 , x 2 , . . . , x T be observations of X 1 , X 2 , . . . , X T . Then, the conditional log-likelihood function is given by By applying the direct maximization technique on Equation (23), the respective CML estimates, sayp cls ,α cls , andθ cls corresponding to the parameters of the PMiD-INAR(1) process, p, α, and θ can be obtained using the optim function of the R software.

Conditional Least Squares (CLS) Estimation
By using Equation (21), the CLS estimates of the PMiD-INAR(1) process parameters ζ = (p, α, θ) are obtained by minimizing the function given below: As a result, by solving ∂ ∂ζ S(ζ) = 0, the system of equations based on the estimates can be derived. The CLS estimate of α is obtained aŝ Hence, in S(ζ), p is switched withp CLS , and the resultant function S(α, θ) should be minimized by concerning α and θ to obtain their CLS estimates. Sinceα CLS andθ CLS do not have the closed-form expression, we utilize optim function of the R software to obtain it numerically by minimizing S(α, θ).

Yule-Walker (YW) Estimation
The concept of the YW approach is to synchronously solve the theoretical moments with the empirical ones. Because of the autocorrelation function (ACF) of the INAR(1) process at lag τ is ρ x (τ) = p τ , the YW estimate of p is given bŷ Now, the theoretical mean and dispersion index of the PMiD-INAR(1) process is then solved with their empirical equivalents to derive the YW estimates of α and θ. When the theoretical mean is equated to the empirical mean, the YW estimateα YW of α is calculated as follows:α wherex = 1 N ∑ T t=1 x t . Then, substitutingα YW with Equation (24) in Equation (20) and equating Equation (20) to the empirical value of the dispersion index ( DI X t ), we obtain the YW estimateθ YW of θ. Sinceθ YW does not have the closed-form expression, we utilize the uniroot function available in the R software to obtain it numerically.

Simulation: PMiD-INAR(1) Process
In this section, a simulation study is conducted for the comprehensive assessment of the long-standing performances of CML, YW, and CLS estimates of the PMiD-INAR(1) process parameters. We have generated samples of sizes n = 100, 250, 500, 750, and 1000 using values of the parameter setting (p = 0.5, α = 0.6, θ = 0.7), and the iteration process is conducted 1001 times. The simulation results are analyzed using estimated biases, MSEs, and mean relative errors (MREs), and are presented in Table 3. The following formulae are used to determine these values: where ζ = (p, α, θ).

Applications and Empirical Study
To show the usage of the PMiD model, we utilize three real datasets in this section: the first is COVID-19 data, the second is hospital length of stay, and the third is the number of earthquakes data.

COVID-19 Data: Armenia
First, we utilize the dataset of the daily new deaths in Armenia due to the COVID-19 disease. The data are available at https://www.worldometers.info/coronavirus/country/ armenia/ accessed on the 10 January 2021 and are also studied by [19]. They contain the daily new COVID cases between 15 February 2020 and 4 October 2020. To demonstrate the PMiD's potential benefit, the distributions given in Table 4 are considered for comparison. We compare the competitive distributions to the suggested distribution using the statistical techniques provided, namely the negative log-likelihood (− log L), Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC), χ 2 statistic, and p-value. Tables 5 and 6 display the corresponding MLEs (with standard errors (SEs) and CIs) and goodness-of-fit (GOF) results, respectively. The PMiD's GOF statistics values are less than the other examined distributions, as shown in these tables. As a result, the suggested model is the most appropriate for modelling the given COVID-19 data. Now, the empirical mean, variance, and DI of this Armenia dataset are 4.1931, 18.7944, and 4.4822, respectively, and the theoretical values for the mean, variance, and DI measures of the PMiD are 4.1931, 19.6659, and 4.6900. It is observed that the empirical and the theoretical means are exactly the same, and the empirical and the theoretical values of variances, and DIs are the closest to each other. The empirical cdf, pmf, and P-P plots for the Armenia dataset are respectively given in Figures 4-6. It again gives some better-shaped curves for those fitted in the PMiD.   The next goal is to use the Bayesian technique to estimate the parameters of the PMiD using the above-mentioned COVID-19 dataset. The analysis was carried out using the MHA of the MCMC technique with 1000 iterations in this perspective. For comparison purposes, both the Bayesian and MLE estimates of the PMiD parameters for the real dataset are given in Table 7. R programming is used to perform the numerical computations.

Length of Hospital Stay
By applying the PMiD regression model to an actual dataset, we can confirm its prominence. The dataset is about the 1991 Arizona cardiovascular patients (AZPRO data), which is available in COUNT package of the R software. The goal of this study is to investigate the combined relationship between the length of hospital stay (y i ) with the covariates x i1 -the cardiovascular procedures (CABG = 1, PTCA = 0), x i2 -the sex of the patients (male = 1, f emale = 0), x i3 -the type of the admission (urgent = 1, elective = 0), and x i4 -the age of the patients ((age > 75) = 1, (age ≤ 75) = 0). The fitted nonlinear regression model is given by In Table 8, we compare the performance of the PMiD regression model with the Poisson regression model, the NPWE regression model, and the PXGD regression model and also display the summaries due to the real dataset, which include the SEs, p-value, negative log-likelihood, AIC, and BIC values. Table 8 points out that the PMiD regression model has the lowest values across all the model selection criteria, indicating that it is the better count regression model than all the Poisson, NPWE, and PXGD regression models.

Japan Earthquake Data
We use data from the ETAS package of R software to calculate annual counts of earthquakes with a magnitude of 4.5 or higher that occurred in Japan between the years 1926 and 2007. The data comprise 82 observations. For more details on this package, one can go through [23]. In addition, Figure 7 depicts the Japan earthquake catalog. The spatial distribution of earthquakes in the study area is depicted in the top-left picture. The three figures in the right half of Figure 7 depict variations in the latitude, longitude, and magnitude of the earthquakes over time. The two figures in the bottom-left corner of Figure 7 depict the earthquake catalog's completeness and time stationarity. Here, the number of earthquakes having a magnitude higher than or equal to m is represented by N m . If the plot of log(N m ) versus m exhibits a linear trend, it reflects the completeness of the earthquake catalog. Furthermore, the time stationarity of the catalog can be determined by looking at the plot of N t versus t, where N t is the total number of occurrences in the catalog up to time t. The earthquake time series is stationary if the plotted points of N t versus t have a functional form in such a way that N t = λ 0 t, where λ > 0.
We  (1)), for the comparison. For more details of these innovations, see Table 4. Now, Table 9 displays the results corresponding to the earthquake data. We see that the − log L, AIC, and BIC values of the PMiD-INAR(1) model is smaller than those of the other compared INAR(1) models, and we conclude that the PMiD-INAR(1) is the most suitable model for the earthquake data compared to that of the other models.  The residual analysis is done to make sure the fitted model is accurate for the earthquake data. To do so, the Pearson residuals for the PMiD-INAR(1) process are computed using the following formula: where E(X t |X t−1 = x t−1 ) and Var(X t |X t−1 = x t−1 ) can be found in Equations (21) and (22), respectively. In general, the mean and variance of Pearson residuals should be near zero and one, respectively, and the computed Pearson residuals should not have any autocorrelation issues if the fitted INAR(1) process is statistically accurate. Here, the obtained Pearson residuals of the PMiD-INAR(1) process have a mean and variance of 0.0012 and 1.1612, respectively, which were quite close to the anticipated values. Therefore, the PMiD-INAR(1) process is well judged for the given earthquake data. Now, the fitted PMiD-INAR(1) process is obtained as follows: where the innovation process ε t ∼ PMiD(0.6869, 0.0247). Now, we can calculate the predicted number of earthquakes in Japan viâ

Context
Discovering new count models is inevitable in the scenario of overdispersion, which will provide more possibilities for superiorly fitting the real datasets by choosing the right models according to the situations. In that sense, we proposed a new overdispersed count model, discussed its regression aspects, and constructed an INAR(1) process based on them. The main motivation behind the construction of this model is also discussed. In all the aspects, we found that the proposed model outperforms the compared models.

This Work
The Poisson-Mirra distribution (PMiD), a novel two-parameter discrete distribution, is introduced and thoroughly investigated. We delivered specific expressions for the factorial moments, mean, variance, dispersion index, skewness, kurtosis, mode, probability generating function, moment generating function, characteristic function, and the entropy measures. The distribution parameters were estimated by using the classical maximum likelihood and Bayesian estimation methods and were also studied in their real-data analysis. A simulation study on MLE was also conducted. A new regression model for count data based on the PMiD is proposed and compared with its competitive regression models based on a real dataset. More importantly, we introduced the integer-valued autoregressive model based on the PMiD, known as the PMiD-INAR(1) process. The parameter estimation problems, which include the conditional maximum likelihood, conditional least squares, and the Yule-Walker estimation procedures for the PMiD-INAR(1) process, are discussed, and simulation studies based on these three estimation procedures are also conducted. In total, three real-world datasets were used to demonstrate the use of the novel model, the first regarding COVID-19 data, the second regarding the length of hospital stay, and the third concerning earthquake data.

Contributions and Limitations
The article thus developed the new distribution PMiD from which we derived a new count model, its regression model, and the first-order integer-valued autoregressive model.
The proposed distribution, we believe, is ideal for researching data from COVID-19 related areas, biology, and earthquake-related fields, and we hope the PMiD applies to other fields of study also. The absence of a bimodal feature is the possible limitation of the proposed distribution.

Future Work
This research could take another route if the bivariate version of the PMiD and the associated BINAR(1) process are built. This work needs considerable modifications and studies, which we shall delegate to future research.

Conclusions
In this article, we fitted three real datasets and concluded that the PMiD model outperforms all the compared models in all aspects. We anticipate that the proposed model will increase its prevalence and have a wider variety of applications in the modelling of positive integer-valued real-world datasets from different study areas such as physics, astronomy, survival analysis, and so on.

Acknowledgments:
We appreciate the constructive feedback from the three reviewers on the first version of the manuscript.

Conflicts of Interest:
The authors declare no conflict of interest.