A New Family of Discrete Distributions with Mathematical Properties, Characterizations, Bayesian and Non-Bayesian Estimation Methods

: In this work, we propose and study a new family of discrete distributions. Many useful mathematical properties, such as ordinary moments, moment generating function, cumulant generating function, probability generating function, central moment, and dispersion index are derived. Some special discrete versions are presented. A certain special case is discussed graphically and numerically. The hazard rate function of the new class can be “decreasing”, “upside down”, “increasing”, and “decreasing-constant-increasing (U-shape)”. Some useful characterization results based on the conditional expectation of certain function of the random variable and in terms of the hazard function are derived and presented. Bayesian and non-Bayesian methods of estimation are considered. The Bayesian estimation procedure under the squared error loss function is discussed. Markov chain Monte Carlo simulation studies for comparing non-Bayesian and Bayesian estimations are performed using the Gibbs sampler and Metropolis–Hastings algorithm. Four applications to real data sets are employed for comparing the Bayesian and non-Bayesian methods. The importance and ﬂexibility of the new discrete class is illustrated by means of four real data applications.


Introduction and Genesis
Discretization of existing continuous probability distributions have received noticeable attention in recent years. In this paper, we present and study a new discrete analogue based on the continuous Rayleigh distribution called the discrete Rayleigh G family of distributions. The relevant mathematical properties, such as moments, cumulant generating function, moment generating function, probability generating function, central moment, and dispersion index (DisIx) are derived and analyzed. A special version of the new family related to well-known Weibull model is discussed. Some classical (non-Bayesian) estimation methods such as Cramér-von Mises estimation (CVME), the ordinary least squared estimation (OLSE), the maximum likelihood estimation (MLE), and the weighted least squared estimation (WLSE) are described and considered. Since the conditional posteriors of the parameters cannot be obtained in any standard forms, using a hybrid Markov chain Monte Carlo for drawing sample from the joint posterior of the parameters is suggested. The Bayesian estimation procedure under the squared error loss function (SELF) is also presented. Markov chain Monte Carlo (MCMC) simulations for comparing non-Bayesian and Bayesian estimations are performed. Gibbs sampler and Metropolis-Hastings (M-H) algorithm are employed and applied. The flexibility and importance of the new family is illustrated by four real data sets. The new family provides a better fit than sixteen competitive distributions. Many special member distributions could be considered and studied in the future. As a future work, we may consider the bivariate and the multivariate extensions of this new family.
In probability theory, the Rayleigh distribution is a continuous probability model for nonnegative-valued random variables (NNRVs). A Rayleigh distribution can often be observed when the overall magnitude of a vector is related to its directional components. A RV is said to have the continuous Rayleigh distribution if its cumulative distribution function (CDF) is given by ( ) = The corresponding reliability function (RF) can be written as The probability mass function (PMF) of the discrete analogue of DRG family corresponding to (2) can be expressed as , ( ) = , ( ) − , ( + 1). Therefore, the PMF can be written as In the statistical literature, many discrete versions of continuous distributions have been proposed and studied such as a generalization of the Poisson distribution ( [2]), the discrete Weibull (DW) ( [3]), the discrete Rayleigh (DR) ( [4]), discrete half-normal distribution ( [5]), discrete Pareto (DPa) ( [6]), a new discrete geometric (DGc) ( [7]), the discrete inverse Weibull (DIW) ( [8]), the discrete Lindley (DLi) ( [9]), the discrete generalized exponentiated type II (DGE-II) ( [10]), the discrete inverse Rayleigh (DIR) ( [11]), exponentiated discrete Weibull (EDW) ( [12]), the discrete Lindley type II (DLi-II) ( [13]), discrete Lomax (DLx) ( [14]),the discrete log-logistic (DLL) ( [15]), the discrete Burr type XII (DBXII) ( [14]), and the exponentiated discrete Lindley (EDLi) ( [16]), among others. Table 1 presents some new sub-models based on the DRG family. In Section 3, however, a comprehensive graphical and numerical analysis is given for the discrete Rayleigh Weibull model. Many useful discretization processes can be applied to many existing continuous families (see [17][18][19][20][21][22][23][24]) and continuous probability distributions (see [25][26][27][28][29][30]). This paper is organized as follows. Some mathematical properties of the DRG family are derived and analyzed in Section 2. Some useful characterization results are derived and presented in Section 3. Estimation and inference procedures are presented in Section 4. Markov chain Monte Carlo simulations for comparing Bayesian and non-Bayesian estimation methods are performed in Section 5. Four real data applications for comparing Bayesian and non-Bayesian estimation methods are presented in Section 6. Four applications for comparing the competitive discrete models are considered in Section 7. Finally, Section 8 offers some concluding remarks.

Central Moments
The central moment of , say , is The DisIx is used to quantify whether a specific set of observations are clustered" or "dispersed" compared to a specific statistical model, so, the DisIx is used to determine whether the observed real data set can be modeled via a Poisson process. For any real data set, when the DisIx is <1, the data set is said to be "under-dispersed" data. A numerical analysis with its related calculations for the DisIx( ) is presented in Table 2.
Proof. See Appendix A.2. □ The first derivatives of Equation (6), with respect to | = 0, yield the first raw moments, i.e., However, the fourth and higher order cumulants are not equal to central moments. In some cases, theoretical treatments of problems in terms of cumulants are simpler than those using moments. In particular, when two or more RVs are statistically independent, the order cumulant of their sum is equal to the sum of their order cumulants. Moreover, the cumulants can be also obtained from | = − ∑ − 1 − 1 .
The skewness ( ( )) and kurtosis measures ( ( )) of follow from the well-known relationships. A numerical analysis for the ( ) and ( ) for a special member of the new family is presented in Table 2 with useful comments. The probability generating function can be written as ( ) = 1 +  Table 1) can be expressed as , ( ) = − ( ) | ( ∈ • , ∈( , ) and ) . Clearly, when = 1, the DRW model reduced to the DRE model. Figure 1 gives some plots of the PMF of the DRW model for various values of the parameters. Figure 2 shows some plots of the HRF of the DRW model for various values of the parameters. Based on Figure 1, we conclude that the PMF of the DRW can be "heavy tail right skewed PMF" and "left skewed PMF". Based on Figure 2, we see that the HRF of the DRW can be "decreasing" ( = 0.5, = 5), "upside down" ( = 0.9, = 0.75), "increasing" ( = 0.992, = 0.263) and "decreasing-constant-increasing" ( = 0.99, = 0.25). Based on Table 2, it is noted that ( ) of the DRW model increases as π increases and decreases as increases. The ( ) of the DRW distribution can be positive and negative. The spread for ( ) is ranging from 1.9 to ∞. The DisIx( ) of the DRW distribution can range in the interval (0.1) or more than 1 or equal 1 (for π = 0.001, = 5) like Poisson distribution. So, the new DRW distribution could be useful in modeling "under-dispersed" or "over-dispersed" count data.

Characterizations of DRG Distribution
The problem of characterizing a distribution is an important problem in applied sciences, where an investigator is vitally interested to know if their model follows the right distribution. To this end, the investigator relies on conditions under which their model would follow specifically the chosen distribution. In this section, we present certain characterizations of the DRG distribution. These characterizations are based on the conditional expectation of certain function of the random variable and in terms of the hazard function.

Proposition 1. Let : → * be a random variable. The PMF of is Equation (3) if and only if
Proof. If has PMF Equation (3), then the left-hand side of Equation (7), will be Conversely, if Equation (7) holds, then From Equation (8), we also have ( ) Now, subtracting Equation (9) from Equation (8), we arrive at  From the last equality, we have which implies that has PMF Equation (3). □ Proposition 2. Let : → * be a random variable. The PMF of is Equation (3) if and only if its hazard function, ℎ , satisfies the following difference equation with the initial condition ℎ (0) = ( ) − 1.

Estimation and Inference
In this subsection, non-Bayesian and Bayesian estimation methods are considered. In the first subsection, we will consider the MLE method, CVME method, OLSE method, and WLSE method. In second subsection, the Bayesian estimation method under the SELF. All non-Bayesian estimation methods are discussed in the statistical literature with more details. Setting 0 = ℓ( )/ = ℓ( )/ and solving them simultaneously yields the MLEs for the parameters of the DRG family. The Newton-Raphson algorithm is employed for obtaining the numerical solutions in such cases.

The CVME Method
The CVME of the parameter vector = , are obtained via minimizing the following expression with respect to (WRT) to and respectively, where

Bayesian Estimation
Assume the beta and uniform priors for the parameters and respectively. Then, Under SELF, the Bayesian estimators of and are the means of their marginal. It is not possible to obtain the Bayesian estimates through the above formulae. So, the numerical approximations are needed. We propose the use of MCMC techniques, namely the Gibbs sampler and M-H algorithm (see [31][32][33][34] for more details). Since the conditional posteriors of the parameters and cannot be obtained in any standard forms, using a hybrid MCMC for drawing sample from the marginal posterior of the parameters is suggested. Then, the full conditional posteriors of and can be easily derived. The simulation algorithm is given by: is the burn-in period of the generated MCMC.

Simulations for Comparing Non-Bayesian and Bayesian Estimation Methods
For the DRW case, MCMC simulation studies are performed for assessing and comparing the performance of non-Bayesian and Bayesian estimations. The numerical assessment is performed based on the mean squared errors (MSEs). First, we generated 1000 samples of the DRW distribution, where n = 50, 150, 300, 500. The MSEs are obtained and listed in Tables 3-5. Based on Tables 3-5 we note that all methods perform well. Though the Bayesian method is better in some cases. The performance of all estimation methods improves and tends to 0 when → ∞.

Real Data Modeling for Comparing Bayesian and Non-Bayesian Estimation Methods
In this Section, four examples of real data sets are introduced for comparing the Bayesian and non-Bayesian estimation methods. We consider the Akaike information criterion (AIC) and Correct Akaike IC (CAIC) statistic for comparing the Bayesian and non-Bayesian estimation methods.

Failure Times Data of 50 Devices
This data represents the failure times of 50 devices (in weeks) put on a certain life test (see [35] Table 6 gives the estimators under Bayesian and non-Bayesian estimation methods, AIC, and CAIC statistics for data set I. Figure  3 presents the frequency distribution using Bayesian estimation for data set I.   Table 6, the MLE method is the best method with AIC = 478.9 and CAIC = 479.20. However, all other methods perform very well.

Failure Times of 15 Electronic Components
This lifetime data gives the failure times for 15 electronic components in an acceleration lifetime test (see [36] Table 7 gives the estimators under Bayesian and non-Bayesian estimation methods, AIC, and CAIC statistics for data set II. Figure 4 shows the frequency distribution using Bayesian estimation for data set II.  Based on Table 7, the MLE method is the best method with AIC = 132.1 and CAIC = 133.1. However, all other methods perform very well.

Real Data Modeling for Comparing the Competitive Discrete Models Using the MLE Method
We illustrate the flexibility and the importance of the DRW distributions using four real data applications. The fitted distributions (see Table 10) are analyzed and compared using the loglikelihood function (ℓ), AIC, CAIC, Chi-square ( ) with degree of freedom (d.f), and its p-value ( [ ] ), Kolmogorov-Smirnov ( − ) and its [ ] . Table 6 below gives the competitive models. Negative binomial NB

Failure Times Data of 50 Devices
We compare the fits of the DRW model with some competitive models, such as DW, EDW, DIW, EDLi, DPa, DLi-II, and DLL. The MLEs along with their corresponding standard errors (SEs), and the goodness of fit (GOF) test statistics are listed in Tables 11 and 12, respectively. Statisticians have developed a remarkably powerful set of tools for analyzing normally distributed data. The most popular one is the "normal quantile-quantile (Q-Q) plot". If the distribution of the data matched the normal distribution perfectly, all the quantile points would lie between the two blue lines. Figure 7 (left plot) gives the Q-Q plot for dataset I. Figure 7 (right) gives the box for the failure time data (data set I). In the applications, the shape of the HRF can help in selecting a particular model. For this aim, the total time on test (TTT) plot (see [39]) can be used. The TTT plot is obtained by plotting ( / ) against / , where ( / ) = ∑ ( ) + ( − ) ( ) / ∑ ( ) , = 1, . . . , and the ( ) 's are the order statistics of the sample. It is "convex shape" for the "monotonically decreasing HRF" and is "concave shape" for the "monotonically increasing HRF". When solid line matches dashed line, the HRF of the data is "constant". Figure 8 shows the TTT plot (left panel) and estimated HRF (EHRF) (right panel) for the DRW model for data set I. Based on Table 8, the DRW provides the best fits against all competitive models.  TTT Plot EHRF for the DRW model

Failure Times of 15 Electronic Components
For this application, we compare the fits of the DRW model with some competitive models such as DGE-II, DLx, DEx, DIR, DR, DIW, DPa, and DBXII. The MLEs with their SEs, and the GOF statistics are reported in Tables 13 and 14, respectively. Figure 9 gives the Q-Q plot and box for the failure times data (data set II). Figure 10 shows the total time test plots (TTT) plot and estimated HRF (EHRF) for the DRW model for data set II. Based on Table 14, the DRW provides the best fits against all competitive models.  TTT Plot EHRF for the DRW model

Counts of Cysts of Kidneys
For this real data set, we compare the fits of the DRW distribution with DW, DR, DIW, DE, DLx, DLi-II, DLi, and Poisson. The MLEs with their SEs are listed in Table 15. Table 16 gives the GOF statistics. Figure 11 gives the TTT plot, Q-Q plot and Box plot versus the EHRFs for data set III. Figure  12 presents the fitted PMFs and EHRF for data set III. Based on Table 16, the DRW provides the best fits against all competitive models. TTT plot Q-Q plot Box plot Figure 11. TTT plot, Q-Q plot, and Box plot versus the EHRFs for data set III.
Fitted PMF for DRW model EHRF for the DRW model

Number of European Corn-Borer Larvae Parasites
We shall compare the fits of the DRW distributions with DIW, DGIW, DIR, DPa, DR, DBXII, negative binomial (NB), and Poisson distributions. The MLEs with their corresponding SEs are listed in Table 17. Table 18 gives the GOF statistics. Figure 13 gives the TTT plot, Q-Q plot and box plot versus the EHRFs for data set IV. Figure 14 presents the fitted PMFs and EHRF for data set IV. Based on Table 18, the DRW provides the best fits against all competitive models  Fitted PMF for DRW model EHRF for the DRW model

Concluding Remarks
In this work, we presented and studied a new discrete analogue based on the continuous Rayleigh distribution called the discrete Rayleigh G (DRG) family of distributions. Some of its statistical properties, such as moments, cumulant generating function, moment generating function, probability generating function, central moment, and DisIx are derived. A special discrete version of the DRG family related to a Weibull distribution is discussed. The HRF of the new family can be "decreasing", "upside down", "increasing" and "decreasing-constant-increasing (U-HRF)". The PMF of the DRG can be "heavy tail right skewed PMF" and "left skewed PMF". Some classical (non-Bayesian) estimation methods, such as Cramér-von Mises estimation, the maximum likelihood estimation, the weighted least squared estimation, and the ordinary least squared estimation are described and considered. Some useful characterization results based on the conditional expectation of certain function of the random variable and in terms of the hazard function are derived and presented.
The Bayesian procedure under the SELF is also presented. Since the conditional posteriors of the parameters cannot be obtained in any standard forms, using a hybrid MCMC for drawing sample from the joint posterior of the parameters is suggested. MCMC simulations for comparing non-Bayesian and Bayesian estimation are performed. Gibbs sampler and M-H algorithm are employed and applied. The Bayesian method is the best for all sample sizes with the minimum mean squared errors. The non-Bayesian estimation methods perform well but not better than the Bayesian method. The performance of all estimation methods improves when → ∞.
Four applications to real data sets are employed for comparing the Bayesian and non-Bayesian methods. The importance and flexibility of the new discrete class is illustrated by means of four real data applications. Many special member distributions could be considered and studied in the future in separate papers. As a future work, we may consider the bivariate and the multivariate extensions of the DRG family. We hope that the DRG family will attract wider applications in reliability, engineering, and other areas of research.

Acknowledgments:
The authors gratefully acknowledge, with thanks, the very thoughtful and constructive comments and suggestions of the four reviewers, which resulted in a much improved paper.

Conflicts of Interest:
The authors declare no conflict of interest.