Abstract
In this paper, a new discrete distribution called Binomial–Natural Discrete Lindley distribution is proposed by compounding the binomial and natural discrete Lindley distributions. Some properties of the distribution are discussed including the moment-generating function, moments and hazard rate function. Estimation of the distribution’s parameter is studied by methods of moments, proportions and maximum likelihood. A simulation study is performed to compare the performance of the different estimates in terms of bias and mean square error. SO2 data applications are also presented to see that the new distribution is useful in modeling data.
1. Introduction
Count data modeling is a challenging task in many areas, including, but not limited to, public health, medicine, epidemiology, applied science, sociology, and agriculture. In many situations, the life length of a device cannot be measured on a continuous scale and the survival function is assumed to be a function of a count random variable instead of being a function of a continuous-time random variable. Therefore, discrete distributions are somewhat meaningful to model lifetime data in situations where output may be of a discrete nature. The traditional discrete distributions have limited applicability as models for reliability, failure times, aggregate loss, etc., especially with the count data with over-dispersion in which the variance is greater than the mean. This has led to the development of some discrete distributions based on popular continuous models in reliability analysis, actuarial sciences survival analysis, etc. The discretization of continuous distributions has produced many discrete distributions in the last few decades in the statistical literature. However, the quest for a quintessential model remains the crux of the matter in the diverse scientific paradigm.
One of the many approaches to define new models is the discretization of distributions. Until recently, the majority of discrete lifetime distributions have been proposed in the statistical literature by discretizing the survival function of continuous lifetime distributions (see the work of authors, for example, in references [1,2,3,4,5,6,7,8,9,10,11,12]).
The probability mass function (pmf) is defined as follows
Away from this method, Afify [12] have introduced and studied a new discrete Lindley distribution by constructing a mixture of discrete analogs to the continuous components used in creating the continuous Lindley distribution.
In this paper, we propose and study a new probability mass function (pmf), denoted by , by compounding the binomial and the NDL distributions. The basic principle of this method is stated as if (input) and (output) are two random variables denoting the number of particles entering and leaving an attenuator, then the probability functions and of these two random variables are connected by the binomial decay transformation
where is the attenuating coefficient which is discussed by Hu et al. [7]. They considered as a Poisson distribution with the parameter , and then they showed that is the Poisson distribution with the parameter . For clarity, attenuators are electrical devices built to lower the amount of voltage flowing through them without severely compromising the signal’s integrity. They serve as a safeguard against systems being exposed to signals with power levels that are too high to be decoded. Déniz [13] introduced uniform Poisson distribution using the idea of Hu et al. [7] by interchanging in Equation (1) the binomial distribution and the discrete uniform distribution and maintaining as the Poisson distribution. Some new discrete distributions also are proposed in the literature using the methodology of [7]. Akdoğan et al. [14] proposed uniform-geometric distribution and Coşkun et al. [15] constructed binomial–discrete Lindley distribution.
The rest of the paper is arranged as follows: Section 2 defines the natural discrete Lindley distribution and proposes the new binomial–natural discrete Lindley distribution with important properties, subsequently. In Section 3, various parameter estimation and simulation studies are given. Section 4 concerns the real data illustration of the findings. In Section 5, some conclusions are provided.
2. Natural Discrete Lindley Distribution
Recently, Al-Babtain et al. [16] proposed and studied a new natural discrete analog of the continuous Lindley distribution as a mixture of geometric and negative binomial distributions. The new distribution is called natural discrete Lindley (NDL) distribution and it has many interesting properties that make it superior to many other discrete distributions, particularly in analyzing over-dispersed count data. The NDL can be applied in the collective risk models and is competitive with the Poisson distribution to fit automobile-claim-frequency data. Let be a non-negative random variable obtained as a finite mixture of geometric () and negative binomial (2, ) with mixing probabilities and , respectively, then the probability mass function of the NDL distribution is defined as
One of the most important features of this distribution is that it has a single parameter and it has attractive properties, which makes it suitable for applications not only in insurance settings but also in other fields where over-dispersions are observed. For more details about this distribution, see Al-Babtain et al. [16]. Given the usefulness of NDL, the discrete analogue due to NDL known as the binomial NDL (BNDL) seems to be naturally interesting to explore.
2.1. The Proposed Discrete Analog
The probability mass function (1) can be expressed as
where has the binomial distribution. Suppose that is the random variable from NDL with parameter given in (2); then, the probability mass function of the discrete random variable is obtained as
If has the pmf (3), then it is called a binomial natural discrete Lindley (BNDL) random variable and it is denoted by For , this means that no particles enter into the attenuator and it will be termed as failure. Consequently, the corresponding cumulative distribution function (cdf) of BNDL distribution is given by
Figure 1 shows the probability mass function (pmf) plots of the proposed distribution for various values of parameter p. Thus, the pmf is always a decreasing function, and the new discrete random variable tends to take small values when p increases. The stochastic process tends to happen very quickly once the parameter value grows, which is implied quite strongly by the model’s behavior. Therefore, the BNDL model is a logical substitute for the traditional exponential distribution to characterize such phenomena. Additionally, the flexibility of the proposed BNDL can be tested for varied count data sources. For example, this model may be helpful for simulating aggregate losses that are typically limited to actuarial data by maximizing the overall garment fit for a particular number of sizes and accommodation rate, crucial to assessing the goodness of the scaling system. Furthermore, it may be helpful to overcome the problem of over-dispersed data in social sciences, as in anthropology where civilizations grew near the existence of a consistent water source, which is necessary for human survival. Figure 2 complements the results of Figure 1.
Figure 1.
Pmf of BNDL distribution for some choices of p.

Figure 2.
Histograms of the BNDL model for simulated data.
2.2. Statistical Properties of the BNDL Distribution
Primarily in this section, we provide some explicit results based on the mathematical properties of the BNDL distribution.
2.2.1. Moment-Generating Function
If distribution, then the moment-generating function of is given as
For more on generating functions, see Yalcin and Simsek [17], Yalcin and Simsek [18] and Simsek [19].
2.2.2. Probability-Generating Function
The probability-generating function of the random variable can be obtained using its moment-generating function which is equivalent to calculating ; therefore, the probability-generating function of the random variable is
Since,
Therefore, at , we can obatin
where is the th factorial moment of .
2.2.3. Non-Central Moments and Variance
If distribution, then the kth moment about zero of X is given by
The first four raw moments can be obtained as follows
and
The variance in the random variable is
2.2.4. Central Moments
The kth moment about the mean of X is
Therefore, the second, third and fourth central moments of the random variable are
and
2.2.5. Skewness and Kurtosis
The coefficient of skewness and the coefficient of kurtosis of the of BNDL distribution are, respectively,
2.2.6. Index of Dispersion
The index of dispersion (ID) indicates whether a certain distribution is suitable for under- or over-dispersed datasets. For example, for the Poisson distribution where the variance is equal to the mean, for the geometric distribution and the negative binomial distribution , while the binomial distribution has .
Theorem 1.
If , then for all
Proof.
We have
This function is a monotonic decreasing function as increases. It converges to 2 when , while it tends to 1 as ; therefore, , which means that , and hence, . □
From Theorem 1, BNDL distribution should only be used in the count data analysis with over-dispersion. In Table 1, some of the empirical findings of these measured are due for considerations.
Table 1.
Mean, Variance, Skewness, kurtosis and ID of the BNDL distribution for different values of the parameter p.
2.2.7. Log-Concavity
A necessary and sufficient condition that be strongly unimodal is that it has to be log-concave, i.e., for all (see Keilson and Gerber [20])).
Theorem 2.
The pmf of the BNDL distribution in (3) is log-concave.
Proof.
From (3), we can directly reach
and
After some algebraic operations, we find that
for all and for all choices .
Theorem 2 confirms that the BNDL distribution is strongly unimodal. □
2.3. Reliability Properties of the BNDL Distribution
2.3.1. Survival Function
If distribution, then from (4), the survival function of is
2.3.2. Hazard Rate and Mean Residual Life Functions
The hazard (failure) rate function is the probability that an item has survived time , given that it has survived to at least time . If distribution, then its hazard rate (failure rate) function is given as
Obviously, the upper limit of the failure rate function is , i.e., . Graphical illustrations of hazard rate function are presented in Figure 3 while descriptive measures are presented in Figure 4.
Figure 3.
Plots of hazard rate of BNDL distribution for some choices of p.
Figure 4.
Plots of the BNDL model for (a) Mean, (b) Variance, (c) Skewness, (d) Kurtosis and (e) ID.
The mean residual life function of is given by
Corollary 1.
If distribution, then it has an increasing failure rate and decreasing mean residual life.
As we explained through Theorem 2, the BNDL distribution has a property of log-concavity; therefore, according to Gupta et al. [21], the BNDL distribution has an IFR property. According to Kemp [22], the next chain is verified
So, the BNDL distribution is
- IFR (increasing failure rate).
- IFRA(increasing failure rate average).
- NBU (new better than used).
- NBUE(new better than used in expectation).
- DMRL (decreasing mean residual lifetime).
2.4. Stochastic Orderings
Stochastic orders are important measures to judge comparative behaviors of random variables. Shaked and Shanthikumar [8] showed that many stochastic orders exist and have various applications. Given two random variables and we say that is smaller than in the
- Usual stochastic order, denoted by , if , for all .
- Hazard rate order, denoted by , if , for all .
- Reversed hazard rate order, denoted by , if decreases in .
- Mean residual life order, denoted by , if , for all x.
- Likelihood ratio order, denoted by , if decreases in .
For all the previous orders, we have the following chains of implications:
and
also,
Theorem 3.
Let and ; then, for all
Proof.
Let
Now,
and
Therefore,
Let and , where and
After substitution of the values and in (5), we obtain
where
and
After some algebraic operations, we find that
Therefore,
This implies that
□
2.5. Entropy
Entropy is a measure of uncertainty of a random variable. The entropy of a discrete random variable with pmf and alphabet is given by
Entropy can be interpreted as the measure of average uncertainty in or the average number of bits needed to describe . For more details on entropy and information theory, we refer the reader to Gray [23].
Now, if , then the entropy of the random variable can be calculated by the following formula
where gives the Lerch transcendent . Table 2 presents some numerical values of the entropy of for different choices of . From Table 2, one can observe that is monotonically decreasing in with its limits tending to be 1.88 as tends to 0 as
Table 2.
Numerical results of for different values of the parameter p.
Figure 5 relates the to the values of parameter p. One may note that (X) is monotonically decreasing in p ∈ (0, 1) with its limit inclining to zero as p tends to 1.
Figure 5.
of X versus p.
3. Estimation and Simulation
In this section, we determine the estimation of unknown parameter by the maximum likelihood, moment and proportion methods.
3.1. Method of Maximum Likelihood Estimation
Let be the observed values from the BNDL distribution with parameter . The likelihood and log-likelihood function are given, respectively, as
and
The maximum likelihood estimate (MLE) of the parameter can be obtained by solving the following equation using some numerical procedures.
3.2. Method of Moments Estimation
Let be a random sample from the BNDL distribution with parameter . The moment estimate (ME) of the parameter can be obtained by solving the following equation.
3.3. Method of Proportions Estimation
Let be a random sample from the BNDL distribution with parameter . For , we define the indicator functions
.
Therefore, the proportion of 0s in the sample . The proportion estimate (PE) of the parameter can be obtained by solving the following equation with respect to
3.4. Simulation Study
In this section, we assess the behavior of the maximum likelihood estimators for a finite sample of size n. Based on BNDL distribution, a simulation study is carried out. The simulation study is based on the following steps: firstly, generate N = 1000 samples of sizes n = 25, 50, …, 500 from the BNDL distribution. Then, compute the maximum likelihood estimators for the model parameters. Lastly, compute the MSEs given by
For various parameters’ values, the simulation’s results provided in Figure 6 indicate that the estimated MSEs fall off toward zero when the sample size n increases. Hence, we have conclusive evidence to claim that the maximum likelihood estimation of p satisfies the asymptotic convergence of normality. The asymptotic normality of the MLE is a very well-known classic property given as follows. In a parametric model, we say that an estimator based on is consistent if in probability as . We say that it is asymptotically normal if converges in distribution to a normal distribution. So above is consistent and asymptotically normal.

Figure 6.
Plots of the estimated parameter and MSEs for various values of p.
4. Applications to Count Data
In this section, to show the application, we used a real-life data set to examine the efficiency and superiority of the BNDL distribution in modeling real data practice, recently studied by Balakarishnan et al. [24], consisting of 744 discrete observations. Santiago, Chile is recognized as one of the most environmentally contaminated cities in the world. In order to obtain the level of air pollution and its associated adverse effects on humans in Santiago, the National Commission of Environment (CONAMA) of the government of Chile collects data on sulfur dioxide (SO2) concentrations in the air. The data corresponding to the hourly SO2 concentrations (in ppm) observed at a monitoring station located in Santiago city are:
| x | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 and above |
| f | 86 | 235 | 120 | 119 | 35 | 15 | 11 | 9 | 4 | 10 |
The descriptive statistics of the data sets are, Mean = 2.93, Median = 2, Mode = 3, SD = 2.02, Coefficient of Variation = 0.69, Skewness = 4.32, Kurtosis = 34.57, Range = 24, Min value = 1 and Max value = 25.
We compare BNDL to Binomial–Discrete Lindley Distribution (BDLD) by Kuş et al. [15] and Negative Binomial distribution. The pmf of BDLD is given as
We considered the AIC (Akaike Information Criterion), CAIC (Consistent Akaike Information Criterion), BIC (Bayesian Information Criterion) and HQIC (Hannan–Quinn Information Criterion). The model with minimum values for these statistics could be chosen as the best model to fit the data. All results in Table 3 were obtained using the R PROGRAM.
Table 3.
MLEs and their standard errors (in parentheses) with statistics AIC, BIC, HQIC and CAIC values for given data.
Figure 7 gives the quantile–quantile plot (Q-Q plot) and box plot and Figure 8 gives TTT plot versus the EHRF for the given data set. Total Time on Test (TTT plots) showed that the data set has an increasing hazard rate shape which is confirmed by EHRF. Figure 9 and Figure 10 show the fitted model against its comparative distributions. These plots clearly show that the BNDL model is superior to well-known BDLD and Negative Binomial models.
Figure 7.
(a) QQ plot and (b) box for the given data.
Figure 8.
(a) TTT plot and (b) Expected Hazard Rate Function (EHRF) for the BDLD model for the dataset.
Figure 9.
Fitted plots of BNDL and BDLD distribution for given data set.
Figure 10.
Fitted plot of Negative Binomial distributions for given data set.
5. Concluding Remarks
A new one-parameter discrete distribution was proposed and its important distributional, monotonic, and reliability characteristics were explored. Some statistical and reliability properties of the proposed discrete model were derived. Various estimating approaches were discussed. A simulation study was conducted to determine the MLEs’ accuracy and precision. The applicability of the proposed distribution in modeling a real-life discrete data set was demonstrated. It is clear from the comparison that the new distribution is the best distribution for fitting the data sets from among the all-tested distributions and it will be a useful contribution to the field of count data modeling.
Author Contributions
Conceptualization, S.S. and S.K.; methodology, W.M.; software, J.G.; validation, S.S. and S.K.; formal analysis, W.M.; investigation, S.S.; resources, F.J.; data curation, W.M.; writing—original draft preparation, S.S. and W.M.; writing—review and editing, S.K.; visualization, J.G.; supervision, S.K.; project administration, F.J. All authors have read and agreed to the published version of the manuscript.
Funding
This research received no external funding.
Conflicts of Interest
The authors declare no conflict of interest.
References
- Aryuyuen, S.; Bodhisuwan, W.; Volodin, A. Discrete Generalized Odd Lindley—Weibull Distribution with Applications. Lobachevskii J. Math. 2020, 41, 945–955. [Google Scholar] [CrossRef]
- Chakraborty, S. A New Discrete Distribution Related to Generalized Gamma Distribution and Its Properties. Commun. Stat. Theory Methods 2015, 44, 1691–1705. [Google Scholar] [CrossRef]
- Chakraborty, S.; Chakravarty, D. Discrete Gamma Distributions: Properties and Parameter Estimations. Commun. Stat. Theory Methods 2012, 41, 3301–3324. [Google Scholar] [CrossRef]
- Chakraborty, S.; Dhrubajyoti, C. A Discrete Gumbel Distribution. arXiv 2014. Available online: https://arxiv.org/abs/1410.7568 (accessed on 8 June 2022).
- El-Morshedy, M.; Eliwa, M.S.; Nagy, H. A New Two-Parameter Exponentiated Discrete Lindley Distribution: Properties, Estimation and Applications. J. Appl. Stat. 2018, 47, 354–375. [Google Scholar] [CrossRef]
- Gómez-Déniz, E.; Calderín-Ojeda, E. The Discrete Lindley Distribution: Properties and Applications. J. Stat. Comput. Simul. 2011, 81, 1405–1416. [Google Scholar] [CrossRef]
- Hu, Y.; Peng, X.; Li, T.; Guo, H. On the Poisson Approximation to Photon Distribution for Faint Lasers. Phys. Lett. A 2007, 367, 173–176. [Google Scholar] [CrossRef] [Green Version]
- Shaked, M.; Shanthikumar, J.G. Stochastic Orders; Springer: New York, NY, USA, 2007. [Google Scholar] [CrossRef]
- Nekoukhou, V.; Alamatsaz, M.H.; Bidram, H. Discrete Generalized Exponential Distribution of a Second Type. Statistics 2013, 47, 876–887. [Google Scholar] [CrossRef]
- Para, B.A.; Jan, T.R. Discrete Generalized Weibull Distribution: Properties and Applications in Medical Sciences. Pak. J. Stat. 2017, 33, 337–354. [Google Scholar]
- Roy, D. The Discrete Normal Distribution. Commun. Stat.-Theory Methods 2003, 32, 1871–1883. [Google Scholar] [CrossRef]
- Afify, A.Z.; Elmorshedy, M.; Eliwa, M.S. A New Skewed Discrete Model: Properties, Inference, and Applications. Pak. J. Stat. Oper. Res. 2021, 17, 799–816. [Google Scholar] [CrossRef]
- Déniz, E.G. A New Discrete Distribution: Properties and Applications in Medical Care. J. Appl. Stat. 2013, 40, 2760–2770. [Google Scholar] [CrossRef]
- Akdoğan, Y.; Kuş, C.; Asgharzadeh, A.; Kinaci, I.; Sharafi, F. Uniform-Geometric Distribution. J. Stat. Comput. Simul. 2016, 86, 1754–1770. [Google Scholar] [CrossRef]
- Kuş, C.; Akdoğan, Y.; Asgharzadeh, A.; Kınacı, I.; Karakaya, K. Binomial-Discrete Lindley Distribution. Commun. Fac. Sci. Univ. Ank. Ser. A1 Math. Stat. 2019, 68, 401–411. [Google Scholar] [CrossRef]
- Al-Babtain, A.A.; Ahmed, A.H.N.; Afify, A.Z. A New Discrete Analog of the Continuous Lindley Distribution, with Reliability Applications. Entropy 2020, 22, 603. [Google Scholar] [CrossRef]
- Yalcin, F.; Simsek, Y. Formulas for characteristic function and moment generating functions of beta type distribution. Rev. Real Acad. Cienc. Exactas Físicas Y Naturales. Ser. A Matemáticas 2022, 116, 86. [Google Scholar] [CrossRef]
- Yalcin, F.; Simsek, Y. Anew class of symmetric beta type distributions constructed by means of symmetric Bernstein type basis functions. Symmetry 2020, 12, 779. [Google Scholar] [CrossRef]
- Simsek, B. Formulas derived from moment generating functions and Bernstein polynomials. Appl. Anal. Discret. Math. 2019, 13, 839–848. [Google Scholar] [CrossRef] [Green Version]
- Keilson, J.; Gerber, H. Some Results for Discrete Unimodality. J. Am. Stat. Assoc. 1971, 66, 386–389. [Google Scholar] [CrossRef]
- Gupta, P.L.; Gupta, R.C.; Tripathi, R.C. On the monotonic properties of discrete failure rates. J. Stat. Plan. Inference 1997, 65, 255–268. [Google Scholar] [CrossRef]
- Kemp, A.W. Classes of discrete lifetime distributions. Commun. Stat. Theory Methods 2004, 33, 3069–3093. [Google Scholar] [CrossRef]
- Gray, R.M. Entropy and Information Theory; Springer: New York, NY, USA, 2011. [Google Scholar] [CrossRef]
- Balakrishnan, N.; Leiva, V.; Sanhueza, A.; Cabrera, E. Mixture inverse Gaussian distributions and its transformations, moments and applications. Statistics 2009, 431, 91–104. [Google Scholar] [CrossRef]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).