A Probability Mass Function for Various Shapes of the Failure Rates, Asymmetric and Dispersed Data with Applications to Coronavirus and Kidney Dysmorphogenesis

: In this article, a discrete analogue of an extension to a two-parameter half-logistic model is proposed for modeling count data. The probability mass function of the new model can be expressed as a mixture representation of a geometric model. Some of its statistical properties, including hazard rate function, moments, moment generating function, conditional moments, stress-strength analysis, residual entropy, cumulative residual entropy and order statistics with its moments, are derived. It is found that the new distribution can be utilized to model positive skewed data, and it can be used for analyzing equi- and over-dispersed data. Furthermore, the hazard rate function can be either decreasing, increasing or bathtub. The parameter estimation through the classical point of view has been performed using the method of maximum likelihood. A detailed simulation study is carried out to examine the outcomes of the estimators. Finally, two distinctive real data sets are analyzed to prove the ﬂexibility of the proposed discrete distribution.

In several cases, lifetimes need to be recorded on a discrete scale rather than on a continuous analogue. Thus, discretizing CPr distributions has received noticeable attention in recent years. See, for example, Pillai and Jayakumar [19], Kemp [20], Roy [21],

Synthesis of the DHLo-II Model
In this Section, the new discrete model can be generated by utilizing the survival discretization technique. Thus, the CDF of the DHLo-II distribution can be expressed as where F(−1; α, β) = 0, 0 < α < 1 and 0 < β < 1. The corresponding PMF to Equation (2) can be listed as Using generalized binomial expansion, Equation (3) can be proposed as a mixture representation of geometric (Geo) model as follows where and g(x; ν) = ν(1 − ν) x denote the PMF of Geo distribution with parameter ν. The HRF can be expressed as where H X (x; α, β) = Pr(X=x;α,β) 1−F X (x−1;α,β) . Figure 1 shows the PMF and HRF plots for various values of the DHLo-II parameters. It is noted that the shape of the PMF is always unimodal. Further, the DHLo-II distribution can be used to model asymmetric data. Regarding the HRF, it is found that the proposed model has several shapes including bathtub, increasing and decreasing, which means this model can be utilized to analyze various types of data in different fields, especially in medicine, insurance and engineering.

Moments and Generating Functions
Assume X to be a DHLo-II RV, then the probability generating the function (PrGF) can be listed as Onreplacing s by e s in Equation (6), the moment generating function (MGF) can be derived. Thus, the first moment of the DHLo-II distribution is Similarly, the other moments can be derived. Based on the MGF, the mean, variance, index of dispersion (IOD), skewness and kurtosis can be listed in Tables 1-5 as numerical computations (NuCo).
From Tables 1-5 it is clear that: the mean, variance and IOD increase for constant values of β with α −→ 1; the proposed model is appropriate only for modelling equiand over-dispersed data, because the IOD always is greater than or equal one; and this distribution is capable of modeling positively skewed and leptokurtic data sets.

Conditional Moments
This section lists results of the conditional moments (CoMos) for the DHLo-II distribution. The CoMos can be utilized to derive the mean deviations, Bonferroni and Lorenz curves. The nth CoMo of the DHLo-II model under X n |X ≤ x and X n |X > x can be expressed as

respectively. The mean residual life function is given by
is referred to as the vitality function of the distribution function F.

Stress-Strength Analysis
Stress-strength analysis has been utilized in mechanical component design. Let It is noted that the value of stress-strength depends on the model parameters only.

Residual Entropy and Cumulative Residual Entropy
Residual entropy (RE) and cumulative residual entropy (CRE) are two important measures of information theory. The RE of the RV X is given by whereas the CRE can be listed as whereF(x; α, β) represents the survival function of the distribution. The previous two equations can be derived by using geometric expansion and generalized binomial expansion (simple algebra).

Order Statistics
Order statistics (OrSt) play an important role in different fields of statistical theory. Suppose X 1 , X 2 , . . .,X n to be a random sample (RS) from the DHLo-II, and let X 1:n , X 2:n ,. . . ,X n:n be their corresponding OrSt. Then, the CDF of the ith OrSt X i:n for an integer value of x is proposed as where Ω (n,k) (j,r,l) = (−1) k+l n j The PMF of the ith OrSt can be formulated as The vth moments of X i:n can be proposed as Based on Equation (14), L-moments can be listed as which can be utilized to discuss some descriptive statistics.

Maximum Likelihood Estimation (MLE)
In this section, we determine the MLE of the DHLo-II parameters according to a complete sample. Assume X 1 , X 2 , . . . , X n to be an RS of size n from the DHLo-II distribution. The log-likelihood function (L) can be listed as follows To estimate the model parameters α and β, the first partial derivatives ∂L(x;α,β) ∂α and ∂L(x;α,β) ∂β should be obtained, and then equating the resulted equations to zero "normal equations". These two equations cannot be solved analytically. Thus, an iterative procedure such as Newton-Raphson is required to solve it numerically.

Simulation
In this section, we assess the performance of the maximum likelihood estimators (MLEs) with respect to sample size n using R software. The assessment is based on a simulation study: generate 10,000 samples of size n = 10, 12, 14, . . . , 60 from DHLo-II are given in Figure 2 and Figure 3, respectively. From Figures 2 and 3, it is noted that the magnitude of bias and MSE always decrease to zero as n grows. This shows the consistency of the MLEs. We can say that the maximum likelihood approach works quite well in estimating the model parameters, and consequently, it can be used effectively for analyzing the count data.

Applications
In this section, we illustrate the importance and the flexibility of the DHLo-II distribution by utilizing data from different fields. We shall compare the fits of the DHLo-II distribution with some competitive models such as discrete inverse Weibull (DIW), discrete gamma Lindley (DGL), discrete Burr II (DB-II), discrete log-logistic (DLL), discrete inverse Rayleigh (DIR), discrete Burr-Hatke (DBH), discrete Lindley and discrete Pareto (DP). The fitted models are compared using some criteria, namely, the maximized log-likelihood (L), Akaike information criterion (Aic) and its corrected (Caic), Hannan-Quinn information criterion (Hqic), Bayesian information criterion (Bic), and Chi-square (Chi 2 ) test with its corresponding P-value (Pv).

Data set I: COVID-19 in Armenia
The data are listed in (https://www.worldometers.info/coronavirus/country/armenia/, accessed on 20 July 2021) and represent the daily new deaths in Armenia for COVID-19 from 15 February to 4 October 2020. The initial mass shape for these data are explored utilizing the nonparametric kernel mass estimation (Kme) technique in Figure 4, and it is observed that the mass is asymmetric function. The normality condition (Nc) is checked by the quantile-quantile (Qu-Qu) plot in Figure 4. The extreme observations (ExOb) are spotted from the box plot in Figure 4, and it is observed that some ExOb were listed.
The MLEs with their corresponding standard errors (Se), confidence intervals (CI) for the parameter(s) and goodness of fit tests for data set I are listed in Tables 6 and 7. The abbreviations "Of" and "Df" represent the observed frequency and degree of freedom, respectively. From Table 7, it is noted that the DGL distribution works quite well in addition to the DHLo-II distribution. However, the DHLo-II model is the best among all tested distributions. Figure 5 shows that the MLEs are unique because the L profiles have only unimodal shapes.   Figure 6 supports our empirical results where the DHLo-II is more fit to analyze these data, whereas Figure 7 shows the probability-probability (Pr-Pr) plot for data set I, which proves that the data set plausibly came from the DHLo-II distribution.
The data exhibit over-dispersion. Moreover, they are moderately skewed to the right and leptokurtic.

Data Set II: Kidney Dysmorphogenesis
This data set is taken from the study of Chan et al. [40]. Initial mass shape for the kidney data is explored using the nonparametric Kme approach in Figure 8, and it is noted that the mass is asymmetric and multimodal functions. The Nc is checked via the Qu-Qu plot in Figure 8. The ExOb are spotted from the box plot in Figure 8, and it is noted that some ExOb were reported. Here, we examine the fitting capability of the DHLo-II distribution with some other competitive distributions. The MLEs, Se and CI for the parameter(s) as well as goodness of fit test for this data are reported in Tables 8 and 9.  It is noted that the DIW, DB-II, DLL, DBH and DP distributions work quite well in addition to the DHLo-II distribution. However, the DHLo-II distribution is the best model among all tested models. Figure 9 shows that the MLEs are unique. Figure 10 supports our empirical results where the DHLo-II is more fit to analyze data set II, whereas Figure 11 shows the Pr-Pr plot for the same data.
According to the MLEs, the EDS for mean, variance, IOD, skewness and kurtosis are 1.45414, 5.87716, 4.04167, 2.33510 and 10.05033, respectively. The data are over-dispersed, skewed to the right and leptokurtic.   Figure 11. The Pr-Pr plots for data set II.

Conclusions
In this paper, we proposed a flexible discrete probability model with two parameters, in the so-called discrete half-logistic (DHLo-II) distribution. Various statistical properties of the proposed model have been derived. It was found that the DHLo-II model is convenient for modelling skewed data sets, especially those which have very extreme observations. Furthermore, it can be used as a flexible model to analyze equi-and over-dispersed phenomena, especially in medicine, insurance and engineering fields. More advantages of the proposed model are that it provides a wide variation in the shape of the HRF, including decreasing, increasing and bathtub, and consequently this distribution can be used in modelling various kinds of data. The DHLo-II parameters have been estimated via the MLE approach. A simulation has been performed based on different sample sizes, and it was found that the MLE method works quite effectively in estimating the DHLo-II parameters due to the consistency property. Finally, two distinctive data sets "COVID-19 and kidney dysmorphogenesis" have been analyzed to illustrate the flexibility of the DHLo-II model. In our future work, the bivariate and multivariate extensions will be derived for the DHLo-II distribution with its applications in medicine and engineering fields.