The Uniform Poisson–Ailamujia Distribution: Actuarial Measures and Applications in Biological Science

: We propose a new asymmetric discrete model by combining the uniform and Poisson– Ailamujia distributions using the binomial decay transformation method. The distribution, named the uniform Poisson–Ailamujia, due to its ﬂexibility is a good alternative to the well-known Poisson and geometric distributions for real data applications in public health, biology, sociology, medicine, and agriculture. Its main statistical properties are studied, including the cumulative and hazard rate functions, moments, and entropy. The new distribution is considered to be suitable for modeling purposes; its parameter is estimated by eight classical methods. Three applications to biological data are presented herein.


Introduction
Discrete distributions are quite useful for modeling discrete lifetime data in many situations. Recently, several continuous distributions have been discretized for modeling lifetime data, such as those summarized in Table 1. Discrete Chen Noughabi et al. [7] On the other hand, a natural discrete analog of the continuous Lindley model, called natural discrete Lindley (NDL), was introduced by [8] as a mixture of the negative binomial and geometric distributions. Several reliability properties of the NDL were explored by [9].
Let N and X be two discrete random variables denoting the numbers of particles entering and leaving an attenuator, with their probability mass functions (pmfs) p(n) and P(X = x) that are connected by the binomial decay transformation introduced by Hu et al. [10] P(X = x) = ∞ ∑ n=x n x (1 − p) n−x p x p(n), x = 0, 1, . . . , where 0 ≤ p ≤ 1 is the attenuating coefficient. Hu et al. [10] defined p(n) as a pmf of a Poisson distribution with rate parameter λ > 0 and illustrated that P(X = x) is also a Poisson distribution with rate λ p. They investigated the quantitative relation between the input and output distributions after the attenuation. In recent studies, new discrete models have been constructed by compounding two discrete distributions. For example, Déniz [11] defined the uniform Poisson, Akdogan et al. [12] proposed the uniform geometric, and Kuş et al. [13] introduced the binomial discrete Lindley.
In this paper, we introduce the asymmetric uniform Poisson-Ailamujia (UPA) distribution using the methodology of Hu et al. [10]. This distribution is a competitor to the Poisson-Ailamujia (PA) model, and it is suitable for fitting datasets with excesses of ones. We estimate the parameter α of the UPA distribution using eight classical methods and provide detailed simulations to explore the behavior of the estimators.
The rest of the paper is organized as follows. Section 2 defines the new one-parameter distribution and some of its properties. Two actuarial measures are calculated in Section 3. The estimation methods are discussed in Section 4. In Section 5, the efficiency of the estimators is studied via Monte Carlo simulations. Section 6 provides three real applications of the new distribution. Section 7 offers some conclusions.

The Discrete UPA Distribution
The PA distribution was derived from the Poisson compounding scheme based on the continuous Ailamujia distribution by Lv et al. [14]. It was pioneered by Hassan et al. [15] for modeling count data, offering a new alternative to the Poisson and the negative binomial, among other models. Its pmf has the form (for α > 0).
Equation (2) can be expressed as where X|N = n has the binomial B(n, p) model. Now, let X|N = n have the discrete uniform U(n) with parameter n ≥ 0, and let N have a PA distribution with parameter α > 0. Then, the pmf of the UPA random variable (rv), say, X ∼UPA(α), is as follows (for x = 0, 1, . . .): Figure 1 displays plots of the pmf of X, which is unimodal. The probabilities of P(X = x) decrease when x increases.

Properties
The survival function (sf) of the UPA distribution is as follows (for x = 0, 1, . . .): The cumulative distribution function (cdf) of X reduces to The hazard rate function (hrf) of X can be defined as h(x) = P(X = x | X ≥ x) = P(X = x) / P(X ≥ x), where P(X ≥ x) > 0. Then, the hrf of the UPA distribution follows from Equations (4) and (5) as The moment generating function of X is The first fourth ordinary moments of X are The variance, skewness, and kurtosis of X are obtained from these expressions as We note that the new distribution is over-dispersed since the index of dispersion (ID) Hence, the UPA distribution can be used for modeling over-dispersed data. In addition, it is right-skewed and leptokurtic, since γ 1 (X) > 0 and γ 2 (X) > 0, respectively. The UPA distribution is a heavy-tailed distribution. Table 2 gives some moments, variances, and IDs in terms of α. Figure 2 displays the plots of the skewness and kurtosis versus α. The ID decreases monotonically in α, whereas the skewness and kurtosis monotonically increase for α ∈ (0, ∞).

Stochastic Orders of the Parameter α
Shaked and Shanthikumar [16] showed that some stochastic orders exist and have several applications. Theorem 1 shows that the UPA distribution is ordered according to the strongest stochastic order, namely, the likelihood ratio (lr) order. Definition 1. Consider the two random variables X and Y with respective pmfs f X (·) and f Y (·). Then, X is said to be smaller than Y in the lr order, denoted by Theorem 1. Let X ∼UPA(α 1 ) and Y ∼UPA(α 2 ). Then X ≤ lr Y for all α 1 > α 2 .

Proof. We have
and Clearly, one can note that

Entropy
The Shannon entropy of X can be expressed as Table 3 gives some values of H(X) in terms of the parameter α. Figure 3 displays the plot of H(X) versus α. The entropy H(X) is monotonically decreasing for α ∈ (0, ∞), and it proceeds to zero when α becomes larger.

Quantile Function
The quantile function (qf) of the UPA distribution is determined by inverting (6) as The ath quantile (x a ) of X can be expressed from Equation (17) as where x denotes the integer part of x. The quantity is the cdf given in (6). The median of the UPA(α) distribution is x 0.5 .

Actuarial Measures
In this section, we determine the value at risk (VaR) and tail value at risk (TVaR) of the UPA(α) distribution.

VaR Measure
Let X denote a loss rv. The VaR p of X at the 100p% level, say, π p , is the 100p percentile of the distribution of X, namely, where p ∈ (0, 1), and F(x) is the cdf of the UPA distribution given in (6). The quantity VaR p of the UPA distribution comes from the qf (17) as follows:

TVaR Measure
The TVaR of X at the 100p% security level, say, TVAR p , has the form The TVaR p measure for the UPA(α) model follows from Equations (4) and (6).
Some VaR p and TVaR p values for the UPA distribution are listed in Table 4. The figures in Table 4 and the plots in Figure 4 indicate that the VaR and TVaR measures are increasing functions of α.

Estimation
In this section, the parameter α is estimated by eight methods, and their performances are investigated via Monte Carlo simulations. The proposed estimators are determined from the maximum likelihood, moments, proportions, ordinary and weighted least-squares, Cramér-von Mises, right-tail Anderson-Darling, and percentiles methods. For all methods, let x 1 , . . . , x n be n independent observations from the UPA distribution.

Maximum Likelihood
The log-likelihood function for α comes from (4) as follows: Then, the maximum likelihood estimate (MLE) of α, say, α, is determined by maximizing n (α) with respect to this parameter as the solution of Under some regularity conditions, the distribution of α can be approximated by the N (α, 1/I( α)) distribution, where I(α) is the observed Fisher information.

Moments
The moment estimate (MOE)α of α follows from E(X) given in Section 2.1 as if x > 0. From the central limit theorem, where Based on the delta method, For any 0 < γ < 1, an approximate 100(1 − γ) confidence interval for the parameter α comes from (30) as where S = √ 2α + 1.

Proportions
We define the indicator function ν(·) (for i = 1, . . . , n) as Clearly, the proportion y = n −1 ∑ n i=1 ν(x i ) refers to the proportion of zeros in the sample, and it is an unbiased and consistent estimate of the probability Then, the proportions estimate (POE) of α [17] follows by solving which leads to the estimateα = −y/[2(y − 1)].

Ordinary and Weighted Least-Squares
Let X j:n be the jth-order statistic in a sample of size n. We adopt lower cases for sample values. It is well-known that E F(X j:n ) = j 1+n and V F(X j:n ) = j (n−j+1) (n+1) 2 (n+2) . The least-squares estimate (LSE) of α,α, follows by minimizing in relation to α.

Cramér-von Mises
The Cramér-von Mises estimate (CVME) (see [18,19]) is based on the difference between the estimate of the cdf and its empirical cdf [20]. The CVME of α follows by minimizing with respect to α. Further, the CVME of α is also obtained by solving

Right-Tail Anderson-Darling
The right-tail Anderson-Darling estimate (RADE) of α follows by minimizing in relation to α. The RADE of α is also found by solving the equation

Percentiles
The percentile estimate (PCE) is obtained by equating the sample percentile point to the population percentile. If p j denotes an estimate of F(x j:n ; α), the PCE of α, sayα PCE , follows by minimizing where p j = j 1+n is an unbiased estimator of F(x j:n ; α) and

Simulation Study
We conducted a simulation study to evaluate the accuracy of the eight estimators discussed before. We generated samples of sizes n = 30, 75, 100, 150, 200, and 300 from the UPA distribution and then calculated the average values of the MLE, MOE, POE, LSE, WLSE, CVME, RADE, and PCE of α (AVEs), mean square errors (MSEs), average absolute biases (ABBs), and mean relative errors (MREs) when α = 0.35, 0.5, 1.5, and 3.0. The ABBs, MSEs and MREs are given by and We repeated the simulation 5000 times to calculate these measures for MLE, MOE, POE, LSE, WLSE, CVME, RADE, and PCE from the previous settings. The results reported in Tables 5-8 were found using the optim-CG routine of R software.
The numbers in Tables 5-8 reveal that the AVEs became closer to the true values of α when the sample size n increased, as expected. Further, the ABBs, MREs, and MSEs for all estimators decreased when n increased. Moreover, the MLE and MOE were the best estimators under these criteria. The MLE and MOE were almost identical in terms of the ABBs, MSEs, and MREs, and both had better performances than the other estimators. Additionally, the biases and MSEs of all estimators decayed toward zero when n increased. In summary, the performance ordering of the proposed estimators, from best to worst, was MLE, MOE, WLSE, LSE, POE, PCE, RADE, and CVME. Hence, maximum likelihood was adopted for the work in the next section.

Modeling Biological Data
In this section, the UPA distribution is fitted to three real biological datasets and compared with the discrete Burr-Hatke (DBH) [21], discrete Poisson Lindley (DPL) [22], natural discrete Lindley (NDL) [8], discrete Pareto (DP) [5], PA and Poisson distributions according to the model's ability. The first dataset (Catcheside et al. [23]) refers to numbers of chromatid aberrations, and it was adopted by Hassan et al. [15] for comparing the Poisson and PA distributions. We aimed to test whether the UPA model is a more reasonable choice for these data based on the chi-squared test. Under the null hypothesis, the estimated probabilities wereα The estimated expected frequencies wereê i = nα i . The results of the chi-square test were reported in Table 9 considering five cells, where whereê i and o i are, respectively, the expected and observed frequencies for x = i. Thus, we cannot reject H 0 at the 5% significance level, and then the UPA distribution is quite suitable for these data. We also report in Table 9 the results of the χ 2 test for the UPA and other distributions based on the MLE of α. The UPA distribution provided the best fit since it resulted in the smallest χ 2 value. This conclusion can also be confirmed by the log-likelihood test. Figure 5 displays the empirical pmf and seven pmfs fitted to the first dataset, which confirm that the new distribution yielded the best fit to the current data.

Conclusions
New discrete distributions are very important for modeling real-life scenarios since the traditional ones have limited applications in failure times, reliability, counts, etc. We proposed and studied the uniform Poisson-Ailamujia (UPA) distribution, which can give better fits than other discrete distributions, especially when modeling over-dispersed count data. Seven methods were discussed to estimate its parameter, and Monte Carlo simulations showed that the maximum likelihood and moments are the best ones. The flexibility of the UPA model was proven empirically by means of three real biological datasets. Furthermore, the UPA distribution can be extended in some ways. For example, the transmuted UPA, exponentiated UPA, Beta UPA, Kumaraswamy UPA can be defined to provide more flexibility with two and three parameters and to increase the potential applicability of the UPA distribution. It is difficult, sometimes, to measure lifetimes or counts on a continuous scale. In practice, we come across situations, where lifetimes are discrete random variables. For example, the number of days that COVID-19 patients stay in hospital beds, the number of hospital beds occupied by coronavirus patients in a hospital, the number of comorbidities in these patients, etc. We point out examples of epidemiology, but it can be applied in several other areas.