A Novel Generalization of Zero-Truncated Binomial Distribution by Lagrangian Approach with Applications for the COVID-19 Pandemic

: The importance of Lagrangian distributions and their applicability in real-world events have been highlighted in several studies. In light of this, we create a new zero-truncated Lagrangian distribution. It is presented as a generalization of the zero-truncated binomial distribution (ZTBD) and hence named the Lagrangian zero-truncated binomial distribution (LZTBD). The moments, probability generating function, factorial moments, as well as skewness and kurtosis measures of the LZTBD are discussed. We also show that the new model’s ﬁnite mixture is identiﬁable. The unknown parameters of the LZTBD are estimated using the maximum likelihood method. A broad simulation study is executed as an evaluation of the well-established performance of the maximum likelihood estimates. The likelihood ratio test is used to assess the effectiveness of the third parameter in the new model. Six COVID-19 datasets are used to demonstrate the LZTBD’s applicability, and we conclude that the LZTBD is very competitive on the ﬁtting objective.


Introduction
Certain discrete distributions whose support is a set of positive integers are known as zero-truncated discrete distributions (ZTDDs). ZTDDs are used in ecology to represent data relating to counts, such as the number of flower heads, fly eggs, European red mites, or the number of times snowshoe hares were collected over seven days. These distributions are also employed in sociology to simulate data such as the size of human groups in parks, beaches, and public locations. As a result, ZTDDs have applications in practically every discipline of study, including biology, medicine, psychology, demography, and political science. In particular, the zero-truncated Poisson distribution (ZTPD) was used in [1] to analyze the number of eggs and gall-cell counts in flower heads. The authors of [2] used the ZTPD to model deer hunting in California. The author of [3] employed the zero-truncated negative binomial distribution (ZTNBD) to model the number of children ever born to a sample of moms over 40 years old; additionally, the authors of [4] used the ZTNBD in a regression model to treat over-dispersed count data of ischemic stroke hospitalizations. The author of [5] analyzed stroke count data based on the ZTPD, ZTNBD, and zero-truncated generalized negative binomial distribution (ZTGNBD). The application of the ZTNBD in the investigation of rare species abundance and hospital stays was discussed in [6]. The authors of [7] considered the use of ZTBD as a randomization device.
Considering the health aspect, many different diseases, ranging from the ordinary cold to much more dangerous ailments like Middle East Respiratory Syndrome (MERS) and Stats 2022, 5 Severe Acute Respiratory Syndrome (SARS), can be caused by the large family of viruses known as coronaviruses. The first cases of novel coronavirus  were found in Wuhan, China, in 2019 and the World Health Organization (WHO) has proclaimed it to be a pandemic. A coordinated international effort has been launched to halt the virus from spreading further, and the scientific community has contributed by starting various investigations. When it comes to model phenomena, statisticians play a critical role, and several attempts have already been made in the statistical literature. To estimate the daily new COVID-19 instances in China, the author of [8] used a mathematical model called SIR distribution. The authors of [9] developed a discrete version of the generalized Lindley distribution to model the daily new cases and deaths in the COVID-19 count data. A discrete type-2 half logistic exponential distribution was presented in [10] for estimating the number of COVID-19 deaths in Pakistan and Saudi Arabia. To model COVID-19 data in Singapore, the authors of [11] employed a discrete Marshall-Olkin inverted Topp-Leone distribution. Following the discovery of such a widespread epidemic, at least one new positive case is reported daily in practically all nations. To the best of our knowledge, ZTDDs are the most appropriate statistical model for such a situation. As far as we know, not even one statistician has attempted to model regularly occurring positive instances using ZTDDs. Hence, in this article, our aim is to propose a novel ZTDD to model the daily new positive cases. Furthermore, based on the same ZTDD, we also tried to model the number of deaths attributable to COVID-19 in a day.
On the other hand, Lagrangian distributions are a subclass of Lagrangian expansions, which were initially introduced in [12]. The authors of [13,14] introduced a discrete Lagrangian family (DLF) of probability distributions, which encompasses a vast and important class of probability distributions. It includes many families. Additionally, the authors of [14] showed that, under certain conditions, all discrete Lagrangian distributions converge to the normal and inverse Gaussian distributions. The author of [15], who discovered the Lagrangian negative binomial distribution, demonstrated its utility in a queuing process. The authors of [16] created the Lagrangian Katz family. The authors of [17] looked at how Lagrangian probability distributions can be employed to solve inferential difficulties in random mapping theory. The generalized Poisson gamma dependency model was developed in [18] using Lagrangian probability models. For collisional turbulent fluid-particle flows, the authors of [19] used the Lagrangian probability density function (pdf) models. The above-mentioned importance of the Lagrangian distributions immensely motivated us to propose a new ZTDD based on the Lagrangian approach. Therefore, based on the Lagrangian technique, we propose a unique ZTDD known as the Lagrangian zero-truncated binomial distribution (LZTBD) that can serve as a discrete model for a variety of count datasets.
The remaining parts of the paper are organized as follows: Section 2 presents some preliminaries of the Lagrangian probability distribution. In Section 3, we discuss the definition and properties of the LZTBD. The finite mixture of the new Lagrangian model is displayed in Section 4. In Section 5, we derive the maximum likelihood (ML) estimation method to estimate the unknown parameters of the LZTBD. The significance of the additional parameter is tested by using a generalized likelihood ratio test in Section 6. The finite sample performance of the ML estimation method is analyzed in Section 7 with a simulation study. Six real-world datasets are considered in Section 8 to demonstrate the usefulness of the proposed model. The concluding remarks are given in Section 9.
Consider the Lagrange expansion given in (1), where g 1 (z) and g 2 (z) are successively differentiable analytic functions over [−1, 1] such that g 1 (1) = g 2 (1) = 1, g 1 (0) = 0, and g 2 (0) ≥ 0. A new type of probability mass function (pmf) was defined in [13,22], and it is indicated as follows: provided that ∑ ∞ r=0 b r is finite. Putting z = u = 1 into (1), we obtain which gives, from (2), This pmf defined the DLF in the broad sense. The corresponding probability generating function (pgf) is given by where z = u g 1 (z). Given the applications of the DLF built with g 1 (z) and g 2 (z) in (3), it is worthwhile to investigate additional horizon distributions using the new function g 2 (z). This is the basis for the study's updated distribution, which is shown below.

Construction of Lagrangian Zero-Truncated Binomial Distribution
The LZTBD is introduced in this section as a new member of the DLF. Proposition 1. Assume that the random variable (rv) X follows the LZTBD, in which 0 < α < β −1 , 0 < β < 1 and γ > 0. Then, the pmf of X is given by (5) where ( y x ) stands for the generalized binomial coefficient, that is ( y which satisfy the statements given in Section 2. Using the DLF given in (3), the pmf of the LZTBD can be derived as follows: Thus, the proof is completed.
The distribution described in (5) is denoted as LZTBD(α, β, γ), and one can note X ∼ LZTBD(α, β, γ) to inform that an rv denoted by X follows the LZTBD with parameters α, β, and γ. Some special cases from the LZTBD are described below: • For α → 0, the LZTBD(α, β, γ) reduces to the one-parameter ZTBD. In this sense, the LZTBD is a generalization of the ZTBD; • For γ = 1, LZTBD(α, β, γ) reduces to the Lagrangian weighted Consul distribution given in [23]. Now, Figure 1 portrays the graphical representation of the LZTBD for different parameter values of α, β, and γ.  The hazard rate function (hrf) of the LZTBD is obtained by substituting the pmf in the following equation: Following (6), it goes without saying that determining the closed form expression of the hrf is more difficult, although, in order to determine the shape of the hrf, we sketch its graph. Figure 2 demonstrates the following facts about the shapes of the hrf of the LZTBD, indicating that the LZTBD has all of the typical shapes, such as increasing, decreasing, and bathtub shapes for varying parameter values. Furthermore, the choice of various specific functions for g 2 (z) will provide various members of DLF. In the following, we list some DLF distributions available in the literature.

Weighted Consul Distribution
If we take g 1 (z) = (1 − β + βz) α and g 2 (z) = z, based on (3), the pmf of the considered distribution can be derived as which is the pmf of the weighted Consul distribution (see [23]).

Weighted Delta Binomial Distribution
If we take g 1 (z) = (1 − β + βz) α and g 2 (z) = z γ , based on (3), the pmf of the considered distribution is obtained as which corresponds to the pmf of the weighted delta binomial distribution (see [23]).

Proposition 2.
Let X be an rv following the LZTBD. Then, the median of X is defined by the smaller integer m greater or equal to 1 such that Proof. By the definition, m is the smallest integer in the support of the rv, i.e., {1, 2, . . .}, such that P(X ≤ m) ≥ 1 2 , which is equivalent to the desired result.
Proposition 3. Let X be a rv following the LZTBD. Then, the mode of X, denoted by x m , exists in {1, 2, . . .} and lies in the case: Proof. We must find the integer x = x m for which f (x) has the greatest value. That is, First, note that f (x) can also be written as .

Proposition 4.
The pgf of an rv X following the LZTBD is expressed as Proof. Using (4), the pgf of the LZTBD is of the following form: Thus, the proof is complete.
Corollary 1. The moment generating function (mgf) of an rv X following the LZTBD is obtained by putting z = e s and u = e k in (11). That is, where s = k + α log(1 − β + βe s ).
Proposition 5. Let X 1 , X 2 , . . . , X n be n independently and identically distributed (iid) rvs following the LZTBD(α, β, γ). Then, the distribution of the random sum variable V = ∑ n i=1 X i has the following pgf: Proof. Based on the pgf of the LZTBD given in (11), the pgf of the rv V becomes This completes the proof.
Proof. By definition, the r th factorial moment of the LZTBD(α, β, γ) is obtained by successively differentiating Ψ(u) given in (11) r times with respect to (wrt) u and by putting u = z = 1. First, note that Taking the first derivative wrt u on both sides, we obtain Taking the second derivative of the above equation wrt u on both sides, we obtain Proceeding in a similar manner, the r th derivative is of the following form: Substitute (15), we obtain (14).

Proposition 7.
The mean (µ) and variance (σ 2 ) for the LZTBD are of the following forms, respectively, and Proof. Using (14), we obtain On the other hand, we have The desired expressions are obtained.
A normalized measure of dispersion can be obtained by utilizing the variance-to-mean relationship. This measure is the well-known index of dispersion (IOD). The next result expressed it for the LZTBD, among others. Proposition 8. The IOD and coefficient of variation (CV) for the LZTBD are given as, respectively, Analogously, the CV is given by A probabilistic model's asymmetry degree and flatness are commonly assessed by their skewness and kurtosis coefficients, respectively. The third central moment, normalized by the variance raised to the power of 3/2, can be used to calculate the first, whereas the fourth central moment divided by the square of the variance can be used to calculate the second. Mean, variance, CV, IOD, skewness, and kurtosis for selected values of parameters of the LZTBD(α, β, γ) are summarized in Table 1. From this table, it is evident that the LZTBD possesses both over-dispersion (IOD > 1) and under-dispersion (IOD < 1) for varying parameter values. It is also noted that the LZTBD is mainly right-skewed, and has several kurtosis levels.

Identifiability
Finite mixture models have received a lot of attention in recent years in real contexts. In astronomy, biology, genetics, medicine, psychiatry, marketing, and other fields, mixture models are widely utilized (see [24]). We derive finite mixtures of the LZTBD(α, β, γ) in this section. This mixed model may be appropriate in the context of future initiatives.
Let Y be a discrete rv with the pmf h(y Then, we state that Y has a mixture distribution and h(y) is a finite mixture of distributions. The constants l 1 , l 2 , ..., l g are known as mixing weights and h 1 (y), h 2 (y), . . . , h g (y), the components of the mixture. We denote as Θ the collection of all distinct parameters in the components.
A distribution with pmf given in (20) is called the Lagrangian zero-truncated binomial mixture distribution with g components (LZTBMDg).
The following theorem from [25] is adopted to construct the identifiability conditions of the finite mixture model. Theorem 1. A necessary and sufficient condition for∆ to be identifiable is that ∆ should be linearly independent over the field of real numbers.
Proof. The proof is stated in [25], hence, it is not included here.
Proof. For the first step, take g = 2 and consider the following equation: where b 1 and b 2 are any two arbitrary real numbers, F 1 (y) = ∑ y j=1 h(j) and F 2 (y) = ∑ y j=1 φ(j) for y = 1, 2, . . . , in which φ(j) is obtained from h(j) by replacing α j by τ j , β j by δ j and γ j by ω j .
Assume that for each i = 1, 2 and α i = τ i , β i = δ i and γ i = ω i . Thus, for l 1 = l, we have and Now, from (22)-(24), we obtain the following equations: Solving (25) and (26), we obtain b 1 Hence, by (27), we have b 1 = 0 and thus, b 2 = 0. Therefore, it may be inferred from Theorem 2 that F 1 (y) and F 2 (y) are linearly independent. Now that the argument may be applied to any positive integer g, the proof follows.
Proof. The proof follows simply from Definition 1, given the pgf of the LZTBD mentioned in (11).

Estimation of Parameters
In this section, we estimate the unknown parameters of the LZTBD by the ML estimation method.
It is worth mentioning that the model corresponding to the LZTBD(α, β, γ) is a triparametric model with parameters α, β, and γ. Let us have a random sample of size n from LZTBD and let the observed frequency be n x , x = 1, 2, . . . , k, so that ∑ k x=1 n x = n, where k is the largest of the observed value having non-zero frequencies. Then, the likelihood function is given by Therefore, the log-likelihood function is given by where x = 1 n ∑ k x=1 xn x . The ML estimates (MLEs) are defined by maximizing L n wrt the parameters. Let us denote byα,β, andγ the MLEs of α, β, and γ, respectively. On the computational side, the score vector is where the partial derivatives of L n wrt the parameters are .
The MLEs can then be found by setting the score vector to zero, i.e., S = 0, and solving them concurrently. These equations cannot be solved analytically, and the R statistical software can be used to solve them numerically by means of iterative techniques such as the Newton-Raphson algorithm.

Likelihood Ratio Test
In this section, we test the significance of an additional parameter included in the LZTBD using the generalized likelihood ratio test (GLRT) (see [26]).
More precisely, to test the significance of the parameter α of the LZTBD(α, β, γ), here, we consider the GLRT procedure. The null hypothesis H 0 : "X follows the ZTBD" against the alternative hypothesis H 1 : "X follows the LZTBD". Here, the test statistic is given by whereΘ is the vector of MLEs of Θ = (α, β, γ) with no constraints, andΘ * is the MLEs of Θ under H 0 .

Simulation
We perform a simulation study by generating observations employing the R software to examine the asymptotic behavior of the MLEs of the parameters of the LZTBD. Here, we apply the inverse transformation method to simulate a LZTBD random sample (see [27]). The algorithm is as follows: Step 1: Generate a random number from the uniform U(0, 1) distribution.
Step 3: If U < F, set X = i and stop.
Conceptually, P is the probability that X = i, and F is the probability that X is less than or equal to i. Additionally, indices such as MLEs, absolute biases, and mean squared errors (MSEs) are calculated using the following equations: • Average value of MLEs: MLE(â) = 1 N ∑ N i=1â i . • Absolute average bias: Here, a = α or β or γ, and the index i represents the i th generated sample. The simulation takes into account sample sizes of n = 15, 50, 175, 500, and 1000 for two different sets of parameter values of the LZTBD. We repeat the process N = 1000 times and report the estimates and MSEs in Table 2. From this table, one can infer that the estimates are quite stable and, more precisely, close to the true parameter values for these sample sizes. A decreasing trend is being observed in the absolute average bias and MSEs as we increase the sample size. Hence, the performance of the ML estimation is quite consistent and reliable.

Applications and Empirical Study
The aim of this section is to show the empirical importance of the LZTBD. We employ six genuine datasets to apply the superiority of the LZTBD fit to the more notable fields of COVID-19 with different nations, including Italy, Senegal, Pakistan, Saudi Arabia, Belgium, and Ethiopia. The graphical method used to determine the hrf of the data set is based on the Total Time on Test (TTT). Convex, concave, convex-then-concave, and concave-then-convex empirical TTT plots correspond to decreasing, increasing, bathtub shape, and upside-down bathtub shape for the corresponding hrf, respectively (see [28]). We employ the statistical software R to evaluate these datasets numerically. To show the possible benefit of the LZTBD, the distributions below are depicted: • ZTBD with parameters β and γ, which has the following pmf: • Zero-truncated generalized binomial distribution (ZTGBD) with parameters α, β, and γ with the following pmf: • Zero-truncated discrete two parameter Poisson-Lindley distribution (ZTDTPPLD) with parameters γ and β (see [29]), which have the following pmf: • Zero-truncated Poisson-Lindley distribution (ZTPLD) with parameters α (see [30]), which has the following pmf: Intervened generalized Poisson distribution (IGPD) with parameters α, β, and γ (see [31]), which has the following pmf:

COVID-19 Data Set from Italy
Italy's 61-day COVID-19 data collection, conducted from 13 June to 12 August 2021, is accessible in [11]. Daily newly reported cases are included in this data collection. The descriptive measures of the real data set, which include sample size (n), minimum (min), first quartile (Q 1 ), median (M d ), third quartile (Q 3 ), maximum (max), and inter-quartile range (IQR) are given in Table 3. In addition, Figure 3 shows an empirical TTT plot of the data, from which we deduce an increasing hrf. We compare the competitive distributions to the LZTBD using the statistical techniques provided, namely, the negative log-likelihood (− log L), Akaike information criterion (AIC), Bayesian information criterion (BIC), andχ 2 statistic. Table 4 displays the corresponding MLEs, model adequacy measures, and χ 2 values. The LZTBD has lower model adequacy measures and χ 2 values than the other distributions studied, as shown in Table 4. As a result, the suggested model is the most appropriate for modeling the given COVID-19 data. It is interesting to note that the empirical mean, variance, and IOD of this COVID-19 data set are 22.6229, 160.3388, and 7.0874, respectively, and the theoretical values for the mean, variance, and IOD measures of the LZTBD are 21.6234, 160.3248, and 7.4144, respectively. Thus, the empirical and theoretical means are almost the same, and the empirical and theoretical variances and IOD values are close to each other. In the case of GLRT, the calculated value based on the test statistic (29) is 2(−234.5071 + 485.9380) = 251.4309 (p-value = 0.0001). As a result, at any level greater than 0.0001, the null hypothesis is rejected in favor of the alternative hypothesis. Hence, we conclude that the additional parameter α in the LZTBD is significant in light of the test procedure outlined in Section 6.

COVID-19 Data Set from Senegal
The LZTBD is fitted to another set of data for the COVID-19 in Senegal for 56 days of infection, which was recorded from 29 March 2021 to 23 May 2021. These data, which show the daily incidence of COVID-19 cases, were gathered by the World Health Organization (WHO) and are accessible at http://covid19.who.int/data, (accessed on 24 August 2022). Table 5 includes some information as well as descriptive statistics for these data. In addition, Figure 4 shows an empirical TTT plot of the data from which an increasing hrf is revealed.
We compare the competitive distributions to the suggested distribution using the statistical techniques provided, specifically, the − log L, AIC, BIC, and χ 2 values. Table 6 displays the corresponding MLEs, model adequacy measures, and χ 2 values of the LZTBD. The LZTBD's model adequacy measures and χ 2 values are less than those of the other examined models. As a result, the suggested model is the most appropriate for modeling the COVID-19 data from Senegal. It is worth noting that the empirical mean, variance, and IOD of these COVID-19 datasets are 46.54, 394.326, and 8.47, respectively, and the theoretical values for the mean, variance, and IOD measures of the LZTBD are 46.4, 394.324, and 8.49, respectively. Thus, the empirical and theoretical means are almost the same, and the empirical and theoretical variances and IOD values are very close to each other.
In the case of GLRT, the calculated value based on the test statistic (29) is 2(−244.0212 + 584.8307) = 340.8095 (p-value = 0.0004). As a result, at any level greater than 0.0004, the null hypothesis is rejected in favor of the alternative hypothesis. Hence, we conclude that the additional parameter α in the LZTBD is significant in light of the test procedure outlined in Section 6.

COVID-19 Data Set from Pakistan
The LZTBD is fitted to another set of data for the COVID-19 in Pakistan for 95 days of infection, which was recorded from 23 May 2021 to 25 August 2021. These data, which are available at http://covid19.who.int/data, (accessed on 24 August 2022), were acquired by the WHO and show the daily incidence of COVID-19 cases. Table 7 contains some information and descriptive statistics for these data. In addition, Figure 5 shows an empirical TTT plot of the data, showing an increasing hrf. Using the statistical methods offered, specifically the − log L, AIC, BIC, and χ 2 values, we compare the competing distributions to the suggested distribution. Table 8 displays the corresponding MLEs, model adequacy measures, and χ 2 values of the LZTBD. The LZTBD's model adequacy measures and χ 2 values are less than those of the other examined models. The suggested model is, therefore, the most suitable one to model the COVID-19 data from Pakistan. In addition, let us mention that the empirical mean, variance, and IOD of this COVID-19 dataset are 52.6842, 538.5801, and 10.2228, respectively, and the theoretical values for the mean, variance, and IOD measures of the LZTBD are 52.6704, 538.5509, and 10.2249, respectively. We thus observe that the empirical and theoretical means are almost equal, and the empirical and theoretical variances and IOD values are very close to each other.
In the case of GLRT, the calculated value based on the test statistic (29) is 2(−431.4451 + 1313.275) = 881.8299 (p-value = 0.0001). As a result, at any level greater than 0.0001, the null hypothesis is rejected in favor of the alternative hypothesis. Hence, we conclude that the additional parameter α in the LZTBD is significant in light of the test procedure outlined in Section 6.

COVID-19 Data Set from Saudi Arabia
The LZTBD is fitted to another set of data of COVID-19 mortality numbers in Saudi Arabia for 83 days of infection, which was recorded from 30 May to 20 August 2020. The WHO gathered these data, which represent the number of deaths per day, and they are available at http://covid19.who.int/data, (accessed on 24 August 2022). Table 9 contains some information and descriptive statistics for these data. In addition, Figure 6 shows an empirical TTT plot of the data and it shows an increasing hrf. We compare the competitive distributions to the suggested distribution using the statistical techniques provided, specifically, the − log L, AIC, BIC, and χ 2 values. Table 10 displays the corresponding MLEs, model adequacy measures, and χ 2 values of the LZTBD. The LZTBD's model adequacy measures and χ 2 values are less than those of the other examined models. For modeling the COVID-19 data from Saudi Arabia, the suggested model is therefore the most suitable. Furthermore, the empirical mean, variance, and the IOD values of this COVID-19 dataset are 36.9277, 70.5313, and 1.9099, respectively, and the theoretical values for the mean, variance, and IOD measures of the LZTBD are 36.8724, 71.0567, and 1.9270, respectively. Hence, the empirical and theoretical means are almost the same, and the empirical and theoretical variances and IOD values are very close to each other.
In the case of GLRT, the calculated value based on the test statistic (29) is 2(−294.288 + 415.1392) = 120.8512 (p-value = 0.0002). As a result, at any level greater than 0.0002, the null hypothesis is rejected in favor of the alternative hypothesis. Hence, we conclude that the additional parameter α in the LZTBD is significant in light of the test procedure outlined in Section 6.

COVID-19 Data Set from Belgium
A different set of data on the COVID-19 infection in Belgium for 425 days (more than a year), which was recorded from 22 July 2021 to 19 September 2022, is fitted using the LZTBD. The WHO gathered these data, which represent the number of deaths per day, and are accessible at http://covid19.who.int/data, (accessed on 24 August 2022). Table 11 contains some information and descriptive statistics for these data. In addition, Figure 7 shows an empirical TTT plot from which we can distinguish an increasing hrf. We compare the competitive distributions to the suggested distribution using the statistical techniques provided, specifically, the -log L, AIC, BIC, and χ 2 values. Table 12 displays the corresponding MLEs, model adequacy measures, and χ 2 values of the LZTBD. The LZTBD's model adequacy measures and χ 2 values are less than those of the other examined models. As a result, the suggested model is the most appropriate for modeling the COVID-19 data from Belgium. It is worth noting that the empirical mean, variance, and IOD of this COVID-19 dataset are 17.122, 178.419, and 10.420, respectively, and the theoretical values for the mean, variance, and IOD measures of the LZTBD are 17.213, 178.412, and 10.365, respectively. Thus, the empirical and theoretical means are almost the same, and the empirical and theoretical variances and IOD values are very close to each other. In the case of GLRT, the calculated value based on the test statistic (29) is 2(−1601.074 + 3825.833) = 2224.759 (p-value = 0.0012). As a result, at any level greater than 0.0012, the null hypothesis is rejected in favor of the alternative hypothesis. Hence, we conclude that the additional parameter α in the LZTBD is significant in light of the test procedure outlined in Section 6.

COVID-19 Data Set from Ethiopia
The LZTBD is fitted to another set of data on the COVID-19 infection in Ethiopia for 301 days, which was recorded from 25 August 2020 to 21 June 2021. The WHO collected these data, which represent the number of deaths per day, and are accessible at http: //covid19.who.int/data, (accessed on 24 August 2022). Table 13 contains some information and descriptive statistics for these data. As an additional result, Figure 8 shows an empirical TTT plot of the data, where an increasing hrf can be seen. We use the statistical techniques provided to compare the competitive distributions to the suggested distribution, specifically, the − log L, AIC, BIC, and χ 2 values. Table 14 displays the corresponding MLEs, model adequacy measures, and χ 2 values of the LZTBD. The LZTBD's model adequacy measures and χ 2 values are less than those of the other examined models. The suggested model is therefore the most suitable one to model the Ethiopian COVID-19 data. In addition, it is found that the empirical mean, variance, and IOD of this COVID-19 dataset are 11.973, 67.00, and 5.5959, respectively, and the theoretical values for the mean, variance, and IOD measures of the LZTBD are 11.891, 67.02, and 5.6361, respectively. Thus, the empirical and theoretical means are almost equal, and the empirical and theoretical variances and IOD values are very close to each other.
In the case of GLRT, the calculated value based on the test statistic (29) is 2(1003.902 + 1680.734) = 676.832 (p-value = 0.0002). As a result, at any level greater than 0.0002, the null hypothesis is rejected in favor of the alternative hypothesis. Hence, we conclude that the additional parameter α in the LZTBD is significant in light of the test procedure outlined in Section 6.

Conclusions
In this article, we used the Lagrange expansion to elaborate a new three-parameter distribution called the Lagrangian zero-truncated binomial distribution (LZTBD). It is worth noting that the proposed distribution is a generalized form of the well-known zerotruncated binomial distribution and the Lagrangian weighted Consul distribution. In particular, we paid close attention to the LZTBD. We investigated the shape properties of the probability mass and hazard functions. The expressions for the factorial moments, generating functions, mean, and median were derived. The identifiability of the LZTBD model was also proved. The LZTBD's model parameters are estimated using the maximum likelihood estimation method. A study employing the simulation technique was also performed to show how well the maximum likelihood estimates are performing. Six actual datasets were used to validate the applicability and demonstrate that the LZTBD offers a superior fit to the competing models.