The Harris Extended Bilal Distribution with Applications in Hydrology and Quality Control

: In this research work, a new three-parameter lifetime distribution is introduced and studied. It is called the Harris extended Bilal distribution due to its construction from a mixture of the famous Bilal and Harris distributions, resulting from a branching process. The basic properties, such as the moment generating function, moments, quantile function, and Rényi entropy, are discussed. We show that the hazard rate function has ideal features for modeling increasing, upside-down bathtub, and roller-coaster data sets. In a second part, the Harris extended Bilal model is investigated from a statistical viewpoint. The maximum likelihood estimation is used to estimate the parameters, and a simulation study is carried out. The ﬂexibility of the proposed model in a hydrological data analysis scenario is demonstrated using two practical data sets and compared with important competing models. After that, we establish an acceptance sampling plan that takes advantage of all of the features of the Harris extended Bilal model. The operating characteristic values, the minimum sample size that corresponds to the maximum possible defects, and the minimum ratios of lifetime associated with the producer’s risk are discussed.


Introduction
Modern data are diverse and complex, so new statistical models based on appealing distributions have long been popular in the statistical literature. Among the most notable references, the authors in [1] introduced a new one-parameter distribution known as the Bilal distribution, which yields more flexibility in the modeling of real data sets than the exponential and Lindley distributions. Although the Bilal distribution has received less attention, there has been great interest in its extensions, generalizations, and related applications. A short retrospective on this topic is offered below. A two-parameter generalization was introduced in [2] as a solution to the unimodal hazard rate function (hrf) of the Bilal distribution. The authors in [3] suggested that the scale parameter involved should be estimated using U-statistics. The log-Bilal distribution and associated regression, which provide better modeling of extremely skewed dependent variables with associated covariates, were introduced in [4]. In addition, the author in [5] proposed a new distribution based on the Poisson-Bilal distribution to model count data regression. The corresponding INAR(1) process for over-dispersed count data sets was also provided. The authors in [6] introduced the Farlie-Gumbel-Morgenstern bivariate Bilal distribution and its inferential aspects using concomitant order statistics. Some properties and an estimation under ranked set sampling were established for the generalized Bilal distribution in [7]. These studies have demonstrated the versatility of the Bilal distribution. There is, however, room for improvement in reaching the goal of perfect statistical modeling.
As a matter of fact, the extended distributions proposed by adding additional parameters generally provide an improved flexibility. To that end, the Harris extended family of distributions was introduced in [8] by modifying the baseline distribution with two parameters. A physical interpretation at the heart of this family is as follows: if a device is made up of N serial components with a fixed failure rate, where N is a random variable, then the Harris extended family is defined by the distribution of the device's time until failure. Thus, it results from a branching process. More information on this construction can be found in [9]. The Harris extended family can also be viewed as a generalization of the Marshall-Olkin distribution developed in [10], with an additional new parameter providing more control over the distribution's shape. It provides an adequate model in various research fields, such as hydrology, insurance, biology, and life testing.
The new lifetime distributions have a large amount of room for quality control because of the non-standard lifetime data scenarios. In this context, acceptance sampling plans (ASPs) play a major role. Due to certain restrictions, examining the whole production unit is impossible. Thus, the ASP acts as a decision rule for the acceptance of a lot from a sample of products. It arose from the consideration of both consumer and producer risks, representing a middle ground between complete inspection and no inspection.
The goal of this paper is to introduce the Harris extended Bilal (HEB) distribution, a three-parameter generalization of the Bilal distribution based on the idea in [8]. We emphasize its practical usefulness. In addition, we intend to compare the proposed distribution with the Harris extended Lindley (HEL) distribution proposed in [11] and the Harris extended exponential (HEE) distribution introduced in [12]. This is demonstrated through hydrological data analysis. We also propose the ASP, a reliability test plan for accepting or rejecting lots, where the lifetime of the product follows the HEB distribution and discusses its properties.
The remaining part of the paper is organized in the following order: Section 2 describes the nature of the probability density function (pdf) and hrf of the HEB distribution. In Section 3, we describe its associated statistical properties, such as the moment generating function (mgf), moments, quantile function, and (Rényi) entropy. The estimation of the parameters and the Fisher information matrix are discussed in Section 4. The large sample behavior of the HEB distribution, with the help of certain simulated data sets, is detailed in Section 5. In Section 6, two real data sets are analyzed using the proposed distribution. Section 7 investigates the ASP with a lifetime following the HEB distribution. Finally, the study is concluded in Section 8.

The Harris Extended Bilal Distribution
In this section, we describe the HEB distribution and elucidate some of its statistical properties.
As suggested in [8], assume that X 1 , X 2 , . . . is a sequence of independent and identically distributed (iid) random variables with the pdf f 1 (x) and the survival function (sf) F 1 (x). Consider a positive integer random variable N, independent of X 1 , X 2 , . . ., following the Harris distribution with parameters θ > 0 and δ > 0. Let X = min(X 1 , X 2 , . . . , X N ). Then, the resulting distribution of X is known as the Harris extended family of distributions with sf of the form where θ = 1 − θ. Thus, θ and δ are the shape parameters, providing additional flexibility to the baseline sf F 1 (x). The corresponding pdf is indicated as follows: It can be considered as a generalization of the Marshall-Olkin family of distributions in [10], obtained by taking δ = 1 in (1).
On the other hand, the author in [1] introduced the Bilal distribution as a new oneparameter lifetime distribution with the following pdf: and ( r i ) = Γ(r+1) Γ(i+1) Γ(r−i+1) for i > 0 and r > 0, where Γ(x) denotes the standard gamma function. (It is worth noting that θ = 1 is omitted voluntarily because it corresponds to the well-known Bilal distribution.) In order not to weigh down the presentation, this proof (as well as all some future proofs) is given in Appendix A.
The main interest of Theorem 1 is in terms of functional approximation: for large enough M, we can efficiently approximate the sophisticated pdf g(x; λ, θ, δ) to a manageable sum of simple exponential functions as Some important statistical measures related to the HEB distribution can therefore be simply approximated, as developed later.
Using (2) and (3), the hrf of the HEB distribution is given by It is understood that h(x; λ, θ, δ) = 0 for x ≤ 0. Figures 1 and 2 display the pdf and hrf of the HEB distribution for different parameter values, respectively.
From Figure 1, we can see that the pdf is unimodal and right-skewed. Figure 2 shows that the hrf can be increasing (IFR), upside-down bathtub (UBT), and roller-coaster, which is not shared by the general Bilal (GB) distribution (see [2]).

Statistical Properties
In this section, some mathematical properties of the HEB distribution are intended for comparison because of the structural complexity of the Harris extended family of distributions.

Moment Generating Function and Moments
The next result presents a series expansion of the mgf of the HEB distribution. Proposition 2. Let X be a random variable following the HEB distribution. Then, the mgf of X is given by M(t) = E(e tX ), and can be expressed as Proof. From the expansion in Theorem 1 and integration, we obtain Considering the change in variables, v = e − x λ , we obtain where B(a, b) refers to the standard beta function, B(a, b) = 1 0 v a−1 (1 − v) b−1 dv, with a > 0 and b > 0. Since B(a, 2) = 1 a(a+1) , we obtain the desired result.
As usual, the mgf can serve to generate the raw moments of X, find bounds for some probability involving X into an event via the Markov inequality, or characterize the independence of several random variables following the HEB distribution.
The next result presents a comprehensive expansion of the raw moments of X.
Proposition 3. Let X be a random variable following the HEB distribution and r be an integer. Then, the rth raw moment of X is given by µ r = E(X r ) and can be expressed as where z i,j is expressed in (4).
The derivation is given in Appendix B. In particular, from Proposition 3, we can derive the first two raw moments of X as respectively. The variance and standard deviation follow immediately.

Quantile Function
The following proposition gives the quantile function of the HEB distribution.

Proposition 4.
The quantile function of the HEB distribution is given by Q(u; λ, θ, δ) = F −1 (u; λ, θ, δ), and can be expressed as Proof. Using (3), we need to solve 1 − G(x; λ, θ, δ) = u, which is equivalent to The left term is the cumulative distribution function (cdf) of the Bilal distribution. Hence, the proof follows from the quantile function of the Bilal distribution (see ( [1] Equation (7))).
Since the HEB distribution has a closed-form quantile function, it has a variate generation property, which is very useful in simulation studies.

Entropy
Entropy is the measure of uncertainty about a random variable. The most common measure of uncertainty is the Rényi entropy. It is given in the following proposition in the context of the HEB distribution.
Proposition 5. Let X be a random variable following the HEB distribution. Then, the Rényi entropy of X is given by with v > 0 and v = 1, and can be expressed as The derivation is given in Appendix C.

Parameter Estimation
Here, we estimate the unknown parameters of the HEB distribution using the maximum likelihood (ML), least squares (LS), and weighted least squares (WLS) methods. Through the use of a simulation study, the effectiveness of these methods is assessed.

Maximum Likelihood Estimation
Let n be a positive integer and X 1 , X 2 , . . . , X n be n iid random variables, which constitutes a random sample of size n, from the HEB(λ, θ, δ) distribution. Let x 1 , x 2 , . . . , x n be observations of these random variables. From (2), the log-likelihood function is given by The ML estimates λ, θ, and δ of the parameters λ, θ, and δ, respectively, are those maximizing log L (λ,θ,δ) with respect to λ, θ, and δ. They may be obtained from the solution of the following equations: Since we cannot find the solution in explicit form when equating to zero, we would go for the direct maximization of (5) using numerical methods.
The inference analysis on the parameters can be performed using the underlying asymptotic properties of the random ML estimators. For the vector parameter estimate φ = ( λ, θ, δ), assuming classical regularity conditions, the asymptotic distribution behind φ is the trivariate normal N 3 (φ, K −1 ) distribution, where K −1 is the information matrix of the parameters, K = −J n , and .

Least and Weighted Least Squares Estimation
Let X (1) , X (2) , . . . , X (n) be the order statistics of X 1 , X 2 , . . . , X n , i.e., such that (2) , . . . , x (n) be observations of these random variables. By minimizing the following function with respect to λ, θ, and δ, we obtain the LS estimates of the parameters: Similarly, the WLS estimates of the parameters λ, θ, and δ are obtained by minimizing the following function:

Simulation
The performance of the HEB model is analyzed by means of a simulation study. The simulation is run with N = 1000 replications for a sample of size of n = 50, 100, 150, 200, and 250, and the following arbitrary choices of parameter values: (λ = 1.5, θ = 0.8, . The parameter estimation is carried out by the ML, LS, and WLS methods, and the following quantities are computed:

1.
Average bias (Bias) of the parameters, given by the following formula: Root mean square error (RMSE) of the parameters, given by the following formula: The simulation result is displayed in Table 1. In general, we can conclude that the ML, LS, and WLS estimations perform very well. Indeed, as n increases, the RMSE and bias decrease.

Methodology
We assess the performance of the proposed model with two real hydrological data sets: the Wheaton River data set given in [11], and the Kiama Blowhole data used in [12].
We compare the performance of the HEB distribution to that of some other competing distributions, such as the HEL distribution, HEE distribution, MOB distribution, GB distribution introduced in [2], exponentiated exponential (EE) distribution defined in [13], exponentiated Weibull (EW) distribution introduced in [14], power Lindley (PL) distribution proposed in [15], Marshall-Olkin exponential (MOE) distribution, which is the HEE distribution with δ = 1, Marshall-Olkin Lindley (MOL) distribution, which is the HEL distribution with δ = 1, and the exponentiated Lindley (EL) distribution discussed in [16].
The ML method is used to estimate the unknown parameters of the HEB model, and the model's performance is evaluated using well-referenced information criteria and goodness-of-fit statistics. The smaller values of the Akaike information criterion (AIC) and Bayesian information criterion (BIC) and the large value of the estimated log likelihood (log L) indicate the model adequacy. The goodness-of-fit statistics are evaluated by employing the Kolmogorov-Smirnov (KS) statistic and associated p value, Anderson-Darling (AD), Cramér-von Mises (CM), and average scaled absolute error (ASAE) statistics (see [17]). The smaller the goodness-of-fit measures, the better the fit.

Wheaton River Data
The considered data set consists of the exceedances of flood peaks (in m 3 /s) of the Wheaton River, Canada, for the years 1958-1984, which is used to fit the HEL distribution proposed in [11]. Figure 3 displays the total time on test (TTT) plot and box plot for the data, and we can see that the observations are right-skewed and have an increasing hrf, which is applicable under the HEB model.   Table 2 lists the ML estimates with standard errors (SEs), information criteria, and goodness-of-fit-measures for different models. We can see that the HEB model has the maximum log L and the lowest AIC and BIC values. Moreover, the associated KS statistic is the minimum with a large p value, and the AD, CM, and ASAE statistics have the smallest values. We can conclude that the HEB model performs well among the considered competitive models.  The fitted pdf and cdf plots, the quantile-quantile (Q-Q) plot, and the probabilityprobability (P-P) plots of the HEB model for the Wheaton river data are given in Figure 4. The points in the Q-Q and P-P plots are almost in a straight line. We can infer that the HEB model yields the best fit for the Wheaton river data.

Kiama Blowhole Data
Here, the considered data set is the waiting times between consecutive eruptions of the Kiama Blowhole used in [12] to fit the HEE distribution. Figure 5 displays the TTT plot and box plot for the data, and we can see that the observations are right-skewed and with an increasing hrf that is applicable under the HEB model. Table 3 lists the ML estimates with SEs and goodness-of-fit-measures for different models. We can see that the HEB model has the minimum KS statistic with a large p value, and the AD, CM, and ASAE statistics have the smallest values. We can conclude that the HEB model performs well among the considered competitive models.
The fitted pdf and cdf plots, the Q-Q plot, and the P-P plot of the HEB model for the Kiama Blowhole data are displayed in Figure 6. The points in the Q-Q and P-P plots are almost in a straight line. We can infer that the HEB model yields the best fit for the Kiama Blowhole data.

Method
Here, we look forward to introducing an ASP based on the assumption that the lifetimes of sample products follow the HEB distribution. The number of items to be examined and the maximum possible number of defects in them for acceptance are the major concerns of an ASP. The number of defects when the test is terminated at a predetermined time is recorded. We accept the lot with a probability of at least p * if the number of defects out of n inspected items does not exceed the maximum possible number of defects (c) at time t. When the number of defects exceeds c before the specified time t, the lot is rejected. Thus, the minimum sample size required for the decision rule is the primary interest of our study.
Assume that the lifetime distribution follows the HEB distribution, with known θ and δ and unknown λ, so that the average lifetime is solely dependent on λ. We recall that the cdf of the HEB distribution is given by Let λ 0 be the required minimum average lifetime. Then, the following equivalence holds: The ASP is characterized by the following elements: • The number of units n on the test; • The acceptance number c; • The maximum test duration t; • The ratio t λ 0 , where λ 0 is the specified average lifetime and t is the maximum test duration. For the sake of consumers, the lot with a true average life λ less than λ 0 should be rejected by the ASP. As a result, the consumer's risk should not exceed the value 1 − p * , where p * is a lower bound for the probability that a lot is rejected by the ASP. The triplet n, c, t λ 0 characterizes the ASP for a given p * . We can obtain the acceptance probability by using a binomial distribution for sufficiently large lots. The main goal is to find the smallest sample size n for known c and t λ 0 values so that where p 0 = G(t; λ 0 , θ, δ) is the failure probability before time t. Table 4 1  7  5  4  4  3  3  3  2  2  2   2  11  8  6  6  5  4  4  4  3  3   3  14  11  8  7  6  6  5  5  5  4   4  18  13  10  9  8  7  6  6  6  5   5  21  15  12  11  9  9  8  7  7  6   6  24  18  14  13  11  10  9  8  8  7   7  27  20  16  14  12  11  10  10  9  8   8  31  23  18  16  14  13  11  11  10  9   9  34  25  20  18  15  14  13  12  11  11   10  37  27  22  20  17  15  14  13  12  12   0.95   0  8  5  4  4  3  3  2  2  2  1   1  12  9  7  6  5  4  4  3  3  3   2  17  12  9  8  7  6  5  5  4  4   3  21  15  12  10  8  7  7  6  6  5   4  24  18  14  12  10  9  8  7  7  6   5  28  20  16  14  12  11  9  9  8  7   6  32  23  18  16  13  12  11  10  9  8   7  36  26  21  18  15  14  12  11  10  9   8  39  29  23  20  16  15  13  12  11  10   9  43  31  25  22  18  16  15  13  13  11   10  46  34  27  24  20  18  16  15  14  12 For large values of n and small values of p 0 , we can use the Poisson approximation with parameter α = np 0 as The minimum values of n satisfying (6) are obtained in the same way as above, and are given in Table 5. The operating characteristic (OC) function of the ASP n, c, t λ gives the probability of accepting the lot. It is given by where p = G(t; λ, θ, δ). The OC function acts as a base for the choices of n and c for given values of p * and t λ 0 . By considering the fact that the OC values for the ASP n, c, t λ are obtained and displayed in Table 6. Figure 7 shows the OC curve for p * = 0.75, d = t λ 0 , and f = λ λ 0 .  For the sake of producers, the lot with λ greater than λ 0 should be accepted. The probability of rejecting a lot when λ is greater than λ 0 , called producer's risk, can be found by determining p = G(t; λ, θ, δ) and with the help of a binomial distribution. For a specified producer's risk of, say 0.05, it would be interesting to know what value of λ λ 0 will ensure that a producer's risk is less than or equal to 0.05 if the proposed ASP is adopted. The smallest value of λ λ 0 must satisfy the following inequality: For the given ASP n, c, t λ and prefixed p * , Table 7 displays the minimum values of λ λ 0 required to satisfy (7).

Illustration
Allow the lifetime to follow the HEB distribution with parameters λ = 1.3, θ = 2, and δ = 1.5. Suppose that our interest is an ASP with an unknown average lifetime of 1000 h, such that the termination time is 1100 h. The consumer's risk is prefixed at 1 − p * = 0.25. The required number of n is 6 for an acceptance number of c = 2 and t λ 0 = 1.1, according to Table 4. Hence, the considered ASP is n = 7, c = 2, t λ 0 = 1.1 . During the test time, we have a confidence level of 0.75 that the average lifetime is at least 1000 h if at most two failures out of six are observed. The ASP under consideration for the Poisson approximation is n = 7, c = 2, t λ 0 = 1.1 . From Table 6, the OC values of the ASP n = 6, c = 2, t λ 0 = 1.1 under the binomial case with a consumer's risk of 0.25 are: 0.83142, 0.99150, 0.99890, 0.99976, 0.99993, 0.99997 for λ λ 0 = 2, 4, 6, 8, 10, 12, respectively. As a result, if λ λ 0 = 2, the producer's risk is 0.17. The producer's risk is negligible if it is 10 or 12. From Table 7, the minimum value of λ λ 0 giving a producer's risk of 0.05 is 3.23.
Thus, if the consumer's risk is fixed at a specified level, then the quality can be reached by a predetermined ratio.
Let the testing time be 3600 h and the prefixed average lifetime be 3000 h. The ASP is adopted under the assumption that the lifetime follows the HEB distribution. The Q-Q plot and goodness-of-fit statistics guarantee a good agreement (θ = 1.758011 × 10 13 , δ = 4.921590 × 10 2 ). By taking t λ 0 = 1.2, p * = 0.95, and n = 13, we obtain c as 6. Thus, the considered ASP is (n = 13, c = 6, t λ 0 = 1.2). We accept the lot if and only if the number of failures is at most 6. There are six values here that are less than t. Thus, we accept the lot.

Conclusions
The Harris extended Bilal (HEB) distribution is a three-parameter extension of the Bilal distribution that we suggested. It is obtained by applying the Harris extended scheme to the Bilal distribution. The aim of the two additional shape parameters is to provide more flexibility to the Bilal distribution. The Bilal distribution is included as a sub-distribution, and the HEB distribution can be considered as a generalization of the Marshall-Olkin Bilal distribution. The corresponding pdf is unimodal and better suited for right-skewed data sets. The hrf can increase, or have an upside-down bathtub shape, or a roller coaster shape. The mathematical properties were discussed and are meant for comparative purposes with respect to the members of the Harris extended family. Then, an emphasis on the statistical HEB model's efficiency was made. The performance of the model parameter estimation was evaluated using a simulation study. The proposed HEB model provided a better modeling of hydrological data when compared to the competing models. We developed an acceptance sampling plan that has a lifetime following the HEB distribution. The operating characteristic values, the minimum sample size that corresponds to the maximum possible defects, and the minimum ratios of lifetime associated with the producer's risk were discussed. The results were illustrated using a real data set. The perspectives of this work are numerous, including the applications to various applied fields (biology, medicine, engineering, informatics, etc.); the extension to the multidimensional case with use in regression and classification modeling; and the discrete version for the modeling of count data.

Acknowledgments:
The authors would like to thank the two reviewers and the associate editor for their constructive comments on the paper.

Conflicts of Interest:
The authors declare no conflict of interest.