Box-Cox Gamma-G family of distributions: Theory and Applications

This paper is devoted to a new family of distributions called the Box-Cox Gamma-G family of distributions. It can be viewed as a natural generalization of the useful RB-G family of distributions, containing a wide variety of power Gamma-G distributions, including the odd one. The key tool of this generalization is the use of the Box-Cox transformation. Some mathematical properties of the new family of distributions are derived. Then a speciﬁc member with three parameters, based on the half-Cauchy distribution as baseline, is studied and considered as a statistical model. The estimation of its parameters is discussed by the method of maximum likelihood. A simulation study is provided to support the theoretical convergence of the estimators. Finally, two real data sets are considered to show the power of adjustment of the new model compared to other competitive models.


Introduction
The standard probability distributions do not offer a convincing statistical models for a large panel of practical data sets. This fact has been the motor of studies dedicated to the creation of new probability distributions by various approaches. One of the most popular approach is the use of generators of probability distributions. The most popular ones include the Marshall-Olkin-G by Marshall and Olkin (1997), the exp-G by Gupta et al. (1998), the beta-G by Eugene et al. (2002), the gamma-G by Zografos and Balakrishnan (2009), the Kumaraswamy-G by Cordeiro and de Castro (2011), the RB-G (also called gamma-G type 2) by Ristić and Balakrishnan (2012), the exponentiated generalized-G by Cordeiro et al. (2013), the logistic-G by Torabi and Montazeri (2014), the TX-G by Alzaatreh et al. (2013), the Weibull-G by Bourguignon et al. (2014), the exponentiated half-logistic-G by , the odd generalized exponential family by Tahir et al. (2015), the odd Burr III-G by Jamal et al. (2017), the cosine-sine-G by Chesneau et al. (2018), the generalized odd Gamma-G by Hosseini et al. (2018) and the extended odd-G family by Bakouch et al. (2019).
In this paper, we propose a new family of distributions, based on a new generator, with interesting features for the statistician. For a first approach, we can say that generalize, in some sense, the RB-G by Ristić and Balakrishnan (2012) by the use of the Box-Cox transformation. Let us now briefly present the RB-G. Let G(x; φ) be a cumulative distribution function (cdf), where φ denotes one parameter or several parameters. Then the RB-G is characterized by the following cdf: where γ(δ, u) denotes the lower incomplete gamma function defined by γ(δ, u) = u 0 t δ−1 e −t dt. The RB-G family of distributions is very rich, providing successful solutions for modeling various kinds of data. We refer to the extensive survey of Cordeiro and Bourguignon (2016), and the references therein. The success of the RB-G family of distributions is a motivation to introduce natural extensions, with a great potential of applicability, as the one studied in this paper. Our idea is to introduce an additional parameter λ > 0 and to consider the cdf given by Thus, in comparison to the RB-G cdf given by (1), we have replaced the logarithmic term − log[G(x; φ]) = log G(x; φ) −1 by the affine power transformation [G(x; φ) −λ − 1]/λ, which corresponds to the so-called Box-Cox transformation of G(x; φ) −1 with parameter λ. Indeed, introducing the Box-Cox transformation defined by b λ (y) = (y λ − 1)/λ, we have [G(x; φ) −λ − 1]/λ = b λ G(x; φ) −1 . The advantages to consider the Box-Cox transformation in our context are the following: (i) When λ → 0, we have b λ G(x; φ) −1 = − log[G(x; φ]) and the cdf F (x; λ, δ, φ) given by (2) becomes the RB-G cdf given by (1), (ii) When λ = 1, we have b λ G(x; φ) −1 = G(x; φ) −1 − 1 = (1 − G(x; φ))/G(x; φ), which corresponds to the odd transformation of G(x; φ), widely used this last decades to define new flexible families of distributions, with physical interpretations (in the context of Gamma-G, see Hosseini et al. (2018), and the references therein) (iii) In full generality, the consideration of the power transform of G(x; φ) increases its flexibility and the flexibility of the related family (see, for instance, Gupta et al. (1998) for the former power family of distributions). The family of distributions characterized by the cdf (2) will be called the Box-Cox Gamma-G family of distributions, with BCG-G as an abbreviation of Box-Cox Gamma-G. In this paper, we aim to study the BCG-G family (of distributions) in detail, by examining both the theoretical and practical aspects, with discussions. For the practice, the half-Cauchy cdf is considered for G(x; φ), offering a new solution for modelling data presenting highly skewed distribution to the right. In particular, we show that our model outperforms, in some sense, well established competitors, highlighting the importance of the BCG-G family.
This paper is organized as follows. In Section 2, we present the two other crucial functions of the BCG-G family: the probability density function and the hazard rate function. Then some members of the BCG-G family with a potential of interest are listed. In Section 3, we derive the general mathematical properties of the BCG-G family such as the asymptotic behavior of the crucial functions, their shapes, some immediate characterizations of the family, the quantile function, a result on stochastic ordering, the useful series expansions of the crucial functions, the moments, the incomplete moments with some derivations, the moment generating function, the Rényi entropy, the probability density function of order statistics with moments and some generalities on the maximum likelihood estimates. Section 4 is devoted to a member of the BCG-G family defined with the half-Cauchy distribution as baseline. The most interesting features of this new distribution are then presented with illustration by numerical results and graphics. Section 5 is devoted to the statistical inference of this distribution via the use of the maximum likelihood method. Analysis of two practical data sets are also performed. The paper is concluded in Section 6.

Description of the BCG-G family
First of all, let us recall that the BCG-G family is characterized by the cdf F (x; λ, δ, φ) given by (2). By derivation, the associated probability density function (pdf) is given by The associated hazard rate function (hrf) is given by Some members of the BCG-G family are presented in Table 1, taking standard distributions for the baseline cdf G(x; φ), with various supports and number of parameters. To the best of our knowledge, none of them has been studied in the literature. Distribution G Support cdf of the BCG-G Parameters Let us mention that the BCG-G member defined with the half-Cauchy cdf as baseline will be in the center of the applications in Section 4 (for reasons explained later).

General mathematical properties
This section is devoted to general mathematical properties of the BCG-G family.

Shapes of the BCG-G pdf and the hrf
In what follows, we describe analytically the shapes of the BCG-G pdf and hrf. The critical points of the BCG-G pdf are the root of the equation We have no guaranty for the uniqueness a critical point for any G(x; φ); more than one root can exist. Let ξ(x) = ∂ 2 log[f (x; λ, δ, φ)]/∂x 2 . Then we have In the same way, the critical points of the BCG-G hrf are the root of the equation Again, more than one root can be obtained. Let ζ(x) = ∂ 2 log[h(x; λ, δ, φ)]/∂x 2 . Then we have If x = x 0 is a critical point, then it corresponds to a local maximum if ζ(x 0 ) < 0, a local minimum if ζ(x 0 ) > 0 and a point of inflection if ζ(x 0 ) = 0. Finally, let us mention that the critical points can be determined by using symbolic computation software (Mathematica, Maple. . . ).

Quantile function, skewness and kurtosis
The quantile function of the BCG family of distributions is defined by where γ −1 (δ, u) denotes the inverse of γ(δ, u). The first quartile is given by Q(1/4; λ, δ, φ), the median is given by M = Q(1/2; λ, δ, φ) and the third quartile is defined by Q(3/4; λ, δ, φ). Also, from Q(y; λ, δ, φ), one can defined several robust measures of skewness and kurtosis as the Galton skewness S introduced by Galton (1883) and the Moors kurtosis K introduced by Moors (1988). The skewness S measures the degree of the long tail, while the kurtosis K measures the degree of tail heaviness. They are respectively defined by and Their main advantages are to be robust to eventual outliers and to always exist (whatever the existence of moments).

Stochastic ordering
A stochastic ordering result on the BCG-G family is now presented. Let δ 1 > 0, δ 2 > 0, and X 1 and X 2 be two random variables with X 1 having the BCG-G pdf f (x; λ, δ 1 , φ) and δ 1 and X 2 having the BCG- This implies that X 2 is stochastically greater than X 1 with respect to the likelihood ratio order, implying others stochastic ordering informations (see, for instance, Shaked and Shanthikumar (2007)).

Useful expansions
Firstly, let us determine a series expansion for the BCG-G cdf. Using the exponential series expansion, we have It follows from the general binomial theorem that Applying the general binomial theorem two times, we obtain Hence we can write is the well-known exp-G cdf with power parameter u. Further details on this family of distributions can be found in Gupta et al. (1998). An alternative expression is given by where d 0 = 1 − c 0 and d u = −c u for u ≥ 1. With this configuration, the BCG-G cdf can be expressed as an infinite linear combination of exp-G cdfs.
Let us now introduce the pdf of the exp-G distribution with power parameter u + 1, i.e. w u+1 (x; φ) = (u + 1)G(x; φ) u g(x; φ). Then, by derivation of (7), we deduce the series expansion for f (x; λ, δ, φ) given by This expansion can be useful to determine important mathematical properties of the BCG-G family, as moments of different nature. Some of them are presented below.

Moments
Important note: Hereafter, let X be a random variable having the BCG-G pdf f (x; λ, δ, θ) given by (3) and, for any integer u, let Y u be a random variable having the exp-G pdf given by w u+1 (x; φ) = (u + 1)G(x; φ) u g(x; φ). Also, when a quantity is introduced, it is assumed that it exists, which is not necessarily the case depending on the choice for G(x; φ).
The r-th moment of X is given by Using the series expansion given by (8), we also have where If the closed-form expression is not available, the integral term can be evaluated numerically for a given cdf G(x; φ). The mean of X is given by E(X) = µ 1 and the variance of X is given by In order to complete this part, let us mention that r-th central moment of X can be determined by using µ 1 , . . . , µ r as follows:

Cumulants, skewness and kurtosis
As usual, the r-th cumulants of X can be obtained by the following recursion formula: with the initial value: κ 1 = µ 1 . From the cumulants of X, we can define the skewness and the kurtosis of X, which are respectively defined by Both of them can be computed numerically for a given cdf G(x; φ).

Incomplete moments
The r-th incomplete moment of X is given by where 1 A denotes the indicator function over the event A. Using the series expansion given by (8), we also have From the incomplete moments, several important mathematical quantities related to the BCG-G family can be expressed. Some of them are presented below.
The mean deviation about the mean given by The mean deviation about the median is given by These two mean deviations can be used as measures of the degree of scatter of X. The Bonferroni curve is given by The Lorenz curve is given by These curves find applications in various areas such as econometrics, finance, medicine, insurance, demography and insurance. We refer to Sarabia (2008). Finally, one can also mention the r-th moment of the reversed residual life for an integer r. Further details and applications on this mathematical object can be found in Nanda et al. (2003). It is defined by Using the binomial theorem, on can express it as

Moment generating function
The moment generating function of X is given by Using the series expansion given by (8), we also have Again, the integral term can be evaluated numerically for a given cdf G(x; φ). We can refind the r-th moment of X from M (t; λ, δ, φ) by using the formula µ r = ∂ r M (t; λ, δ, φ)/∂t r | t=0 .

Rényi entropy
The Rényi entropy introduced by Rényi (1961) is a useful measure of variation of the uncertainty used in many areas as engineering, quantum information and ecology. This subsection is devoted to the Rényi entropy of the BCG-G family. Let υ > 0 with υ = 1. Then the Réyni entropy of the BCG-G family is given by We have Using the exponential series decomposition, we obtain By the general binomial theorem, we obtain Putting this series expansion in (11), we can express I(υ) as The last integral term can be expressed as Numerical evaluation of this integral is feasible.

Order statistics
The order statistics are useful in statistics and probability theory. Here we aim to give tractable expressions for the pdfs of the order statistics, as well as their moments, in the context of the BCG-G family. Let X 1 , . . . , X n be n random variable having the BCG-G pdf. Then the pdf of the i-th order statistic of X 1 , . . . , X n is given by It follows from the series expansions for F (x; λ, δ, φ) and f (x; λ, δ, φ) given by (7) and (8) respectively that (12) Since W u (x; φ) = G(x; φ) u , by virtue of a result by Gradshteyn and Ryzhik (2000), we have the following equality: with ξ k defined by the following recursive formula: ξ 0 = d j+i−1 0 and, for k ≥ 1,

Now equation (12) becomes
Let X i:n be the i-th order statistic of X 1 , . . . , X n . Then, using (13), the r-th moment of X i:n is given by The last integral term can be computed numerically for most of the considered cdf G(x; φ).
Proceeding as the subsection above, one can also express other mathematical quantities, as the incomplete moments and the moment generating function of X i:n .

Maximum likelihood: general formula
In this section, we investigate the estimation of the parameters of the BCG-G model by the method of the maximum likelihood. Let x 1 , . . . , x n be observations of n independent and identically distributed random variables having the BCG-G pdf. The log-likelihood function is given by The maximum likelihood estimates (MLEs) for λ, δ and φ are the real numbersλ,δ and φ such that (λ,δ,φ) is maximal. They are simultaneous solutions of the three nonlinear equations: ∂ (λ, δ, φ)/∂λ = 0, ∂ (λ, δ, φ)/∂δ = 0 and ∂ (λ, δ, φ)/∂φ = 0, with and Naturally,λ,δ andφ can be determined numerically via a statistical software as R. Under specific regularity conditions, their random versions have the features to be asymptotic unbiased and asymptotic normal. This allows to construct confidence intervals (Wald interval. . . ), hypothesis testing (Likelihood-ratio test. . . ) and various measures of goodness of fit. This aspect will be developed for a special cdf G(x; φ) in Section 5.

Box-Cox Gamma-half-Cauchy distribution
Among all the distributions belonging to the BCG-G family, we now focus on the one defined with the half-Cauchy distribution as baseline. The reasons are threefold: (i) The half-Cauchy distribution has the feature to be a simple distribution with heavy tailed, highly skewed to the right (ii) The few existing generalizations of the half-Cauchy distribution give models that demonstrates nice goodness of fit properties (see Alzaatreh et al. (2016), and the references therein) (iii) Since the parameter λ has a great influence on the BCG pdf and hrf on the neighborhood of x = 0, one can expect to add more flexibility on the left tail of the half-Cauchy distribution (with sucess as we shall see in Section 5).
Let θ > 0. The cdf of the half-Cauchy distribution with parameter θ is given by The associated pdf is given by g(x; θ) = 2 πθ Putting these expressions into the BCG-G cdf given by (2), we obtain the cdf given by The related distribution is called the Box-Cox Gamma-half-Cauchy distribution (BCG-HC for short). Then all the general mathematical properties presented in Section 2 can be applied to this special case (with φ = θ and a quantile function Q G (y; θ) that will be presented later).
In the following, we present and discuss the most useful mathematical properties of this new distribution.
The associated pdf is given by 13 The associated hrf is given by Let us now investigate the asymptotic behavior of the BCG-HC pdf only. When x → 0, we have g(x; θ) ∼ 2/(πθ) and G(x; θ) ∼ (2/(πθ))x. Then, when x → 0, we have When x → +∞, we have g(x; θ) ∼ (2θ/π)(1/x 2 ) and G(x; θ) = 1 − (2/π) arctan(θ/x) ∼ 1 − (2θ/π)(1/x). Hence, when x → +∞, we have In Figure 1, we have plotted the BCG-HC pdfs and hrfs for selected values of λ, δ and θ. We see various shapes with different level of bell shaped and right-skewed. In some cases, a light left tail can be observed. These features are welcome to construct flexible models for a wide variety of lifetime data. Let us notice that the quantile function of the half-Cauchy distribution with parameter θ is given by Q G (y, θ) = θ tan π 2 y .
Thus we have the following immediate characterization. Let Y be a random variable following the Gamma distribution with parameters 1 and δ. Then the random variable X = θ tan π 2 [1 + λY ] −1/λ follows the BCG-HC distribution.
From this quantile function, we can express the Galton skewness S defined by (5) and the Moors kurtosis K defined by (6). Figure 2 presents the graphics of these two measures for θ = 2.5, α ∈ (1, 5) and δ ∈ (1, 5). We see that the skewness S increases when δ increases, with a various magnitude according to λ. Varying shapes are observed for the kurtosis K.  Figure 2: Plots of (a) the Galton skewness S and (b) the Moor kurtosis K for the BCG-HC distribution with parameters θ = 2.5, α ∈ (1, 5) and δ ∈ (1, 5).
A random variable X following the BCG-HC distribution has no moment of all order. More precisely, µ r exists if and only if we have δ > r. Indeed, when x → +∞, we have which converge as Riemann integral if and only if δ > r (there is no problem of convergence for x r f (x; λ, δ, θ) at x = 0). Under this condition, the r-th moments are given by (9). As consequences, if δ > 2, the variance exists and, if δ > 4, the skewness γ 1 and the kurtosis γ 2 given by (10) exist. Table 2 provides a numerical evaluation of these quantities for selected values for λ, δ and θ.
The same numerical approach can be performed to compute the r-th incomplete moments with a given value for t, the moment generating function (defined with t < 0), the Rényi entropy, the moments of the order statistics and the maximum likelihood estimates (as done in Subsection 5.3 for two practical data sets).

Statistical inference and data analysis with the BCG-HC model
The BCG-HC distribution with parameters λ, δ and θ, characterized by the cdf (17) (and having the pdf (18)), can be considered as a parametric model. Statistical inference and applications of the BCG-HC model are explored in this section.

Maximum likelihood method
The MLEsλ,δ andθ of the parameters λ, δ and θ respectively can be obtained by solving the nonlinear equations (14), (15) and (16) with These estimates will be considered in the rest of study.

Simulation study
Here we provide a numerical evaluation of the performance of the MLEsλδ andθ in the estimation of λ, δ and θ respectively via a graphical (Monte Carlo) simulation study. The R program is used. We generate N = 3000 samples samples of size n = 5, 10, 20, 40,. . . ,140 from BCG-HC distribution with the following parameters values: λ = 3.5, δ = 5 and θ = 2. We, for h ∈ {λ, δ, θ}, we calculate • the empirical bias of the MLEs defined by • the empirical mean square error (MSE) of the MLEs defined by The results of this simulation study can be viewed in Figure 3 for the empirical bias and in Figure 4 for the empirical MSE. From these figures, we observe that, when the sample size increases, the empirical biases and MSEs approach to 0 in all cases, which is consistent with the theoretical properties of the MLEs.

Data analysis
In this subsection, we prove empirically the flexibility of the BCG-HC distribution by means of two practical data sets. The BCG-HC distribution will be compared with some competitive models listed in Table 3. We compare the fitted distributions by using the following usual goodness of fit measures: − (where the maximized log-likelihood), AIC (Akaike information criterion), BIC (Bayesian information criterion), CVM (Cramér-Von Mises), AD (Anderson-Darling) and KS (Kolmogorov Smirnov with its p-value (PV)) statistics. These
The two considered data sets are described below.
To show the uniqueness of the obtained MLEs for the BCG-HC model, we provide the profiles plots of the log-likelihood function of λ, δ and θ in Figure 5.   Table 6 gives the confidence intervals of the parameters of the BCG-HC model for Data set 1 at the levels 95% and 99%.  Table 7 provides the values of goodness of fit measures for the BCG-HC model and other fitted models. Based on these numerical results, we see that the BCG-HC model provide better fit to Data set 1 than the competitors (smallest value of AIC, BIC, KS. . . ).
The PP, QQ, epdf and ecdf plots of the BCG-HC are shown in Figure 6. The Box plot and the Kaplan Meier survival plot are presented in Figure 7. The nice fits of the different estimated curves indicate that the BCG-HC model yields the best fit to Data set 1.
Let us now focus on the analysis of Data set 2 with the same tools as for Data set 1. Table 8 lists the MLEs and their corresponding standard errors (SEs) (in parentheses) for the BCG-HC model and other fitted models.

22
The uniqueness of the obtained MLEs for the BCG-HC model is observed in Figure 8.  Table 9 gives the confidence intervals of the parameters of the BCG-HC model for Data set 1 at the levels 95% and 99%.  Table 10 provides the values of goodness of fit measures for the BCG-HC model and other fitted models. Based on these numerical results, we see that the BCG-HC model provide better fit to Data set 2 than the competitors.

Concluding remarks
In this paper, a new family of distributions called the BCG-G family is studied. This new family can be viewed as a generalization of the RB-G family of distributions introduced by Ristić and Balakrishnan (2012) by the use of the Box-Cox transformation. The mathematical and practical properties of the BCG-G family are investigated in detail. A member of the BCG-G family using the half-Cauchy distribution as baseline distribution is introduced and considered as statistical model. It is called the BCG-HC distribution. The maximum likelihood estimation of the parameters of the BCG-HC model is discussed. In terms of the statistical significance of the model adequacy, we show that it leads to a better goodness of fit than some serious competitors of the literature.