The Transmuted Odd Fréchet-G Family of Distributions: Theory and Applications

: The last years, the odd Fréchet-G family has been considered with success in various statistical applications. This notoriety can be explained by its simple and ﬂexible exponential-odd structure quite different to the other existing families, with the use of only one additional parameter. In counter part, some of its statistical properties suffer of a lack of adaptivity in the sense that they really depend on the choice of the baseline distribution. Hence, efforts have been made to relax this subjectivity by investigating extensions or generalizations of the odd transformation at the heart of the construction of this family, with the aim to reach new perspectives of applications as well. This study explores another possibility, based on the transformation of the whole cumulative distribution function of this family (while keeping the odd transformation intact), through the use of the quadratic rank transmutation that has proven itself in other contexts. We thus introduce and study a new family of ﬂexible distributions called the transmuted odd Fréchet-G family. We show how the former odd Fréchet-G family is enriched by the proposed transformation through theoretical and practical results. We emphasize the special distribution based on the standard exponential distribution because of its desirable features for the statistical modeling. In particular, different kinds of monotonic and nonmonotonic shapes for the probability density and hazard rate functions are observed. Then, we show how the new family can be used in practice. We discuss in detail the parametric estimation of a special model, along with a simulation study. Practical data sets are handle with quite favorable results for the new modeling strategy.


Introduction
Nowadays, there is still a need for statistical models capable of extracting all the information from the data, in order to communicate on them and make them useful as well.This is particularly the case in engineering, economics, biological studies and environmental sciences.For this reason, several generations of statisticians have concentrated their efforts in improving the desirable properties of the probability distributions at the basis of these models, through various kinds of extensions or generalizations.In this regard, sophisticated mathematical modifications have emerged, with practical use encouraged by the modern informatics developments.A classical strategy consists in adding scale or shape parameter(s), also through the use of special functions (beta, gamma, hypergeometric, etc.), with the aim to make the former distribution more pliant on some important modeling aspects (mean, variance, tails of the distributions, skewness, kurtosis, etc.).Thus, new families of continuous distributions were proposed, including those developed in the following short list of references: [1][2][3][4][5][6][7][8][9][10].
In this study, a hybrid family of continuous distributions is constructed, on the basis of the so-called transmuted-G and odd Fréchet-G families.Alpha motivations behind this family are presented below.First of all, the transmuted-G (T-G) family by [9] is defined by the cumulative distribution function (cdf) and probability density function (pdf) given by and respectively, where |λ| ≤ 1 (allowing negative value for λ), H(x; ζ) and h(x; ζ) are the cdf and pdf of a baseline continuous distribution, respectively, with ζ as parameter vector.The definition of F(x; λ, ζ) is based on the concept of quadratic rank transmutation as described in [9].As prime remark, one can notice that the cdf of the T-G family can be written as a two component mixture: one is the baseline cdf (obtained for λ = 0) and the other is the exponentiated-G cdf (see [5]) with power parameter two (obtained for λ = 1).Numerous studies proved that the simple polynomial structure behind the T-G family can improve the desirable characteristics of the baseline distribution and make the choice of the baseline distribution less determinant (see [11] (Introduction), and the references therein).In addition, the T-G family positively serves to generalize or extend other existing families.For notable studies in this regard, we refer the reader to the transmuted exponentiated generalized-G by [12], new transmuted-G family by [13], generalized transmuted-G family by [14], transmuted Weibull-G family by [15], transmuted odd Lindley-G family by [16], generalized transmuted-G family by [17], transmuted Gompertz-G family by [18], T transmuted-X family by [19], transmuted transmuted-G family by [20] and transmuted generalized odd generalized exponential-G family by [21], among others.In parallel of these modern transmuted-G families, [7] proposed the odd Fréchet-G (OFr-G) family, constituting a new and simple family using the Fréchet distribution as main generator.More precisely, it is based on the cdf and pdf given by and respectively, where θ > 0 (a shape parameter), G(x; ζ) and g(x; ζ) are the cdf and pdf of a baseline continuous distribution with ζ as parameter vector, respectively.It is shown in [7] that the OFr-G family is easily applicable for modeling purposes.See also [22] where a special member of the OFr-G family, called the odd Féchet inverse exponential distribution, is applied with success.This was also discussed in several notable extensions and generalizations, as in [23] introducing the extended odd Fréchet family, [24] developing the Fréchet Topp Leone-G family, [25] for the generalized odd inverted-exponential-G family and [26] introducing the extended odd Fréchet-G family.However, all these families are based on thorough transformations of the odd function: none of them investigate a simple and direct modification of Q(x; θ, ζ).As praised in the previous paragraph about the T-G family, a motivated idea is to investigate the tunable quadratic rank transmutation.To the best of our knowledge, this direction of work remains new and promising in view of the respective qualities of the T-G and OFr-G families.We thus introduce the transmuted odd Fréchet-G (TOFr-G) family defined with the cdf and pdf given by Equations ( 1) and ( 2) with H(x; ζ) as Equation ( 3) and h(x; ζ) as Equation ( 4), i.e., respectively, where the notations of the previous paragraphs have been used.The attractive motivation behind the TOFr-G family is to improve the overall adaptability of the former OFr-G family, through the use of the quadratic rank transmutation, and more specially, the tuning of the additional parameter λ (the OFr-G family being obtained with λ = 0).In addition, this modification makes the choice of the baseline distribution less crucial; globally, the joint action of λ and θ in the definition of Equations ( 5) and ( 6) ensures a high level of flexibility for important distributional characteristics, such as the mode(s), skewness, kurtosis, mean and variance.We illustrate this aspect by discussing in detail a special three-parameter distribution of the family defined with the (standard one-parameter) exponential model as baseline.
A graphical analysis reveals that the corresponding probability density and hazard rate functions possess a large panel of monotonic and nonmonotonic shapes, making it desirable for data fitting, among others.Additionally, by considering real data sets of interest, we show that the corresponding model has a better fit behavior in comparison to the transmuted linear exponential distribution developed by [27], new generalized linear exponential proposed by [28], standard Fréchet model and standard exponential model.The gain in terms of statistical modeling is significant.
The rest of the study is structured by the following plan.In Section 2, we complete the presentation of the TOFr-G family by mentioning other important functions of interest, and some special members including the one based on the standard exponential distribution.The mathematical properties of the TOFr-G family are investigated in Section 3, deriving some useful, representation, measures and functions.Turning on the TOFr-G family as potential statistical models, the parametric estimation of the models are discussed via the maximum likelihood method in Section 4, with a simulation study guaranteeing their numerical performance.In Section 5, three practical data sets are analyzed, showing how useful the TOFr-G models can be.Some conclusions are provided in Section 6.

Some Complements on the TOFr-G Family
Here, some functions of the TOFr-G family are described, with discussions.

Other Functions of Interest
We now present some functions of the TOFr-G family having several applications in probability and statistics.First of all, the survival function of the TOFr-G family is given by In addition, the cumulative hazard function of the TOFr-G family is given as Finally, the hazard rate function (hrf) of the TOFr-G family is When the baseline distribution is a lifetime distribution, i.e., with support on (0, +∞), these functions are particularly meaningful in survival and hazard analyzes.See [29] for instance.

Notable Members
Here, we introduce three special members of the TOFr-G family.In order to have tractable expressions for Equations ( 5) and ( 6), we select the following basic baseline distributions: the exponential, Lindley (see [30,31]) and Lomax (see [32]) distributions.They belong to the family of lifetime distributions.The cdfs and pdfs of these distributions are listed in Table 1, as well as the expression of the following central transformation: [

− G(x; ζ)]/G(x; ζ).
Table 1.Some examples of baseline lifetime distributions which can be used to define special members of the transmuted odd Fréchet-G (TOFr-G) family.
Transmuted odd Fréchet Lindley (TOFrLi) Distribution:The cdf and pdf of the TOFrLi distribution are given by respectively, where |λ| ≤ 1 and θ, a > 0. Similarly, the hrf can be expressed by using Equation (7).

On the TOFrE Distribution
Motivated by an upstream investigation, we chose to put the light on the TOFrE distribution.As a first approach, we display some plots on the corresponding pdf and hrf.By fixing two parameters and varying the one that remains, Figures 1 and 2 show some interesting shapes for the pdf and hrf, respectively.For the considered sets of parameters, we see in Figure 1 that λ mainly influences the kurtosis of the distribution.In addition, β impacts the central parameters (mean and mode) and kurtosis.For θ, it has a strong effect on the skewness, making the pdf possibly decreasing.Additionally, the following desirable shapes are observed for modeling purposes: almost symmetrical, left or right skewed and reverse J shapes.From Figure 2, we deduce a similar high flexibilities for the hrf, showing bathtub, increasing, decreasing and reversed bathtub shapes, which are welcome for a deep analysis of any lifetime data.Due to their importance, these different shapes are highlighted in Figure 3.In particular, the hrf of the TOFrE distribution revals to be more pliant than the hrf of the OFrW distribution, special distribution of the former OFr-G family based on the Weibull distribution (see [7] (Section 3.1 and Figure 2a)).In this sense, we gain to use the quadratic rank transmutation as described in the TOFr-G family (introducing the parameter λ), instead of considering the former OFr-G family with an extended baseline distribution (the Weibull distribution extends the exponential distribution, with the add of a shape parameter).
In the next section, thanks to its singular flexibility, an emphasis will be put on TOFrE distribution.

Some Results
Here, some mathematical aspects of the TOFr-G family are discussed, and specifically, alternative expressions for the corresponding pdf and cdf, various moments and related functions (incomplete moments, Lorenz curve, etc.).
Henceforth, X denotes a random variable (rv) having the cdf of the TOFr-G family.

Alternative Expression of the Pdf
Here, we establish a linear/series representation for the pdf of the TOFr-G family in terms of pdfs of the exponentiated-G family.As developed in detail in [33], it allows to provide series expansions of important related measures and functions, such as ordinary moments, moment generating function, incomplete moments and so on.From a practical treatment, we can derive precise approximations of them by replacing the infinite limit by any large integer.This remains an acceptable analytical approach, basically less opaque than using already implemented tools in mathematical softwares.Moreover, as mentioned in [33], the use of such series expansions can be more precise than numerical integration techniques.
Based on Equation ( 6), f (x; λ, θ, ζ) can be written as Now, the power series of the exponential function gives, for a ∈ {1, 2}, Now, the generalized binomial formula gives Hence, where π υ (x; ζ) = υg(x; ζ)G(x; ζ) υ−1 denotes the pdf of the exponentiated-G family (with υ as power parameter) and Similarly, upon integration over (−∞, x), the cdf of the TOFr-G family can also be expressed as where Π υ (x; ζ) = G(x; ζ) υ denotes the cdf of the exponentiated-G family (with υ as power parameter).Some applications of the above results will be presented later.

Quantile Function
Like the cdf, the quantile function characterizes the distribution.It plays an essential role in many statistical applications.The quantile function of the TOFr-G family, say Q(u; λ, θ, ζ), is defined as the inverse function of F(x; λ, θ, ζ).After some algebra, we establish that where Q G (u; ζ) denotes the quantile function corresponding to the baseline distribution.
In addition, the quantile function allows to define of several shapes measures, as the pioneers Bowley skewness and Moors kurtosis [34,35].

On the Moments
Here, the moments of the TOFr-G family are discussed, with natural extensions.Henceforth, Z υ denotes a rv having the cdf and pdf given by Π υ (x; ζ) and π υ (x; ζ), respectively.In addition, it is assumed that all the presented sums and integrals exist (in the convergence sense), which is not guarantee a priori since most of them depend on the definition of the baseline distribution.

Ordinary Moments
The ordinary moments of X are the essential ingredients to define important measures of the TOFr-G family, as the mean, variance, coefficients of variations of X, coefficients of skewness and kurtosis, among others.They are determined below.The r-th ordinary moment of the TOFr-G family can be obtained from Equation (8) as In full generality, we have For instance, in the setting of the TOFrE distribution, the expression of E(Z r υ ) can be found in [5] (Equation (2.1)), i.e., From the computational point of view, the following approximation remains acceptable: (the choice of "40" remains subjective, any large integer can be chosen).
The mean and variance of X are, respectively, given by µ = µ 1 and σ 2 = µ 2 − µ 2 .Additionally, the coefficients of skewness and kurtosis are defined by CS = (µ 3 − 3µ 2 µ + 2µ 3 )/σ 3 and CK = (µ Table 2 presents some of the measures above when X follows the TOFrE distribution.Several sets of parameters values are considered.Strong variations are mainly observed for the mean, variance and kurtosis.In particular, we see that β has an important effect on the kurtosis, as already suggested by Figure 1.In line with what has been observed in Figure 1, the skewness remains oriented to the right, but with small variations.

Moment Generating Function
The moment generating function of the TOFr-G family can be obtained from Equation (8) as ).In the setting of the TOFrE distribution, the expression of E(e tZ υ ) can be found in [5] (Equation (2.3)), i.e., for t ∈ (0, β), where Γ(x) = +∞ 0 t x−1 e −t dt, x > 0, denotes the gamma function.

Incomplete Moments and Application
Some functions are useful for prediction purposes in lifetime models, finding numerous applications in demography, economics, econometrics, insurance, reliability and medicine.Several of them can be defined through the use of incomplete moments, as discussed below.

Incomplete Moments
Thanks to Equation ( 8), the r th incomplete moments of X evaluated at t ≥ 0 can be expressed as where, in full generality, For instance, in the framework of the TOFrE distribution, we can show that where γ(x, y) = y 0 t x−1 e −t dt, x > 0, y ≥ 0, denotes the lower incomplete gamma function.Alternatively, we can use the following representation: du.Some functions defined with the incomplete moments are presented below.

Applications
On some residual life functions.The mean residual life and reversed residual life functions have many applications in applied sciences.In addition, as a significant theoretical result, it is proved that the mean residual life function characterizes the distribution (see [36]).See [37], and the references therein.
For the TOFr-G family, we can determine the r-th moment of the residual life.It corresponds to the function of t given as where µ h and ϑ h (t) are given by Equations ( 9) and (10), respectively.
In particular, the mean residual life function is given as µ 1 (t).In addition, as complementary function, the r-th moment of the reversed residual life is the function of t given by The mean reversed residual life function is defined by m 1 (t).
Mean deviations.The first incomplete moment allows to define some mean deviations, which find applications in income fields and property in economics (see [34]).In the context of the TOFr-G family, the mean deviation of X about the mean µ and the mean deviation of X about the median M = Q(1/2; λ, θ, ζ) are defined as respectively, where ϑ 1 (t) is the first complete moment given by Equation ( 10) with r = 1.
Bonferroni and Lorenz curves.Lorenz and Bonferroni curves are essential tools to determine inequality measures with numerous applications in medicine, reliability and demography.See [38], and the references therein.In the setting of the TOFr-G family, they are defined by respectively, where ϑ 1 (t) is the first complete moment given by Equation ( 10) with r = 1 and q = Q(p; λ, θ, ζ).

Parametric Estimation
The parametric estimation of the TOFr-G model is now investigated, employing the famous maximum likelihood method.

Method
The maximum likelihood method provides attractive estimates of the model parameters, called the maximum likelihood estimates (MLEs).They enjoy some asymptotic properties allowing the construction of confidence intervals and some test statistics.In the context of the TOFr-G family, the basics on the MLEs are presented below.Let (x 1 , . . ., x n ) be a random sample of size n from X. Additionally, let ψ = (λ, θ, ζ) be the vector of parameters, with ζ = (ζ 1 , ζ 2 . ..), the vector of parameters of the baseline model.Then, the log-likelihood function for ψ is given by The vector of the MLEs of ψ = (λ, θ, ζ), say ψ = ( λ, θ, ζ), is defined by The use of any statistical software is possible to provide fine numerical evaluations of them.Now, let ψ r be the r th component of ψ and J n (ψ) = {−∂ 2 L n (ψ)/∂ψ r ∂ψ s } r,s be the sample information matrix corresponding to ψ (assuming that L n (ψ) is two times differentiable).Then, by denoting ψ r the r th component of ψ and using the asymptotic normality of the MLEs, an asymptotic two-sided confidence interval for ψ r at the level 100(1 − γ)% with γ ∈ (0, 1) is given by ACI r = [LB r , UB r ], , with d r is the r th diagonal element of J n ( ψ) −1 obtained from x 1 , . . ., x n and z 1−γ/2 satisfies Q(z 1−γ/2 ) = 1 − γ/2, where Q(x) denotes the quantile function of the normal distribution N (0, 1).The details can be found in the book of [39].

Numerical Study
This subsection provides a simulation study, offering a numerical check assessing the behavior of the MLEs for the TOFrE model parameters.We determine the mean squared errors (MSEs), as well as average lower bounds (LBs), average upper bounds (UBs) and average lengths (ALs) (i.e., defined by the following generic formula: AL = UB -LB), of the asymptotic two-sided confidence intervals of the model parameters (the levels 90% and 95% are considered).The software Mathematica 9 is used.The following scheme is adopted.
Tables 3-6 list the obtained results.For the considered sets of parameters, we see that the MLEs stabilize to the true parameters values as the sample size n increases.In addition, the ALs decrease in this case, which is coherent with the well-known theory of the MLEs.This confirms the pertinence of the use of the MLEs in the estimation of the TOFrE model parameters.

Applications
In this section, we use the TOFrE model for statistical analyzes of three notorious data sets; the two first data sets are with right exponential-like tails and the third one is with right heavy-like tail.In particular, we aim to compare the fits of the TOFrE model with those of the transmuted linear exponential distribution (TLE) (see [27]), new generalized linear exponential (NGLE) (see [28]), Fréchet (Fr) and exponential (E) models.
The maximum likelihood method is used for all the models, allowing to determine the following measures: AIC, CVM, AD and KS, i.e., Akaike information criterion, Cramer-von Mises, Anderson-Darling and Kolmogorov-Smirnov statistics.In addition, the p-value of the corresponding KS test is provided.The best model is the one with the smallest AIC, CVM, AD and KS values and the biggest p-value for the KS test.The calculations are performed by using the package maxLik proposed by the R software.

Data Sets I and II (Exponential Tail)
Let us now present our two first data sets of interest, both coming from real-life phenomena.Data set I. The first data set, called Data set I, is obtained from [40] and comes from a reliability analysis.The data are also available at the following electronic address: https://chesneau.users.lmno.cnrs.fr/DatasetI.txtData set II.The second data set, called Data set II, contains 72 measurements of excedances of the Wheaton river in Canada, between 1958 to 1984.These data were also considered by [41], among others.The data are also available at the following electronic address: https://chesneau.users.lmno.cnrs.fr/DatasetII.txtThe basics statistics of these data sets are given in Table 7, with support of the corresponding boxplots in Figure 4.The main observable differences between the two data sets are in the central and dispersion parameters and also, in their right skewed nature: Data set I is highly right skewed whereas Data set II is moderately right skewed.We refine our descriptive analysis by showing the corresponding total time on test (TTT) plots in Figure 5 as introduced by [42].These plots reveal that Data set I has a concave TTT line, corresponding to a possible subjacent increasing hrf, whereas Data set II has concave-convex TTT line, corresponding to a possible subjacent bathtub-shaped hrf.These two cases are covered by the TOFrE model, motivating its use to analyze such data.
The MLEs of the model's parameters along with their standard errors (SEs) are collected in Tables 8 and 9 for Data sets I and II, respectively.Tables 10 and 11 present the values of the criteria of fitness of the models for Data sets I and II, respectively.6 and 7, respectively.Visually, in comparison to the competitor models, the blue curves of the estimated fits of the TOFrE model are more close to the empirical pdfs and cdfs.We complete this part by determining the confidence intervals of the TOFrE model parameters in Table 12, as described in Section 4. We now consider a data set of different nature from insurance losses, called Data set III.It represents the vehicle insurance losses as considered in [43].It is fully available at the electronic address: https://chesneau.users.lmno.cnrs.fr/DatasetIII.csv We may refer to [43] for all the necessary descriptive statistics.Thus, we aim to apply our statistical methodology to this new data set.As in [43], we also introduced the two following criteria: consistent Akaike information criterion (CAIC), and Hannan-Quinn information criterion (HQIC), which have the same interpretation to the AIC.The MLEs of the considered models are provided in Table 13.The numerical results in Table 14 show that the TOFrE model provides a better fit to the considered competitors.In addition, for the same data, it is better to the new heavy tailed Weibull (NHTW) model developed by [43], having the following values for the considered criteria: AIC = 67, 278.87,BIC = 67, 300.22,CAIC = 67, 281.04 and HQIC = 67, 286.13 (see [43] (Table 6)), which reveals to be better to other heavy tailed models, such as the Weibull, Kumaraswamy Weibull, Lomax, Marshall-Olkin Weibull and Burr-XII models.
The estimated pdfs and cdfs of the models for Data set III are sketched in Figure 8.In the light of this study, thanks to all its numerous qualities, we hope that the TOFr-G family will seduce the practitioner for wider applications in applied sciences.As perspectives, the multivariate extensions of the TOFr-G family can be of interest for the construction of various regression models as well as clustering methods, allowing new possibilities for the analysis of big data.

Conclusions
In this paper, on the basis of the well-established transmuted-G and odd Fréchet-G families of distributions, we introduce a new family of distribution, called the transmuted odd Fréchet-G (TOFr-G) family.It contains a myriad of new flexible distributions, which can be turned as models to analyze a wide variety of data sets.We treat the theoretical and practical features of the TOFr-G family, with a focus on its member defined with the exponential distribution as baseline.We estimate the model parameters by the maximum likelihood method, showing that it gives convincing results via a simulation study.Then, three practical data sets are analyzed favorably by the proposed model.Among the possible applications, an interesting direction is to study heavy tailed distributions from the TOFr-G family and investigate results related to insurance, as performed in [44,45].

3 Figure 1 .
Figure 1.A panel of shapes for the pdf of the transmuted odd Fréchet exponential (TOFrE) distribution.

3 Figure 2 .
Figure 2. A panel of shapes for the hrf of the TOFrE distribution.

9 Figure 3 .
Figure 3. Individual plots of the hazard rate function (hrf) of the TOFrE distribution for the main observed shapes.

Figure 4 .
Figure 4. Box plots for Data sets I and II, respectively.

Figure 5 .
Figure 5.Total time on test (TTT) plots for Data sets I and II, respectively.

Figure 6 .Figure 7 .
Figure 6.Plots of all the estimated pdfs and cdfs for Data set I.

Figure 8 .
Figure 8. Plots of all the estimated pdfs and cdfs for Data set III.

Table 7 .
First statistical approaches of Data sets I and II.

Table 8 .
Maximum likelihood estimates (MLEs) and standard errors (SEs) of the models for Data set I.

Table 9 .
MLEs and SEs of the models for Data set II.

Table 10 .
Numerical measures of fitness of the models for Data set I.

Table 11 .
Numerical measures of fitness of the models for Data set II.

Table 10 ,
we see that the TOFrE and TLE models are the best for Data set I. In particular, the TOFrE model possesses the lowest AIC and KS values, and has the biggest p-value for the KS test; it is the best under these two criteria.From Table11, the TOFrE model is the best for Data set II; it possesses the lowest AIC, CVM, AD and KS values and has the biggest p-value for the KS test.Now, we display the plots of the estimated pdfs and cdfs for Data sets I and II in Figures

Table 12 .
Asymptotic two-sided confidence intervals of the TOFrE model parameters at level 90% and 95% for Data sets I and II, respectively.

Table 13 .
MLEs and SEs of the models for Data set III.

Table 14
indicates the values of the considered criteria of fitness of the models for Data set III.

Table 14 .
Numerical measures of fitness of the models for Data set III.