The Transmuted Muth Generated Class of Distributions with Applications

: Recently, the Muth generated class of distributions has been shown to be useful for diverse statistical purposes. Here, we make some contributions to this class by ﬁrst discussing new theoretical facts and then introducing a natural extension of it via the transmuted scheme. The extended class is described in detail, emphasizing the characteristics of its probability and reliability functions, as well as its moments. Among other things, we show that it can extend the possible values of the mean and variance of the parental distribution, while maintaining symmetry or creating various types of asymmetry. The mathematical inference of the parameters is also discussed. Special attention is paid to the distribution of the new class using the log-logistic distribution as a parent. In an applied work, we evaluate the behavior of the corresponding model by using simulated and practical data. In particular, we employ it to ﬁt two real-life data sets, one with environmental data and the other with survival data. Standard statistical criteria validate the importance of the proposed model.


Introduction
The Muth distribution is a one-parameter lifetime distribution introduced by [1] demonstrating a certain interest in the modelling of some reliability phenomenon. As an essential mathematical definition, it has the following cumulative distribution function (cdf): where α ∈ (0, 1] is a shape parameter. The popularity of the Muth distribution is explained by the combination of the following facts: (i) it corresponds to the classical exponential distribution with parameter 1 when α tends to 0, (ii) its probability mass in the right tail is less than those of the standard gamma, log-normal and Weibull distributions, (iii) it satisfies the variate generation property, (iv) it satisfies the mode-median-mean inequality and (v) it has enough flexibility to fit properly certain lifetime data sets, especially those resulting from the experiments of reliability. All these aspects are detailed in [1][2][3]. Later, the Muth distribution has been extended through the power transform by [4].
The perspective of generating new symmetric or asymmetric distributions from the Muth distribution has been explored in [5] through the Muth generated (M-G) class of distributions. Let us now present the M-G class constituting the basis of our study. First of all, it is defined by the following cdf: where G(x; φ) is the cdf of a parental continuous distribution having φ for parameter vector. This cdf comes from the combination of the second type of the T-X transformation by [6] and the cdf of the Muth distribution. That is, we have F(x; α, φ) = 1 − F(− log[G(x; φ)]; α). With this construction, when α tends to 0, F(x; α, φ) is reduced to G(x; φ). The probability density function (pdf) of the M-G class is specified by where g(x; φ) refers to the pdf of the parental distribution, and the corresponding hazard rate function (hrf) follows: Then, in order to illustrate the flexibility of the M-G class, Reference [5] considered the following five special distributions: Muth uniform, Muth-Rayleigh, Muth-Lomax, Muth exponential and Muth-Weibull distributions. Graphics reveal diverse curvatures for the related pdfs and hrfs, proving their ability to model various types of phenomena. This is illustrated in [5] with the Muth-Weibull distribution as the representative of the Muth class and the failure times aircraft windshield data by [7]. In particular, Reference [5] proved that these data are better adjusted by the Muth-Weibull model in comparison to several extensions of the Weibull models: the beta Weibull model by [8], McDonald-Weibull model by [9], and exponentiated Weibull model by [10]. Other special Muth distributions can be constructed and studied, symmetric or not, following the spirits of [11][12][13][14].
In this study, we first discuss two new facts about the M-G class. One is about the possible values of α and the other is about the quantile function (qf). Next, we deepen the perspectives of the M-G class by extending it through the use of the quadratic rank transmutation map. More precisely, we introduce the transmuted Muth generated (TM-G) class of distributions defined by the following cdf: where T λ (y) = y(1 + λ − λy), y ∈ (0, 1) is the quadratic rank transmutation map, λ ∈ [−1, 1] and Ω = (λ, α, φ) is the vector containing all the parameters. We thus apply the general approach developed by [15] to the M-G class. The idea is to offer an intermediate class between the exponentiated generated class with power parameter 2 corresponding to λ = −1 (see [16]) and the Topp-Leone generated class with power parameter 1 corresponding to λ = 1 (see [17]), the former M-G class being obtained with λ = 0. The gain in transforming existing classes of distributions via the quadratic rank transmutation map is now well-established, improving the possible values of the mean and variance, while maintaining symmetry or creating skewness with varying kurtosis, etc. We may refer to [18][19][20][21][22][23]. We make a theoretical work on the TM-G class, determining its main functions, discussing the shape properties of the corresponding pdf and hrf, various moments through series techniques and mathematical inference on the parameter by using the maximum likelihood approach. Then, we illustrate the applicability of the new class by studying a special case based on the log-logistic distribution. We develop the related model to fit two real-life data sets, one with environmental data and the other with survival data. The adequacy of the model reveals to be quite acceptable, and better to competitors connected with the log-logistic model. The outline of the study is as follows. In Section 2, the new facts about the M-G class, as well as more information on the TM-G class, are provided. Technical results on the TM-G class are described in Section 3. The practical side of the TM-G class is explored in Section 4 through the analysis of real-life data. Section 5 is the concluding section.

The TM-G Class
In this section, we complete the motivations leading to the development of the M-G class with some new facts, and end the presentation of the TM-G class.

New Facts about the Former M-G Class
Two new facts about the M-G class are formulated below. First, we can relax the possible range of values for α, i.e., α ∈ (0, 1]. Indeed, for any α ∈ (−∞, 1]/{0}, can be defined as a continuous function; only an extension of continuity at the greater point x 0 such that G(x 0 ; φ) = 0 is required when α < 0, depending on the support of G(x; φ), • Standard arguments give lim x→−∞ F(x; α, φ) = 0 and lim x→+∞ F(x; α, φ) = 1, the infinite limits being possibly adjusted according to the support of G(x; φ), • In view of the definition of the pdf given as (3), it is clear that f (x; α, φ) ≥ 0 for x ∈ R; the sign of α does not affect its positivity, implying that F(x; α, φ) is an increasing function with respect to x.
Consequently, the cdf (2) remains mathematically valid for α ∈ (−∞, 1]/{0}, allowing negative values for α. This direction was not investigated in [24], and opens some interesting perspectives of applications. In the next, we thus consider α ∈ (−∞, 1]/{0}. Note that, when α → 0, we arrive at F(x; α, φ) = G(x; φ). Therefore, one can eventually define the M-G class with the parental cdf for α = 0. The second contribution concerns the qf of the M-G class. It was not determined in [5], but can be with the use of the so-called Lambert function. Indeed, based on (2), the following equation: F(Q(u; α, φ); α, φ) = u with u ∈ (0, 1) yields an identifiable function Q(u; α, φ); the qf of the M-G class is given by where Q G (u; φ) denotes the qf of the parental distribution and W(x) denotes the Lambert function satisfying the following equation: W(x)e W(x) = x. One can mention that the Lambert function does not have an analytical expression but is implemented in most of the mathematical softwares, making its use straightforward in practice. These new facts will be used in the context of the TM-G class.

Distribution Functions
As sketched in the introduction, based on (4) and (2), the TM-G class is defined with the following cdf: We recall that α ∈ (−∞, 1]/{0}, λ ∈ [−1, 1], G(x; φ) is the cdf of a parental distribution which may be of different nature in terms of symmetry, support, etc, φ is the related parameter vector and Ω = (λ, α, φ). Furthermore, the pdf corresponding to G(x; φ) is denoted by g(x; φ). Clearly, the M-G class is obtained by taking λ = 0. Furthermore, the possible negative and positive values of λ offer a certain analytical versatility on the functions derived to F(x; Ω), as developed later. In addition, one can notice that, when α tends to 0, F(x; Ω) is reduced to T λ [G(x; φ)], corresponding to the cdf of the transmuted parental distribution. We can derive the pdf of the TM-G class as The characteristics of the shapes of f (x; Ω) are crucial for data fitting purposes; more f (x; Ω) has diverse curvatures, more the related model will be flexible enough to fit diverse data sets. As shown later, f (x; Ω) is also predominant in the definition of another important function of interest: the hrf. It is also the main ingredient of the well-known transfer formula. Indeed, by introducing a random variable X having f (x; Ω) as pdf, for any subset A ⊆ R and any function Υ(x), the transfer formula states that where I A (X) = 1 if {X ∈ A} is satisfied, and 0 otherwise, and E denotes the expectation operator.
In particular, for Υ(x) = 1, we get P(A) = A f (x; Ω)dx, which is the probability measure characterizing the TM-G class. More generally, through the formula in (7), f (x; Ω) thus allows the determination of various characteristics of the TM-G class, including central and dispersion parameters, coefficient asymmetry, and index on the weight of the tails.

Reliability Functions
The survival function of the TM-G class can be expressed as The hrf of the TM-G class follows from f (x; Ω) and S(x; Ω); it is obtained as The role of h(x; Ω) is of importance to determine the modelling capacities of the related model, especially when we deal with a lifetime distribution. In this regard, we refer to [25].

Quantile Function
By solving the following equation: By virtue of (5), the qf of the TM-G class is given by The fact this qf is expressible is a plus for the TM-G class. It allows us to define various measures of interest, such as the median, quartiles, octiles, as well as measures of asymmetry and kurtosis. Furthermore, it is the main tool to generate values from any distribution of the TM-G class.

Diverse Results
Diverse results on the TM-G class are now established.

Critical Points
The critical points of the functions f (x; Ω) and h(x; Ω) are of interests for the following reasons: (i) a maximum for f (x; Ω) corresponds to a mode for the TM-G class and (ii) critical points for h(x; Ω) into the support imply the presence of non-monotonic shapes. The generic equations allowing the determination of such critical points are expressed below. First, a critical point for f (x; Ω) satisfies the following general equation From these equations, we see the complex roles of G(x; φ), α and λ in the determination of the critical points. In particular, we can have several minima or maxima. If G(x; φ), α and λ are explicit, with computational efforts, one can envisage determining them numerically through the use of standard iterative techniques of Newton-Raphson types. We can also visualize them through a graphical analysis, which remains the most simple approach when we deal with such sophisticated functions.

Series Expansions
For any integer m ≥ 1, as in [24], a series expansion for [F(x; Ω)] m can be investigated. Indeed, by the classical binomial formula, we have The main interest of this expansion is that Ψ z (x; φ) corresponds to the cdf of the famous exponentiated-generated class, which has been examined through various parental distributions. Therefore, some existing results can be directly used in the TM-G class. Note that, by choosing m = 1 in (9), we thus derive a tractable series for the cdf of the TM-G class.
By differentiating with respect to x, we get the following series expansion: where provided that the sum exists according to all the quantities involved. A direct consequence is the expansion of the pdf of the TM-G class by taking m = 1. More generally, the previous expansions allow to define a lot of probabilistic measures and functions in a manageable way, such as the raw moments, incomplete moments, negative moments, probability weighted moments, characteristic function, mean deviations, functions related to the order statistics, along with their moments, exactly as performed in ([24] Subsections 3.2-3.5). In particular, based on (7) and (10), most moments-type measures or functions can be written under the following form: where y ∈ R ∪ {+∞}, r(x) denotes a certain function related to the desired measure or function of interest, and For instance, the sth raw moment corresponds to Ξ(y; m, Ω)[r] with r(x) = x s , y → +∞ and m = 1, the sth incomplete moment at y follows by taking r(x) = x s and m = 1, the (s, m − 1)th probability weighted moment is obtained with r(x) = x s and y → +∞, the characteristic function at t ∈ R corresponds to r(x) = e itx , i 2 = −1, m = 1 and y → +∞, and so on. In particular, from the raw moments, one can define central and dispersion parameters, coefficient asymmetry, and index on the weight of the tails. More details on the applications of such a series expansion approach can be found in [26].

Maximum Likelihood Approach: Theory and Practice
One of the main purposes of the TM-G class is to produce practical models for data analysis. In this regard, we provide the theory on the related semi-parametric models through the use of the maximum likelihood (ML) approach, as presented in [27]. The advantages of this approach are the following ones: (i) We have theoretical guarantee of the efficiency of the estimates, (ii) useful statistical objects such that confidence regions of the parameters can be derived, and (iii) we can define criteria allowing to compare the fit behavior of several models.
Here, Ω = (λ, α, φ) is an unknown vector of parameters of interest. Furthermore, the following notations are adopted: q is the number of components in φ, φ s is the sth component of φ, and Ω s is the sth component of Ω. Now, we introduce n i.i.d. random variables X 1 , . . . , X n whose common distribution is specified by the cdf of the TM-G class as defined by (6). Then, the ML random estimator of Ω is given as provided it is unique, where (Ω | X 1 , . . . , X n ) is the random log-likelihood function defined by If the random log-likelihood function is differentiable with respect to Ω, the ML random estimator satisfies the following system of equations: ∂ (Ω | X 1 , . . . , X n )/∂Ω = 0 q+2 . By distinguishing the main elements in the vector Ω, this system can be decomposed as and, for s = 1, . . . , q, The complexity of these equations is such that an explicit form of Ω is impossible. However, some practical issues exist when we deal with observed values, as discussed in the next paragraph. As a well-known result, the asymptotic distribution of Ω is the multivariate normal distribution N q+2 (Ω, J(Ω) −1 ), where In particular, this asymptotic result is the main mathematical ingredient to determine confidence regions for any sub-vector of Ω, including confidence intervals for each parameter.
Numerous numerical techniques can be applied to determine Ω, as those developed in the package AdequacyModel of the software R (see [28]). This package is very operational for data fitting and can be used quite efficiently to use TM-G models; only the cdf and pdf are needed to be implemented, and initial values are needed. In this regard, for the tested TM-G models, arbitrary initial values give stable numerical results.
Furthermore, for the construction of practical confidence regions, we need to estimate the matrix J(Ω). Here, we use the following approximation: J(Ω) ≈ I, where I denotes the observed information matrix given as

Practice of a Special TM-G Model
We are now focusing on a special distribution of the TM-G class, emphasizing its ability to adapt to real data.

Transmuted Muth Log-Logistic Distribution
There are as many distributions in the TM-G class as there are parental distributions. Here, we chose the log-logistic distribution as parental distribution, aiming to extend it for more statistical objectives. Basically, the log-logistic distribution is a continuous lifetime distribution used in the study of certain lifespan of an event, as for cancer mortality after diagnosis or treatment. It is also used in hydrology to model the flow of a river or the level of needs, and in economics to model income inequality. From a probabilistic point of view, the log-logistic distribution corresponds to the distribution of a random variable whose logarithm is distributed according to a logistic distribution. It closely resembles the log-normal distribution, but was distinguished by thicker tails. Moreover, its cdf admits an explicit expression, unlike the log-normal distribution. Further details on the log-logistic distribution can be found in [29][30][31].
Here, we extend the log-logistic distribution by applying the transmuted Muth scheme, as described in Section 2. First, we define the log-logistic distribution by the following cdf: where θ > 0 is a shape parameter, the corresponding pdf being given as Then, based on (6), we introduce the transmuted Muth log-logistic (TMLL) distribution defined by the following cdf: where Ω = (λ, α, θ). The corresponding pdf is obtained as The hrf of the TMLL distribution is specified by All the theory developed in the sections above can be applied to the TMLL distribution. Here, we go straight to the point by performing a graphical study on the critical points and possible shapes of f (x; Ω) and h(x; Ω). First, Figure 1 shows the possible shapes of f (x; Ω) by varying only one of the parameters. In particular, we see that the TMLL distribution is mainly unimodal, and is versatile in skewness and kurtosis. For the considered values of the parameters, λ mainly affects the degree of right-skewness, α mainly impacts the peak of the pdf; we observe that increasing α clearly inceases this peak, and θ shows a strong influence on the mode and the overall curvature, making the pdf possibly decreasing. The decreasing property is mainly observed for the small values of θ. Figure 2 is about the possible shapes of h(x; Ω) for four sets of parameters. We observe in Figure 2 that the hrf has decreasing, and increasing-decreasing shapes when a maximum is observed. These characteristics are desirable for the modelling in diverse lifetime phenomena, as discussed in [25].

Parametric Estimation
The parameters of the TMLL model can be estimated through the ML approach, as developed in Section 3.3. In order to check the convergence of the obtained estimates, we perform a simulation study with samples of varying sizes. We thus examine the numerical properties of the corresponding ML estimates, mean square errors (MSEs), and average lengths (ALs) and coverage probabilities (CPs) of CIs at a certain fixed level.
More precisely, we generate N = 5000 random samples of n values from the TMLL distribution defined with some (known) values of the parameters through the inverse transform sampling method. Then, we determine the average ML estimates, MSEs and average ALs of CIs at the levels 90%, and 95%. We do that for chosen increasing values of n, that is n = 20, 50, 100, 200, 300, and 1000, in order to see if (i) the ML estimates tend to the true values of the parameters, (ii) the MSEs decrease to 0, (iii) the ALs become smaller and (iv) the CPs tend to the expected values, i.e., 0.90 or 0.95, depending on the considered level. The obtained results are put in Tables 1-3.   From Tables 1-3, it is clear that the ML estimates converge to the corresponding values of the parameters. Furthermore, when n increases, the MSEs decrease to 0, the ALs becomes smaller and the CPs tend to the considered level value. This motivates us to use the ML approach in estimation of the parameters of the TMLL model.

Applications
We now apply the TMLL model, along with the ML approach, to show its adequacy with environmental and survival data. The two considered data sets are described below.
Data set I: the first data set is obtained from [32].  Table 4 presents the descriptive statistics of these data sets. It is clear that the data sets mainly differ in their kurtosis nature. Now, we aim to compare the fit power of the TMLL model with those of the following famous models:
As a first step, we determine the ML estimates of all the parameters of the models in Tables 5 and 6, for data sets I and II, respectively. From these tables, concerning the TMLL model, we see that the parameter λ is estimated "relatively far" to 0, motivating the use of the transmuted scheme. Furthermore, the parameter α is negatively estimated, attesting the importance of the new remarks formulated in Section 2.1.
We now compare the models through the following well-established criteria: Akaike and Bayesian information criteria (AIC and BIC), Cramer-von Mises (W), and Anderson-Darling (A). Roughly speaking, the smaller the values of these criteria, the better the model is in the fit of the data. The R software is used, along with the package AdequacyModel by [28]. Tables 7 and 8 collect the obtained results.  Since it has the smallest values of AIC, BIC and A the TMLL model is the best among them all, for the two data sets. In particular, it outperforms the BLL model, and thus represents a better extension of the log-logistic model for the considered data sets.
We now illustrate graphically the adequacy of the model by plotting the following objects: • The curve of the estimated cdf of the TMLL model, i.e., F(x; Ω), as well as the curves of the estimated cdfs of the other models over the curve of the empirical cdf for data sets I and II in Figures 3 and 4, respectively.

•
The curve of the estimated pdf of the TMLL model, i.e., f (x; Ω), as well as the curves of the estimated pdfs of the other models, all over the histogram for data sets I and II in Figures 5 and 6, respectively.   Visually, the fits of the TMLL model are quite satisfying, as anticipated.

Concluding Remarks and Perspectives
This study contributes to the development of the M-G class by showing new facts, and by proposing a motivated extending class, called the transmuted Muth generated class of distributions. We derive several of its probabilistic and analytical properties, including the expressions of the pdf and hrf, discussions on their critical points, expression of the qf, series expansions for various moments and theory on the parametric estimation of the parameters. Then, a focus is put on the special distribution of the TM-G class, called the transmuted Muth log-logistic distribution. We show that this distribution is flexible enough to fit various symmetrical or right-skewed data. Combined with the maximum likelihood approach, the fit behavior of the model is discussed through the analysis of two real data sets. The adequacy of the new model is quite satisfactory, beating the one of the famous beta log-logistic model. As further developments, one can investigate other special distributions of the TM-G class, as those having support in (0, 1) or R, finding applications in diverse regression models. The regression aspect, however, needs further investigations in the future.