A New Kumaraswamy Generalized Family of Distributions with Properties, Applications, and Bivariate Extension

: For bounded unit interval, we propose a new Kumaraswamy generalized (G) family of distributions through a new generator which could be an alternate to the Kumaraswamy-G family proposed earlier by Cordeiro and de Castro in 2011. This new generator can also be used to develop alternate G-classes such as beta-G, McDonald-G, Topp-Leone-G, Marshall-Olkin-G, and Transmuted-G for bounded unit interval. Some mathematical properties of this new family are obtained and maximum likelihood method is used for the estimation of G-family parameters. We investigate the properties of one special model called the new Kumaraswamy-Weibull (NKwW) distribution. Parameters of NKwW model are estimated by using maximum likelihood method, and the performance of these estimators are assessed through simulation study. Two real life data sets are analyzed to illustrate the importance and ﬂexibility of the proposed model. In fact, this model outperforms some generalized Weibull models such as the Kumaraswamy–Weibull, McDonald–Weibull, beta-Weibull, exponentiated-generalized Weibull, gamma-Weibull, odd log-logistic-Weibull, Marshall–Olkin–Weibull, transmuted-Weibull and exponentiated-Weibull distributions when applied to these data sets. The bivariate extension of the family is also proposed, and the estimation of parameters is dealt. The usefulness of the bivariate NKwW model is illustrated empirically by means of a real-life data set. ,

Kumaraswamy [18] pioneered a two-parameter model for bounded unit interval (0, 1) which we denote here by using random variable (rv) T ∼ Kw(a, b). The cumulative distribution function (cdf) and probability density function (pdf) of T are and respectively, where a > 0 and b > 0 are shape parameters.
Some other special models of the Kw-G family were also reported in the literature but these suffer non-identifiability issue (when two parameters appear, for example, in a product and it is impossible to determine their individual effects). These special models are: power function, Burr III, generalized linear failure rate, exponentiated-Pareto, exponentiated Burr and exponentiated-Lomax. Note 1. The citations and the references of the authors of special models of the Kw-G family [8] are avoided in this section and in references to save some space.
Alzaatreh et al. [21] proposed a general method for constructing G-families by using the transformed-transformer (T-X) approach. Let r(t) be the pdf and R(t) be the cdf of a rv T ∈ [a, b] for −∞ < a < b < ∞ and let W[G(x)] be a function of the cdf G(x) or survival function (sf) G(x) = 1 − G(x) of any baseline rv W(·) is known as generator such that W[G(x)] satisfies three conditions: ] is differentiable and monotonically non-decreasing, and The cdf of the T-X family is where W[G(x)] satisfies the conditions (i)-(iii).
(ii) The proposed extension of the Kumaraswamy-G model is based on a new generator W[G(x)] = 1 −Ḡ(x) G(x) for T ∈ (0, 1) instead of the only existing generator G(x) for which the beta-G, Kw-G, Mc-G and TL-G classes were developed so far.
(iii) The proposed generator 1 −Ḡ(x) G(x) seems little complicated in comparison to earlier well-established generator for the unit interval but it has the ability to produce better estimates and goodness-of-fit (GoF) tests results that can make it distinguishable and attractive for applied researchers (as evident from the results in Sections 5 and 7).
(iv) For most of the families and models, if the cdf is in closed form, then the quantile function (qf) can be straightforward to obtain. In some families and models, where the qf is based on some special functions such as beta, gamma, and others, then the qfs can only be determined by using power series. In our case, the cdf of the family is in closed form but the qf can be obtained only numerically.

Note 2.
A complete and independent investigation of the properties and application of our proposed generator F(x) = 1 −Ḡ(x) G(x) as a new family such as transmuted-G (Tr-G) and exponentiated-generalized-G (EG-G) will appear in another outlet very soon. It is noted here that the two G-families (Tr-G and EG-G) have not been developed from any existing parent model similar to our proposed one.
The paper is unfolded as follows. In Section 2, we define the new Kumaraswamy generalized (NKw-G) family. In Section 3, some of its mathematical properties are determined from a useful linear representation of the family density. We investigate the asymptotics and shapes of the density and hazard rate, ordinary and incomplete moments, generating function, mean deviations and estimation of the model parameters. Several properties of a special model viz. new Kumaraswamy Weibull (NKwW) distribution are discussed in Section 4. A simulation study is also conducted to assess the performance of maximum likelihood estimators of the newly proposed model in this section. In Section 5, the usefulness of new model is illustrated by means of two real-life data sets. In Section 6, we define the Bivariate New Kumaraswamy G-family of distributions. In Section 7, the usefulness of the new bivariate model is illustrated by means of a real-life data set. In fact, we prove empirically that our proposed model outperforms some well-known univariate and bivariate distributions. Finally, Section 8 offers some concluding remarks.
The important feature of our article is that the proposed model from this new generalized family, NKwW, is better in performance as compared to some well-known (or well-established) generalized Weibull models selected from the statistical literature. It can be noted from Section 5 that the Kolmogrov-Smirnov GoF statistic yields minimum GoF values along with high p-values of this statistic. Furthermore, it can also been observed from Section 5 that the GoF values of some other well-established statistics such as Akaike information Criterion, Bayesian Information Criterion, Hannan-Quinn Information Criterion, Anderson-Darling and Cramér-von Mises for our propose model are smallest as compared to some important generalized Weibull models. This fact reveals that the performance and flexibility of our proposed model is better in comparison to all other competitive models, when applied to these selected real-life data sets. The same fact is valid for our proposed bivariate model (see, Section 7).

The NKw-G Family
For W G(x) = G(x) and T ∈ (0, 1) just only the beta-G, Kw-G, Mc-G and TL-G families were reported so far. No other generators for T ∈ (0, 1) were published until now. Therefore, our main objective is to introduce a new family of distributions for T ∈ (0, 1) called the NKw-G family and to study its main structural properties.
Let r(t) be the Kumaraswamy density. By inserting Equation (2) in Equation (5) and letting G(x) , the cdf of the NKw-G family is given by where a > 0 and b > 0 are two shape parameters of the Kw distribution and ξ is the vector of the baseline parameters.
The pdf corresponding to Equation (7) becomes Henceforth, a rv X with the density (8) is denoted by X ∼ NKw-G(a, b, ξ). The survival function (sf) S(x) and hazard rate function (hrf) h(x) of X are, respectively,

Properties of the NKw-G Family
In this section, we obtain some mathematical properties of the NKw-G family.

Quantile Function
The most common and simplest method for generating random variates is based on the inverse cdf. For an arbitrary cdf, the quantile function (qf) is define as The qf of the NKw-G family can be determined by inverting (7) and then solving the two non-linear equations numerically. We can use the following procedure: (ii) Find w = w(u) numerically in w log(1 − w) = log(z) using any Newton-Raphson algorithm; (iii) Solving numerically for x in G(x; ξ) = w yields the qf x = Q(u) of X.

Asymptotics
The following asymptotics for the density, distribution function and hrf of X hold. Corollary 1. The asymptotics of Equations (7)-(9) when x → 0 or (G(x) → 0) are

Analytic Shapes of the Density and Hazard Rate Function
The shapes of the density and hrf of X can be described analytically. The critical points of the density of X are the roots of the equation: The critical points of the hrf of X are obtained from the equation:

Linear Representation of the NKw-G Density
Here, we derive useful expansions for Equations (7) and (8) based on the concept of exponentiated distributions. For an arbitrary baseline cdf G(x; ξ), a rv is said to have the exponentiated-G (exp-G) distribution with power parameter a > 0 if its cdf and pdf are respectively.
The properties of the exponentiated distributions were studied by many authors in recent years. We consider the generalized binomial expansion which holds for any real non-integer b and |z| < 1. Using (11) twice in the following expression ). Then, we can expand Equation (7) as Furthermore, using Mathematica, the power series holds where By inserting Equation (13) in Equation (12) and noting that ∑ ∞ j=0 w j+1 = 1, we obtain where By differentiating F(x), the NKwG density has the form where h i+1 (x; ξ) is the exp-G density with power parameter (i + 1). Equation (16) reveals that the NKw-G density function is a linear combination of exp-G densities. Then, some of its mathematical properties can be determined directly from those of the exp-G distribution.

Mathematical Properties
The formulae derived throughout the paper can be easily handled in most symbolic computation platforms such as Maple, Mathematica and Matlab which have the ability to deal with analytic expressions of formidable size and complexity. Henceforth, let Y i+1 be a rv with the exp-G distribution with power parameter (i + 1). We obtain some mathematical quantities of the NKw-G family from (16) and those properties of the exp-G distribution. The exp-G properties are known for at least fifty distributions; see those distributions listed in Tahir and Nadarajah [16].
First, the nth ordinary moment of X, say E(X n ), can be expressed from (16) as where is the qf of the baseline G. The quantities E(Y n i+1 ) are known for many G distributions as can been seen in those papers cited in Tahir and Nadarajah (2015).
Moments are important in any statistical analysis. Some of the most important features of a distribution can be studied through moments. For instance, the first four moments can be used to describe some characteristics of a distribution. Clearly, the central moments and cumulants of X can be determined from (17) using well-known relationships.
Second, the nth lower incomplete moment of X, say m n (y) = The last two integrals can be evaluated numerically for most G distributions.
The first incomplete moment m 1 (y) is used to construct the Bonferroni and Lorenz curves (popular measures in economics, reliability, demography, insurance, and medicine) and to determine the totality of deviations from the mean and median of X (important statistics in statistical applications).
Third, for a given probability π, the Bonferroni and Lorenz curves (popular measures in economics, reliability, demography, insurance and medicine) of X are given by B(π) = m 1 (q)/(π µ 1 ) and L(π) = m 1 (q)/µ 1 , respectively, where q = Q(π; ξ) can be found from the procedure described at the last paragraph of Section 2.
Fourth, the total deviations from the mean and median are δ 1 = 2µ 1 F(µ 1 ) − 2m 1 (µ 1 ) and Fifth, the moment generating function (mgf) M(t) = E(e t X ) of X follows from (16) as where Hence, we can obtain the mgfs of many special NKw-G distributions directly from exp-G generating function and Equation (19).

Estimation
Here, we consider the estimation of the unknown parameters of the NKw-G family by the maximum likelihood method. The MLEs enjoy desirable properties and deliver simple approximations that work well in finite samples when constructing confidence intervals. The normal approximation for the MLEs can be handled either analytically or numerically.
The log-likelihood function (θ) for the vector of parameters θ = (a, b, ξ) from n observations x 1 , · · · , x n has the form The MLE θ of θ can be evaluated by maximizing (θ). There are several routines for numerical maximization of (θ) in the R program (optim function), SAS (PROC NLMIXED), Ox (sub-routine MaxBFGS), among others.
All distributions belonging to the NKw-G family can be fitted to real data using the AdequacyModel package for the R statistical computing environment (https://www.r-project.org/). An important advantage of this package is that it is not necessary to define the log-likelihood function and that it computes the MLEs, their standard errors and some GoF statistics. We only need to provide the pdf and cdf of the distribution to be fitted to a data set.
Alternatively, we can differentiate the log-likelihood and solving the resulting nonlinear likelihood equations. Then, the score components with respect to a, b and ξ are ∂ξ are column vectors of the same dimension of ξ. Setting the score components to zero and solving them simultaneously yields the MLEs of the model parameters. The resulting equations cannot be solved analytically, but some statistical softwares can be used to solve them numerically through iterative Newton-Raphson type algorithms.
For interval estimation and hypothesis tests on the model parameters, we can obtain the (p + 2) × (p + 2) observed information matrix J(θ) numerically (p is the dimension of ξ) since the expected information matrix K(θ) is very complicated and requires numerical integration.
Under standard regularity conditions, we have ∼ means approximately distributed and K(θ) is the expected information matrix. The asymptotic behavior remains valid if K(θ) is replaced by the observed information matrix J(θ) evaluated at θ, i.e., J( θ). The multivariate normal N p+2 (0, J( θ) −1 ) distribution can be used to construct approximate confidence intervals for the model parameters.

The NKwW Distribution
We now define the NKwW distribution by taking the Weibull baseline with cdf . Then, the cdf and pdf of the NKwW distribution are, respectively, given by and Henceforth, a rv with density (21) is denoted by X ∼ NKwW(a, b, α, β). The hrf of X has the form Figures 1 and 2 display some plots of the pdf and hrf of X for selected parameter values. Figure 1 reveals that the NKwW distribution is right-skewed and reversed-J shaped. Also, Figure 2 shows that the NKwW hrf can produce increasing, decreasing, bathtub and upside-down bathtub shapes.

Linear Representation
The cdf of the NKwW distribution follows from Equation (14) is By expanding the binomial term in (22) and noting that ∑ ∞ i=2 t i = 1, we can write and then by changing the index p by (p + 1) we get Let δ p = 2 for p = 0, 1, 2 and δ p = p for p ≥ 3. We can interchange the sums conveniently to obtain By differentiating the last expression, the NKwW density can be expressed as where denotes the Weibull density with scale parameter (p + 1)α and shape parameter β.
Equation (23) shows that the NKwW density is a linear combination of Weibull densities. Therefore, several NKwW mathematical properties can be derived from those of the Weibull distribution.

Properties
Let Z p be a rv with density π(x; (p + 1)α, β). Then, several properties of X can follow from those of Z p . First, the nth ordinary moment of X can be written as Second, the cumulants (κ n ) of X can be determined recursively from (24) The skewness γ 1 = κ 3 /κ 3/2 2 and kurtosis γ 2 = κ 4 /κ 2 2 of X can be calculated from the third and fourth standardized cumulants. The skewness and kurtosis plots of the NKwW distribution are displayed in Figure 3. These plots reveal that the parameters a and b play a significant role in modeling the skewness and kurtosis behaviors of X. Third, we derive an approximation for the density of the sample average X = ∑ n i=1 X i / √ n of independent and identically (iid) rvs X 1 , · · · , X n with density (21). Without loss of generality, we can replace each X i by (X i − µ 1 )/Var(X i ) in order to simplify the approximation. By doing this, the previous third and fourth standardized cumulants are γ 1 = µ 3 and γ 2 = µ 4 − 3. Furthermore, we require the first six Hermite polynomials defined by (−1) n ∂ n φ(x)/∂x n = H r (x) φ(x) for n ≥ 0, where φ(x) is the standard normal pdf. They satisfy the recurrence equation H r (x) = yH r−1 (x) − (r − 1)H r−2 (x) (r ≥ 2) and follow as H 0 (x) = 1, The second-order Edgeworth expansion for the sample mean X of standardized NKwW rvs can be expressed as It is much more frequent in statistical applications to compute distribution functions than density functions. By integrating Equation (25), the cdf of X has the form where Φ(x) is the standard normal cdf. Equation (26) provides highly accurate results for the probabilities associated with Y. Fourth, the nth incomplete moment of X, denoted by m n (y) = E(X n | X ≤ y) = y 0 x n f NKwW (x)dx, can easily be obtained by changing variables from the lower incomplete gamma function γ(s, x) = ∞ 0 x s−1 e −x dx when calculating the corresponding moment of Z p . Then, we obtain Fifth, the first incomplete moment m 1 (z) is used to to determine the totality of deviations from the mean and median of a distribution and construct the Bonferroni and Lorenz curves. The total deviations from the mean and median M of X can be expressed as δ 1 = 2µ 1 F NKwW (µ 1 ) − 2m 1 (µ 1 ) and δ 2 = µ 1 − 2m 1 (M), where M can be determined from F NKwW (M) = 0.5. The Bonferroni and Lorenz curves of X for a given probability π are given by B(π) = m 1 (q)/(π mu 1 ) and L(π) = π B(π), respectively, where q = Q ( π) is the qf of X discussed in Section 4.1.
The R script to generate observations from the NKwW distribution is given in the Appendix A.
Here we study the performance and accuracy of maximum likelihood estimates of the NKwW parameters using Monte Carlo simulations. The simulation study is carried out for sample sizes n = 25, 50, 75, 100, 200 and parameter scenarios: I: α = 0.5, β = 0.5, a = 2.5, and b = 1.5, II: α = 1.5, β = 1.5, a = 1.5, and b = 1.5 and III: α = 1.1, β = 5.5, a = 0.5, and b = 0.5. We used the above algorithm for sample generation whose R-codes ae given in Appendix A. The simulation study is repeated for N = 1000 times each with given sample size and computed the average estimates (AE) along with their average biases (Bias)of the MLEs and mean squared errors (MSE).
We display Bias and MSE for the parameters α, β, a and b in Figures 4 and 5, respectively, which indicate that as sample size increases the bias and MSE decreases. Thus, MLEs perform well in estimating the parameters of the NKwW distribution.

Estimation
Let x 1 , · · · , x n be a sample of size n from the NKwW distribution given in Equation (21). The log-likelihood function = (θ) for the vector of parameters θ = ( α, β, a, b) is The function can be easily maximized using the AdequacyModel package. The components of the score vector U(θ) are The MLEθ of θ can be obtained by solving the nonlinear equations U α = 0, U β = 0, U a = 0 and U b = 0. These equations cannot be solved analytically and statistical software can be used to obtain the estimates numerically. We can use iterative techniques such as a Newton-Raphson type algorithm to obtainθ using a wide range of initial values. The initial values for the parameters are important but are not hard to obtain from the fit of the Weibull distribution. This process often results or leads to more than one maximum. However, in these cases, we consider the MLEs corresponding to the largest value of the maximum. In a few cases, no maximum is identified for the selected initial values. In these cases, a new initial value is tried in order to obtain a maximum.
All the calculations in these two applications are performed using the AdequacyModel package in R. The unknown parameters of the models are estimated by the maximum likelihood method.  Table 1 and Table 2 list the MLEs and their standard errors (SEs) for the NKwW distribution and other competitive models (KwW, BW, EGW, McW, GaW, OLLW, MOW, TrW and W) fitted to the two hydrological data sets. The values of the GoFS in Table 3 and Tabl 4 indicate that the NKwW model shows small values of these statistics and hence it provides the best fit as compared to the other models. The plots in Figures 6 and 7 also support our claim.    Table 3. The statistics AIC, BIC, HQIC, A * , W * , K-S and p-value for data set 1.

Bivariate New Kumaraswamy G-Family
In this Section, we introduce a bivariate extension of the NKw-G family according to Marshall and Olkin shock model (see, Marshall and Olkin, [47]). Several authors used the Marshall and Olkin approach as a method to generate bivariate distributions, see for example Sarhan and Balakrishnan, [48], Kundu and Dey [49], El-Gohary et al. [50], Muhammed [51], El-Bassiouny et al. [52], Ghosh and Hamedani [53], El-Morshedy et al. [54,55], Eliwa et al. [56], Hussain et al. [57], among others. The bivariate new Kumaraswamy (BvNKw) G-family is constructed from three independent NKw-G families by using a minimization process. Assume three independent rvs Y k ∼ NKw-G(a, b k , ξ); k = 1, 2, 3 and defining X j = min{Y j , Y 3 }; j = 1, 2, the bivaraite random vector X is said to have the BvNKw-G family with parameters vector Υ =(a, b 1 , b 2 , b 3 , ξ) if its joint reliability function (jrf) is given by The marginal reliability functions (rfs) corresponding to (28) can be written as The corresponding joint pdf (jpdf) to (28) can be formulated as where the jpdf in Equation (30) can be derived from a well-known formula (see Eliwa and El-Morshedy, [58]). The marginal pdfs corresponding to Equation (29) can be proposed as If X have the BvNKw-G family, then the distributions of max{X 1 , X 2 } and min{X 1 , respectively. If X j ∼ NKw-G(a, b j + b 3 , ξ); j = 1, 2, then the coefficient of correlation between X 1 and X 2 is The BvNKw-G family has a singular part along the line ξ); j = 1, 2, the jrf of the proposed family can be derived by using copula of the Marshall-Olkin model as . For more details on copula property, see Gijbels et al. [59] and Husková and Maciak [60].

The MLE for the BvNKw-G Family
In this section, the unknown parameters of the BvNKw-G family are estimated by using the maximum likelihood approach. Assuming that (x 11 , x 21 ), (x 12 , x 22 ), ..., (x 1p , x 2p ) is a sample of size p from the BvNKw-G family where |Λ s | ; s = 1, 2, 3 and |Λ| = p = p 1 + p 2 + p 3 . Using Equation (30), the likelihood function l(Υ) can be expressed as Through differentiation of the term L (Υ) = log l(Υ) with respect to a, b 1 , b 2 , b 3 and ξ, and then equating the resulting equations to zeros, we get the non-linear normal equations. An iterative procedure such as Newton-Raphson technique is required to solve them numerically.

Empirical Illustrations of BvNKwW Model Through Motors Data
In this Section, the flexibility of the BvNKwW model is shown through a real-life data application. This data is reported in Relia and Staff [61] which represents the failure times of a parallel system constituted by two identical motors in days. The fitted bivariate models are compared using some statistical criteria, namely AIC, CAIC, BIC and HQIC. To fit the marginals of the BvNKwW model, the K-S with its p-value are used. The BvNKwW model is compared with other bivariate distributions such as: bivariate generalized power Weibull (BvGPW), bivariate exponentiated Weibull (BvEW), bivariate Weibull (BvW), bivariate generalized exponential (BvGEx), bivariate exponential (BvEx), and bivariate generalized linear failure rate (BvGLFR) distributions when applied to this data set. At first, the marginals X 1 , X 2 and max(X 1 , X 2 ) are fitted separately to this data set. The MLEs of the parameters (a, b, α, β) of the corresponding NKwW distribution for X 1 , X 2 and max(X 1 , X 2 ) are (60.5030, 732.6059, 0.9694, 0.17440), (1.8182, 84.4223, 0.0019, 0.9434) and (27.8306, 263.4773, 0.4649, 0.2626), respectively. The −L, K-S, p-value for the marginals are reported in Table 5. Table 5. The -L, K-S and p-values for X 1 , X 2 and max(X 1 , X 2 ).  Table 5 lists that the NKwW model fits to the real data for the marginals. Figures 11-13 show the fitted pdf, cdf and probability-probability (pp) plots, which support our empirical results in Table 5.  Figure 11. The fitted pdfs plots for X 1 , X 2 and max(X 1 , X 2 ).  Figure 13. The pp-plots for X 1 , X 2 and max(X 1 , X 2 ). Figure 14 show the box and TTT plots for the X 1 , X 2 and max(X 1 , X 2 ), and the scatter plot for the motors data. It is noted that the BvNKwW model can be used to analyze and the real data on motors. The MLEs, −L, AIC, CAIC, BIC and HQIC values for the BvNKwW model and some competitive models are listed in Table 6.  From Table 6, it is observed that the BvNKwW model provides a better fit as compared to competitive models.

Concluding Remarks
Proposing new and flexible models through G-classes is an active research area in distribution theory. The new era has proved that flexible models can prove very helpful to researchers and practitioners in investigating data genertated from different phenomenons. G-classes are one of the basic source which provides a paradigm to data related science and its investigation.
The purpose of our article is to contribute a new G-family and hence a new Kumaraswamy-G family of distributions is introduced from a new generator 1 −Ḡ(x) G(x) for support (0,1), that has ability to serve as an alternative to well-known Kumaraswamy-G family (pioneered in 2011) and other classes of distributions for T ∈ (0, 1). The proposed generator adopted here involves a different function of the cumulative function instead of existing generator which is only based on G(x). In the literature, beta-G, Kw-G, Mc-G and TL-G families were introduced from the existing generator G(x) for bounded unit interval. Therefore, similar G-families can be developed from our proposed generator 1 −Ḡ(x) G(x) . We obtain some structural properties of this new Kumaraswamy-G family, and also study some properties of the special model called the new Kumaraswamy-Weibull (NKwW) distribution. We compare this distribution with the well-known generalized Weibull models (Kumaraswamy-Weibull, McDonald-Weibull, beta-Weibull, exponentiated-generalized Weibull, gamma-Weibul, odd log-logistic-Weibull, Marshall-Olkin-Weibull, transmuted-Weibull and Weibull) using six popular GoF test-statistics. We found that the new distribution provides better estimates and minimum GoF-tests values. Thus, the NKwW distribution outperforms the well-established competitive models on the basis of numerical and graphical analysis. Similarly, the BvNKwW distribution is introduced, and is compared with other well-known bivariate models such as bivariate generalized power Weibull, bivariate exponentiated Weibull, bivariate Weibull, bivariate generalized exponential, bivariate exponential, and bivariate generalized linear failure rate distributions. The results of popular goodness-of-fit statistics showed that our proposed bivariate model is better as compared to other well-known bivariate models. We expect that this new family will be able to attract readers and applied statisticians.