Abstract
In this article, a new flexible probability density function with three parameters is proposed for modeling asymmetric data (positive and negative) with different types of kurtosis (mesokurtic, leptokurtic and platykurtic). Some of its statistical and reliability properties, including hazard rate function, moments, moment generating function, incomplete moments, mean deviations, moment of the residual life, moment of the reversed residual life, and order statistics are derived. Its hazard rate function can be either constant, increasing-constant, decreasing-constant, U shape, upside down shape or upside down-U shape. Seven classical estimation methods are considered to estimate the unknown model parameters. Monte Carlo simulation experiments are performed to compare the performance of the seven different estimation methods. Finally, a distinctive asymmetric real data application is analyzed for illustrating the flexibility of the new model.
1. Introduction
Recently, Nadarajah and Haghighi [1] presented and studied a new lifetime model and pointed out that its probability density function (PDF) has a zero mode. A random variable (RV) is said to have Nadarajah and Haghighi (NH) model if its survival function (SF) and PDF are given by
and
respectively, where is a shape parameter. Nadarajah and Haghighi [1] considered (2) as an alternative to the exponential (Exp), gamma (Gam), Weibull (W) and exponentiated-exponential (ExpExp) distributions. Several extensions of the NH model can be cited, such as the exponentiated NH (E-NH) model by Lemonte [2], the Gam NH (Gam-NH) and Poisson Gam NH (PGam-NH) by Ortega et al. [3], transmuted NH (Tr-NH) by Ahmed et al. [4], Kumaraswamy NH (Kuw-NH) by Lim [5], modified NH (Mo-NH) by El-Damcese and Ramadan [6], Marshall–Olkin NH (MO-NH) by Lemonte et al. [7], Topp–Leone NH (TL-NH) by Yousof and Korkmaz [8], beta NH (B-NH) by Dias [9], inverted NH (I-NH) by Tahir et al. [10], and NH Lindley (NH-L) by Pena et al. [11]. The PDF and cumulative distribution function (CDF) of the Topp–Leone exponentiated-G(TLE-G) family are given by
and
respectively, where is the CDF of any baseline model and is the PDF of any baseline model. For we get the TL family. By inserting (1) and (2) into (3), we can write the PDF of the TLE-NH model as
where After a quick study of TLE-NH properties, different classical estimation methods under uncensored schemes are considered, such as the maximum likelihood (ML), Anderson–Darling (AD), ordinary least squares (OLS), Cramér–von Mises (CVM), weighted least squares (WLS), left-tail Anderson–Darling (LTAD), and right-tail Anderson–Darling (RTAD) methods. Numerical simulations are performed for comparing the estimation approaches using different sample sizes for three different combinations of parameters. The corresponding CDF is given by
For the TLE-NH reduces to the TL-NH (see [8]). We provide some plots of the PDF and hazard rate function (HRF) of the TLE-NH model to show its flexibility. The CDF in (6) can be expressed as
where and is the CDF of the E-NH model with power parameter . The corresponding TLE-NH density function can be formulated as
where is the E-NH PDF with power parameter . Figure 1a displays some plots of the TLE-NH density for some parameter values of , and . The plots of the HRF of the TLE-NH model for some parameter values of , and are obtained in Figure 1b.
Figure 1.
Plots of the TLE-NH PDF (a) and HRF (b) for some parameter values.
Figure 1a shows that the PDF of the new version has right skew tails with different shapes whereas Figure 1b shows that the HRF of the TLE-NH has many important failure rates such as “constant (, and )”, “increasing-constant (, and )”, “decreasing-constant (, and )”, “U shape (, and )”, “upside down shape (, and )” and “upside down-U shape (, and )”.
We are motivated to introduce and study the TLE-NH model for the following reasons:
- i.
- The new density in (5) can be “asymmetric unimodal and right skewed” with many useful shapes.
- ii.
- The HRF of the new model can be constant, increasing-constant, bathtub (U-HRF), decreasing-constant and upside-down (reversed U-HRF). These characteristics give a great advantage to the TLE-NH model for analyzing the data sets in which its HRF can be constant, increasing-constant bathtub, decreasing-constant or upside down- bathtub.
- iii.
- The new TLE-NH model is recommended for modeling the remission times (in months) of the bladder cancer patients. The bladder cancer data have some extreme values.
- iv.
- Also, its nonparametric Kernel density estimation is asymmetric with heavy tail to the right. Therefore, the TLE-NH model could be useful asymmetric real data especially the unimodal symmetric heavy tailed right skewed data and the bimodal symmetric heavy tailed right skewed data.
- v.
- On the other hand, the new TLE-NH model is flexible enough to exhibit the asymmetric densities and right heavy tail shapes as illustrated in Figure 1a.
- vi.
- Moreover, the HRF of the bladder cancer patients is upside-down; this property matches with our new model which contains the upside-down HRF as illustrated in Figure 1b. It is vital to mention that the presented class of probabilistic distributions suitable for modeling asymmetric data also have an important utilization in insurance (see Maciak et al. [12] for more details) and in dependence modeling (see Gijbels et al. [13]).
- vii.
- The range of the skewness of the TLE-NH model is falling in the interval (). However, the skewness of the standard baseline NH model is falling in the interval (). The wide range of the skewness gives priority to the TLE-NH model in modeling and future prediction since many real-life datasets are negatively skewed. The standard baseline NH model cannot be useful in such cases (see Tables 1 and 2 and Figures 2 and 3).
- viii.
- The kurtosis of the TLE-NH model is located between and , however the kurtosis of the NH model starts from to . Thus, the TLE-NH extension could be useful for mesokurtic, leptokurtic and platykurtic data sets (see Tables 1 and 2 and Figures 2 and 3).
- ix.
- The estimation persuaders of the TLE-NH model can be performed under the maximum likelihood, Anderson–Darling, ordinary least squares, Cramér–von Mises, weighted least squares, left-tail Anderson–Darling, and right-tail Anderson–Darling methods. Although all estimation methods perform well, the weighted least squares estimation method is the best in real data modeling, with slight differences in results.
2. Properties
2.1. Moments
The ordinary moment of is given by Then we obtain
where , and refers to the complementary incomplete Gamma function. The moments in (9) reduce to ( integer)
when in (9) and (10), we have the mean of . The central moment of , say , is
Table 1 lists the expected value variance skewness and kurtosis for the TLE-NH model, whereas Table 2 reports the and values for the NH model. From Table 1 and Table 2, we note that the range of of the TLE-NH model is (), however the of the NH model is (). The of the TLE-NH model is located between and , however the of the NH model starts from to . Figure 2 shows three-dimensional (3-D) skewness plots for 0.01, 0.25, 5.75, 75, 150, 1000. Figure 3 shows 3-D kurtosis plots for 0.01, 0.25, 5.75, 75, 150, 1000. Figure 2 and Figure 3 illustrate how and changes with respect to the new parameter . Table 1 and plots of Figure 2 Figure 3 show that the proposed model can be utilized for analyzing the asymmetric data with different types of kurtosis.
Table 1.
E(Z), V(Z), S(Z) and K(Z) for the TLE-NH model.
Table 2.
E(Z), V(Z), S(Z) and K(Z) for the NH model.
Figure 2.
3-D skewness plots for 0.01, 0.25, 5.75, 75, 150, 1000 (a–f).
Figure 3.
3-D kurtosis plots for 0.01, 0.25, 5.75, 75, 150, 1000 (a–f).
2.2. Moment Generating Function (MGF)
The MGF of can be derived from Equation (9) or (10) as
or integer we have
2.3. Incomplete Moments (I-Ms)
The I-M, say , of can be expressed from (8) as then
or integer we have
2.4. The Moment of the Residual Life (MoRL)
The MoRL can be formulated as
Then, the MoRL of the TLE-NH model can be reported by
where and or integer we have
The life expectation at age can be defined by
which represents the additional expected life length for a certain unit which is alive at age .
2.5. The Moment of the Reversed Residual Life ()
The moment of the reversed residual life can be expressed as
Then, the MoRRL of the TLE-NH model can be formulated as
where and or integer we have
The mean inactivity time (MIT) is given by
or integer we get
which is the elapsed waiting time since the failure of a certain subsystem occurred in .
2.6. Order Statistics
Let be an observed random sample (RS) from the TLE-NH model and let be the corresponding order statistics. Then the PDF of order statistic can be written as
where B is the beta function. Substituting (5) and (6) in (11), the PDF of can be expressed as
where and can be obtained recursively from
where . The moments of can be proposed as
where or integer; the moments in (12) reduce to
3. Estimation Methods
We discuss seven methods to estimate the parameters of the TLE-NH model which can be implemented using the “AdequacyModel” script in “R” software, which provides a general meta-heuristic optimization technique for maximizing or minimizing an arbitrary objective function. The major aim of using various estimation approaches is to get the best estimators for good analytics, for instance Eliwa et al. [14], El-Morshedy et al. [15], Hamedani et al. [16] and Elgohari et al. [17], among others.
3.1. Maximum Likelihood Estimation (MLE) Method
Let be any observed RS from the new TLE-NH model. The log likelihood function for may be expressed as
where Following the norm routine of parameter estimation for the MLE of and we differentiate with respect to and to obtain the score vector as follows
where Setting and solving them simultaneously yields the MLE of .
3.2. Cramér–Von-Mises Estimation (CVME) Method
The CVME of the parameters and are obtained via minimizing the following expression with respect to (WRT) to the parameters and respectively,
where and
The CVME of the parameters and are obtained by solving the three following non-linear equations
and
where and
3.3. Ordinary Least Squares Estimation (OLSE) Method
Let denote the CDF of TLE-NH model and let be the ordered RS. The OLSEs are obtained upon minimizing
then, we have
where . The LSEs are obtained via solving the following non-linear equations
and
where and , defined above.
3.4. Weighted Least Squares Estimation (WLSE) Method
The WLSE are obtained by minimizing the function WLSE WRT and
where . The WLSEs are obtained by solving
and
3.5. Anderson–Darling Estimation (ADE) Method
The ADE are obtained by minimizing the function
The parameter estimates follow by solving the nonlinear equations and
3.6. Right Tail-Anderson–Darling Estimation (RT-ADE) Method
The RTADE is obtained by minimizing
The estimates follow by solving the nonlinear equations and
3.7. Left Tail-Anderson–Darling Estimation (LT-ADE) Method
The LTADE is obtained by minimizing
The parameter estimates can be derived by solving
and
4. Simulation for Comparing Various Estimation Methods
Simulation studies are performed to compare and assess the above-mentioned estimation methods. The simulation studies are based on generated data sets from the TLE-NH version, where and and . The performance of the different estimators is compared in terms of the average of its estimates and mean-standard error The confidence intervals (Lower CI(LCI), Upper CI(UCI)) have been also calculated. Table 3, Table 4 and Table 5 list the simulation results. From Table 3, Table 4 and Table 5, it is noted that the MSE tend to zero and A-Vs tend to initial values when increases, which means the incidence of consistency property. For more illustration and based on Table 3, we have the following results, For a = 3:
Table 3.
Simulation results for the parameter a = 3.
Table 4.
Simulation results for the parameter b = 0.3.
Table 5.
Simulation results for the parameter c = 0.1.
- i.
- The MSE under ML decreased from 0.27643to 0.08205.
- ii.
- The MSE under CVM decreased from 0.24741to 0.06579.
- iii.
- The MSE under OLS decreased from 0.29624to 0.07117.
- iv.
- The MSE under WLS decreased from 0.29093to 0.06882.
- v.
- The MSE under AD decreased from 0.19796to 0.05345.
- vi.
- The MSE under RT-AD decreased from 0.27743to 0.07353.
- vii.
- The MSE under LT-AD decreased from 0.18347to 0.04942.
Similar results are recorded regarding the other two parameters.
5. Asymmetric Data Analysis
5.1. For Comparing Methods under Asymmetric Data
For comparing the classical methods, an application to a real data set is analyzed. We consider the Cramér–Von Mises (CM) and the Anderson–Darling (AD) statistics. The real data set represents the remission time (in months) of an RS of 128 bladder cancer patients (0.08, 2.09, 3.48, 4.87, 6.94, 8.66, 13.11, 23.63, 0.20, 2.23, 3.52, 4.98, 6.97, 9.02, 13.29, 0.40, 2.26, 3.57, 5.06, 7.09, 9.22, 13.80, 25.74, 0.50, 2.46, 3.64, 5.09, 7.26, 9.47, 14.24, 25.82, 0.51, 2.54, 3.70, 5.17, 7.28, 9.74, 14.76, 26.31, 0.81, 2.62, 3.82, 5.32, 7.32, 10.06, 14.77, 32.15, 2.64, 3.88, 5.32, 7.39, 10.34, 14.83, 34.26, 0.90, 2.69, 4.18, 5.34, 7.59, 10.66, 15.96, 36.66, 1.05, 2.69, 4.23, 5.41, 7.62, 10.75, 16.62, 43.01, 1.19, 2.75, 4.26, 5.41, 7.63, 17.12, 46.12, 1.26, 2.83, 4.33, 5.49, 7.66, 11.25, 17.14, 79.05, 1.35, 2.87, 5.62, 7.87, 11.64, 17.36, 1.40, 3.02, 4.34, 5.71, 7.93, 11.79, 18.10, 1.46, 4.40, 5.85, 8.26, 11.98, 19.13, 1.76, 3.25, 4.50, 6.25, 8.37, 12.02, 2.02, 3.31, 4.51, 6.54, 8.53, 12.03, 20.28, 2.02, 3.36, 6.76, 12.07, 21.73, 2.07, 3.36, 6.93, 8.65, 12.63, 22.69) (see Lee and Wang [18]). Table 6 lists the different estimators as well as CM and AD.
Table 6.
Application results for comparing methods.
From Table 6, the WLS method is the best method, with CM = 0.03673 and AD = 0.24138, among all estimation techniques; however, MLE, CVM, OLS, ADE, RT-ADE and LT-ADE performed well. Figure 4 shows the probability-probability (P-P) plots for comparing estimation methods. Figure 5 shows the estimated CDF (ECDF) plots for comparing estimation methods. Figure 6 provides Kaplan–Meier estimation plots for comparing estimation methods. Figure 4, Figure 5 and Figure 6 ensures the results obtained in Table 6.
Figure 4.
P-P plots for comparing estimation methods.
Figure 5.
ECDF plots for comparing estimation methods.
Figure 6.
Kaplan–Meier estimation plots for comparing estimation methods.
5.2. For Comparing Competitive Models
An application is present, based on the data set of Cordeiro et al. [18], to show the flexibility of the TLE-NH model. We compare the TLE-NH model with some competitive models such as the Burr type-XII NH (BuXII-NH) ([19]), Lomax NH (Lx-NH) (Selim [19]), exponentiated exponential (Exp-Exp) beta exponential (B-Exp), Kumaraswamy exponential, TL-NH, inverse generalized power Weibull (IGPW) ([20]), inverse NH (I-NH), inverse Weibull (IW), inverse Rayleigh (IR), inverse exponential (IE) and NH distributions. Selecting the best model is performed using the estimated log-likelihood, Akaike-Information-Criterion (AI), Consistent-Akaike-Information-Criteria (CAI), Bayesian-Information-Criterion (BI), and Hannan–Quinn Information-Criterion (HQI). This data has a unimodal HRF-shape. The results of this application are listed in Table 7 and Table 8. Table 7 lists the MLEs and the standard errors (SEs) for the asymmetric real data. Table 8 lists the statistics for the asymmetric real data. These results show that the TLE-NH distribution has the lowest AI, CAI, BI and HQI values among all the fitted models. Hence, it could be chosen as the best model under these criteria. Figure 7 gives the total time in test (TTT), box, quantile-quantile (QQ) and nonparametric Kernel density estimation (NKDE) plots for the real data. Figure 8 shows the estimated PDF (EPDF), ECDF, EHRF and Kaplan–Meier estimation plots. Clearly, the TLE-NH distribution provides a closer fit to the empirical functions. For this data, we have the following results: 12.21692 91.23032 4.546743 and 38.4308.
Table 7.
The MLEs (SEs) for the real data.
Table 8.
Statistics for the real data.
Figure 7.
The TTT, box, QQ and NKDE plots for real data.
Figure 8.
ECDF, EPDF, EHRF and Kaplan–Meier estimation plots.
6. Conclusions
In this paper, we have introduced a new flexible extension to the Nadarajah and Haghighi model called the Topp–Leone exponentiated Nadarajah and Haghighi model (TLE-NH). The PDF of the TLE-NH model can be expressed as a simple linear representation of the exponentiated NH density. Some of its statistical properties have been derived and studied in detail. The HRF can take different shapes, such as constant, increasing-constant, decreasing-constant, bathtub, upside down and upside down-U, which make the TLE-NH model able to analyze different types of data sets in various fields. Moreover, the TLE-NH model can be utilized to discuss both negatively and positively skewed data. The model parameters have been estimated by utilizing various estimation methods. Monte Carlo simulation experiments have been performed to compare the estimation methods. Finally, a real data set is analyzed for illustrating the flexibility of the proposed model, and it is found that the TLE-NH model showed its superiority in modeling the real data set.
Author Contributions
M.M.A.A. (writing-review and editing; Funding acquisition and conceptualization), M.A.A. (writing-review and editing, validation), M.S.E. (writing-review and editing; conceptualization; software; methodology and validation), M.E.-M. (writing-review and editing; software; conceptualization and validation) and H.M.Y. (writing the original draft preparation; software; resources; project administration and validation). All authors have read and agreed to the published version of the manuscript.
Funding
The author extends his appreciation to the Deanship of Scientific Research at King Khalid University for funding this work under grant number (RGP. 1/26/42), received by Mohammed M. Almazah (www.kku.edu.sa (accessed date: 12 August 2021)).
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
The data set is available in Lee and Wang (2003) and given in Section 5.1.
Conflicts of Interest
The authors declare no conflict of interest.
References
- Nadarajah, S.; Haghighi, F. An extension of the exponential distribution. Statistics 2011, 45, 543–558. [Google Scholar] [CrossRef]
- Lemonte, A.J.; Cordeiro, G.M.; Moreno-Arenas, G. A new useful three-parameter extension of the exponential distribution. Statistics 2016, 50, 312–337. [Google Scholar] [CrossRef]
- Ortega, E.M.; Lemonte, A.J.; Silva, G.O.; Cordeiro, G.M. New flexible models generated by gamma random variables for lifetime modeling. J. Appl. Stat. 2015, 42, 2159–2179. [Google Scholar] [CrossRef]
- Ahmed, A.; Muhammed, H.Z.; Elbatal, I. A new class of extension exponential distribution. Int. J. Appl. Math. Sci. 2015, 8, 13–30. [Google Scholar]
- Lima, S.R.L. The Half Normal Generalized Family and Kummaraswamy Nadarajah Haghighi Distribution. Master’s Thesis, Universidade Federal de Pernumbuco, Recife, Brazil, 2015. [Google Scholar]
- El Damcese, M.A.; Ramadan, D.A. Studies on Properties and Estimation Problems for Modified Extension of Exponential Distribution. Int. J. Comput. Appl. 2015, 2015, 125. [Google Scholar] [CrossRef][Green Version]
- Lemonte, A.J. A new exponential-type distribution with constant, decreasing, increasing, upside-down bathtub and bathtub-shaped failure rate function. Comput. Stat. Data Anal. 2013, 62, 149–170. [Google Scholar] [CrossRef]
- Yousof, H.M.; Korkmaz, M.C. Topp-Leone Nadarajah Haghighi distribution: Mathematical properties and applications, International Journal of Applied Mathematics. J. Stat. Stat. Actuar. Sci. 2017, 2, 119–128. [Google Scholar]
- Dias, C.R.B.; Alizadeh, M.; Cordeiro, G.M. The beta Nadarajah-Haghighi distribution. Hacet. J. Math. Stat. 2018, 47, 1302–13203. [Google Scholar] [CrossRef]
- Tahir, M.H.; Cordeiro, G.M.; Ali, S.; Dey, S. The inverted Nadarajah-Haghighi distribution: Properties, estimation methods and applications. J. Stat. Comput. Simul. 2019, 88, 2775–2798. [Google Scholar] [CrossRef]
- Pena-Ramirez, F.A.; Guerra, R.R.; Cordeiro, G.M. The Nadarajah-Haghighi Lindley distribution. Acad Bras Cienc. 2019, 91, 1–20. [Google Scholar] [CrossRef]
- Maciak, M.; Okhrin, O.; Pešta, M. Infinitely stochastic micro reserving. Insur. Math. Econ. 2021, 100, 30–58. [Google Scholar] [CrossRef]
- Gijbels, I.; Omelka, M.; Pešta, M.; Veraverbeke, N. Score tests for covariate effects in conditional copulas. J. Multivar. Anal. 2017, 159, 111–133. [Google Scholar] [CrossRef]
- Eliwa, M.S.; Altun, E.; Alhussain, Z.A.; Ahmed, E.A.; Salah, M.M.; Ahmed, H.H.; El-Morshedy, M. A new one-parameter lifetime distribution and its regression model with applications. PLoS ONE 2021, 16, e0246969. [Google Scholar] [CrossRef] [PubMed]
- El-Morshedy, M.; Eliwa, M.S.; Altun, E. Discrete Burr-Hatke distribution with properties, estimation methods and regression model. IEEE Access 2020, 8, 74359–74370. [Google Scholar] [CrossRef]
- Hamedani, G.G.; Korkmaz, M.C.; Butt, N.S.; Yousof, H.M. The Type I Quasi Lambert Family: Properties, Characterizations and Different Estimation Methods. Pak. J. Stat. Oper. Res. 2021, 17, 545–558. [Google Scholar] [CrossRef]
- Elgohari, H.; Ibrahim, M.; Yousof, H.M. A New Probability Distribution for Modeling Failure and Service Times: Properties, Copulas and Various Estimation Methods. Stat. Optim. Inf. Comput. 2021, 8, 555–586. [Google Scholar] [CrossRef]
- Lee, E.T.; Wang, J. Statistical Methods for Survival Data Analysis; John Wiley & Sons: New York, NY, USA, 2003; Volume 476. [Google Scholar]
- Cordeiro, G.M.; Yousof, H.M.; Ramires, T.G.; Ortega, E.M. The Burr XII system of densities: Properties, regression model and applications. J. Stat. Comput. Simul. 2018, 88, 432–456. [Google Scholar] [CrossRef]
- Selim, M.A. Some theoretical and computational aspects of the inverse generalized power Weibull distribution. J. Data Sci. 2019, 17, 742–755. [Google Scholar] [CrossRef]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).