Abstract
Over the years, several researchers have worked to model phenomena in which the distribution of data presents more or less heavy tails. With this aim, several generalizations or extensions of the Lomax distribution have been proposed. In this paper, an attempt is made to create a hybrid distribution mixing the functionalities of the Nadarajah–Haghighi and Lomax distributions, namely the Nadarajah–Haghighi Lomax (NHLx) distribution. It can also be thought of as an extension of the exponential Lomax distribution. The NHLx distribution has the features of having four parameters, a lower bounded support, and very flexible distributional functions, including a decreasing or unimodal probability density function and an increasing, decreasing, or upside-down bathtub hazard rate function. In addition, it benefits from the treatable statistical properties of moments and quantiles. The statistical applicability of the NHLx model is highlighted, with simulations carried out. Four real data sets are also used to illustrate the practical applications. In particular, results are compared with Lomax-based models of importance, such as the Lomax, Weibull Lomax, and exponential Lomax models, and it is observed that the NHLx model fits better.
1. Introduction
Modeling heavy-tailed data is one of the important aspects in many engineering and medical domains. Initial work on this topic was carried out by Pareto [1] to model income data. In later years, the applications of Pareto, particularly the type II Lomax distribution (see [2]), usually referred to as Lomax (Lx) distribution, branched into scientific fields such as engineering sciences, actuarial sciences, medicine, income, and many more. The distribution function (cdf) and probability density function (pdf) of the Lx distribution are given by
and
respectively, where is a shape parameter, and is a scale parameter. We have for . References [3,4] considered the Lx distribution to model income and wealth data. Reference [5] used the Lx distribution as an alternative to the exponential, gamma, and Weibull distributions for heavy-tailed data. Reference [6] derived various estimation techniques based on the Lx distribution. References [7,8] examined the various structural properties and record value moments of the Lx distribution. Reference [9] extensively studied and extended the family of distributions that were used in the Lx distribution. Reference [10] considered the Lx distribution as an important distribution to model lifetime data, since it belongs to the family of decreasing hazard rate.
In continuation of this, many researchers have proposed several distributions that deal with heavy-tailed data by generalizing the functional forms of the Lx distribution. It mainly consists of adding scale/shape parameters accordingly. A few to mention are the exponentiated Lx (EL) distribution in [11], beta Lx (BL) distribution in [12], Poisson Lx distribution in [13], exponential Lx (EXL) distribution in [14], gamma Lx (GL) distribution in [15], Weibull Lx (WL) distribution in [16], beta exponentiated Lx distribution in [17], power Lx distribution in [18], exponentiated Weibull Lx distribution in [19], Marshall–Olkin exponential Lx distribution in [20], type II Topp–Leone power Lx distribution in [21], Marshall–Olkin length biased Lomax distribution in [22], Kumaraswamy generalized power Lx distribution in [23] and sine power Lx distribution in [24]. For the purpose of this study, a retrospective on the EXL distribution is required. To begin, it is defined by the following cdf and pdf:
and
respectively, where is a shape parameter, and and are scale parameters. We have for . Thus, the EXL distribution combines the functionalities of the exponential and Lx distributions through a specific composition scheme. This scheme may be called the extended Lx scheme (it will be discussed mathematically later). As immediate remarks, the EXL distribution has three parameters and is with a lower bounded support. It is shown in [14] that the pdf of the EXL distribution is unimodal and has an increasing hazard rate function (hrf). Moreover, its quantile and moment properties are manageable. On the statistical side, by considering the aircraft windshield data collected in [25], it is proven in [14] that the EXL model outperforms several three- or four-parameter extensions of the Lx model, including the EL, BL, and GL models. Thus, strong evidence is for the use of the extended Lx scheme for the construction of efficient distributions and models.
On the other hand, recently, a generalized version of the exponential distribution was given by Nadarajah and Haghighi [26]. It can be presented as an alternative to the Weibull, gamma, and exponentiated exponential (EE) distributions. It is called the Nadarajah–Haghighi (NH) distribution. The cdf and pdf of the NH distribution are
and
respectively. We have for . Among its main features, the pdf can have decreasing and uni-modal shapes, and the hrf exhibits increasing, decreasing, and constant shapes. According to [26], if the pdfs of the gamma, Weibull, and exponentiated exponential are monotonically decreasing, then it is not possible to allow increasing hrf. However, such a hrf property can be achieved by the NH distribution.
In light of the above research work, we present a new distribution based on the extended Lx scheme with the use of the NH distribution as the main generator. It is called the NH Lx (NHLx) distribution. In this sense, the NHLx distribution is to the NH distribution what the EXL distribution is to the exponential distribution. The NHLx distribution can also be presented as a generalization of the EXL distribution through the introduction of an additional shape parameter. We investigate the theoretical and practical facets of the NHLx distribution. Among its functional features, it has four parameters, it is lower-bounded (as with the EXL distribution, with a bound governed by a scale parameter), its pdf exhibits non-increasing and inverted J-shaped curves, and its hrf possesses increasing, decreasing, and upside-down bathtub shapes. This combination of qualitative characteristics is rare for a lower-bounded distribution and, in this way, it has better functionality to model lifetime data than the EXL and Lx distributions, among others. We illustrate this aspect by considering four different data sets referenced in the literature.
The rest of the article covers the following aspects: Section 2 presents the most important functions of the NHLx distribution, namely the cdf, pdf, hrf, and quantile function (qf), along with a graphical analysis when necessary. Section 3 is devoted to moment analysis and related functions. Section 4 concerns the maximum likelihood estimates of the NHLx model parameters. The above section is completed by a simulation study in Section 5. Concrete applications of the NHLx model are developed in Section 6. A conclusion is formulated in Section 7.
2. NHLx Distribution
In order to understand the essence of the NHLx distribution, let us describe more precisely the extended Lx scheme on the basis of the EXL distribution. One can remark that , where denotes the cdf of the exponential distribution with parameter , and for , and otherwise. Thus, can be thought of as a support-extended version of over the semi-finite interval . It is worth noting that is not a cdf anymore, but it is increasing and satisfies and , which ensure that as a cdf is mathematically correct. It is worth noting that it can be applied to any lifetime distribution in place of the generator exponential distribution.
Based on the extended Lx scheme with the NH distribution as a generator, the cdf and pdf of the NHLx distribution are specified by
and
respectively, where a and are shape parameters, and b and are scale parameters. We have for . Thus, the cdf has been derived from the following formula: , . By taking , we remark that ; the NHLx distribution is reduced to the EXL distribution with . The asymptotic properties of the pdf depend on the values of mainly; with the use of standard asymptotic techniques, we establish that
Figure 1 completes these asymptotic results by showing some curves of the pdf for several parameter values.
Figure 1.
Curves of the pdf of the NHLx distribution for various parameter values, but with the fixed value: .
In Figure 1, we see that the pdf can be inverted J decreasing or have uni-modal shapes. It is very flexible to skewness, peakedness, and platness curves at a small value of (at least), and different selected parameter values of a, b, and . Such flexibility is not observed for the pdf of the EXL distribution, as visually shown in the figures in [14].
The analysis of the corresponding hrf is now examined. By applying the definition , it is given by
and for . Contrary to the pdf, the asymptotic properties of the hrf mainly depend on the values of a and ; we have
In full generality, the possible shapes of the hrf are determinant for modeling purposes: the more different shapes it has, the more the associated model is applicable to a wide panel of data sets.
Figure 2 presents the identified shapes for the hrf of the NHLx distribution. From Figure 2, we see that the hrf can be increasing, decreasing, or upside-down bathtub-shaped, with flexible convex–concave properties. In particular, these curve modulations are possible thanks to the variation of the new additional parameters a. We are far beyond the curve possibilities of the hrf of the EXL distribution, which is only increasing according to [14]. Thus, from one perspective, the NHLx distribution adds a new shape parameter a to the EXL distribution in a thorough fashion, considerably improving its modeling properties.
Figure 2.
Curves of the hrf of the NHLx distribution for various parameter values.
The qf of the NHLx distribution is now studied. To begin, it is defined in function of by , . After some mathematical development, we establish that
Based on this qf, the main quartiles of the NHLx distribution can be explicated: by taking , , and into , we get the first, second, and third quartiles. In addition, several quantile-based functions, and skewness and kurtosis measures, can be listed and analyzed (see [27]). In addition, various quantile regression models can be constructed (see [28]).
3. Moment Properties of the NHLx Distribution
The moment properties of the NHLx distribution are now under investigation. First, for a random variable X with the NHLx distribution and any integer r, the rth moment of X is defined by
which can be explicated as
For given distribution parameters, this integral can be computed numerically with the help of scientific software. An analytical expression involving sums is given in the next proposition.
Proposition 1.
Let X be a random variable with the NHLx distribution. Then, its rth moment can be expressed as
where and with and , which defines the incomplete gamma function.
Proof.
Let us apply the following change of variable:
which satisfies and . Then, we have
By applying the standard and generalized binomial formulas, we get
This ends the proof of Proposition 1. □
Based on Proposition 1, the mean of X can be expanded as
and the moment of order 2 of X can be expressed as
From the above moments, we derive the variance of X by . Several other moment measures can be expressed in a similar manner, including the dispersion index, coefficient of variation, moment skewness, and moment kurtosis. More details on the moment skewness and moment kurtosis will be provided later.
The two following points can be proven by following the lines of the proof of Proposition 1.
- The rth moment of X about the mean can be expressed asBased on it, the standard moment skewness measure is defined by , and the standard moment kurtosis measure is defined by , among other moment measures.
- The rth unconditional moment of X at a certain can be expanded asIt is immediate that . The unconditional moments are useful in the expression of various important functions, such as the mean residual life and reversed mean residual life functions. For more information on these functions, see [29].
4. Maximum Likelihood Estimates of the Parameters
We now consder the NHLx distribution as a statistical model, and we assume that the parameters a, b, , and are unknown. We aim to give some details on the maximum likelihood estimates (MLEs) of the parameters. First, let n be a positive integer, be independent and identically distributed random variables drawn from the NHLx distribution, and be corresponding observations. Then, provided that , the likelihood function and log-likelihood functions are defined by
and
respectively. Then, the MLEs of the parameters a, b, , and , say , , , and , respectively, are defined by
In the case where is known and we have surely , the MLEs of a, b, and are the solution of the following equations: , and , where
and
The above expressions do not have closed-form solutions; hence, they are to be solved numerically by iterative methods. These numerical values can be easily obtained using specific tools in statistical software such as the R software, and the MLE of is obtained by taking its first-order statistics, as in [14]. It is also possible to determine the values of the standard errors (SEs) of the MLEs. For more information, see [30].
Based on the MLEs, we define the estimated pdf of the NHLx distribution by . Conceptually, the curve of this estimated function must be close to the shape of the histogram of the data, among other visual criteria.
5. Simulation Study
In this section, we perform 1000 Monte Carlo simulation studies for three different sets of parameters and each of the sample sizes of . By considering the order , these sets of parameters are Set I , Set II , and Set III . Table 1 shows the mean MLEs (MMLEs), biases and mean squared errors (MSEs) of the studies.
Table 1.
Simulation results related to the MLEs of the NHLx model parameters.
From Table 1, it can be observed that as the sample size increases, the biases and MSEs of the MLEs decrease, and with the increase in the sample sizes, the MMLEs are closer to the true parameter values. These results prove the accuracy of the considered parameter strategy estimation.
6. Applications of the NHLx Model
6.1. Heavy-Tailed Data Applications
Two real data sets taken from [31], namely the theft and claim data, are considered to illustrate the proposed methodology. These data sets are known to have heavy tail features. Table 2 presents the estimation of the tails of several standard distributions, namely the lognormal, Weibull, gamma, and exponential distributions, and the proposed NHLx distribution, taken at several values. The survival function, denoted by for all distributions in full generality, determines the tail probabilities at the point x.
Table 2.
Estimation of the tail probabilities of various distributions for the considered data sets.
It is obvious from Table 2 that the NHLx model has a better fit in both data sets, and its corresponding tail probabilities are also fairly high. This means that the proposed distribution is also a heavy-tailed distribution, which was compared to other heavy-tailed distributions and contains more mass at the tail ends than the other distributions considered for comparison.
The rest of the study is devoted to the in-depth analysis of two famous data sets in the literature, highlighting the efficiency of the estimated NHLx model under real-life scenarios.
6.2. Practical Applications
The first data set contains 65 successive eruptions of the waiting times (in seconds) of the Kiama Blowhole data. It was studied in [32,33]. The second data set is about intensive care unit (ICU) patients for varying time periods of 37 patients. It was analyzed in [34] and, more recently, in [35].
The descriptive measures such as mean, median, skewness, and kurtosis have been computed for both the eruption data and ICU data sets. The results are presented in Table 3.
Table 3.
Descriptive measures for the two data sets.
From the measures of skewness and kurtosis, it is clear that the data are highly skewed and heavy-tailed. Furthermore, the mean value is larger than the median.
For comparison purposes, we consider some of the most accurate extended Lx models: the WL, EXL, and Lx models.
The MLEs and the corresponding SEs of these models are listed in Table 4.
Table 4.
MLEs with SEs in parentheses of the considered models for the two data sets.
The measures of goodness of fit are used to verify whether a data set is distributionally compatible with a given model. To judge the accuracy of a model, we use the Cramér–von Mises (W*), Anderson–Darling (A*), and Kolmogorov–Smirnov (K-S) statistics (D), along with the K-S p-Value related to D. Adequacy measures are widely used to determine which model is best. Here, we traditionally consider the Akaike information criterion (AIC), consistent AIC (CAIC), Bayesian information criterion (BIC), and Hannan–Quinn information criterion (HQIC), which are based on the MLEs of the models. The model with the minimum W*, A*, D, AIC, CAIC, BIC, and HQIC value and maximum p-Value is chosen as the best one that fits the data. We may refer to [36] for the precise definitions of these measures. Their values for the considered models and the two data sets are collected in Table 5.
Table 5.
Values of the statistical measures for the considered models.
From Table 5, it is witnessed that the two data sets have a better fit for the proposed NHLx model than the other three models.
The histogram plots and estimated pdfs of the considered models are reported in Figure 3.
Figure 3.
Curves of the estimated pdfs of the considered models for the two data sets. (a) Eruption data. (b) ICU data.
From Figure 3, we see that both histograms exhibit the skewed nature of the two data sets, and the estimated pdf curves depict that the NHLx model is observed to have a better pattern of closeness to the histogram plot when compared to the other three models.
7. Conclusions
In this paper, we propose a new four-parameter Lomax distribution called the Nadarajah–Haghighi Lomax distribution. It aims to provide a new lower-bounded distribution that combines the functionalities of the Nadarajah–Haghighi and Lomax distributions, and extends the modeling scope of the so-called exponential Lomax distribution. We have derived various properties, including the expression of the probability density, hazard and quantile functions, and diverse kinds of moments. The maximum likelihood method is used for estimating the model parameters. Simulation studies show its effectiveness by considering different sets of parameters. Furthermore, the support of two real data sets is taken to illustrate the applications of the Nadarajah–Haghighi Lomax distribution and it is compared with other Lomax-based distributions. From the obtained results, it is very easy to understand that the Nadarajah–Haghighi Lomax distribution has a better fit than the other Lomax models. The perspectives of new work based on the Nadarajah–Haghighi Lomax distribution are numerous, including:
- the development of various extensions, such as parametric-functional, multivariate, and discrete versions;
- the creation of new families of distributions;
- the construction of diverse regression models;
- by viewing the related cdf as a sigmoidal function, one can think of studying the “confidential intervals” (or “confidential bounds”) and “supersaturation” to the horizontal asymptote (at the median level) in the Hausdorff sense (see [37]). These two characteristics are important for researchers in choosing an appropriate model for approximating specific data from very different branches of scientific knowledge, such as computer virus propagation (see [38]).
Author Contributions
Conceptualization, V.B.V.N., R.V.V. and C.C.; methodology, V.B.V.N., R.V.V. and C.C.; software, V.B.V.N., R.V.V. and C.C.; validation, V.B.V.N., R.V.V. and C.C.; formal analysis, V.B.V.N., R.V.V. and C.C.; investigation, V.B.V.N., R.V.V. and C.C.; data curation, V.B.V.N., R.V.V. and C.C.; writing—original draft preparation, V.B.V.N., R.V.V. and C.C.; writing—review and editing, V.B.V.N., R.V.V. and C.C.; visualization, V.B.V.N., R.V.V. and C.C. All authors have read and agreed to the published version of the manuscript.
Funding
This research received no external funding.
Acknowledgments
We thank the three reviewers and the associate editor for their in-depth comments on the first version of the article.
Conflicts of Interest
The authors declare no conflict of interest.
References
- Pareto, V. Cours d’Économie Politique; Rouge: Lausanne, Switzerland, 1897; Volume II. [Google Scholar]
- Lomax, K. Business failures: Another example of the analysis of failure data. J. Am. Stat. Assoc. 1954, 49, 847–852. [Google Scholar] [CrossRef]
- Harris, C.M. The Pareto distribution as a queue service discipline. Oper. Res. 1968, 16, 307–313. [Google Scholar] [CrossRef]
- Atkinson, A.B.; Harrison, A.J. Distribution of Personal Wealth in Britain; Cambridge University Press: Cambridge, UK, 1978. [Google Scholar]
- Bryson, M.C. Heavy-tailed distributions: Properties and tests. Technometrics 1974, 16, 61–68. [Google Scholar] [CrossRef]
- Lingappaiah, G. Bayes prediction in exponential life-testing when sample size is a random variable. IEEE Trans. Reliab. 1986, 35, 106–110. [Google Scholar] [CrossRef]
- Ahsanullah, M. Record values of the Lomax distribution. Stat. Neerl. 1991, 45, 21–29. [Google Scholar] [CrossRef]
- Balakrishnan, N.; Ahsanullah, M. Relations for single and product moments of record values from Lomax distribution. Sankhya Indian J. Stat. Ser. B 1994, 56, 140–146. [Google Scholar]
- Marshall, A.W.; Olkin, I. A new method for adding a parameter to a family of distributions with application to the exponential and Weibull families. Biometrika 1997, 84, 641–652. [Google Scholar] [CrossRef]
- Chahkandi, M.; Ganjali, M. On some lifetime distributions with decreasing failure rate. Comput. Stat. Data Anal. 2009, 53, 4433–4440. [Google Scholar] [CrossRef]
- Abdul-Moniem, I.B. Recurrence relations for moments of lower generalized order statistics from Exponentiated Lomax Distribution and its characterization. J. Math. Comput. Sci. 2012, 2, 999–1011. [Google Scholar]
- Rajab, M.; Aleem, M.; Nawaz, T.; Daniyal, M. On Five Parameter Beta Lomax Distribution. J. Stat. 2013, 20, 102–118. [Google Scholar]
- Al-Jarallah, R.A.; Ghitany, M.E.; Gupta, R.C. A proportional hazard Marshall–Olkin extended family of distributions and its application to Gompertz distribution. Commun. Stat. Theory Methods 2014, 43, 4428–4443. [Google Scholar] [CrossRef]
- El-Bassiouny, A.H.; Abdo, N.F.; Shahen, H.S. Exponential Lomax Distribution. Int. J. Comput. Appl. 2015, 121, 24–29. [Google Scholar]
- Cordeiro, G.M.; Ortega, E.M.; Popovic, B.V. The Gamma–Lomax Distribution. J. Stat. Comput. Simul. 2015, 85, 305–319. [Google Scholar] [CrossRef]
- Tahir, M.H.; Cordeiro, G.M.; Mansoor, M.; Zubair, M. The Weibull–Lomax distribution: Properties and applications. Hacet. J. Math. Stat. 2015, 44, 455–474. [Google Scholar] [CrossRef]
- Mead, M.E. On five-parameter Lomax distribution: Properties and applications. Pak. J. Stat. Oper. Res. 2016, 12, 185–200. [Google Scholar]
- Rady, E.H.A.; Hassanein, W.A.; Elhaddad, T.A. The power Lomax distribution with an application to bladder cancer data. SpringerPlus 2016, 5, 1–22. [Google Scholar] [CrossRef] [Green Version]
- Hassan, A.S.; Abd-Allah, M. Exponentiated Weibull–Lomax distribution: Properties and estimation. J. Data Sci. 2018, 16, 277–298. [Google Scholar] [CrossRef]
- Nagarjuna, B.V.; Vishnu Vardhan, R. Marshall–Olkin exponential Lomax distribution: Properties and its application. Stoch. Model. Appl. 2020, 24, 161–177. [Google Scholar]
- Al-Marzouki, S.; Jamal, F.; Chesneau, C.; Elgarhy, M. Type II Topp–Leone power Lomax distribution with applications. Mathematics 2020, 8, 4. [Google Scholar] [CrossRef] [Green Version]
- Mathew, J.; Chesneau, C. Some new contributions on the Marshall–Olkin length biased Lomax distribution: Theory, modelling and data analysis. Math. Comput. Appl. 2020, 25, 79. [Google Scholar] [CrossRef]
- Nagarjuna, B.V.; Vishnu Vardhan, R.; Chesneau, C. Kumaraswamy Generalized Power Lomax Distribution and Its Applications. Stats 2021, 4, 28–45. [Google Scholar] [CrossRef]
- Nagarjuna, B.V.; Vishnu Vardhan, R.; Chesneau, C. On the Accuracy of the Sine Power Lomax Model for Data Fitting. Modelling 2021, 2, 78–104. [Google Scholar] [CrossRef]
- Murthy, D.N.P.; Xie, M.; Jiang, R. Weibull Models; John Wiley & Sons: New York, NY, USA, 2004. [Google Scholar]
- Nadarajah, S.; Haghighi, F. An extension of the exponential distribution. Statistics 2011, 45, 543–558. [Google Scholar] [CrossRef]
- Gilchrist, W.G. Statistical Modelling with Quantile Functions; Chapman & Hall/CRC: London, UK, 2000. [Google Scholar]
- Koenker, R. Quantile Regression; Cambridge University Press: Cambridge, UK, 2005. [Google Scholar]
- Cordeiro, G.M.; Silva, R.B.; Nascimento, A.D.C. Recent Advances in Lifetime and Reliability Models; Bentham Books: Sharjah, United Arab Emirates, 2020. [Google Scholar] [CrossRef]
- Casella, G.; Berger, R.L. Statistical Inference; Brooks/Cole Publishing Company: Pacific Grove, CA, USA, 1990. [Google Scholar]
- Boland, P.J. Statistical and Probabilistic Methods in Actuarial Science; CRC Press: Boca Raton, FL, USA, 2007. [Google Scholar]
- da Silva, R.V.; de Andrade, T.A.; Maciel, D.B.; Campos, R.P.; Cordeiro, G.M. A New Lifetime Model: The Gamma Extended Fréchet Distribution. J. Stat. Theory Appl. 2013, 12, 39–54. [Google Scholar] [CrossRef] [Green Version]
- Pinho, L.G.B.; Cordeiro, G.M.; Nobre, J.S. The Harris extended exponential distribution. Commun. Stat. Theory Methods 2015, 44, 3486–3502. [Google Scholar] [CrossRef]
- Kang, I.; Hudson, I.; Rudge, A.; Chase, J.G. Density estimation and wavelet thresholding via Bayesian methods: A wavelet probability band and related metrics approach to assess agitation and sedation in ICU patients. In Discrete Wavelet Transforms—A Compendium of New Approaches and Recent Applications; IntechOpen: London, UK, 2013. [Google Scholar]
- Khan, M.S.; King, R.; Hudson, I.L. Transmuted generalized exponential distribution: A generalization of the exponential distribution with applications to survival data. Commun. Stat. Simul. Comput. 2017, 46, 4377–4398. [Google Scholar] [CrossRef]
- Chen, G.; Balakrishnan, N. A general purpose approximate goodness-of-fit test. J. Qual. Technol. 1995, 27, 154–161. [Google Scholar] [CrossRef]
- Sendov, B. Hausdorff Approximations; Kluwer: Boston, MA, USA, 1990. [Google Scholar]
- Iliev, A.; Kyurkchiev, N.; Rahnev, A.; Terzieva, T. Some Models in the Theory of Computer Viruses Propagation; LAP LAMBERT Academic Publishing: Saarbrucken, Germany, 2019. [Google Scholar]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).