1. Introduction
New distributions can often result from the introduction of one or more additional shape parameters to an existing lifetime distribution (say, a baseline model). They are the generalized (or generated) G-classes of distributions. According to [
1], there are some reasons why the G-classes attract researchers in several areas. One reason might be the computational refinement of symbolic and numerical programming software. It becomes easier to derive some important mathematical and statistical properties. In addition, the structure of the new generators also allows for the exploration of the distribution’s tail properties. Another reason is that the extra parameters obtained from the G baseline models have been shown to improve the quality of fit. Ref. [
2] also showed that the G-classes might provide better fits than classical distributions for skewed data.
Several generators have been defined as special cases of the transformed-transformer
–
method introduced by [
3]. This technique allows for the derivation of families of distributions by using any probability density function (pdf) as a generator.
Let be the pdf of a random variable for . Let be the baseline cumulative distribution function (cdf) of a random variable X such that satisfies the following conditions:
;
is differentiable and monotonically non-decreasing;
when and when .
Therefore, the
T–
X family cdf is defined by
and its corresponding pdf is given by
The T–X family of distributions can be classified into subfamilies. One subfamily has the same X distribution but different T distributions, and the other has the same T distribution but different X distributions. Some functions , such as for , will also define different subfamilies.
For example, consider
. If
T is a beta random variable, we have the beta-generated family pioneered by [
4]. The Kumaraswamy generalized family [
5] follows when
T is a Kumaraswamy random variable. The exponentiated logarithmic generated [
2] is also an example of
T-
X special model.
We can also refer to [
6] for a class of univariate distributions generated by extending the logistic distribution, called the logistic-X class (“LX” for short). The LX family is a special model of the
T-
X family defined by
in Equation (
1) by taking a logistic random variable for
T. The cdf and pdf of
T are given by (for
)
and
respectively, where
. Thus, the LX family cdf is defined by
and its pdf is given by
where
is any baseline cdf and
. The LX family has the same parameter as the baseline distribution plus an additional shape parameter
. Note that the baseline distribution is not a special case of the LX family. However, it can be interpreted as a compounding model between the logistic and the baseline distributions. According to [
6], this family may allow the construction of symmetric, left-skewed, right-skewed, and/or reverse J-shaped distributions; the definition of models with more types of hazard rate function (hrf); and the provision of competitive models to other generated families under the same baseline distribution, among other characterizations.
In this paper, we introduce a new four-parameter distribution called the logistic Burr XII (LBXII) distribution. It is defined by inserting the three-parameter Burr XII (BXII) distribution as the baseline in Equations (
2) and (
3). The BXII distribution has a cdf and pdf (for
) given by
and
respectively, where
and
are shape parameters and
is a scale parameter.
The BXII distribution was originally proposed by [
7]. The utilization of the BXII distribution is appealing for several reasons. Notably, it possesses the capacity to effectively capture asymmetric behaviors and heavy-tailed distributions in positive outcomes [
8]. These characteristics have led to its adoption as a fundamental tool for the development of generalized probability distributions.
Table 1 provides a review of some BXII generalizations through different G families or transformation methods. The BXII distribution also finds extensive application in diverse fields, such as remote sensing [
9], econometrics [
8], and environmetrics [
10].
The cdf of the LBXII distribution is given by (for
)
where
,
, and
are shape parameters and
is a scale parameter. The corresponding pdf has the form
Henceforth, if
X is a random variable with density function (
6), we write
.
Figure 1 displays plots of the LBXII density function for selected parameter values. It can take various forms, and has as special models some well-known distributions. For
and
, we have the logistic-log-logistic (LLL) distribution. For
it becomes the logistic-Lomax (LLo) model. The hrf of
X can be expressed as
Figure 2 provides plots of the hrf for some parameter values. It reveals that the LBXII distribution can have decreasing and upside-down-bathtub hazard functions. The proposed distribution is quite flexible regarding the pdf and hrf and may be a useful alternative to the BXII model and its generalizations. Therefore, it can be considered for modeling income distribution and also in actuarial science, bioscience, and lifetime data, among other areas.
The rest of the paper is organized as follows: We derive useful expansions for the cdf and pdf of the new distribution in
Section 2. In
Section 3, some mathematical properties of the LBXII distribution are investigated. In
Section 4, the maximum likelihood method is presented to estimate the model parameters. A simulation study is performed in
Section 5. In
Section 6, we illustrate the flexibility of the new model using two real data sets. Some concluding remarks are offered in
Section 7.
2. Useful Expansions
Tahir et al. [
6] demonstrated that the LX pdf can be written as an infinite linear combination of exponentiated-G (exp-G) densities; see [
32] for the definition of the exp-G distribution. In this section, we derive useful expansions for the LBXII pdf not from exponentiated models but based on our baseline model. Inserting (
4) into Equation (
2), the LBXII cdf can be rewritten as
Using the
Mathematica software, version 12.0, we obtain a power series for
as
Applying this power series for
in (
7) and after some algebraic manipulation, we obtain
where the
’s are
,
,
,
, etc. For any
real non-integer, the following expansion holds since the left-hand-side expression is a cdf
where
Thus, Equation (
8) can be rewritten as
where
. The coefficients of the quotient of the two power series in (
9) can be determined from the recurrence equation (for
)
and then, Equation (
9) reduces to
where
By differentiating (
10), we obtain
where
is the exp-BXII pdf with power parameter
. Using the binomial theorem (for
), we can write
Inserting (
12) into Equation (
11) and after some algebra, we obtain
where
is the BXII density function with scale parameter
s and shape parameters
c and
. Since the sums in the above expressions vary in equal sets of indices, we can exchange
for
. Therefore, the LBXII pdf can be reduced to
where
Equation (
13) is the main result of this section. So, the LBXII pdf is an infinite linear combination of BXII densities. Thus, some mathematical properties of
X can be derived from those BXII properties.
6. Applications
In this section, we present two examples to illustrate the potentiality of the LBXII distribution for modeling income data. The first data set consists of the annual salaries of professional hockey players for the season 2012–2013. It has 714 observations in American dollars and is available for download at
https://www.usatoday.com/sports/nhl/ (accessed on 17 September 2016).
The second example represents the individual payroll income of 5024 Italian households with positive income. These data are obtained from the Survey of Household Income and Wealth (SHIW) of the Bank of Italy for 2014. The observations are measured in euros.
We fit the LBXII model for both data sets and compare them with six other competitive models. The distributions covered in this comparison include five-parameter BXII generalizations and some special models of our proposal. In what follows, we present the mathematical expressions of the density functions under consideration. These expressions are essential to provide a clear and concise reference for readers to understand the potential competitors of the proposed model. Therefore, they are defined below (for ):
The KwBXII density is given by
where
and
are shape parameters.
The BBXII density is given by
where
and
are shape parameters.
The BXII density is given in (6).
The exponentiated Weibull (EW) density [
32] is given by
where
and
are shape parameters and
is a scale parameter.
The Weibull (W) density, which arises from the EW density when .
The LL density obtained from the BXII density with and .
The statistics considered for these models are the following: the Akaike information criterion (AIC), consistent Akaike information criterion (CAIC), Bayesian information criterion (BIC), Bayesian information criteria Hannan–Quinn information criterion (HQIC), and Kolmogorov–Smirnov (KS). The lower the goodness-of-fit statistics, the better the distribution adjustment to the data. We use the R programming language to obtain the MLEs and goodness-of-fit statistics of the LBXII and all its competitor models.
6.1. Hockey Players’ Salaries
Table 3 provides a descriptive summary of the hockey players’ data. We have a higher value for the standard deviation (SD) and an amplitude of 13,475,000. This indicates that the current data have great variability. The skewness is positive, and the kurtosis is large. Further, the mean and median are not so close. These statistics suggest that hockey players’ salaries follow a power law distribution, which is very common in income data sets.
The MLEs and their standard errors for all fitted distributions are listed in
Table 4. The Bayes estimates, following the procedure described in
Section 4.2, are also included. We note that the parameter estimates are significant for all considered models.
Table 5 presents the goodness-of-fit statistics and reveals that the LBXII distribution yields a good adjustment for the hockey players’ data. It has the lowest values for all statistics, thus indicating it as a competitive alternative to the classical W, EW, and other BXII generalizations and special models.
The three estimated densities with lower values for the goodness-of-fit statistics and the histogram of the data are given in
Figure 4. They agree with what was discussed in the descriptive summary and the results in
Table 5. Thus, the LBXII model is very competitive with the other fitted distributions and provides a better adjustment for the current data.
6.2. Individual Payroll Income
Table 6 provides a descriptive summary of the individual payroll income data. For these data, the mean and median are close and the SD is higher. We also note large values for the skewness and kurtosis coefficients. The amplitude is 134,900 for these data. Just like for the first data set, the descriptive statistics indicate that the payroll income may follow a power law distribution with a right-skew tail.
Table 7 and
Table 8 present the MLEs with their standard errors and the goodness-of-fit statistics, respectively, for seven fitted models. The Bayes estimates, following the procedure described in
Section 4.2, are also included. These results are obtained for the LBXII distribution and six competitive models. The parameter estimates are significant for all fitted models, and the LBXII distribution exhibits the lowest values for all goodness-of-fit statistics. Similarly to the first empirical example, the LBXII model shows up as a competitive alternative to the other fitted models.
Figure 5 displays a histogram and some plots of the estimated densities for the three most competitive models according to the goodness-of-fit statistics of the payroll income data. These plots are in agreement with the results in
Table 8. Similarly to the first data set, the LBXII distribution can be used effectively to provide better fits than other considered income distributions for these data and it is a very competitive alternative to the W and EW models.
7. Concluding Remarks
We introduce the four-parameter logistic Burr XII (LBXII) distribution. It can have decreasing and upside-down-bathtub hazard functions and can be considered for modeling income distributions, among other applications. We demonstrate that the LBXII density function is an infinite linear combination of BXII densities. Thus, some mathematical properties of the new distribution are obtained using this result, such as the ordinary and incomplete moments and generating function. We also determine the quantile function for the LBXII distribution, which is useful to obtain any quantiles of interest, simulate LBXII random variables, and provide some alternative expressions for the skewness and kurtosis. We estimate the model parameters using the maximum likelihood method, and a simulation study is provided using a Monte Carlo experiment. In our simulation study, we note that the efficiency of the maximum likelihood estimators improves for larger sample sizes, which is an important aspect to consider when applying the LBXII distribution to real-world data. We present two applications to illustrate the potentiality of the LBXII distribution for modeling income data. Both data sets exhibit characteristics of a power law distribution, which is very common in income data sets. We note that the LBXII distribution has a good adjustment in both cases, thus being a competitive model against the classical Weibull distribution, exponentiated Weibull model, other BXII generalizations, and special models. Finally, the LBXII model may provide an attractive alternative to describe and understand income distribution behavior.