1. Introduction
Analysts often have to deal with data that exhibit bimodality; for example, when observing the size of worker ants in weaver ant colonies [
1], the duration of volcanic eruptions [
2], the amount of excretion of mercury in urine [
3], the grain size in sintered Zirconia [
4], or the amount of tropospheric water vapor in the tropics [
5].
Two-component mixture distributions are usually used to model data that exhibit bimodality. These distributions have a very flexible density function (unimodal/bimodal), a highly valued feature when trying to model data in different real settings. However, one difficulty in working with them is that it is necessary to deal with the problem of the non-identifiability of their parameters, see McLachlan et al. [
6] for more detail.
It is possible to find in the literature guidelines to deal with the problem of non-identifiability in a mixture distribution; for example, as suggested by Aitkin and Rubin [
7], it is convenient to assume a certain restriction on the mixing parameter to evaluate the behavior of the maximum likelihood estimates. An alternative that has been considered in various studies is to impose restrictions on the components of the mixture distribution; for example, assuming that the components have the same variance. In this way, sub-families of mixture distributions with a simpler parametric structure (where the parameters are identifiable) are defined.
Taking into account certain restrictions for the components of a mixture distribution implies working with a sub-family that exhibits a less flexible density function than the unrestricted case. However, it is possible to find in the literature sub-families of two-component mixture distributions that have been shown to be useful in data modeling in various real settings. In this context, it is possible to find the generalized bimodal (GB) distribution originally proposed by Rao [
8] and later studied as a special case of the bimodal symmetric distribution proposed in Sarma et al. [
9].
A random variable
X has a generalized bimodal distribution, denoted as
, if its probability density function (pdf) is given by
and its cumulative distribution function (cdf) is given by
where
is a shape parameter that controls bimodality and
and
denote the pdf and the cdf of the standard normal distribution, respectively.
It is easy to verify that Equation (
1) corresponds to the pdf of a two-component mixture distribution with mixing parameter
, for
, one component having a standard normal distribution, and the other a standard bimodal-normal (BN) distribution. If
, the Lambert generalized bimodal (LGB) distribution reduces to the BN distribution. Details on the structural properties of the BN distribution can be found in Hassan and Hijazi [
10] and Elal-Olivero [
11]. A class of generalized bimodal distributions that extends the GB distribution can be found in [
12]. This class is defined by the cdf F(x) = Φ(x) − α(x) ϕ(x), where α(x) is a linear function of x. Thus, the GB distribution (with cdf given in Equation (2)) can be derived as a special case of the class proposed by [
12] when α(x) = x/(1 + γ).
The parametric space bounded to the interval [0,2) for
can be explained by the fact that the pdf given in Equation (
1) is bimodal for any value of
in such interval. However, we observe that the range of
can be extended so that
assumes values in the interval
. In such case, it is possible to verify the following properties: The GB pdf is symmetric-bimodal when
and symmetric-unimodal when
. The GB distribution reduces to the bimodal-normal distribution when
and tends to the standard normal distribution as
.
The symmetry characteristic of the GB pdf can be considered a desirable characteristic in the analysis of certain data sets, but a limitation in the analysis of others. In the literature, it is possible to find different construction methods that allow generating an asymmetric distribution from a symmetric baseline distribution, see Azzalini [
13], Eugene et al. [
14], Cordeiro and de Castro [
15], Ferreira and Steel [
16], Goerg [
17], and Alzaatreh et al. [
18], among others.
Recently, Iriarte et al. [
19] introduce the Lambert-
F distribution generator defined as
where
is an extra shape parameter and
is the cdf of an arbitrary baseline distribution.
The transformation given in Equation (
3) defines a new family of distributions more flexible in terms of skewness than the baseline distribution. Iriarte et al. [
19] study two special cases of Equation (
3), extending the classical exponential and Rayleigh distributions and showing that the hazard rate functions induced by this transformation can be understood as modifications in the early times of the baseline hazard rate functions.
In this article, we introduce a new unimodal/bimodal distribution that generalizes the GB distribution and is capable of modeling different levels of skewness. The proposal, called the Lambert generalized bimodal (LGB) distribution, arises from Equation (
3) when considering a baseline GB distribution. The result is a new distribution that generalizes to the GB and BN distributions and that can serve as an alternative to other asymmetric bimodal distributions in the literature.
The article is organized as follows. In
Section 2, we define the LGB random variable and derive the density and distribution functions. In
Section 3, we describe the characteristics of unimodality, bimodality, asymmetry, and kurtosis. In addition, alias distributions are discussed. In
Section 4, we consider the problem of parameter estimation using the maximum likelihood (ML) method. In
Section 5, the behavior of the ML estimators and the utility of the proposed distribution are evaluated through simulation experiments.
Section 6 presents two application examples illustrating the usefulness of the LGB distribution in real settings. Finally, the main conclusions are reported in
Section 7.
4. Maximum Likelihood Estimator
In this section, we deal with the problem of parameter estimation in the LGB distribution under the maximum likelihood (ML) method.
If
, then (for
’) the log-likelihood is given by
where
Thus, the elements of the score vector are
For a random sample
from
, we observe that the ML estimator
’ for
’ cannot be expressed in closed form. The solution of the likelihood equations gives rise to a system of four nonlinear equations (See
Appendix D) that must be solved with the help of some computational routine in search of ML estimates.
In this case, as the ML estimators do not have a closed form, a good alternative to obtain ML estimates is to solve the following optimization problem,
where
is given in Equation (
14). We solved (
15) using the function optim of the R language [
20] and, specifically, the L-BFGS-B algorithm [
23] was applied. This algorithm requires the declaration of a feasible starting point in the parametric space to start the iterative process. Considering that the bimodal-normal distribution is a special case of the LGB distribution, we verify through simulation experiments that (
), where
is the mean of the observations and
the corresponding standard deviation, is a good starting point.
Under regularity conditions, the asymptotic distribution of
is
, where
is the expected information matrix. As the function
is not simple, it is not easy to obtain the analytical expression of this matrix. However, we obtain an approximation from the observed information matrix, whose elements are computed as minus the second partial derivatives of the log-likelihood function with respect to all the parameters (evaluated at the ML estimates). Thus, for a random sample
from
, the observed information matrix is given by
with
,
,
and
, where the analytical expressions of the second partial derivatives are presented in
Appendix E.
5. Simulation Studies
In the analysis of data exhibiting bimodality, it is common to use a two-component mixture normal (MN) distribution. In this section, we initially carry out a simulation study to evaluate the behavior of the ML estimators of the LGB distribution parameters. Subsequently, we conducted a second simulation study in order to evaluate the usefulness of the LGB distribution in a context where the MN distribution performs well.
5.1. First Simulation Study
In this study, 1000 random samples from the LGB distribution were generated considering the sample sizes n = 100, 200, 300, 500, and 1000 in the following two scenarios:
Scenario A: , , and .
Scenario B: , , and .
Simulated random samples were generated using the qf given in Equation (
7). The LambertW package [
24] in the R language was used to compute the principal branch of the Lambert
W function. A code in the R language is provided in
Appendix A.
For each simulated sample, we obtain the ML estimates by solving (
15) under the considerations mentioned in
Section 4.
Table 3 reports the average estimate (AE), the empirical standard deviation (SD), and the root of the mean square error (RMSE) for the 1000 estimates obtained in each scenario and sample size considered.
Table 4 reports the average of the asymptotic standard error (SE) for the ML estimates along with the coverage probability (CP) of the 95% asymptotic confidence intervals.
Table 3 indicates that the AEs tend to be close to the true values of the parameters as the sample size increases. The SDs and RMSEs are close and decrease towards 0 as the sample size increases, as expected in the standard asymptotic theory.
Table 4 indicates that the SEs are close to the SDs and RMSEs given in
Table 3. As expected, the SEs decrease towards 0 and the CPs converge to the nominal values used to construct the confidence intervals as the sample size increases.
5.2. Second Simulation Study
In the first place, 1000 random samples from MN distribution were generated considering the sample sizes n = 50, 100, 200, 300 and the following four scenarios: Scenario A, , , , , and . Scenario B, , , , , and . Scenario C, , , , and . Scenario D, , , , , and .
For each simulated sample, the LGB and MN distributions are fitted via the ML method using the optim function in R language. Subsequently, based on the Akaike Information Criterion (AIC) [
25], Corrected Akaike Information Criterion (CAIC) [
26], and Bayesian Information Criterion (BIC) [
27], the proportions where the AIC, CAIC, and BIC values are lower in the LGB distribution are calculated. We call this the hit rate for the LGB distribution. In addition, the modified Cramer–von Mises (
) and Anderson–Darling (
) statistics [
28] are calculated for the LGB distribution in order to test the hypothesis
is a random sample from a LGB population, where the parameters have been estimated by the ML method. Thus, we calculate the rate of simulated samples where
is not rejected, which we call the non-rejection rate.
Finally, considering a procedure analogous to the one described above, we simulate random samples from the LGB distribution and calculate the hit and non-rejection rates for the MN distribution. The scenarios considered here are the following: Scenario A, , , , and . Scenario B, , , , and . Scenario C, , , , and . Scenario D, , , , and .
Table 5 and
Table 6 report the hit and non-rejection rates for the LGB and MN distributions, respectively. In
Table 5, we observe that the non-rejection rates are high, which means that a considerable proportion of samples generated from the MN distribution can be appropriately fitted with the LGB distribution. On the other hand, we observe that the hit rates are high, exceeding the value 0.5, even in moderate sample sizes,
. Note that the non-rejection rates decrease considerably in scenario D as the sample size increases and that the hit rates are lower than the other scenarios. This is because the samples are generated from a MN population where the scales
and
are considerably different. This shows that the LGB distribution can perform well in settings where the MN distribution is used and where the estimates for
and
in this distribution are similar.
In
Table 6, we observe that the non-rejection rates are very high, which was expected as the MN distribution has one more parameter than the LGB distribution. This means that a very considerable proportion of samples generated from a LGB population can be appropriately fitted with the MN distribution. However, due to having one more parameter, the hit rates in the different scenarios are small, as the AIC, CAIC, and BIC values depend on the number of parameters of the distribution. Therefore, it can be expected that in a possible real setting where both distributions appropriately fit a certain dataset, the information criteria AIC, CAIC, and BIC will provide favorable indications for the use of the LGB distribution due to the fact of having to estimate a smaller amount of parameters.
6. Data Analysis
In this section, two applications are presented in order to illustrate the usefulness of the LGB distribution and its special cases in data modeling in different real settings. Other symmetric/asymmetric unimodal/bimodal distributions are also considered to illustrate that the LGB distribution or some of its special cases may have a better fit than other distributions in the literature. Specifically, the odd log-logistic skew-normal (OLLSN) [
29] and gamma sinh-Cauchy (GSC) [
30] distributions are considered. Like the LGB distribution, these distributions have four parameters: two shape parameters (which together control skewness and bimodality), a location parameter, and a scale parameter. A mixture distribution of two normal components (MN) [
6] is also included into the analysis as it is a commonly used distribution for analyzing data exhibiting bimodality.
The first dataset corresponds to 188 observations on the inflation rate (in %) registered quarterly between the years 1950 and 1996 in Canada. This dataset can be found with the name Tbrate in the R language [
31].
The second dataset refers to 128 observations on the electrical resistance (in ohms) of nectarine fruits. This data can be found with the name fruitohms in the R language [
32].
For the datasets described above, we test hypothesis
: the data have exactly one mode versus the alternative hypothesis
: the data have at least two modes. For this, we consider the excess mass test [
33] using the modetest function in R language [
34]. For the inflation rate data, the observed statistic was 0.05 with a
p-value equal to 0.05. For the electrical resistance data, these values were 0.074 and 0.01, respectively. Thus, at a significance level equal to 0.05, in both datasets
is rejected; that is, the distributions of the inflation rate and electrical resistance data are at least bimodal.
We compared the distributions fitted by the ML method using the information criteria AIC, CAIC, and BIC. We also calculate the statistics and to test the hypothesis is a random sample from a continuous distribution , where is known but is unknown. In these tests, is rejected at a significance level equal to 0.05 if and .
Table 7 reports the ML estimates with the corresponding standard errors for each distribution fitted to the inflation rate and electrical resistance data. In addition, the values associated with the statistics
and
and with the information criteria are reported.
In
Table 7, with respect to the inflation rate data, based on the values of the statistics
and
, it can be seen that the hypothesis that the data correspond to an observed random sample of the GSC, GB, LBN, or BN distributions is rejected at a significance level equal to 0.05. In addition, it can be seen that the LGB distribution is the one with the lowest AIC, CAIC, and BIC values among the fitted distributions, indicating that this distribution should be selected over the others for the modeling of these data.
With respect to the electrical resistance data, it is observed that the hypothesis that the data correspond to an observed random sample of the OLLSN, GB, BN, or GSC distributions is rejected at a significance level equal to 0.05. Note that the AIC, CAIC, and BIC values for the LGB, MN, and LBN distributions are close, the values associated with the LBN distribution being slightly lower. Thus, the LBN distribution (which has a smaller parametric dimension than the LGB and MN distributions) is capable of fitting the electrical resistance data as well as the LGB and MN distributions.
In the left panels of
Figure 7 and
Figure 8, the histograms of inflation rate and electrical resistance are displayed along with the fitted densities. In the right panels of the same figures, the empirical cdf and the fitted cdf’s are compared. In these plots, we see that the LGB distribution fits the inflation rate data appropriately, while the LBN distribution fits the electrical resistance data appropriately. Note that the LGB and LBN distributions have one and two fewer parameters, respectively, than the MN distribution, and that despite this fact they are capable of presenting good fits to the analyzed data.
7. Final Comments
This article introduces a new symmetric/asymmetric unimodal/bimodal distribution called the Lambert generalized bimodal (LGB) distribution. Some special cases of the LGB distribution are discussed. One of the special cases, the Lambert-bimodal normal (LBN) distribution, can be considered as an alternative to other symmetric/asymmetric bimodal distributions, including the LGB distribution. The LGB distribution arises using the Lambert-F transformation when the generalized bimodal distribution is considered as baseline distribution. We study the main structural properties of the LGB distribution, such as the pdf, cdf, qf, and raw moments that are used for a description of the skewness and kurtosis characteristics. Parameter estimation of the LGB distribution is discussed using the ML method. Through simulation experiments, we observe that the ML method provide acceptable estimates of the parameters of the LGB distribution. Furthermore, through simulation experiments, we observed that the LGB distribution can adequately fit datasets generated from the mixture normal distribution, despite having one less parameter. Finally, two applications that illustrate that the LGB distribution and the LBN especial case can present a better fit of data in real settings than other symmetric/asymmetric unimodal/bimodal distributions such as the odd log-logistic skew-normal distribution (OLLSN), gamma sinh-Cauchy (GSC), and mixture normal (MN) distributions.
As a final consideration, we leave the question open, does the LGB distribution have an intuitive stochastic representation? In the literature, it is possible to find distribution families that have an attractive intuitive generation mechanism, such as the skew-elliptical (SE) distributions [
35] and the closed-skew-normal (CSN) distribution [
36]. According to Loperfido et al. [
37], any linear combination of the largest and smaller component of a bivariate, exchange elliptical random vector has a skew-elliptical distribution. According to Loperfido [
38], any order statistic from a random vector with exchangeable normal distribution has a closed-skew-normal distribution. As far as we know, no similar property is known for the Lambert-
F distributions class or for some of its special cases.