An Asymmetric Bimodal Distribution with Application to Quantile Regression

In this article, we study an extension of the sinh Cauchy model in order to obtain asymmetric bimodality. The behavior of the distribution may be either unimodal or bimodal. We calculate its cumulative distribution function and use it to carry out quantile regression. We calculate the maximum likelihood estimators and carry out a simulation study. Two applications are analyzed based on real data to illustrate the flexibility of the distribution for modeling unimodal and bimodal data.


Introduction
It frequently occurs in real life that we find continuous data that are bimodal; these cannot be modeled by known unimodal distributions.It is therefore of interest to investigate more flexible distributions in modes that will be useful for professionals working in different areas of knowledge.
In unimodal distributions, the flexibility is based on the asymmetry and kurtosis of the data.In this context, Azzalini [1] introduced the skew-normal (SN) distribution, with asymmetry parameter λ.It has a probability density function (pdf) given by f (y; µ, σ, λ) = 2 where φ and Φ denote, respectively, the density and cumulative distribution functions of the N (0, 1)  distribution.This is denoted as Y ∼ SN(λ).SN(0) becomes the standard normal distribution.Bimodal distributions generated from skew distributions can be found in Ma and Genton [2], Kim [3], Lin et al. [4,5], Elal-Olivero et al. [6], Arnold et al. [7], Arnold et al. [8], and Venegas et al. [9], among others.The importance of studying these distributions is based on the fact that they do not have identifiability problems and can be used as alternative parametric models to replace the use of mixtures of distributions that present estimation problems from either the classical or the Bayesian point of view (see McLachlan and Peel [10]; Marin et al. [11]).One difficulty with these distributions is that in general, there is no closed-form expression for their cumulative distribution function (cdf).This makes it more difficult to generate data from these distributions for simulation studies or to carry out quantile regression.Additionally, many such bimodal distributions have complicated expressions for a general quantile (say, the q-th).
A variety of bimodal data sets and appropriate models have been presented by many authors.For example, Cobb et al. [12] used the quartic exponential density presented by Fisher [13] to model crude birth rates data; Rao et al. [14] used a bimodal distribution to analyze fish length data; Famoye et al. [15] used the beta-normal distribution to analyze egg diameter data; Everitt and Hand [16] discussed some mixture distributions for modeling bimodal data; Chatterjee et al. [17] and Weisberg [18] presented two bimodal data sets on the eruption and interruption times of the Old Faithful geyser; Bansal et al. [19] discussed the bimodality of quantum dot size distribution; Famoye et al. [15] cited a variety of bimodal distributions that arise from different areas of science.On the other hand, the sinh Cauchy (SC) distribution is given by , where Λ = (λ, µ, σ), z = y−µ σ , z ∈ R, µ ∈ R is a location parameter, σ > 0 is a scale parameter, and λ > 0 is a symmetric parameter.The SC distribution produces unimodal and bimodal densities.The disadvantage of the SC distribution is that it is symmetric, which limits it to modeling only symmetric bimodal data.The main objective of this article is therefore to study a bimodal skew-symmetric model with closed cdf, in order to apply it to quantile regression.To do this, we used an extension of the SC distribution that we call the gamma-sinh Cauchy (GSC) distribution, which presents flexibility in its modes and also closed-form expression in its cdf.The GSC distribution belongs to the (gamma-G generator) family introduced by Zografos and Balakrishnan [20].For any baseline cdf G(y; Λ), x ∈ R, they defined the gamma-G generator by the pdf and cdf given by and respectively, where φ > 0 is a skewness parameter, Λ is a vector of parameters, g(y) = d dy G(y), γ(y, a) = y 0 t a−1 e −t dt is the incomplete gamma function, and Γ(a) = γ(+∞, a) is the usual gamma function.We remark that in the literature, there are many models that can accommodate bimodal distributions.However, in only a few of them do the parameters have an interpretation in terms of measures of central tendency (mean, median, for instance) or a general q-th quantile.As we will show in Section 3, the main advantage of the GSC is that the location parameter represents the respective q-th quantile under a certain restriction over φ, which is very convenient for the use of this model in a quantile regression framework.
The paper is organized as follows.Section 2 develops the GSC distribution, its basic properties, and quantile regression.In Section 3, we perform a small-scale simulation study of the maximum likelihood (ML) estimators for parameters.Two applications to real data are discussed in Section 4, which illustrate the usefulness of the proposed model.Finally, conclusions are given in Section 5.

Gamma-Sinh Cauchy Distribution
The GSC distribution is obtained considering G in (2) as the cdf of the SC distribution.The pdf can be written as where Θ = (φ, λ, µ, σ), and φ > 0 is an asymmetric parameter.We denoted this by Z ∼ GSC(φ, λ, µ, σ).
The following proposition states conditions for the symmetry of the GSC distribution.
The unimodal and bimodal regions for GSC(φ, λ, µ, σ) are illustrated in Figure 1.We can see that for all φ, there is λ such that GSC is bimodal.Figure 2 shows the density function for some values of the parameters φ and λ, considering the location and scale parameters fixed at 0 and 1, respectively.The distribution assumes symmetric unimodal and bimodal shapes and asymmetric unimodal and bimodal shapes.Figure 3 shows the skewness and kurtosis coefficients for the GSC model under different values of λ and φ (such coefficients do not depend on µ and σ).As illustrated previously, the model can assume positive and negative values for the skewness coefficient and can also accommodate kurtosis coefficients lower than, equal to, and greater than the normal model (<3, =3 and >3, respectively).The GSC Model for Quantile Regression From Equation ( 5), it follows that the cdf of the GSC distribution evaluated in µ is given by where G(y, a) = γ(y, a)/Γ(a) corresponds to the cdf of the gamma distribution with shape and scale parameters a and 1, respectively.Note that G (log(2), φ) depends only on φ (and not on σ or λ).As G(log(2), φ) is an increasing function in terms of φ and G(•, φ) ∈ (0, 1), the equation G(log(2), φ) = q has an unique solution for q ∈ (0, 1), which can also be written as Equation ( 7) can be solved numerically.For instance, in R the uniroot function can be used.Table 1 shows some values for φ(q) with different values for q.For this reason, for a fixed q, if we take φ = φ(q) satisfying ( 7), by ( 6) the parameter µ directly represents the q-th quantile, allowing regression to be performed conveniently even though µ.Under this setting, a set of available p-covariates, say x i = (x i1 , . . ., x ip ), for i = 1, . . ., n, can be introduced as follows: This is a convenient property of the GSC distribution because it provides a simple way to performing quantile regression in a model that can be unimodal or bimodal, depending only on parameter λ (because φ = φ(q) is considered as fixed in this setting).
As far as we know, there is no model in the literature that is parameterized conveniently in terms of the q-th quantile and can also be unimodal or bimodal.Figure 1 shows that for any φ = φ(q) fixed in the GSC, there is an interval Λ (1) φ(q) for λ where the distribution is unimodal and an interval Λ where the distribution is bimodal.

ML Estimation
Consider y 1 , y 2 , . . ., y n as a size n random sample from the pdf GSC(φ, λ, µ, σ).Hence, the log-likelihood function is given by where z i = y i −µ σ .To compute the ML estimation for Θ = (φ, λ, µ, σ), (8) must be maximized.That is, we have to solve the following system of equations: δl δφ , δl δλ , δl δµ , and δl δσ .More precisely, we have to solve where t i = 1 2 − 1 π arctan{λ sinh(z i )}, and Ψ(•) is the digamma function.The system of equations given above can be solved using numerical procedures such as the Newton-Raphson procedure.An alternative is to use the NumDeriv routine with the R software (R Core Team [22]).

Simulation Study
In this Section, we present a brief simulation study to assess the performance of MLE in the GSC model.To draw values from the model, we used the inversion method.If U ∼ U(0, 1), then where G −1 (•; φ) is the inverse of the cdf of the gamma distribution with shape and rate parameters equal to φ and 1, respectively.In all scenarios, we considered µ = 0 and σ = 1, three values for φ-0.75, 1, and 1.5-and three values for λ-0.5, 1, and 2. We also considered three sample sizes: 100, 200, and 500.For each combination of the parameters, we drew 1,000 samples and computed the ML estimates.Table 2 summarizes the results considering the average of the bias (bias), the root of the estimated mean squared error (RMSE), and the 95% coverage probability (CP).Note that in all cases, the bias and RMSE decreased when the sample size was increased, suggesting that the estimators are consistent.Finally, we also remark that the coverage probability converged to the nominal values used for the construction of the confidence intervals when the sample size was increased, suggesting that the normality for the ML estimates is reasonable in sample sizes.

Applications
In this section, we carry out two applications to real data, the first using the GSC model without covariates and the second applying quantile regression to uni-and bimodal data.

Application 1: Without Covariates
The first application reported is for the data set consisting of 1150 heights measured at 1 micron intervals along the drum of a roller (i.e., parallel to the axis of the roller).This was part of an extensive study of roller surface roughness.It is available for downloading at http://lib.stat,emu.edu/jasadata/laslett.Table 3 presents summary statistics for the data set where b 1 and b 2 correspond to the sample asymmetry and kurtosis coefficients, respectively.We fitted this using the SN model, the exponentiated sinh Cauchy (ESC) model (see Cooray [23]), and the GSC model.A summary of these fits is presented in Table 4. Based on the AIC criteria, the GSC provided the best fit for the height data set.Figure 4 shows plots of the density functions for the fitted models using the MLEs for SN, ECG and GSC distributions.

Data Set 2: Quantile Regression to Bimodal Data
The second application we consider is the Australian data set available in the sn package in R.This data set is related to 102 male and 100 female athletes collected at the Australian Institute of Sport.The linear model considered is where Bfat i is the body fat percentage for the i-th athlete and bmi i and lbm i are the covariates body mass index and lean body mass for the i-th athlete, respectively.In addition, i (q) ∼ GSC(0, σ, λ, φ(q)), φ(q) satisfies Equation ( 7), and q ∈ (0, 1) is the fixed quantile that is being modeled.This data set was also analyzed in Martínez-Flórez et al. [24] using a bimodal regression model.However, the authors modeled the mean of the distribution.In our approach, we model the 0.1, 0.25, 0.5, 0.75, and 0.9 percentiles of the distribution, which provides a more informative scenario to explain body fat in terms of the body mass index and lean body mass.Our approach is compared with the skewed Laplace (SKL) and skewed Student-t (SKT) models discussed in Galarza et al. [25], where the authors proposed a flexible model in a quantile regression model context.Table 5 shows the AIC for those models considering different quantiles.We also present the p-value for the Kolmogorov-Smirnov (K-S) test of the hypothesis that the respective quantile residuals came from the standard normal distribution.P-values greater than 5% suggest that with this significance level, the standard normal assumption is reasonable for those residuals, in which case the model would be appropriate for this data set.Note that based on the AIC criteria, the GSC presents a better fit for this data set, except for the median regression (q = 0.5).On the other hand, based on the p-value for the K-S test applied to the quantile residuals, we conclude that GSC, SKL, and SKT are appropriate models for q = 0.25 and q = 0.5 (the GSC and SKT provide a better fit based on the AIC criteria).However, for q = 0.1 and 0.75, GSC provides the better fit because the p-values are (significantly) greater than 0.05.Finally, for q = 0.9 no model seems appropriate, but based on the p-values, the GSC provides a better fit than SKL and SKT distributions. Figure 5 shows the regression coefficients for the quantile regression presented in Equation ( 9) and their respective 95% confidence intervals.Note that body mass index and lean body mass are significant in explaining all the quantiles modeled.
Figure 6 shows the profile density for the q-th quantile of body fat percentage for q = 0.1 and q = 0.75.Note that the distribution of the 0.1 quantile is unimodal, and the distribution of the 0.9 quantile is bimodal.

Final Comments
This paper proposes the GSC distribution, which is flexible in its modes and contains the SC distribution as a special case.We implemented ML estimation in a quantile regression, obtaining significant results.In the applications to real data it performs very well, better than potential rival models.Some further characteristics of the GSC distribution are:

•
The GSC distribution contains the SC and hyperbolic secant models as special cases.

•
The GSC distribution presents great flexibility in its modes, as can be observed in Figure 1.

•
The proposed model has a closed-form expression for its cdf.

•
In the two applications, we show that the GSC model fits better than the other models.

Figure 4 .
Figure 4. Fitted models for roller data set.

Figure 6 .
Figure 6.Distribution for 0.1 and 0.75 quantiles of body fat percentage considering body mass index and lean body mass equal to 22.96 and 64.87, respectively.Curves in black, red, and green represent the density functions estimated by the GSC, SKL, and SKT models, respectively.

Table 1 .
Some values for φ(q) in terms of q.

Table 3 .
Descriptive statistics for the data set

Table 4 .
Maximum likelihood (ML) estimates for the GSC, exponentiated sinh Cauchy (ECG), and skew-normal (SN) models for the roller data set.

Table 5 .
AIC and p-value for K-S test in the ais data set for the GSC, skewed Laplace (SKL), and skewed Student-t (SKT) models and different quantiles.