1. Introduction
The different bimodal and skew distributions have been proposed over the last decade to construct flexible distributions. The proposed distributions are in [
1,
2,
3,
4,
5,
6,
7,
8,
9,
10,
11,
12,
13,
14,
15,
16,
17,
18,
19,
20,
21,
22,
23,
24,
25,
26] and references therein via using different generating techniques [
27] to get a probability density function (PDF). In these distributions,
$\epsilon $-skew form of gamma distribution on the real line was proposed by [
5,
6]. The deficiency of these functions is that different height and shape of peakedness around location on the real line cannot be modelled separately. The model proposed by [
6] has a bimodality with the same height, which is not flexible enough to model bimodal data with different height and shape of peakedness. The bimodal and alpha-skew Laplace distribution that does not model shape peakedness around location on the real line was proposed by [
24]. However, the best way is to find a function that can fit data around location separately. In other words, the left and right sides of location will be modelled with different parameters to have an efficient fitting for both sides of the location. A bimodal exponential power (BEP) distribution is proposed by [
28]. The properties of BEP distribution are few when BEP is compared with distribution proposed by [
29] because BEP has the same level of peaks around location on the real line, and it is also symmetric on both sides of the location. The shape of peakedness around location on the real line is modelled by only one parameter; however, two parameters are added in order to model different modes from distribution on the real line [
29]. Two parameters controlling fitting the shape of peakedness and two parameters controlling fitting the height of bimodality will be used together. Skewness parameter is also added to model asymmetry in data. Thus, modelling capacity of asymmetric bimodal exponential power (ABEP) distribution is better than current candidates proposed by [
5,
6,
28,
29] because ABEP distribution has parameters that control the fitting both sides of location separately.
The second aim is that we do not only propose ABEP distribution but also derive this distribution via constructing a normalizing constant (NC), which leads to producing a PDF. While deriving a PDF, producing NC can be a preferable approach. This approach can be taken care for deriving a PDF when one wants to add a new parameter to increase the modelling capacity of function if it is tractable to get NC from a function. The NC approach was examined by [
30] to construct asymmetric distributions from symmetric distributions. Some techniques used to derive a PDF are reviewed by [
27]. There are other techniques to produce PDFs derived from entropy functions via the method of Lagrange multipliers as well [
31,
32] and references therein. The different goodness of fit tests (GOFTs) are applied to the ABEP. Thus, importance and advantage of GOFTs, such as Kolmogorov–Smirnov (KS), Cramér von Mises (CVM), Anderson–Darling (AD) via a cumulative distribution function (CDF) of a PDF will be expressed for ABEP distribution when the optimization problem of ABEP can arise.
In particular, the estimation of location parameter is important—for example, the proteins in cancer cells need to be determined, and the image processing demands for obtaining the quantitative value of colors at a prescribed range. A radar data, speech processing, etc. in many phenomena can be modelled via ABEP. The parametric models that can accommodate the shape of peakedness, bimodality and skewness are mostly preferred to be able to model the data set efficiently. In other words, the frequented data can be represented by the parameters that control fitting the shape of peakedness, the parameters that control fitting the bimodality and the skewness that controls fitting the asymmetry in the data set. Due to this reason, ABEP distribution having these parameters is proposed. In addition, since the generalized gamma distribution is a class for many distributions, it is chosen in order to reflect to the negative side of the real line.
The paper is organized as follows. In
Section 2, ABEP distribution is defined and mode, distributional properties, related distributions and tail behaviour of ABEP distribution are given. Maximum likelihood (ML) estimations of parameters are provided in
Section 3. A brief simulation study is given in
Section 4. In
Section 5, the real data examples are provided to make a comparison among candidate densities. The results are commented. Finally, in the last section, conclusions are given and remarks are considered.
3. Maximum Likelihood Estimations for Parameters of ABEP Distribution
Let
${x}_{1},{x}_{2},...,{x}_{n}$ be a random sample of size
n from an ABEP distributed population. The unknown parameters
$\mu ,\sigma ,{\alpha}_{1},{\alpha}_{0},{\delta}_{1},{\delta}_{0}$ and
$\epsilon $ will be estimated by an ML estimation method [
35]. Here, the parameters
${k}_{1}$ and
${k}_{0}$ are nuisance parameters. The log-likelihood
$log(L)$ function is:
where
${n}_{0}$ is the number of non-negative observations and
${n}_{1}$ is the number of negative observations.
$\widehat{\mathbf{\theta}}=(\widehat{\mu},\widehat{\sigma},{\widehat{\alpha}}_{1},{\widehat{\alpha}}_{0},{\widehat{\delta}}_{1},{\widehat{\delta}}_{0},\widehat{\epsilon})$ are ML estimators of parameter vector
$\theta =(\mu ,\sigma ,{\alpha}_{1},{\alpha}_{0},{\delta}_{1},{\delta}_{0},\epsilon )$.
The second derivative test can be used to determine whether or not the
$log(L)$ function in Equation (
18) has the maximum value. However, since PDF has seven parameters
$\mu ,\sigma ,{\alpha}_{1},{\alpha}_{0},{\delta}_{1},{\delta}_{0}$ and
$\epsilon $, it is not easy to get a Hessian matrix because of two cases and seven parameters in ABEP. One can get it via using the mathematical software programs, which are Maple 18.00 (Maplesoft, Waterloo, ON, Canada) or Mathematica 9.0.1.0 (Wolfram Research, Champaign, IL, USA). It is also noted that ABEP can have a discontinuity point at
$x=\mu $ for some values of parameters. There can be a solution to overcome this problem if we focus on improving the modelling capacity of PDF having more parameters, which help us to increase flexibility of the function. Thus, the efficiency for ML estimators of the parameters
$\mu $ and
$\sigma $ is increased. A solution in an indirect way for this problem is that one can use GOFT statistics, such as KS, CVM and AD to see the distances between expected and empirical cumulative distributions. It is well known that the smaller values of the GOFT statistics mean that more fitting performance accomplished by the function. In the computation process, optimization of nonlinear function in Equation (
18) is conducted via hybrid genetic algorithm (HGA) in MATLAB 2016a (MathWorks, Natick, MA, USA). In HGA, intervals for parameters that will optimize the
$log(L)$ function in Equation (
18) are used. The intervals for
$\mu ,\sigma ,{\alpha}_{1},{\alpha}_{0},{\delta}_{1},{\delta}_{0}$ and
$\epsilon $ are
$[-5,5],[0,5],[0,10],[0,10],[0,10],[0,10]$ and
$(-1,1),$ which is a domain of skewness parameter
$\epsilon $.
${k}_{1}$ and
${k}_{0},$ as nuisance parameters are taken to be
${\alpha}_{1}$ and
${\alpha}_{0}$. This form is appropriate to have the same form of normal and Laplace. Let us remind that ABEP is a generalized normal or Laplace distribution. Thus,
${k}_{1}$ and
${k}_{0}$ are nuisance parameters.
The Fisher information matrix for parameters
$\mu $ and
$\sigma $ from ABEP is given by matrix
F in the following form:
The Cramér–Rao lower bounds (RCLBs) for ML estimators of parameters are given. The Monte Carlo numerical integration is used to compute the integrals in Fisher information in Equation (
19) for RS, ESC and ASL distributions.
Equations (
6) and (
9) are used to calculate the integrals in matrix
F. Due to the analytical expression of PDF in Equation (
12), off-diagonal elements of matrix
F are non-zero. Here, shape
${\alpha}_{1}$,
${\alpha}_{0}$, bimodality
${\delta}_{1}$,
${\delta}_{0}$, skewness
$\epsilon $ and nuisance
${k}_{1}$,
${k}_{0}$ parameters make a covariance structure between location
$\mu $ and scale
$\sigma $ parameters. From this result, covariance structure on ML estimators of other parameters can be seen. Since it is possible to obtain the covariance among ML estimators, the Fisher information matrix is obtained only for the ML estimators of two parameters
$\mu $ and
$\sigma $. If there can be a covariance among ML estimators, the inverse of matrix
F cannot be obtained except the generalized inverse. Note that getting matrix
F for
$\mu $ and
$\sigma $ from ABEP is tractable for the calculation of integration of Fisher information. Using the generalized inverse cannot be preferable due to a loss of information in an inverse of a matrix. Loss of information occurs because the multiplication of generalized inverse of matrix
F and
F, that is,
${F}^{-}F$, does not give an identity matrix [
53]. When
${\alpha}_{1}={\alpha}_{0}=\alpha $,
${\delta}_{1}={\delta}_{0}=\delta $,
$\epsilon =0$ and
${k}_{1}={k}_{0}=k$,
${\mathbb{E}}_{1}\left(\right)open="["\; close="]">\frac{{\partial}^{2}log[f(x;\mu ,\sigma )]}{\partial \mu \partial \sigma}=0$, that is, the covariance between ML estimators of
$\mu $ and
$\sigma $ from ABEP is zero:
Some of the regularity conditions [
35] are as follows:
det$[F(\mu ,\sigma )]<\infty $ and
$|\frac{{\partial}^{3}}{\partial {\theta}^{3}}logf(x;\theta )|\le M(x).$ Then, $\mathbb{E}[M(X)]<\infty $.
One can verify that the conditions can be satisfied by using the mathematical software programs, which are Maple 18.00 (Maplesoft, Waterloo, ON, Canada) or Mathematica 9.0.1.0 (Wolfram Research, Champaign, IL, USA). Here, it is possible to get
$M(X)$ as
${X}^{r}$ in Equation (
15). Then, the condition 2 is satisfied. The other regularity conditions are already satisfied obviously. Since the ABEP distribution satisfies these two conditions,
that is,
$\sqrt{n}\left(\right)open="("\; close=")">\left[\begin{array}{c}\widehat{\mu}\\ \widehat{\sigma}\end{array}\right]-\left[\begin{array}{c}\mu \\ \sigma \end{array}\right]$ is asymptotically normal with mean zero vector and covariance matrix
${[F(\mu ,\sigma )]}^{-1},$ and
$\widehat{\mu}$,
$\widehat{\sigma}$ are asymptotically efficient and asymptotic normally distributed [
35].
4. Simulation Study
In this section, a brief simulation study to verify the behaviour of the ML estimators in finite samples is presented. The data were drawn based on the stochastic representation given in
Appendix A. The arbitrary values for the parameters are chosen. The probable handling for the values of parameters is considered such that the different combination for the values of
${\alpha}_{1}$,
${\alpha}_{0}$,
${\delta}_{1}$ and
${\delta}_{0}$ can be observable. Thus, an asymmetry and a bimodality can be constructed.
Table 1 has the chosen values for the parameters. Three sample sizes are utilized: 100, 200 and 500. Based on 1000 replicates for each sample size, the average bias and the root of the mean squared error (RMSE) are computed and they are given in
Table 1. HGA is used to optimize the
$log(L)$ function in Equation (
18) according to the parameters
$\mu ,\sigma ,{\alpha}_{1},{\alpha}_{0},{\delta}_{1},{\delta}_{0}$ and
$\epsilon $. In the computation process of HGA, the intervals for
$\mu ,\sigma ,{\alpha}_{1},{\alpha}_{0},{\delta}_{1},{\delta}_{0}$ and
$\epsilon $ are
$[-5,5],[0,5],[0,10],[0,10],[0,10],[0,10]$ and
$(-1,1)$.
In general, it is observed from the results of simulation that the bias can be acceptable and it can decrease when the sample size increases. When the sample size n increases, the values of $\widehat{\mathrm{RMSE}}$ go to zero because the function is represented well by the artificial data, which is an expected result.
5. Real Data Illustration
In this section, the modelling capability of ABEP is shown by applying it to two data sets from microarray [
54]. The real data sets from the web site in [
54] are given in
Table A1 and
Table A2 in
Appendix B. The analysis of proteins in cancer cells is important. The efficient estimates of location and scale parameters for these proteins have a crucial role in medical care. For this reason, we prefer to focus on these data sets that have different shapes of peakedness, bimodality and asymmetry.
In the second step, the distributions are considered to model these data sets. In the estimation process, we use the maximum likelihood method together with GOFT statistics, mostly prominent ones that are KS, CVM and AD (robust one) distances to test the fitting capability of distributions [
55]. When the estimates of parameters are computed, we can examine via GOFT statistics which of the five PDFs is the best fit on data.
The bimodal extended generalized gamma (BEGG) [
29], the Rathie-Swamee (RS) (RS is also known to be a modified version of generalized logistic) [
11,
12,
13], the exponentiated sinh Cauchy (ESC) [
10] and the alpha-skew Laplace (ASL) [
24] distributions are used to fit the data and make a comparison between them and ABEP. There are many different distributions that have been proposed; however, using an explicit expression for CDF would be preferred to fit the data. For this reason, the distributions having an explicit expression for their CDFs are used. Thus, GOFTs can be used without including the numerical integration methods having computational errors.
Modelling data (or Riemann integration in randomly putting the bin of histograms on the real line) is equivalent to an integration. Thus, the discontinuity at
$x=\mu $ is not a problem for estimations of parameters. For computation, the HGA is used. HGA also includes the derivative free approach [
56] for optimization. Then, the discontinuity point
$x=\mu $ is not a problem for optimization of
$log(L)$ function in Equation (
18) according to parameters. At the same time, GOFT statistics are used while performing the computation process.
The algorithm given in
Appendix A allows for generating a random variable that is distributed according to a PDF given in Equation (
12). Thus, the performance of fitting can be checked via the counted data at the prescribed ranges of domain. However, this procedure is rough when it is compared with GOFTs. It is also beneficial to observe the performance of the random number generation procedure.
The number of replicated sample size n is 100,000. Data generated from ABEP, BEGG and ESC distributions are sorted from small to big values for each sample size n. After sorting, arithmetic mean of 100,000 artificial data is obtained for $n=118$. After artificial data are generated from their corresponding PDFs, it is also possible to check the fitting performance of these functions via the artificial data (see Tables 4 and 7). Since ABEP, BEGG and ESC are competitive distributions for fitting data and they have a random number generation procedure, they are preferred to check their similarities with real data. In two examples given in the following subsections, Tables 2 and 5 give the ML estimates of parameters in PDFs and GOFT statistics for these estimated values of parameters. Tables 3 and 6 show the asymptotic variances and covariances of ML estimators for the parameters $\mu $ and $\sigma $. The Monte Carlo numerical integration method is used while performing the computation process of integration in the Fisher information matrix. Tables 4 and 7 represent the counted data at the prescribed ranges of domain.
5.1. Example 1: Modelling Shape of Peakedness, Bimodality and Asymmetry
The data labelled as “Homo sapiens Pig7 (PIG7) mRNA, complete cds Chr.16 [381663, (EW), 5’:AA059047, 3’:AA059031]” from microarray are modelled by ABEP, BEGG, RS, ESC and ASL distributions.
Table 2 gives the ML estimates of parameters in these distributions and GOFT statistics for these estimated values of parameters
$\mu $,
$\sigma $,
${\alpha}_{1}$,
${\alpha}_{0}$,
${\delta}_{1}$,
${\delta}_{0}$ and
$\epsilon $.
Table 3 shows the asymptotic variances and covariances of ML estimators for the parameters
$\mu $ and
$\sigma $.
Table 4 represents the counted data at the ranges [−10, −0.3, −0.1, 0, 0.1, 0.3, 10].
5.2. Example 2: Modelling Shape of Peakedness, Bimodality and Asymmetry
The data from microarray labelled as “SID 377353, ESTs [5’:, 3’:AA055048]” are modelled by ABEP, BEGG, RS, ESC and ASL distributions.
Table 5 gives the ML estimates of parameters in these distributions and GOFT statistics for these estimated values of parameters
$\mu $,
$\sigma $,
${\alpha}_{1}$,
${\alpha}_{0}$,
${\delta}_{1}$,
${\delta}_{0}$ and
$\epsilon $.
Table 6 shows the asymptotic variances and covariances of ML estimators for the parameters
$\mu $ and
$\sigma $.
Table 7 represents the counted data at the ranges [−10, −0.4, −0.2, 0, 0.2, 0.4, 10].
5.3. Comments on the Results of Examples 1 and 2
For both of the examples,
Figure 3a and
Figure 4a show that ABEP fits better than the other distributions. In particular, the modalities around location have been modelled as the different modes of heights. The shape of peakedness can be modelled as well. The right side of the location is especially modelled very well by ABEP in Example 2. The asymmetry illustrated in Example 1 has been modelled.
The histograms of data in Example 2 do not show an asymmetry and the ML estimate of skewness parameter is very near to zero because, as it is seen from
Figure 4a, the histograms do not have an asymmetry when they are compared with histograms in
Figure 3a. The unequally distributed histograms around location in
Figure 3a can show that there is an asymmetry in the data set.
For both of the examples,
Table 2 and
Table 5 give the ML estimates of parameters of distributions and GOFT statistics of fitted densities. ABEP distribution has the best fitting on data when we consider the values of KS and CVM statistics. When we look at the fitting performance for all distributions from
Figure 3a and
Figure 4a, it is seen that ABEP, BEGG and ESC have better fitting performance than RS and ASL. However, when ABEP and ESC are compared, it is observed that two parameters
$\lambda $ and
$\beta $ of ESC are not enough to get the precise fitting on data because these parameters work together around the location. In BEGG, there is only one parameter
$\alpha $ to control the fitting shape of function on the real line. In ABEP, the role of parameters
${\alpha}_{1},{\alpha}_{0},{\delta}_{1}$ and
${\delta}_{0}$ around the location is constructed definitely. Thus, more precise estimates for parameters
$\mu $ and
$\sigma $ can be obtained through these parameters if the data are from many phenomena.
It is well known that the probability value (p-value) of a test statistic depends on the fitted density. For this reason, more efficient density must be preferred before getting the p-value of a test statistic from corresponding density. Then, the potential problem that can occur in future from phenomena can be refrained. The estimates of $\mu $ from fitted densities of ABEP, BEGG, RS and ESC can be close to each other, but the estimates of $\mu $ of ABEP are more precise because ABEP is the best one for fitting on data. Similarly, the estimates of $\sigma $ of ABEP for both of examples are the best ones.
The random number generation procedure can be conducted in a more precise way for ABEP, BEGG and ESC distributions because ABEP and BEGG have an algorithm of random number generation in
Appendix A. The inverse of CDF of ESC distribution [
10] can be taken to get the random numbers from ESC. The artificial data generated from ABEP distribution also show that the counted artificial data at ranges can be similar to the counted real data at ranges (see
Table 4 and
Table 7). It is noted that the mostly counted data (the numbers 37 and 62 in Examples 1 and 2, respectively) at an interval for real data are constructed by the artificial data generated from ABEP distribution for the prescribed ranges on the real line. The counted artificial data from ABEP represent the counted real data when they are compared with the counted artificial data from BEGG and ESC. Thus, we can infer that the data generation procedure is also successful after we get the precise estimates of parameters in ABEP via collaboration with GOFT statistics.
GOFT statistics in
Table 2 and
Table 5 show that there can be a numerical error in the computation of special function from CDF of ABEP. The AD for ABEP can have a numerical error from the computation of CDF because CDF of ABEP is a special function. Even if CDF of ABEP depends on special functions that are incomplete gamma functions, the fitting performance of ABEP is the best one due to the fact that all possible parameters (shape, bimodality and skewness) are added into ABEP.
6. Conclusion and Discussion
A family for bimodal distribution with two parameters fitting the shape of peakedness (${\alpha}_{1}$ and ${\alpha}_{0}$), two parameters fitting the height of bimodality (${\delta}_{1}$ and ${\delta}_{0}$) and a parameter fitting the asymmetry ($\epsilon $) in the data set have been proposed. The unimodal case of this family is obtained when ${\delta}_{1}={\delta}_{0}=0$. The skewness parameter in this family is from the $\epsilon $-skew approach, which can produce the asymmetry around location. The importance of having these parameters in ABEP for modelling around locations separately has been observed when we make a comparison among ABEP, BEGG, RS and ESC distributions that have explicit expression for CDF. As a result, ABEP can model efficiently the shape of peakedness, the bimodality and the asymmetry at the same time because ABEP has parameters that are responsible for fitting the shape of peakedness, the bimodality and the asymmetry in data when it is compared with BEGG, RS, ESC and ASL distributions.
The well known approach that derives PDFs without consulting the variable transformation technique is applied for the tractable functions in Equations (
6)–(
11) to propose a new distribution. It is clear that this approach can be applied for other kinds of distributions that are on the negative, positive or real lines. The disadvantage of this approach is that the analytical expression of a function must be tractable to derive a PDF. Equations (
6)–(
8) are the power version of gamma, lower and upper incomplete gamma functions. The functions in Equations (
9)–(
11) are transferred to the negative side of the real line through using functions in Equations (
6)–(
8). They are a new kind of the special functions to calculate the integrals having the kernel of gamma function. One can get distributions via these functions. For example, alpha-skew Laplace [
24], alpha–beta skew normal [
3], alpha-skew generalized
t with variable transformation [
57,
58], symmetric and asymmetric EP [
40,
41,
42,
43,
44,
45] distributions with the recalculated NC can also be obtained by these special functions. The special cases, the related distributions and the flexibility of ABEP are given in the relevant section.
An algorithm for generating artificial data from ABEP is provided. Thus, the similarity between artificial and real data sets has been observed as a rough approach and the performance of optimization for the $log(L)$ function and GOFTs can be supported by this similarity as well. The benefit of GOFTs is depicted when a PDF has more parameters. The best performance on optimizing the $log(L)$ function can also be checked by GOFT statistics. Thus, if CDF of a PDF exists, using GOFTs as an indirect way to check the potential optimization problem(s) is provided when the second derivative test is a problem for getting the Hessian matrix with respect to parameters of $log(L)$ function. HGA is also used to overcome the problems that can occur while performing optimization of $log(L)$ function according to the parameters in ABEP. As a result, performing a cross check between the optimization tool HGA and the GOFT statistics is a beneficial approach to overcome the potential problem(s) from the computation process. Thus, more precise ML estimates for parameters can be gained. When it is considered on overall results from illustrating of PDF and CDF and also artificial data, the GOFT statistics and these results support each other to show the fitting performance of ABEP. The brief simulation study gives the satisfactory results for the bias and RMSE.
RCLBs for ML estimators of parameters $\mu $ and $\sigma $ are obtained. The properties of ABEP are provided and so the heavy-tailedness property of ABEP distribution has been examined. The heavy-tailedness of ABEP from Definitions 1 and 2 are guaranteed when $b>a$ in the $\gamma $ function. Definitions 1 and 2 imply that ABEP can be a heavy-tailed distribution together with that comment in there.
The entropy-based parameter estimation for ABEP is an ongoing issue in [
31,
32] to study via the proposed special functions in Equations (
6)–(
11). We will introduce the information theoretic model selection criteria [
59] to fit the models, and the results will be published separately. In the future, a package in a statistical software R from open access will be prepared for ABEP distribution with different estimation methods [
31,
55], and the information theoretic model selection criteria will be added into this package.