Asymmetric Bimodal Exponential Power Distribution on the Real Line

The asymmetric bimodal exponential power (ABEP) distribution is an extension of the generalized gamma distribution to the real line via adding two parameters that fit the shape of peakedness in bimodality on the real line. The special values of peakedness parameters of the distribution are a combination of half Laplace and half normal distributions on the real line. The distribution has two parameters fitting the height of bimodality, so capacity of bimodality is enhanced by using these parameters. Adding a skewness parameter is considered to model asymmetry in data. The location-scale form of this distribution is proposed. The Fisher information matrix of these parameters in ABEP is obtained explicitly. Properties of ABEP are examined. Real data examples are given to illustrate the modelling capacity of ABEP. The replicated artificial data from maximum likelihood estimates of parameters of ABEP and other distributions having an algorithm for artificial data generation procedure are provided to test the similarity with real data. A brief simulation study is presented.


Introduction
The different bimodal and skew distributions have been proposed over the last decade to construct flexible distributions. The proposed distributions are in  and references therein via using different generating techniques [27] to get a probability density function (PDF). In these distributions, ε-skew form of gamma distribution on the real line was proposed by [5,6]. The deficiency of these functions is that different height and shape of peakedness around location on the real line cannot be modelled separately. The model proposed by [6] has a bimodality with the same height, which is not flexible enough to model bimodal data with different height and shape of peakedness. The bimodal and alpha-skew Laplace distribution that does not model shape peakedness around location on the real line was proposed by [24]. However, the best way is to find a function that can fit data around location separately. In other words, the left and right sides of location will be modelled with different parameters to have an efficient fitting for both sides of the location. A bimodal exponential power (BEP) distribution is proposed by [28]. The properties of BEP distribution are few when BEP is compared with distribution proposed by [29] because BEP has the same level of peaks around location on the real line, and it is also symmetric on both sides of the location. The shape of peakedness around location on the real line is modelled by only one parameter; however, two parameters are added in order to model different modes from distribution on the real line [29]. Two parameters controlling fitting the shape of peakedness and two parameters controlling fitting the height of bimodality will be used together. Skewness parameter is also added to model asymmetry in data. Thus, modelling capacity of asymmetric bimodal exponential power (ABEP) distribution is better than current candidates proposed by [5,6,28,29] because ABEP distribution has parameters that control the fitting both sides of location separately.
The second aim is that we do not only propose ABEP distribution but also derive this distribution via constructing a normalizing constant (NC), which leads to producing a PDF. While deriving a PDF, producing NC can be a preferable approach. This approach can be taken care for deriving a PDF when one wants to add a new parameter to increase the modelling capacity of function if it is tractable to get NC from a function. The NC approach was examined by [30] to construct asymmetric distributions from symmetric distributions. Some techniques used to derive a PDF are reviewed by [27]. There are other techniques to produce PDFs derived from entropy functions via the method of Lagrange multipliers as well [31,32] and references therein. The different goodness of fit tests (GOFTs) are applied to the ABEP. Thus, importance and advantage of GOFTs, such as Kolmogorov-Smirnov (KS), Cramér von Mises (CVM), Anderson-Darling (AD) via a cumulative distribution function (CDF) of a PDF will be expressed for ABEP distribution when the optimization problem of ABEP can arise.
In particular, the estimation of location parameter is important-for example, the proteins in cancer cells need to be determined, and the image processing demands for obtaining the quantitative value of colors at a prescribed range. A radar data, speech processing, etc. in many phenomena can be modelled via ABEP. The parametric models that can accommodate the shape of peakedness, bimodality and skewness are mostly preferred to be able to model the data set efficiently. In other words, the frequented data can be represented by the parameters that control fitting the shape of peakedness, the parameters that control fitting the bimodality and the skewness that controls fitting the asymmetry in the data set. Due to this reason, ABEP distribution having these parameters is proposed. In addition, since the generalized gamma distribution is a class for many distributions, it is chosen in order to reflect to the negative side of the real line.
The paper is organized as follows. In Section 2, ABEP distribution is defined and mode, distributional properties, related distributions and tail behaviour of ABEP distribution are given. Maximum likelihood (ML) estimations of parameters are provided in Section 3. A brief simulation study is given in Section 4. In Section 5, the real data examples are provided to make a comparison among candidate densities. The results are commented. Finally, in the last section, conclusions are given and remarks are considered.

Gamma Distribution: Reparametrization and ABEP Distribution on Real Line
The random variable Y will have a gamma distribution with PDF having parameters δ+1 α : Theorem 1. Let Y be a continuous random variable defined on [0, ∞), distributed as G( δ+1 α ). Consider a discrete random variable T, which generates a function on the real line. Then, unequal probabilities at negative and positive sides of the real line will be constructed. T is 1 + ε with the probability 1+ε 2 on the positive side and T is −(1 − ε) with the probability 1−ε 2 on the negative side. A variable transformation Z = Y 1/α T is applied to get the α power of Gamma distribution. Here, the random variables Y and T are independent [5,29]. After applying this transformation on gamma distribution in Equation (1), we will get the following PDF: where the parameters α > 0, δ > 0 and ε ∈ (−1, 1) [29]. The random variable T keeps being PDF, which will be generated because the gamma distribution has a PDF defined at the interval [0, ∞). The probabilities of (1 + ε) and −(1 − ε) values of random variable T are 1+ε 2 and 1−ε 2 , respectively [30,33]. Thus, a function in Equation (2) has the unequal probabilities on positive and negative sides of the real line. The following PDF from a function in Equation (2) will be proposed: where the parameters α 1 > 0, α 0 > 0, δ 1 > 0, δ 0 > 0, k 1 > 0, k 0 > 0 and ε ∈ (−1, 1). Without consulting the variable transformation technique, PDF can be obtained. This PDF is called an asymmetric bimodal exponential power distribution (ABEP). α 1 and α 0 are for the shape of peakedness, δ 1 and δ 0 are for height of bimodality on negative and positive sides of the real line. k 1 and k 0 are nuisance parameters to have the same form of normal or Laplace distributions. ε is a skewness parameter that is responsible for having unequal probabilities at negative and positive sides of the real line. Thus, a skewness on a function can be constructed. The details for functions in Equation (3) are given by the following proof.
Proof. The preliminary tools for the calculation of integrals are required. The gamma function and the incomplete gamma functions are used to have integral kernels, which are appropriate for calculating the integrals. Thus, we can derive a PDF: where These are the gamma, the lower and upper incomplete gamma functions, respectively [34].
The reparametrization of gamma function is considered as: A variable transformation x = (yp) α is applied to get the power version of gamma function: From Equation (4), γ(s * , α * ) = Γ(s * ) − Γ(s * , α * ). Now, let s * be s + 1/α and α * = (pk) α . Then, γ(s + 1/α, (pk) α ) = (pk) α 0 x s+1/α−1 exp{−x}dx. Now, the variable transformation x = (yp) α is applied to the power version of the lower incomplete gamma function: From Equation (4), Γ(s * , α * ) = Γ(s * ) − γ(s * , α * ). Now, let s * be s + 1/α and α * = (pk) α . Then, Γ(s + 1/α, (pk) α ) = ∞ (pk) α x s+1/α−1 exp{−x}dx. Now, the variable transformation x = (yp) α is applied to the power version of the upper incomplete gamma function: Equations (6)-(8) are power versions of gamma functions defined on the positive axis. These three functions can be transferred to the negative axis via the variable transformation y = −u. For Equation (6), For Equation (7), For Equation (8), For two cases of x < 0 and x ≥ 0, we have the integrals of Equation (3). Hence, Equations (6) and (9) can be used to calculate these integrals. One can easily show that the integrated values of negative and positive sides of Equation (3) are 1/2, respectively. Due to the fact that we must have a PDF defined on the real line, the summation of these two results is 1. Here, the variable transformation technique is not used. Thus, we can guarantee that the function gotten is on the interval [0, 1]. It is well known that if a function is defined on the interval [0, 1], this function will be a PDF.

Mode of a Kernel Function in ABEP
The mode of function in Equation (12) is examined. It is obvious that this function is a reflected function in Equation (3) that comes from the reparameterized gamma function with the power parameter α. Thus, examining the mode of the positive side of Equation (3) means that the negative side of Equation (3) is also examined. Now, it is examined whether or not there is one root of the following function: Here, we will give comments about getting the root of this function: NC can be ignored because NC produces a function at interval [0, 1]. It does not affect the modes of function. At the same way, the location parameter µ can be ignored because the location shows where the function in Equation (12) is located. The scale σ, its variants k 0 or k 1 and ε parameters change the rescaling of the function in Equation (12).
The root of derivative of the function in Equation (13) with respect to t is exp{α −1 0 log(α −1 0 δ 0 )}. For t = 0, h(t) = 0 is the obvious root that does not lead to modality. Thus, there is only one root of function in Equation (13), that is, there is one mode of function of generalized gamma on the positive side. Since it is reflected on the negative side of the real line, the function has a mode at the negative side of the real line. In total, this function in Equation (12) has two modes at the real line. Note that it is not necessary to use a second derivative test because maximization of a function is equivalent to the negative version of that function. Detecting the root is enough for having modality.
The rth, r ≥ 0, non-central moment is given by One can get the results via Equations (6) and (9). Since E(X r ) is finite for finite values of parameters α 1 , α 0 , δ 1 , δ 0 , k 1 , k 0 and when the extremely big values of parameters α 1 , α 0 , δ 1 , δ 0 , k 1 , k 0 and r are not taken, the ABEP distribution can produce finite values for the estimates of parameters because finiteness of moments guarantees having a finite value of function [35]. Note that the domain of skewness parameter ε is the interval (−1, 1).

Moment Generating Function for Random Variable X Distributed as ABEP
The moment generating function of the random variable X is: where t ∈ R and m ∈ N. In order to calculate the integral E[exp(tX)], the Taylor expansion at x = 0 of the function exp(tx) = ∑ ∞ m=0 (tx) m m! must be gotten. After some straightforward calculation for the integral E[exp(tX)] via using Equations (6) and (9), the result of integral can be obtained. Figures 1 and 2 illustrate the examples of PDFs of ABEP distribution for some values of parameters that give all possible shapes of function. As can be seen from these figures, the shape of peakedness, bimodality and asymmetry can be controlled at the same time via parameters in ABEP. When different values of parameters α 1 , α 0 and δ 0 , δ 1 are chosen, a different shape of peakedness and a bimodality with different heights around location parameter µ are obtained, respectively. The skewness parameter ε makes an asymmetry around parameter µ. Laplace and their half forms due to α 1 > 0 and α 0 > 0; (b) bimodal densities due to δ 1 > 0, right of density is normal and Laplace due to δ 0 = 0 and α 0 .

Tail Behaviour Property of ABEP
Tail behaviour or heavy tailedness of a distribution is examined by means of definitions given below [36]: From Equation (14), the positive part of CDF includes the lower incomplete gamma function γ. The function γ(a, b) is examined to get the limit in Definition 1. For b > a, this function goes to zero. Then, lim x→+∞ exp(λx)G(x) can go to zero when b is more bigger than a. Otherwise, this limit is infinite.
It is seen that when b as a variable x of the function γ has big values, that is, an outlier is included by data, the heavy-tailedness property of ABEP can be obtained. For a ≥ b, there is already a tendency to get small values of variable x in γ function in Equation (14), which does not correspond to an outlier in the data set when it is compared with case b > a in γ function. Thus, having an undefined value for lim x→+∞ exp(λx)Ḡ(x) is not a problem in order to test the heavy-tailedness property of function G via Definition 1.

Definition 2.
Suppose that random variable X has a PDF g defined on [0, ∞). If E[exp(tX)] = ∞, for all t, then g is a heavy-tailed distribution.
Note that the generalized gamma distribution is reflected to a negative axis or x < µ. The tail behaviour at x > µ or x < µ has the same role. Then, Definition 2 can be used for ABEP.
From Equation (16), E[exp(tX)] = ∞ is satisfied due to m in summation in Equation (16) of ABEP distribution because m goes to infinity and Γ function gives infinity for big values of m. Then, ABEP is a heavy-tailed distribution.
A comment for heavy-tailedness from the results of Definitions 1 and 2 is given: the skewness parameter ε and also shape parameters α 1 , α 0 , δ 1 , δ 0 work together in order to get a heavy-tailed function because they are responsible for changing the shape of function.

Special Cases, Related Distributions and Flexibility of ABEP
When we want to make a comparison among distributions in [28,29] and ABEP, the ordered form from lowest to highest for capacity on modelling frequency is to be [28,29] and ABEP distribution. For this reason, ABEP distribution is defined by using the generalized gamma distribution. The obtained distribution has five parameters. Thus, ABEP distribution will have some properties: when α 1 = 1 and α 0 = 2, the left side of the location is half of Laplace distribution and the right side of location is half of normal distribution for ε = 0 and δ 1 = δ 0 = 0. For values of α 1 = 2 and α 0 = 1, the obtained function will be vice versa from the previous case. For these situations, when ε = 0, ABEP will be ε-skew form of half from Laplace and normal distributions. It is easily seen that ABEP distribution can be a combination of Laplace and normal distributions for values of peakedness parameters α 1 and α 0 of distribution in ε-skew form. The nuisance parameters k 1 and k 0 are added to have the same form of normal and Laplace distributions and also ABEP can have the same framework with algebraic and exponential power distributions in references at items given below. The location-scale form is also provided for ABEP in Equation (12). The parameters α 1 , δ 1 and α 0 , δ 0 determine the overall shape of function for x < µ and x ≥ µ, respectively. Tails at negative and positive sides of the real line can be platykurtic (α 1 , α 0 → ∞) and leptokurtic (α 1 , α 0 → 0). Note that the random variable T in the variable transformation Z = Y 1/α T has the same role with ε-skew approach in [33,37,38]. Thus, ABEP is a general scheme for distributions in the class of algebraic and exponential functions. The special cases, the related distributions and the flexibility of ABEP distribution are given in the following items: 1. When α 1 = α 0 = α > 0, ABEP distribution drops to the kernel of distribution in [29]. 2. If δ 0 = δ 1 = δ > 0, the density function has two modes (bimodal case) with the same height.
The first developer of EP is in [47] via solving the differential equation as a different sense from GG in Equation (17). The EP as generalized error distribution was proposed by [48]. In ABEP distribution, there are parameters for modelling x < µ and x ≥ µ. Thus, the bimodality can be produced (see also Section 2.1.1) and the role of parameters that creates bimodality due to reflection approach in Equation (2) of GG function can be observed easily.

Maximum Likelihood Estimations for Parameters of ABEP Distribution
Let x 1 , x 2 , ..., x n be a random sample of size n from an ABEP distributed population. The unknown parameters µ, σ, α 1 , α 0 , δ 1 , δ 0 and ε will be estimated by an ML estimation method [35]. Here, the parameters k 1 and k 0 are nuisance parameters. The log-likelihood log(L) function is: where n 0 is the number of non-negative observations and n 1 is the number of negative observations. θ = (μ,σ,α 1 ,α 0 ,δ 1 ,δ 0 ,ε) are ML estimators of parameter vector θ = (µ, σ, α 1 , α 0 , δ 1 , δ 0 , ε). The second derivative test can be used to determine whether or not the log(L) function in Equation (18) has the maximum value. However, since PDF has seven parameters µ, σ, α 1 , α 0 , δ 1 , δ 0 and ε, it is not easy to get a Hessian matrix because of two cases and seven parameters in ABEP. One can get it via using the mathematical software programs, which are Maple 18.00 (Maplesoft, Waterloo, ON, Canada) or Mathematica 9.0.1.0 (Wolfram Research, Champaign, IL, USA). It is also noted that ABEP can have a discontinuity point at x = µ for some values of parameters. There can be a solution to overcome this problem if we focus on improving the modelling capacity of PDF having more parameters, which help us to increase flexibility of the function. Thus, the efficiency for ML estimators of the parameters µ and σ is increased. A solution in an indirect way for this problem is that one can use GOFT statistics, such as KS, CVM and AD to see the distances between expected and empirical cumulative distributions. It is well known that the smaller values of the GOFT statistics mean that more fitting performance accomplished by the function. In the computation process, optimization of nonlinear function in Equation (18) is conducted via hybrid genetic algorithm (HGA) in MATLAB 2016a (MathWorks, Natick, MA, USA). In HGA, intervals for parameters that will optimize the log(L) function in Equation (18) (−1, 1), which is a domain of skewness parameter ε. k 1 and k 0 , as nuisance parameters are taken to be α 1 and α 0 . This form is appropriate to have the same form of normal and Laplace. Let us remind that ABEP is a generalized normal or Laplace distribution. Thus, k 1 and k 0 are nuisance parameters.
The Fisher information matrix for parameters µ and σ from ABEP is given by matrix F in the following form: The Cramér-Rao lower bounds (RCLBs) for ML estimators of parameters are given. The Monte Carlo numerical integration is used to compute the integrals in Fisher information in Equation (19) for RS, ESC and ASL distributions.
Equations (6) and (9) are used to calculate the integrals in matrix F. Due to the analytical expression of PDF in Equation (12), off-diagonal elements of matrix F are non-zero. Here, shape α 1 , α 0 , bimodality δ 1 , δ 0 , skewness ε and nuisance k 1 , k 0 parameters make a covariance structure between location µ and scale σ parameters. From this result, covariance structure on ML estimators of other parameters can be seen. Since it is possible to obtain the covariance among ML estimators, the Fisher information matrix is obtained only for the ML estimators of two parameters µ and σ. If there can be a covariance among ML estimators, the inverse of matrix F cannot be obtained except the generalized inverse. Note that getting matrix F for µ and σ from ABEP is tractable for the calculation of integration of Fisher information. Using the generalized inverse cannot be preferable due to a loss of information in an inverse of a matrix. Loss of information occurs because the multiplication of generalized inverse of matrix F and F, that is, F − F, does not give an identity matrix [53]. When α 1 = α 0 = α, δ 1 = δ 0 = δ, = 0, that is, the covariance between ML estimators of µ and σ from ABEP is zero: Some of the regularity conditions [35] are as follows: One can verify that the conditions can be satisfied by using the mathematical software programs, which are Maple 18.00 (Maplesoft, Waterloo, ON, Canada) or Mathematica 9.0.1.0 (Wolfram Research, Champaign, IL, USA). Here, it is possible to get M(X) as X r in Equation (15). Then, the condition 2 is satisfied. The other regularity conditions are already satisfied obviously. Since the ABEP distribution satisfies these two conditions, that is, √ n μ σ − µ σ is asymptotically normal with mean zero vector and covariance matrix [F(µ, σ)] −1 , andμ,σ are asymptotically efficient and asymptotic normally distributed [35].

Simulation Study
In this section, a brief simulation study to verify the behaviour of the ML estimators in finite samples is presented. The data were drawn based on the stochastic representation given in Appendix A.
The arbitrary values for the parameters are chosen. The probable handling for the values of parameters is considered such that the different combination for the values of α 1 , α 0 , δ 1 and δ 0 can be observable. Thus, an asymmetry and a bimodality can be constructed. Table 1 has the chosen values for the parameters. Three sample sizes are utilized: 100, 200 and 500. Based on 1000 replicates for each sample size, the average bias and the root of the mean squared error (RMSE) are computed and they are given in Table 1. HGA is used to optimize the log(L) function in Equation (18)   In general, it is observed from the results of simulation that the bias can be acceptable and it can decrease when the sample size increases. When the sample size n increases, the values of RMSE go to zero because the function is represented well by the artificial data, which is an expected result.

Real Data Illustration
In this section, the modelling capability of ABEP is shown by applying it to two data sets from microarray [54]. The real data sets from the web site in [54] are given in Tables A1 and A2 in Appendix B. The analysis of proteins in cancer cells is important. The efficient estimates of location and scale parameters for these proteins have a crucial role in medical care. For this reason, we prefer to focus on these data sets that have different shapes of peakedness, bimodality and asymmetry.
In the second step, the distributions are considered to model these data sets. In the estimation process, we use the maximum likelihood method together with GOFT statistics, mostly prominent ones that are KS, CVM and AD (robust one) distances to test the fitting capability of distributions [55]. When the estimates of parameters are computed, we can examine via GOFT statistics which of the five PDFs is the best fit on data.
The bimodal extended generalized gamma (BEGG) [29], the Rathie-Swamee (RS) (RS is also known to be a modified version of generalized logistic) [11][12][13], the exponentiated sinh Cauchy (ESC) [10] and the alpha-skew Laplace (ASL) [24] distributions are used to fit the data and make a comparison between them and ABEP. There are many different distributions that have been proposed; however, using an explicit expression for CDF would be preferred to fit the data. For this reason, the distributions having an explicit expression for their CDFs are used. Thus, GOFTs can be used without including the numerical integration methods having computational errors.
Modelling data (or Riemann integration in randomly putting the bin of histograms on the real line) is equivalent to an integration. Thus, the discontinuity at x = µ is not a problem for estimations of parameters. For computation, the HGA is used. HGA also includes the derivative free approach [56] for optimization. Then, the discontinuity point x = µ is not a problem for optimization of log(L) function in Equation (18) according to parameters. At the same time, GOFT statistics are used while performing the computation process.
The algorithm given in Appendix A allows for generating a random variable that is distributed according to a PDF given in Equation (12). Thus, the performance of fitting can be checked via the counted data at the prescribed ranges of domain. However, this procedure is rough when it is compared with GOFTs. It is also beneficial to observe the performance of the random number generation procedure.
The number of replicated sample size n is 100,000. Data generated from ABEP, BEGG and ESC distributions are sorted from small to big values for each sample size n. After sorting, arithmetic mean of 100,000 artificial data is obtained for n = 118. After artificial data are generated from their corresponding PDFs, it is also possible to check the fitting performance of these functions via the artificial data (see Tables 4 and 7). Since ABEP, BEGG and ESC are competitive distributions for fitting data and they have a random number generation procedure, they are preferred to check their similarities with real data. In two examples given in the following subsections, Tables 2 and 5 give the ML estimates of parameters in PDFs and GOFT statistics for these estimated values of parameters. Tables 3 and 6 show the asymptotic variances and covariances of ML estimators for the parameters µ and σ. The Monte Carlo numerical integration method is used while performing the computation process of integration in the Fisher information matrix. Tables 4 and 7 represent the counted data at the prescribed ranges of domain.

Comments on the Results of Examples 1 and 2
For both of the examples, Figures 3a and 4a show that ABEP fits better than the other distributions. In particular, the modalities around location have been modelled as the different modes of heights. The shape of peakedness can be modelled as well. The right side of the location is especially modelled very well by ABEP in Example 2. The asymmetry illustrated in Example 1 has been modelled.
The histograms of data in Example 2 do not show an asymmetry and the ML estimate of skewness parameter is very near to zero because, as it is seen from Figure 4a, the histograms do not have an asymmetry when they are compared with histograms in Figure 3a. The unequally distributed histograms around location in Figure 3a can show that there is an asymmetry in the data set.
For both of the examples, Tables 2 and 5 give the ML estimates of parameters of distributions and GOFT statistics of fitted densities. ABEP distribution has the best fitting on data when we consider the values of KS and CVM statistics. When we look at the fitting performance for all distributions from Figures 3a and 4a, it is seen that ABEP, BEGG and ESC have better fitting performance than RS and ASL. However, when ABEP and ESC are compared, it is observed that two parameters λ and β of ESC are not enough to get the precise fitting on data because these parameters work together around the location. In BEGG, there is only one parameter α to control the fitting shape of function on the real line.
In ABEP, the role of parameters α 1 , α 0 , δ 1 and δ 0 around the location is constructed definitely. Thus, more precise estimates for parameters µ and σ can be obtained through these parameters if the data are from many phenomena.   It is well known that the probability value (p-value) of a test statistic depends on the fitted density. For this reason, more efficient density must be preferred before getting the p-value of a test statistic from corresponding density. Then, the potential problem that can occur in future from phenomena can be refrained. The estimates of µ from fitted densities of ABEP, BEGG, RS and ESC can be close to each other, but the estimates of µ of ABEP are more precise because ABEP is the best one for fitting on data. Similarly, the estimates of σ of ABEP for both of examples are the best ones.
The random number generation procedure can be conducted in a more precise way for ABEP, BEGG and ESC distributions because ABEP and BEGG have an algorithm of random number generation in Appendix A. The inverse of CDF of ESC distribution [10] can be taken to get the random numbers from ESC. The artificial data generated from ABEP distribution also show that the counted artificial data at ranges can be similar to the counted real data at ranges (see Tables 4 and 7). It is noted that the mostly counted data (the numbers 37 and 62 in Examples 1 and 2, respectively) at an interval for real data are constructed by the artificial data generated from ABEP distribution for the prescribed ranges on the real line. The counted artificial data from ABEP represent the counted real data when they are compared with the counted artificial data from BEGG and ESC. Thus, we can infer that the data generation procedure is also successful after we get the precise estimates of parameters in ABEP via collaboration with GOFT statistics.
GOFT statistics in Tables 2 and 5 show that there can be a numerical error in the computation of special function from CDF of ABEP. The AD for ABEP can have a numerical error from the computation of CDF because CDF of ABEP is a special function. Even if CDF of ABEP depends on special functions that are incomplete gamma functions, the fitting performance of ABEP is the best one due to the fact that all possible parameters (shape, bimodality and skewness) are added into ABEP.

Conclusion and Discussion
A family for bimodal distribution with two parameters fitting the shape of peakedness (α 1 and α 0 ), two parameters fitting the height of bimodality (δ 1 and δ 0 ) and a parameter fitting the asymmetry (ε) in the data set have been proposed. The unimodal case of this family is obtained when δ 1 = δ 0 = 0. The skewness parameter in this family is from the ε-skew approach, which can produce the asymmetry around location. The importance of having these parameters in ABEP for modelling around locations separately has been observed when we make a comparison among ABEP, BEGG, RS and ESC distributions that have explicit expression for CDF. As a result, ABEP can model efficiently the shape of peakedness, the bimodality and the asymmetry at the same time because ABEP has parameters that are responsible for fitting the shape of peakedness, the bimodality and the asymmetry in data when it is compared with BEGG, RS, ESC and ASL distributions.
The well known approach that derives PDFs without consulting the variable transformation technique is applied for the tractable functions in Equations (6)- (11) to propose a new distribution. It is clear that this approach can be applied for other kinds of distributions that are on the negative, positive or real lines. The disadvantage of this approach is that the analytical expression of a function must be tractable to derive a PDF. Equations (6)-(8) are the power version of gamma, lower and upper incomplete gamma functions. The functions in Equations (9)-(11) are transferred to the negative side of the real line through using functions in Equations (6)- (8). They are a new kind of the special functions to calculate the integrals having the kernel of gamma function. One can get distributions via these functions. For example, alpha-skew Laplace [24], alpha-beta skew normal [3], alpha-skew generalized t with variable transformation [57,58], symmetric and asymmetric EP [40][41][42][43][44][45] distributions with the recalculated NC can also be obtained by these special functions. The special cases, the related distributions and the flexibility of ABEP are given in the relevant section.
An algorithm for generating artificial data from ABEP is provided. Thus, the similarity between artificial and real data sets has been observed as a rough approach and the performance of optimization for the log(L) function and GOFTs can be supported by this similarity as well. The benefit of GOFTs is depicted when a PDF has more parameters. The best performance on optimizing the log(L) function can also be checked by GOFT statistics. Thus, if CDF of a PDF exists, using GOFTs as an indirect way to check the potential optimization problem(s) is provided when the second derivative test is a problem for getting the Hessian matrix with respect to parameters of log(L) function. HGA is also used to overcome the problems that can occur while performing optimization of log(L) function according to the parameters in ABEP. As a result, performing a cross check between the optimization tool HGA and the GOFT statistics is a beneficial approach to overcome the potential problem(s) from the computation process. Thus, more precise ML estimates for parameters can be gained. When it is considered on overall results from illustrating of PDF and CDF and also artificial data, the GOFT statistics and these results support each other to show the fitting performance of ABEP. The brief simulation study gives the satisfactory results for the bias and RMSE.
RCLBs for ML estimators of parameters µ and σ are obtained. The properties of ABEP are provided and so the heavy-tailedness property of ABEP distribution has been examined. The heavy-tailedness of ABEP from Definitions 1 and 2 are guaranteed when b > a in the γ function. Definitions 1 and 2 imply that ABEP can be a heavy-tailed distribution together with that comment in there.
The entropy-based parameter estimation for ABEP is an ongoing issue in [31,32] to study via the proposed special functions in Equations (6)- (11). We will introduce the information theoretic model selection criteria [59] to fit the models, and the results will be published separately. In the future, a package in a statistical software R from open access will be prepared for ABEP distribution with different estimation methods [31,55], and the information theoretic model selection criteria will be added into this package.