A New Birnbaum–Saunders Distribution and Its Mathematical Features Applied to Bimodal Real-World Data from Environment and Medicine

: In this paper, we propose and derive a Birnbaum–Saunders distribution to model bimodal data. This new distribution is obtained using the product of the standard Birnbaum–Saunders distribution and a polynomial function of the fourth degree. We study the mathematical and statistical properties of the bimodal Birnbaum–Saunders distribution, including probabilistic features and moments. Inference on its parameters is conducted using the estimation methods of moments and maximum likelihood. Based on the acceptance–rejection criterion, an algorithm is proposed to generate values of a random variable that follows the new bimodal Birnbaum–Saunders distribution. We carry out a simulation study using the Monte Carlo method to assess the statistical performance of the parameter estimators. Illustrations with real-world data sets from environmental and medical sciences are provided to show applications that can be of potential use in real problems.


Introduction
Many phenomena may be modeled by continuous statistical distributions [1]. These distributions may be divided into symmetrical and asymmetrical with a wide variety of shapes (symmetry, skewed to the right, skewed to the left, unimodality, multimodality, platykurtosis, leptokurtosis, etc.) and supports (real numbers, positive reals, etc.). The most popular continuous distribution is the Gaussian or normal model, which has been the reference distribution for many years in statistics. However, other continuous distributions have gained space in the landscape of statistics due mainly to the important development of computation. For more details of theory, methodology, and applications of diverse continuous distributions, see [1].
A model that has received considerable attention recently is the Birnbaum-Saunders (BS) distribution, which is unimodal, asymmetric (skewed to the right), and has two parameters that modify its shape and scale. The BS distribution has good properties. For example, it is closely related to the normal distribution; it is closed under scalar multiplication (proportionality) and under reciprocation; its median coincides with the BS scale parameter; it has different shapes for its probability density function (PDF), which cover high, medium, and low asymmetry levels; it belongs to the class of log-symmetric distributions; it is a mixture model; it has mathematical arguments to be considered as a cumulative damage distribution; for more details on these and other properties, see [2][3][4][5][6][7].
Despite the wide applicability of the BS distribution, its shapes fail to model bimodal data. Note that phenomena associated with this type of data is more frequent than thought so that, under the scenario of a suitable application of the BS distribution, new bimodal models of this kind are needed to form part of the pool of statistical tools of diverse practitioners. Some BS distributions that included a bimodality parameter were introduced in [26,27]. However, these distributions showed little flexibility in the modeling of bimodal data. Therefore, a bimodal BS distribution that permits real-world phenomena to be modeled with high flexibility to gain precision in the information obtained from the data analysis based on such a model is needed.
The main objective of this investigation is to propose and derive a new bimodal BS distribution. Its derivation is followed by the corresponding mathematical expressions that allow the computation of probabilistic features and moments. These features enable us to obtain location, scale, variability, and shape measures, such as the mean, variance, and coefficients of variation (CV), skewness/asymmetry (CS), and kurtosis (CK). In addition, a simulation algorithm is presented, permitting us to generate values for a random variable that has the new proposed distribution. Such an algorithm allows us to simulate bimodal BS distributed data for assessing the performance of the parameter estimators of this new model. A computational implementation in the R software is provided, which helps to exemplify the obtained results with two real-world data sets from environmental and medical sciences and to show potential applications.
The remainder of the paper is structured in the following form: Section 2 presents background on the BS distribution and its use to model environmental and medical phenomena. In Section 3, we propose and derive the new model with its mathematical properties. In Section 4, the estimation of parameters of the bimodal BS distribution is performed. Section 5 proposes an acceptance-rejection simulation algorithm that allows us to generate bimodal BS distributed values. This section also provides the numerical results of our investigation with simulated and real data sets. At last, in Section 6, we give the conclusions, limitations, and ideas for further work that have emerged from the present study.

The Birnbaum-Saunders Distribution
In this section, we provide preliminary elements of the BS distribution. In addition, we illustrate how this distribution has mathematical arguments to describe environmental and medical data.

Definition
The results of this section can be seen in detail in [5]. If a random variable Y has a BS distribution with parameters α and β, the notation Y ∼ BS(α, β) is employed. The BS distributed random variable Y is related to the standard normal distributed random variable V by means of where V ∼ N(0, 1), α > 0, and β > 0. The BS PDF is given by with a(y) = (1/α)( y/β − β/y), where φ is the standard normal PDF.

Justifying the BS Distribution in Environmental and Medical Contexts
The BS distribution can be adequate to model data from certain fields based on an empirical fit using goodness-of-fit techniques. However, this empirical argument by means of a simple fit can be unsatisfactory. Indeed, if we show that a distribution is suitable to model a phenomenon, employing a theoretical justification may be a better argument. For more details about the results presented below, see [5] (fatigue), [18] (environment), [7] (medicine), and [28] (neural activity).
Assume a random variable associated with a generic metric. For instance, the tumor size can result in the death of a patient in a medical setting, or in an environmental context, the accumulation of pollutants in the atmosphere leads to a critical environmental situation. A generic metric can be stated considering the effect of several independent causes that interact, which work together to increase the value of the metric over time.
When the effects of such causes are accumulated randomly, this accumulation has a Gaussian asymptotic distribution based on the central limit theorem. Nevertheless, the causes do not seem to jointly act by a simple summation. It is more natural to suppose that each cause gives an impulse, and then the effect depends on both the impulse strength and value of the metric at the moment when the impulse is working.
Let M 1 , . . . , M k be independent random variables associated with the magnitude of k impulses that work sequentially according to their subscripts. Furthermore, let O be a generic metric that is obtained from these impulses. Suppose that O j+1 corresponds to value of the metric O at the instant immediately before, which increases by the (j + 1)th impulse M j+1 that is proportional to some function g of the mentioned value of the metric O stated as Thus, O j+1 is the accumulated value of the metric after applying M j+1 . The formulation expressed in (2) is usually known as the proportionate-effect law. From this law, observe that the BS distribution may be reached when g(O j ) = 1 in (2). Note that the BS model was built to describe the fatigue life of materials subject to cyclic stress. It causes damage that is accumulated over time by summing several small damages. If the cumulative damage is greater than the rupture value (threshold) of the material, it fails [5]. Table 1 shows a conceptual comparison for environmental, fatigue, medical, neural frameworks that follow the proportionate-effect law.
with o being the generic metric and do its differential, m being the number of impulses and dm its differential, S being the strength exercised in each impulse, f being an empirical function and c, c 1 being constants. Observe that o 1/2 b∆S stated in (3)  Observe how the differential-equation stated in (3) and the proportionate-effect law given in (2) are similar. Assuming (2), apply the central limit theorem, and consider that the increment ∆O j = O j+1 − O j in the (j + 1)th impulse provides a small contribution to the growth of the metric. Hence, if the summation is switched by integration, ∑ k j=1 M j is closely a Gaussian distributed random variable, with and O s is an initial value of the metric and O k its final value after applying k impulses.
To generate the BS model in a general framework, consider that the mean of O j is η and its variance is 2 . Hence, a continuous extension of (4) using the time t is established by with O(t) being the value of the metric at t and O(t 0 ) its starting value. Now, suppose that Using (6), observe that the function g in the model given in (2) establishes the dependence of the metric on its previous value. In this case, a power function for g can be suitable obtaining g(x) = x ς , where ς is an indicator associated with the metric used for each framework (see Table 1), and ς ≥ 0, but ς = 1. Hence, based on (5) and considering g(x) = x ς , we have that Then, from (7)-(6) and by symmetry, we get . Therefore, the CDF of T at t is expressed as From (8), observe that the BS distribution is obtained for ς = 0. Nonetheless, if ς = 1, it is the case of the lognormal distribution (g(x) = x) in the proportionate-effect law stated in (4), which cannot be obtained from (8). Hence, the CDF of T formulated in (8) corresponds the BS distribution and not to the lognormal distribution.
To be specific and as an example, the setting discussed above might be placed in a framework of pollution, doing an analogy between the proportionate-effect law and a contaminant concentration model. Such an example is valid as a justification to model data on indoor and outdoor environmental quality, water contamination in lakes, groundwater, and soils, and trace metals in human tissue, blood, and feces.
To contextualize the model stated in (8) for environmental contamination, suppose the following. We assume a non-random framework. Consider that an environment has a pollutant concentration X 0 and an air mass V 0 at an initial time t = 0. Hence, q 0 = X 0 V 0 corresponds to the total pollutants in the environment. If no chemical reactions are present, conservation of matter requires the total pollutant amount after dilution to be equal to the total pollutant amount before dilution. In this way, when V 1 is the volume of the mixture after the first dilution and X 1 the pollutant concentration in that diluted mixture, we have that Solving the equation stated in (9) for X 1 , the content of X 1 is inversely proportional to the ratio of the final volume to the initial volume given by X 1 = (V 0 /V 1 ) X 0 . Therefore, if V 0 /V 1 is established by a dimensionless dilution D 1 that is treated as a random variable, then X 1 = D 1 X 0 is also a random variable, where X 0 is non-random. Such a random framework is closer to reality due to the fact that contaminants vary over time because of climatology, geography, topography, and human activity.
Let D j denote the jth dilution. Then, the concentration X 2 in the second dilution is formulated as In addition, let N be the number of successive dilutions in a random dilution process. Then, the final concentration after N dilutions is defined as By applying the logarithm function in the expression given in (10), we get that Alternatively, the formulation defined in (11) may be expressed by means of a differential equation of first-order as usual in the mass balance model, after permitting the change rate of the concentration to be small enough. Once again, observe how the equations stated in (4) and (11) if g(x) = x are similar. As mentioned, this formulation is associated with a lognormal distribution. Nevertheless, a more general expression for (11) might be assumed by considering . (12) Note that the expression stated in (12) should conduct a BS model upon certain conditions, with this expression being similar to (4). Such a setting permits us to consider a statistical distribution to model contaminant concentration data, which should not be symmetrical but skewed to the right.

The BS Distribution under Bimodality
In this section, we obtain the bimodal BS distribution of a random variable Y based on the product of the BS(α, β) PDF and a fourth-degree polynomial function as a competitor of two other versions of bimodal BS distributions proposed in [26,27], which are named bimodal BS distributions of types 1 and 2, denoted by BBS1 and BBS2, respectively. The new bimodal BS distribution, which we name type 3, is denoted by Y ∼ BBS3(α, β, δ), with δ ≥ 0 being the bimodality parameter.

Probability Density Function
A bimodal BS distribution with parameters α 1 , β 1 , and δ 1 of a random variable Y, denoted by Y ∼ BBS1(α 1 , β 1 , δ 1 ), was postulated in [26]. This bimodal BS distribution of type 1 has a PDF stated as with a(y) being defined as in (1) and Φ being the standard normal CDF. A second bimodal BS distribution with parameters α 2 , β 2 , and δ 2 of a random variable Y, denoted by Y ∼ BBS2(α 2 , β 2 , δ 2 ), was postulated in [27]. This bimodal BS distribution of type 2 has a PDF as a competitor of (13) given by with a(y) being defined as in (1). Next, we propose the bimodal BS distribution of type 3 as a competitor of the classical unimodal BS distribution, as well as of the bimodal BS distributions of types 1 and 2, with PDFs established in (1), (13), and (14), respectively.
where f X is the PDF of X ∼ BS(α, β) given in (1), and with SD denoting the standard deviation, α, β being shape parameters, and δ being the bimodality parameter.

Proof. Note that
Observe that the BBS3 distribution is obtained following the idea given in [29]. Thus, to obtain an interpretation of the modality parameter δ, we present below a proposition that generalizes a classical continuous distribution allowing us to generate unimodality or bimodality in the distribution.
Proposition 2. Let f (y; µ, σ) be the PDF of a continuous distribution with finite mean µ, variance σ 2 , and CK κ. Then, it is possible to build a new distribution with PDF stated as h(y; µ, σ; δ) = ω(y; µ, σ, δ) f (y; µ, σ), with 0 ≤ δ < ∞ being the parameter that controls the unimodality or the bimodality of the distribution and where Note that the function stated in (16) is a genuine PDF.
Proof. The result is directly obtained by taking into account that f (y; µ, σ) ≥ 0 and integrating over the support Observe that the parameter δ defined in the PDF given in (16) controls the unimodality or bimodality of the corresponding distribution. Additionally, by taking δ = 0, the parent PDF f (y; µ, σ) is obtained as a special case. For a shape analysis of the BBS3 PDF compared to the BS distribution, see Figure 1.

Cumulative Distribution Function and BBS3 Properties
Next, we give some elementary properties of the BBS3 distribution. Note that this distribution is closed under reciprocation and has the standard BS distribution as a particular case.
The following proposition states the CDF of the BBS3 distribution, which depend on the standard BS CDF. ∼ BBS3(α, β, δ). Then, the CDF of Y is given by
Proof. Let Y ∼ BBS3(α, β, δ) and F X be the CDF of X ∼ BS(α, β). Then, Figure 1a shows the PDFs of the BBS3 and BS distributions for different values of α, β and δ. Here, we can see the bimodality of the BBS3 distribution and the unimodality of the BS distribution. Figure 1b shows the graphical plots of the corresponding CDFs for different values of their parameters.

Cumulant Generating Function
The following proposition states the cumulant generating function of the BBS3 distribution, which depends on the moment generating function of the standard BS distribution. From the cumulant generating function, the cumulants of a distribution may be obtained, which correspond to quantities that provide an alternative to the moments of the distribution. Note that the rth cumulant is computed as where C(t) is the cumulant generating function of the BBS3 distribution given in the following proposition, and C (r) (0) defined in (17) is obtained by taking the rth derivative of C(t) with respect to t and evaluating it at t = 0. β, δ). Then, the cumulant generating function of Y is obtained as with E(e tX ) being the moment generating function of X ∼ BS(α, β). Proof.

E(e tY ) =
∞ 0 e tY f Y (y; α, β, δ)dy e ty f X (y; α, β)dy The rth moment µ r is an rth degree polynomial function in the first r cumulants, where Observe that the "prime" states a difference between the moments µ r = E(Y r ) and the central moments µ r = E((Y − µ) r ). To obtain the central moments as functions of the cumulants, we must eliminate from these polynomials all the terms in which c 1 is a factor. Thus, from the above expressions, we generate the first six moments as µ 1 = 0, µ 2 = c 2 , µ 3 = c 3 , µ 4 = c 4 + 3c 2 2 , µ 5 = c 5 + 10c 3 c 2 , and µ 6 = c 6 + 15c 4 c 2 + 10c 2 3 + 15c 3 2 . Similarly, the rth cumulant c r is an rth degree polynomial function in the first r raw (non-central) moments so that the first six cumulants are stated as To obtain the cumulants c r for r > 1 as functions of the central moments, we must remove from these polynomials all the terms in which µ 1 is as a factor. Hence, from the above expressions, we generate the first six cumulants as c 2 = µ 2 , c 3 = µ 3 , c 4 = µ 4 − 3µ 2 2 , c 5 = µ 5 − 10µ 3 µ 2 , and c 6 = µ 6 − 15µ 4 µ 2 − 10µ 3 2 + 30µ 2 3 . This indicates that the first cumulant is the mean, the second cumulant is the variance, and the third cumulant is the third central moment. However, from the fourth cumulant, they are not equal to the central moments.

Moments
The following proposition states the moments of the BBS3 distribution. Essentially, these moments depend on the moments of the standard BS distribution. Proposition 6. Let Y ∼ BBS3(α, β, δ). Then, for r = 1, 2, . . . , we have that where E(X r ) is the rth moment of X ∼ BS(α, β), and µ, σ, κ are given in Proposition 1.

Inference
In this section, we discuss statistical inference based on the moment and maximum likelihood estimators of the BBS3 distribution parameters. For more details about moment estimation for the standard BS distribution, see [12,30].

Maximum Likelihood Estimators
Let Y = (Y 1 , . . . , Y n ) be a random sample of size n from Y ∼ BBS3(θ), with θ = (α, β, δ), and y = (y 1 , . . . , y n ) the observed value of Y. Then, the maximum likelihood estimator of θ can be obtained from the corresponding log-likelihood function defined as where z i = (y i − µ)/σ, for i = 1, . . . , n. Thus, the score vector associated with the loglikelihood function stated in (18) has elements given by By solving the estimation equations obtained from the score vector numerically, we obtain maximum likelihood estimators of the parameters α, β and δ.

Numerical Applications
In this section, we use simulated and real-world data to show our numerical results. An algorithm is employed to simulate bimodal BS distributed data and then evaluate the performance of the estimators of the new bimodal BS distribution parameters. Hence, a computational implementation in the R software produced by the authors of this study is utilized to illustrate the obtained results with real-world data sets from environmental and medical sciences to show potential applications. The data analyzed in this paper and the computational codes are available from the authors under request.

Simulation Algorithm and Computer Characteristics
Even though some software provide built-in random number generators, there are probability distributions that are not covered by these software. This is the case of the BBS3 distribution. We use the acceptance-rejection method to generate random numbers from the BBS3(α, β, δ) distribution with PDF defined in (15) according to Algorithm 1. The results in a sequence of n random numbers are stored within an array that we name n_vector. Since the BBS3 distribution has a non-finite support, we use a constant l 1 > 0 to limit the BBS3 values generated. Furthermore, we employ a constant l 2 > 0 corresponding to the maximum value of the BBS3 PDF f , which must be evaluated at the true parameters. In addition to the parameters α, β, γ of the BBS3 distribution, to build the algorithm, we need to define: • n: The length of the n_vector.
The BBS3 PDF with y > 0. • l 1 : A lower limit for the BBS3 numbers to be generated with l 1 > 0. • l 2 : The maximum value of f Y with l 2 > 0. • U 1 : A random variable with a uniform distribution in (0, l 1 ), U(0, l 1 ) in short. • U 2 : A random variable with a U(0, l 2 ) distribution.

6
Repeat steps 3-5 until the length of n_vector is equal to n;

end
The computational experiments were carried out in the R programming language, using a computer with the following characteristics: (i) OS: Windows 10 Pro for 64 bits; (ii) RAM: 8 Gigabytes; and (iii) Processor: Intel(R) Core(TM) i7-8550U CPU @ 1.99 GigaHertz. Algorithm 1 was executed 100 times with n = 100, 000, and an average processing time equal to 2.9 seconds was obtained. When executing this algorithm 5000 times, with n = 50, 200, the average processing times were 0.035 and 0.039 seconds, respectively.

Results of the Simulation Study
We use Algorithm 1 to conduct a Monte Carlo simulation study. The results of this study are reported in Table 2 to assess the behavior of the maximum likelihood estimators based on 5000 generated samples of sizes n = 50, 100, 150, 200 from a BBS3(α, β, δ) distributed population. For each sample generated, maximum likelihood estimates were computed numerically using a Newton-Raphson procedure. Then, empirical means, SDs, average length (ALI) and covering probabilities (C) of 95% confidence intervals are reported. Note that the empirical bias becomes smaller as the sample size n increases, as expected. Table 3 shows the results of a simulation with 5000 samples but with smaller sizes (n = 10, 25, 40) and using moment estimation.

Illustrative Example I with Real-World Data
The first data set corresponds to the duration of the eruption for the Old Faithful geyser in Yellowstone National Park, Wyoming, USA; the data were reported in [31]. Descriptive statistics for the data set are the following: 272 (sample size), 43 (minimum), 96 (maximum), 76 (median), 70.897 (mean), 13.595 (SD), 19.176 (CV), −0.414 (CS) and 1.844 (CK). Table 4 reports the maximum likelihood estimates, standard errors (SEs), and log-likelihood (loglik) values for the BBS1, BBS2, and BBS3 models. Furthermore, we report the values of the Akaike (AIC) and Bayesian information (BIC) criteria as well as the p-values of the Kolmogorov-Smirnov (KS) test. Figure 2a shows the histogram with estimated PDFs, which indicates that the BBS3 model fits the data better than the BBS1 and BBS2 models. This result is supported by Figure 2b with the estimated CDF and by Figure 3 based on the theoretical versus empirical quantiles (QQ) plots. Table 2. Empirical means, SDs, ALIs, and covering probabilities of 95% confidence intervals based on simulated data from the BBS3(α, β, δ) distribution for its parameter estimators using the maximum likelihood method.  Table 3. Empirical means, SDs, ALIs, and covering probabilities of 95% confidence intervals based on simulated data from the BBS3(α, β, δ) distribution for its parameter estimators using the moment method.      Once the best model to describe these data has been selected, that is, the BBS3 distribution, and its parameters estimated, diverse statistical analyses can be conducted by practitioners and users of statistics. For example, by estimating the mean duration of the eruption of geysers, certain measures may be established to reduce the risk against people visiting the Yellowstone National Park in Wyoming, USA. In our case, the mean duration of the eruption of geysers in Yellowstone Park is 49.205 minutes. This information is obtained from the estimates reported in Table 4 and the expressions taken from Section 3.4 specified as E(X 6 ) = β 6 2 10, 395α 12 + 11, 340α 10 + 5670α 8 + 1680α 6 + 315α 4 + 36α 2 + 2

Illustrative Example II with Real-World Data
Fleroxacin is a fluoroquinolone derivative for a broad antibacterial spectrum and powerful in vitro activity against various gram-negative and many other gram-positive varieties. This antibacterial element was studied in [32] with the aim of estimating the representative values of clearance on systemic availability (D/F) and volume of distribution on systemic availability (I/F), after administering therapeutic doses of fleroxacin. In addition, the factors that influence the disposition of fleroxacin were detected and to what degree. The individuals studied were 172 volunteers, men and women, healthy, without infections and within a wide range of ages. Among the data that were analyzed in [32] are the following measurements (in ml/min) of clearance on D/F and treatinine clearance (Dcr). In this second illustration with real data, we consider a data set corresponding to the Dcr. Table 5 provides descriptive statistics of the Dcr data. Table 6 reports values of the maximum likelihood estimates, SEs, log-lik, AIC, and BIC as well as the KS p-values for the BBS1, BBS2, and BBS3 models. Figure 4 shows the good agreement of the BBS3 model in relation to the BBS2 and BBS2 models, such as supported by the QQ plots of Figure 5.   Similarly to Example I, once the best model to describe these data has been selected, that is, the BBS3 distribution, and its parameters estimated, we obtain the mean treatinine clearance, which is 70.602 mL/min. Once again, this information is obtained from the estimates reported in Table 6 and the expressions taken from Section 3.4 specified in Example I.

Conclusions, Limitations and Future Investigation
In this paper, we have proposed a new bimodal Birnbaum-Saunders distribution, which is widely flexible. We have estimated the model parameters with the moment and maximum likelihood methods and used information criteria for model selection to assess the adequacy of the new bimodal Birnbaum-Saunders distribution in comparison with other competitor distributions.
We have conducted a Monte Carlo simulation study to empirically evaluate the statistical performance of the moment and maximum likelihood estimators of the new bimodal Birnbaum-Saunders distribution. Furthermore, we have studied coverage probabilities and the average length of confidence intervals of the corresponding parameters based on the asymptotic normality of these estimators. This simulation study reported a good statistical performance of such estimators.
Two data analyses were conducted related to environmental and medical sciences with two competing models. Both of these analyses reported a suitable performance of the new model that was superior to all of the competing models, providing evidence that the bimodal Birnbaum-Saunders distribution is a good alternative for modeling bimodal data. Such results reported that the bimodal Birnbaum-Saunders model may be a new setting for analyzing this type of data. The new approach is an addition to the tools of applied statisticians and diverse practitioners interested in the modeling of data. From these applications, we have obtained helpful information that can be employed by practitioners and users of statistics.
Some limitations of our methodology are the instability of the maximum likelihood estimators for small samples. The bootstrap and Jackknife methods and the application of bias correction, mainly for small sample sizes, can be proposed to improve the statistical performance of these estimators. In addition, the use of explanatory variables in the modeling is of interest. Thus, their effect on the different parameters of the new bimodal Birnbaum-Saunders distribution is a topic to be studied in a future work, as detailed next. Other topics of future research based on this new distribution are related to studying how multivariate, regression, quantile, spatial, temporal, partial least squares, principal components, and sampling structures can be inserted in the modeling of bimodal data [33][34][35][36][37][38][39][40][41][42][43][44]. In addition, Tobit and Cobb-Douglas type frameworks may be analyzed in the present thematic [45,46]. Moreover, censored observations might be studied in the context of our investigation [7,47].
The authors are working on these and other issues associated with the present investigation, and the corresponding findings are expected to be reported in future works.