A New Extended Birnbaum – Saunders Model : Properties , Regression and Applications

We propose an extended fatigue lifetime model called the odd log-logistic Birnbaum–Saunders–Poisson distribution, which includes as special cases the Birnbaum–Saunders and odd log-logistic Birnbaum–Saunders distributions. We obtain some structural properties of the new distribution. We define a new extended regression model based on the logarithm of the odd log-logistic Birnbaum–Saunders–Poisson random variable. For censored data, we estimate the parameters of the regression model using maximum likelihood. We investigate the accuracy of the maximum likelihood estimates using Monte Carlo simulations. The importance of the proposed models, when compared to existing models, is illustrated by means of two real data sets.


Introduction
Motivated by problems of vibration in commercial aircraft that caused fatigue in the materials, Birnbaum and Saunders [1] proposed the two-parameter Birnbaum-Saunders (BS) model, also known as the fatigue life distribution, with shape parameter a > 0 and scale parameter b > 0, i.e., BS(a, b).This distribution can be used to model lifetime data and it is widely applicable to model failure times of fatiguing materials.A random variable W having the BS(a, b) distribution is defined by where Z is a standard normal random variable.Its cumulative distribution function (cdf) is given by where ν = a −1 ρ(x/b), ρ(q) = q 1/2 − q −1/2 and Φ(•) is the standard normal cumulative function.
The parameter b is the median of the distribution, i.e., G a,b (b) = Φ(0) = 1/2.For any k > 0, k W ∼ BS(a, kb).Some distributions have been proposed to extend the fatigue lifetime BS model.For example, Cordeiro et al. [2] defined the McDonald-Birnbaum-Saunders distribution, Ortega et al. [3] introduced the β-Birnbaum-Saunders distribution and Hashimoto et al. [4] proposed the Poisson Birnbaum-Saunders model with long-term survivors.Further, Pescim et al. [5] studied the Kummer beta generalized Birnbaum-Saunders distribution, Cordeiro et al. [6] introduced the negative binomial Birnbaum-Saunders model, Cordeiro et al. [7] proposed the gamma-Birnbaum-Saunders distribution, Ortega et al. [8] constructed the odd Birnbaum-Saunders regression model and Ortega et al. [9] studied a four-parameter odd Birnbaum-Saunders geometric distribution.The probability density function (pdf) corresponding to Equation ( 2) is given by where κ(a, b) = (2ae a 2 √ 2πb) −1 and τ(q) = q + q −1 .The fractional moments of Equation ( 3) are determined as E(W p ) = b p I(p, a) ( [10]), where and θ ν (z) = 0.5 ∞ −∞ exp{−z cosh(t) − νt}dt denotes the modified Bessel function of the third kind with ν representing its order and z the argument.A discussion of this function can be found in [11].
Recently, some attempts have been made to define new classes of lifetime distributions that provide great flexibility in modeling skewed data in practice.For any baseline cdf G(x) (x ∈ R), Gleaton and Lynch [12] defined the cdf F(x) of the odd log-logistic-G ("OLL-G" for short) class by ( The pdf corresponding to Equation ( 5) is given by In addition, some authors have investigated the OLL-G class in different settings.For instance, da Cruz et al. [13] proposed the log-odd log-logistic Weibull regression model in survival analysis, Braga et al. [14] studied the odd log-logistic Student t distribution, da Cruz et al. [15] considered the bivariate odd-log-logistic-Weibull regression model for oral health-related quality of life, Cordeiro et al. [16] proposed the generalized odd log-logistic family and, recently, Prataviera et al. [17] introduced the heteroscedastic odd log-logistic generalized gamma regression model for censored data.We use a similar method to define a new distribution by compounding the Birnbaum-Saunders and Poisson distributions whose procedure is described below.
The method of compounding discrete and continuous distributions started fifteen years ago to model complex situations under series and parallel structures.Many compounded classes of distributions can be constructed by choosing a positive integer discrete random variable N with probability mass function (pmf) P(N = n).Let Z 1 , • • • , Z N be independent identically distributed (iid) random variables with common cdf given in Equation (5).The minimum and maximum random variables min{Z 1 , • • • , Z N } and max{Z 1 , • • • , Z N } can generate several models that arise in series and parallel systems with identical components and have many industrial and biological applications.For example, the time to the failure of a series system with an unknown number of protected components or the cancer recurrence of an individual after undergoing a certain treatment can be modeled by the generated distribution of the minimum random variable.In a dual mechanism, the time to the failure of a parallel system with an unknown number of protected components or a disease manifestation if it occurs only after an unknown number of factors have been active can be modeled by the generated distribution of the maximum random variable.
The odd log-logistic-G-Poisson (OLLG-P for short) family is defined by the maximum random variable M N = max{Z 1 , • • • , Z N }, under the assumption that the Z i s are iid OLLG random variables independent of N, by taking N as a truncated Poisson random variable with pmf Then, the cdf of the OLLG-P family can be expressed as The pdf of the OLLG-P family reduces to where α > 0; β > 0; G(x; τ) and g(x; τ) denote the cdf and pdf of the baseline distribution, respectively; and τ is the baseline parameter vector.
In this paper, we propose a new lifetime model named the odd log-logistic Birnbaum-Saunders-Poisson (OLLBSP) distribution from Equations ( 8) and ( 9) to extend the BS model.We obtain some of its structural properties and discuss maximum likelihood estimation of the model parameters.
Regression models can be proposed in different forms in survival and reliability analysis.Among them, the location-scale regression model and Cox proportional-hazards model are distinguished and frequently used.We also construct a location-scale regression model based on the OLLBSP distribution, called the log-odd log-logistic Birnbaum-Saunders-Poisson (LOLLBSP) regression model for censored data, as a feasible alternative to the log-BS regression model.We consider a classic analysis for the LOLLBSP regression model.The inferential part is carried out using the asymptotic distribution of the maximum likelihood estimators (MLEs).
The paper is outlined as follows.In Section 2, we define the OLLBSP distribution and provide plots of its density and hazard rate functions.In Section 3, we present the maximum likelihood method to estimate the model parameters and a simulation study.In Section 4, we derive a useful linear representation for the OLLBSP density function and obtain its ordinary moments.In Section 5, we define the LOLLBSP regression model for censored data and estimate the model parameters by maximum likelihood.In Section 6, we prove empirically the potentiality of the new distribution for fatigue life data and the flexibility and relevance of the proposed regression model by means of two applications to real data sets.Finally, Section 7 ends with some concluding remarks.

The OLLBSP Distribution
Most generalized BS distributions have been proposed in reliability to provide a better fit to certain data sets than the traditional two-and three-parameter BS models.Providing a new lifetime distribution is always precious for statisticians.The pdf and cdf of the OLLBSP distribution are defined (for x > 0) by substituting Equations ( 2) and (3) into Equations ( 8) and ( 9) and respectively.Evidently, the density function in Equation (11) does not involve any complicated function.Hereafter, we denote by X ∼OLLBSP(α, β, a, b) a random variable having the density in Equation (11). Figure 1 displays some possible shapes of the density function in Equation (11) for selected parameter values.It is evident that the OLLBSP distribution is much more flexible than the BS distribution presenting one additional shapes, namely bi-modality.Some real situations require bimodal density behavior and a distribution that presents this shape is a good tentative model.Further, there are not many distributions having this shape.The hazard rate function (hrf) corresponding to Equation ( 11) is given by Plots of the hrf h(x) for some parameter values are displayed in Figure 2. We note that these plots illustrate the four classic types of hazard shapes.Further, we note the increasing-decreasing-increasing shape, which is important because there are fewer lifetime distributions that yield complicated shapes similar to this.Further, we require a mixture of two or more lifetime distributions to model this shape.Some data sets present this behavior, usually in regression study, when we have different levels of explanatory variables.Thus, probably, this model will have a better fit to certain data sets than other known models, as we show in the Application Section.
The new distribution is simulated as follows: if U is a uniform random variable (0, 1), then has the density in Equation (11), where Q BS (•) is the BS quantile function (qf) and

Inference and Estimation
In this section, we discuss the maximum likelihood method for inference and estimation of the model parameters of the OLLBSP distribution.Let x 1 , • • • , x n be a random sample of size n from the OLLBSP(α, β, a, b) distribution.The log-likelihood function for the vector of parameters Maximization of Equation ( 15) can be performed using well-established routines such as nlm or optimize in the R statistical package.Setting these equations to zero, U α (θ) = 0, U β (θ) = 0, U a (θ) = 0, U b (θ) = 0 and solving them simultaneously gives the MLE θ of θ.These equations cannot be solved analytically and statistical or mathematical software can be used to evaluate them numerically using iterative techniques such as the Newton-Raphson algorithm.
For interval estimation and tests of hypotheses on the parameters in θ, we require the 4 × 4 observed information matrix J = J(θ) = {j r,s }, whose elements j r,s (for r, s = α, β, a, b) can be evaluated numerically.Under standard regularity conditions, the approximate multivariate normal N 4 (0, J( θ) −1 )  distribution can be used to construct approximate confidence intervals for the model parameters.
The likelihood ratio (LR) statistics are useful for comparing the new distribution with some special models.For example, we may use the LR statistic to check if the fit using the OLLBSP distribution is statistically "superior" to the fit using the BS distribution for a given data set.
We assess the finite sample performance of the MLEs in the OLLBSP distribution by varying the sample size n using the inverse method with Equation (13).The sample sizes are taken as n ∈ {50, 100, 200, 300}.We perform a Monte Carlo simulation study with 1000 replications under three setups for parameters of the OLLBSP distribution: a We conclude from the figures in Table 1 that the average estimates of the parameters tend to be closer to the true parameters when n increases.This fact supports that the asymptotic normal distribution provides an adequate approximation to the finite sample distribution of the MLEs.The normal approximation can be oftentimes improved by using bias adjustments to these estimators.Approximations to the their biases in simple models may be determined analytically.Bias correction typically does a very good job for correcting the MLEs.However, it may also increase the mean squared errors (MSEs).Whether bias correction is useful in practice depends basically on the shape of the bias function and on the variance of the MLE.

Linear Representation
Expansions for Equations ( 10) and ( 11) can be derived using the concept of exponentiated distributions to obtain some structural properties of the OLLBSP distribution.Cordeiro et al. [2] defined the exponentiated Birnbaum-Saunders (EBS) distribution with positive parameters α, β and c, say Y ∼ EBS(a, b, c), if its cdf and pdf are given by H(y; a, b, c) = Φ(ν) c and h(y; a, b, c) = c g a,b (y) Φ(ν) c−1 , respectively, where Φ(ν) is defined in Equation ( 2) and g a,b (y) is the BS(a, b) pdf.The properties of some exponentiated distributions have been studied by several authors (see, for example, Mudholkar and Srivastava [18] for exponentiated Weibull (EW) and [19] for the exponentiated generalized gamma (EGG) distributions).
By using Taylor expansion, we can write Equation (10) as For p real non-integer and |z| < 1, the convergent power series holds, where the binomial coefficient is defined for any real.By applying this equation twice to ]} iα and interchanging sums, we obtain the convergent power series (for α > 0) where (for k ≥ 0) By using the binomial expansion and Equation ( 17), we obtain (for i ≥ 1) where d r,j (i, α) = (−1) j ( i r ) ( rα j ).Further, using Equation (18) in the two powers of the right-hand side and interchanging sums, we have Hence, from Equations ( 18) and ( 20), we obtain the ratio of two convergent power series where the coefficient c k (i, α) can be determined recursively as Then, we can write where (for k ≥ 0) and H k (x) = Φ(ν) k denotes the EBS cdf with parameters α, β and k.
Consequently, the pdf of X can be expressed as where h k+1 (x) denotes the EBS(k + 1, a, b) density function.Based on Equation ( 23), some mathematical properties of the OLLBSP distribution can be obtained by knowing those of the EBS distribution.

Moments
The ordinary moments of X can be derived from the probability weighted moments (PWMs) ( [20]) of the BS distribution formally defined for p and r non-negative integers by The integral in Equation ( 24) can be computed numerically in software such as MAPLE, MATLAB, MATHEMATICA, Ox and R. Cordeiro and Lemonte [21] derived an expression for τ p,r given by where ] −1 and I(p + (2s j + j − 2m)/2, α) is obtained from Equation (4).These algebraic platforms currently have the ability to deal with analytic expressions of formidable size and complexity.
The sth moment of X can be expressed from Equation (23) as where τ s,k is given by Equation (25).Equation ( 26) can be computed numerically in any symbolic mathematical software by taking in the sum a large number of summands instead of infinity.The central moments (µ r ) and cumulants (κ r ) of , where κ 1 = µ 1 .Plots of the skewness and kurtosis of X, for fixed values of a, b and β, as functions of α are displayed in Figure 3a,b, respectively.Plots of the skewness of X for a, b and α fixed, as functions of β are displayed in Figure 3c,d, respectively.We can note that it is possible to obtain negative skewness for some parameter values.

The LOLLBSP Regression Model with Censored Data
If X is a random variable following the OLLBSP density function in Equation ( 11), the random variable Y = log(X) has the LOLLBSP distribution (also referred to as the odd log-logistic sinh-normal Poisson distribution).The density function of Y obtained by replacing b = e µ and performing two re-parameterizations reduces to (for y ∈ R) where The parameter µ ∈ R is a location parameter, σ > 0 is a scale parameter and β, α and a are positive shape parameters.We refer to Equation ( 27) as the non-standard LOLLBSP model.Thus, The survival function of Y is given by The pdf of the standardized random variable Z = (Y − µ)/σ is given by Plots of the density function in Equation ( 27) for selected parameter values are given in Figure 4.These plots show great flexibility for different values of the shape parameters α and β.They indicate that the density function in Equation ( 27) is very flexible, especially for modeling bimodal and asymmetric data and hence can be used in several applications.In many practical applications, the lifetimes are affected by explanatory variables and parametric regression models are adopted to estimate univariate survival functions for censored data.A parametric model that provides a good fit to lifetime data tends to yield more precise estimates of the quantities of interest.Based on the standardized LOLLBSP density, we propose a linear location-scale regression model linking the response variable y i and the explanatory variable vector where the random error z i has the density function in Equation (29), τ = (τ 1 , • • • , τ p ) T , a > 0, α > 0, β > 0 and σ > 0 are unknown parameters.The parameter Consider a sample (y 1 , v 1 ), • • • , (y n , v n ) of n independent observations, where each random response is defined by y i = min{log(t i ), log(c i )}.We assume non-informative censoring such that the observed lifetimes and censoring times are independent.Let F and C be the sets of individuals for which y i is the log-lifetime or log-censoring, respectively.Conventional likelihood estimation techniques can be applied here.The log-likelihood function for the vector of parameters θ = (α, β, a, τ T , σ) T from the model in Equation (30) has the form l (27) and S(y i ) is the survival function in Equation ( 28) of Y i .The total log-likelihood function for θ reduces to where q is the observed number of failures and for i = 1, . . ., n.The MLE θ of the vector of unknown parameters can be calculated by maximizing the log-likelihood of Equation (31).We use the NLMixed procedure in SAS to calculate the estimate θ.Initial values for a are taken from the fit of the LBS regression model with α = 1.From the fitted LOLLBSP model, we obtain the estimated survival function for y i given by where ξi2 = sinh Under standard regularity, the asymptotic distribution of ( θ − θ) is multivariate normal N p+4 (0, K(θ) −1 ), where K(θ) is the expected information matrix.The asymptotic covariance matrix K(θ) −1 of θ can be approximated by the inverse of the (p + 4) × (p + 4) observed information matrix − L(θ).The approximate multivariate normal distribution N p+4 (0, − L(θ) −1 ) for θ can be used in the classical way to construct approximate confidence regions for some parameters in θ.

Applications
In this section, we present two applications to prove empirically the flexibility of the OLLBSP distribution.We compare the results of the fits of the OLLBSP model and other models under some goodness-of-fit statistics.In fact, we compare the fits of the OLLBSP distribution with six other lifetime models, namely: 1.
The MLEs of the model parameters are computed and the goodness-of-fit statistics for these models fitted to the data sets are compared.These estimates are evaluated using the BFGS script as well as the goodness-of-fit measures.The MLEs of the parameters and these measures including the Cramér-von Mises (W * ) and Anderson-Darling (A * ) statistics defined by Chen and Balakrishnan [25] and Kolmogorov-Smirnov (K-S) statistics are calculated to compare the fitted models.In general, the smaller the values of these statistics, the better is the fit to the data.All computations are performed using the R statistical software [26].

Data Set 1: Breaking Stress of Carbon Fibres (In Gba)
As a first application, we consider the data set consisting of n = 100 observations on breaking stress of carbon fibers (in Gba) reported by [27].A brief summary of these data are: mean = 2.621, median = 2.700, standard deviation = 1.013885, skewness = 0.36264, kurtosis = 0.0431, minimum = 0.390, and maximum = 5.560.The data have positive skewness and kurtosis.These values indicate that the empirical distribution is skewed to the right and leptokurtic.
Table 2 lists the MLEs of the parameters and their standard errors (in parentheses) from the fitted models and the measures W * , A * , K-S and its p-value.The results indicate that the OLLBSP model has the smallest values of these statistics and large p-value for the K-S distance among the fitted models, and therefore it could be chosen as the best model to the current data.Plots of the estimated pdf and the empirical and estimated cdf of the OLLBSP model are displayed in Figure 5.It is clear from these plots that the OLLBSP model provides the best fit to the histogram of the breaking stress data.

Data Set 2: Ambient Temperature Data
The data set consists of failure times (X) of eight components at three different temperatures.The data are obtained from [28].The original sample size is n = 24.The following variables are associated with each component: x i , observed time (in years); and v i1 , temperature (temperature category: 100, 120 and 140), for i = 1, 2, . . ., 24.We analyze these data using the LOLLBSP regression model.First, we consider the regression model where the random errors z i 's (i = 1, . . ., n) are independent random variables having the density function in Equation (29) and y i = log(x i ) for i = 1, . . ., 24.These data were originally analyzed by Cordeiro et al. [2] using the log-McDonald-Birnbaum-Saunders (LMcBS) regression model.The LMcBS distributions includes as special cases two important distributions: the log-beta Birnbaum-Saunders (LBBS) model and the log-Birnbaum-Saunders (LBS) model.Next, we compare the proposed regression model with other known regression models.The values of the AIC (Akaike Information Criterion), CAIC (Corrected Akaike's Information Criterion) and BIC (Bayesian Information Criterion) statistics to compare the LOLLBSP, LMcBS, LBBS and LBS regression models are listed in Table 3.The LOLLBSP regression model outperforms the LMcBS, LBBS and LBS models, irrespective of the criteria and thus the proposed regression model can be used effectively in the analysis of these data.Table 4 lists the MLEs of the parameters for these regression models fitted to the current data using the NLMixed procedure in SAS.Iterative maximization of the logarithm of the likelihood function in Equation (31) starts with initial values for τ and σ taken from the fit of the LBS regression.We note from the fitted LOLLBSP regression model that the explanatory variable v 1 is significant at 1% and that there is a significant difference among the levels of the temperature for the failure times.A graphical comparison among the levels of the explanatory variable v 1 is given in Figure 6a.These plots provide the Kaplan-Meier (KM) estimate and the estimated survival function of the LOLLBSP regression model given by Equation (32) for different levels of v 1 .The plot of the LOLLBSP hrf in Figure 6b reveals that it has an increasing-decreasing-increasing shape.As discussed in Section 2, the OLLBSP distribution is adequate for this type of hrf (see Figure 6c).Based on these plots, it is clear that the LOLLBSP regression model gives a superior fit to the ambient temperature data.
is a known model matrix.The LOLLBSP regression model in Equation (30) opens new possibilities for fitting many different types of data.

Figure 5 .
Figure 5. (a) Estimated densities for some fitted models; (b) Estimated cdfs for some fitted models.

Figure 6 .
Figure 6.(a) Estimated survival function for the LOLLBSP regression model and the KM estimate; and (b) Estimated hrf of the LOLLBSP regression model. )

Table 2 .
MLEs of the parameters for the fitted models to the breaking stress of carbon fibres data.

Table 4 .
MLEs, standard errors (SEs) and[p-values]for the LOLLBSP regression model fitted to the ambient temperature data.