Statistical Inference for Progressive Stress Accelerated Life Testing with Birnbaum-Saunders Distribution

As a result of the two-parameter Birnbaum–Saunders (BS) distribution being successful in modelling fatigue failure times, several extensions of this model have been explored from different aspects. In this article, we consider a progressive stress accelerated life testing for the BS model to introduce a generalized Birnbaum–Saunders (we call it Type-II GBS) distribution on the lifetime of products in the test. We outline some interesting properties of this highly flexible distribution, present the Fisher’s information in the maximum likelihood estimation method, and propose a new Bayesian approach for inference. Simulation studies are carried out to assess the performance of the methods under various settings of parameter values and sample sizes. Real data are analyzed for illustrative purposes to demonstrate the efficiency and accuracy of the proposed Bayesian method over the likelihood-based procedure.


Introduction
The Birnbaum-Saunders (BS) model is based on a physical argument of cumulative damage that produces fatigue in materials, which is exerted by cyclical stress.The failure follows from the development and growth of a dominant crack in the material.Considering the basic characteristics of the fatigue process, Ref. [1] derived the BS(α, β) distribution function of the failure time T expressed by: where α > 0, β > 0 are shape and scale parameters and Φ(•) is the distribution function of the standard normal variate.The BS distribution can be widely applied to describe fatigue life and lifetimes in general.Its field of application has been extended beyond the original context of material fatigue, and the model becomes the most versatile within the popular distributions for failure times.Over the years, various approaches of parameter inference, generalizations, and applications of the distribution have been introduced and developed by many authors (see [2][3][4][5][6][7][8], just to name a few).
In recent years, more research work has lied in the study of accelerated life testing (ALT) with the BS distribution.In industrial experiments, it is often very costly and time consuming to obtain information about the lifetime of highly reliable products under normal experimental conditions.To collect failure data rapidly and improve the efficiency of the experiment, people commonly apply ALT where products or materials are subjected to elevated stress conditions compared to those normally applied in practice.These stresses often include temperature, voltage, vibration, cycling rate, pressure, etc., which can be applied in mainly three ways: constant stress, step stress, and progressive stress.
Many authors have contributed extensively to the development and parameter inference on ALT for various lifetime distributions; see, for example, [9][10][11][12][13].An ALT on the BS distribution was considered in [14], who developed the model under the inverse power law accelerated form and explored the inference procedure based on the lifetime data collected under several elevated stress levels.The work in [15] presented an inference approach for the BS distribution in step-stress accelerated life testing (SSALT) with the Type-II censoring scheme.In practice, to make a more efficient test plan for collecting the lifetime of highly reliable products, people usually resort to an ALT by a progressive stress, and we will consider such a model for the BS distribution.

Progressive Stress Accelerated Life Testing
We briefly introduce the SSALT first and then extend it to the progressive stress model for the BS distribution.

Step-Stress Test
Suppose that, for a particular pattern of SSALT with I steps in total, step i runs at stress level V i , starts at time t i−1 , and runs to time t i (t 0 = 0), i = 1, 2, ..., I.The life distribution for specimens at the stress level V i is assumed as BS(α, β i ), whose distribution function where α is the common shape parameter, and the stress V i only influences the scale parameter β i through the inverse power law, i.e., with a standard level of stress V 0 > 0 and the power p > 0 being only associated with the characteristics of products.Accordingly, based on the principle of cumulative exposure (CE) [16], the population cumulative fraction of specimens failing in step i is: where the equivalent start time s i−1 at step i satisfies: By using the recursive equation above with the time duration being ∆ i = t i − t i−1 at the i th step, the population fraction having failed in (2) over the exposure time where the "cumulative exposure" u(t I ) is the sum of the "scaled" time experienced at each step given by: As all time lengths ∆ i → 0, the step-stress levels V i become a progressive or varying stress V(t) continuously over time in ALT, and as a result, the scale parameter β(V) = (V 0 /V) p is a function of time, namely β(t), and the corresponding cumulative exposure u(t) (or damage), which appears as a sum in (6) for a step-stress testing, becomes the integral u(t) = t 0 1/β(x)dx.Specifically, here, we consider a ramp stress V(t) = Rt with R > 0 being the rate of rise of stress, and then, the scale parameter becomes β(t) = [V 0 /V(t)] p = V p 0 /(R p t p ).The corresponding cumulative exposure time of products at time t is: and the lifetime distribution of products at the progressive stress V(t) becomes: where: The distribution in ( 8) is a generalization of the Birnbaum-Saunders distribution (GBS) first proposed in [17] by allowing the exponent (1/2 on the BS) to take on other values.Recently, considering the original BS distribution derived from a homogeneous Poisson process, [18] obtained the same GBS depending on a non-homogeneous Poisson process.To distinguish from the GBS developed in [19], we refer to the distribution as Type-II GBS, denoted as GBS-II(m, α, β), whose density function is given by: where m > 0 is another shape-type parameter and φ(•) is the density function of the standard normal distribution.As the BS distribution, the transformation Z = ((T/β) m − (β/T) m )/α is a standard normal variate, and thus, it is easy to generate the GBS-II random variates from the standard normal variables.In the following, we briefly summarize the main properties of the distribution.

Properties of GBS-II
The three-parameter GBS-II distribution in (10) is a flexible family of distributions, and the shape of the density widely varies with different values of the parameters.Specifically (the detailed proof is long, and so omitted here): (i) when 0 < m ≤ 1, the density is unimodal or an upside-down bathtub; (ii) when m > 1 and α 2 ≤ m/(m − 1), the density is also unimodal (upside-down); (iii) if m > 1 and α 2 > m/(m − 1), then the density is either unimodal or bimodal.Figure 1 shows various graphs of the density function for different values of m and α with the scale parameter β fixed at unity.Additionally, Figure 2 presents the failure rate function given by λ(t) = f (t)/(1 − F(t)) with various values of m and α, showing that it could have an increasing, decreasing, bathtub, and upside-down bathtub shape.Hence, it seems that the distribution is flexible enough to model various situations of product life.
The GBS-II has some interesting properties as the BS distribution.For example, β remains the median for the distribution.The reciprocal property is also preserved by that if T ∼ GBS-II(m, α, β), then T −1 ∼ GBS-II(m, α, β −1 ).In fact, the GBS-II generally describes the distribution family of power transformations for a BS random variable from the following: if T ∼ BS(α, β), then for any nonzero real-valued constant r, T r ∼ GBS-II(0.5|r|−1 , α, β r ), where | • | represents the absolute value function.Conversely, given T ∼ GBS-II(m, α, β), then T 2m ∼ BS(α, β 2m ).In addition, similar to the BS distribution, which can be written as an equal mixture of an inverse normal distribution and a distribution of the reciprocal of an inverse normal random variable [20], the GBS-II distribution can be also expressed as a mixture of power inverse normal-type distributions [21].
In the aspect of numeric characteristics for GBS-II, generally, there is no analytic form of moments except for some special cases.For example, from the fact that T m = β m αZ + √ α 2 Z 2 + 4 /2 with a standard normal variate Z, one may easily obtain the expressions of the moments E(T km ) with an even number k, such as E(T 2m ) = β 2m (1 + α 2 /2), E(T 4m ) = β 4m (1 + 2α 2 + 11α 4 /4), etc.The general moment expression was obtained in [22], who used the relationship between the GBS-II and three-parameter sinh-normal distribution (see [23]): where r is a real number and K ω (x) is an order ω modified Bessel function of the third kind, which can be expressed in an integral form as K ω (x) = ∞ 0 exp{−x cosh(s)} cosh(ωs)ds with the hyperbolic cosine function cosh(x) = [exp(x) + exp(−x)]/2.Numerous software packages can be used to evaluate K ω (x) for specific values of ω and x for calculating the moments such as the mean and variance.The finite moments guarantee the rationality of the moment-based estimation methods.So far, little research has been seen to study the parameter estimation and inference for the GBS-II distribution.The work in [21] discussed the maximum likelihood (ML) estimation of parameters and provided interval estimation based on an "observed" Fisher's information.To contribute a novel research of precise inference, in this article, we further investigate the ML method to obtain an analytic expression of Fisher's information and propose a new Bayesian approach for parameter inference.The rest of the article is arranged as follows.Section 3 presents the methodology of the estimation procedure.Subsequently, we carry out simulation studies to investigate the performance of proposed methods in Section 4. For illustrative purpose, two real datasets are analyzed in Section 5, followed by some concluding remarks in Section 6.

Estimations
Let t = (t 1 , t 2 , ..., t n ) be n observational data from the GBS-II distribution.We first consider the likelihood-based approach in the following.

Likelihood-Based Method
To make the notation simple, let ε(t) = (t/β) m − (β/t) m , δ(t) = (t/β) m + (β/t) m .Then, the likelihood and log-likelihood functions, up to a constant, are given by: and we have the first partial derivatives w.r.t the parameters: Due to the complexity of the expression above, a numerical method has to be applied to obtain maximum likelihood estimates (MLEs) m, α, β by solving the equations ∂ /∂m = 0, ∂ /∂α = 0 and ∂ /∂β = 0 simultaneously.The work in [21] used the "observed" Fisher's information matrix to obtain a large sample-based interval estimation for the parameters.However, our study shows that a "partially" analytic form of it can be obtained, so that more precise inference can be made.Let the Fisher's information matrix be the following form: whose elements are the negative expectation of the second partial derivatives of parameters for the log-likelihood function in (13), such as u αα = −E(∂ 2 /∂α 2 ), u αβ = −E(∂ 2 /∂α∂β), etc.These elements are given by (the detailed derivations are provided in Appendix A): where Z is the standard normal variate, Therefore, the asymptotic normal distribution of MLEs has the following covariance matrix: with c = u mm u αα − u 2 mα .Notice from ( 21) that the MLEs m and β, α, and β are asymptotically independent.It is also clear that when m = 1/2, the GBS-II reduces to the BS distribution whose Fisher's information is the lower 2 × 2 matrix in (16) with u ββ = n α 2 /2 − αh(α) + 1 /(αβ) 2 , the same as the one provided in [4].
By the fact of positive parameters and the asymptotic normality of MLEs, we may use log transformation to obtain approximate confidence intervals (CIs) for the parameters [24].In particular, for parameter m and its MLE m, we have the approximate normal distribution log( m) ∼ N(log(m), Var(log( m))), where the variance can be approximated by the delta method as Var(log( m)) = Var( m)/ m2 = ûαα /( ĉ m2 ) with ûαα , ĉ being the values of u αα and c, respectively, evaluated at MLEs m, α, β.Then, a (1 − γ)100% CI for m is then given by: where z γ/2 is the upper 100 × γ/2 th percentile of the standard normal distribution.The normal-approximated CIs for α and β can be constructed in the same way to have the form in (22) with m being replaced by α and β, respectively, and Var( α) = ûmm / ĉ, Var( β) = 1/ ûββ .It should be noted that these intervals can exhibit less accurate coverage for small samples.

Bayesian Inference
Since there are no tractable forms of MLEs and CIs, this is not very applicable and efficient by the use of the ML estimation method.We propose a Bayesian inference approach as an alternative.From the model development of the GBS-II distribution in [18], we notice that the parameter m is independent of the other two parameters α and β, which have a relation α 2 ∝ β, and so, we propose a joint prior π(m, α, β) = π(m)π(α|β)π(β) with the conditional prior mean E(α 2 |β) ∝ β.From the likelihood function in (12), it is easily discernible that for α 2 , an inverse gamma is a conjugate prior for the conditional likelihood L(α|m, β, t).Therefore, we specify that α 2 |β ∼ IG(a 0 /2, a 1 β/2), and thus, the prior density of α|β is: with the hyperparameters a 0 > 4 and a 1 > 0 to ensure the existence of the variance.It is also clear that there is no conjugate prior for m and β.However, we may consider a prior distribution that has a similar functional form as their conditional likelihood functions.In this case, we pick both priors of m and β to be a gamma distribution, The hyperparameter values can be specified based on the following factors: (i) Since β is the median of the GBS-II distribution, we can refer to the sample median as the estimate β = median(t 1 , t 2 , ..., t n ) in our attempt to specify b 0 and b 1 .Further, by the fact that and by the principle of method of moments, we establish two equations by equating the first two sample and theoretical moments (β is estimated by β), , to obtain the roots m and α for specifying a 0 , a 1 , d 0 and d 1 .(ii) The MLEs m, α, β can also be used to determine the hyperparameters.The non-informative distribution with a i = 0, b i = 0, d i = 0, i = 0, 1 can be chosen if no prior knowledge is available.The joint posterior distribution of the parameters (m, α, β) with the sample data t is given by: It follows that the full conditional posteriors are: We implement a Markov chain Monte Carlo (MCMC) algorithm, specifically here a Gibbs sampling procedure (see, for example, Casella and George [25]), to draw posterior samples from their full conditional posterior distributions.First, we take MLEs of m, α and β as the initial values to make the algorithm converge more quickly and then repeat the following steps M times, among which, given the values at the k th iteration, the (k + 1) th iteration is as follows: i. Draw m k+1 from π(m|α k , β k , t) in ( 26) using a Metropolis-Hastings (MH) procedure (see, for example, Chib and Greenberg [26]).We first propose m p ∼ Gamma(c m m k , c m ), where the mean of the proposal is m k and c m is a tuning parameter to make the algorithm efficient, and then take m k+1 = m p with probability: with the Gamma proposal density function q m (•), and so: ii. Draw α 2 k+1 ∼ IG (ν 0 /2, ν 1 /2) with updated ν 0 and ν 1 in (27).iii.Draw β k+1 from π(β|m k+1 , α k+1 , t) in ( 28) using an MH procedure.We propose β p from a lognormal distribution centered at the previous value, i.e., log , where c β > 0 is a tuning parameter.For the purpose of sampling efficiency, we intend to specify a proposal distribution, which closely resembles the conditional posterior of β.This consideration prompts us to specify the variance term σ 2 k , evaluated at updated values (m k+1 , α k+1 , β k ), as the reciprocal of Fisher information of the conditional posterior of log β, whose log-likelihood is log π(β|m k+1 , α k+1 , t) + log |J|.The Jacobian term J = β is needed here as we make a log transformation on β.

Simulation Study
We conduct a simulation study to assess the performance of parameter estimation by ML and the Bayesian methods, where we fix the scale parameter β = 1 and take four settings of the other two as (m, α) = (0.25, 0.50), (0.50, 0.50), (1.00, 1.00), (1.50, 2.00).We generate 1000 datasets for each of these parameter settings with three sample sizes n = 20, 30, 50.For the Bayesian analysis, we choose the hyperparameter values such that the prior distributions are rather "flat" or "less informative" to reflect little prior knowledge about the parameters.With each simulated data, we find that the rule of thumb c m = 15, c β = 2.4 as the tuning parameter values are adequate to ensure the acceptance rates to hover around 35-40%, and we run five MCMC chains with fairly different initial values and each with a burn-in period of 2000 followed by 8000 iterations.
The scale reduction factor estimate RF = Var(θ)/W is used to monitor the convergence of MCMC simulations [27], where θ is the estimand of the parameter of interest, Var(θ) = (N − 1)W/N + B/N, with the iteration number N for each chain and the between-and within-sequence variances B and W. The scale factors for the sequences of m, α, and β are within 1.00-1.02for all five MCMC chains, indicating their convergence.The remaining 8000 samples are used to compute the average biases, mean squared error (MSE) of the estimates, average lengths (AL) of the 95% credible intervals (CI), and coverage probability (CP) for the parameters.The results are shown in Table 1 along with these estimates from the ML method for comparison purposes.The main features are summarized as follows: (i) as expected, the bias of estimates, MSE, and AL of 95% CI decrease, and CP is closer to the nominal level as sample size n increases for all cases; (ii) the estimation of all parameters from the Bayesian method is much better than from the ML approach in terms of smaller biases and MSEs, narrower CIs and higher CPs.The MLE α dose not perform well for the small to moderate sample sizes (n = 20, 30).(iii) comparatively, both methods produce much more accurate estimation of β, less precise for m, and least for α.With the larger sample size (n = 50), the performances of the β estimates are similar for both methods; (iv) it seems that the estimate of α has a smaller MSE under a smaller true value of α, whereas the estimate of m has a smaller MSE under a larger true m.Overall, the Bayesian inference method outperforms the ML approach for all parameter settings, especially when the sample size is small.

Real Data Analysis
To further illustrate the usefulness of our method in parameter inference for the GBS-II, we consider a real data example given by [28] about the breakdown time in an accelerated test that employed a pair of parallel disk electrodes immersed in an insulating oil.Voltage V across the pair was increased linearly with time t at a specified rate R, and breakdown time was recorded at a one square inch electrode.The data are presented in Table 2, consisting of 60 measured breakdown times (seconds).Fitting the GBS-II distribution with the Bayesian method, we choose the hyperparameters b 0 and b 1 such that the prior mean of β is close to the sample median 4.3 of the data, a 0 and a 1 such that the conditional prior mean E(α 2 |β = 4.3) is close to the MLE α 2 = 1.3656, and the CV (coefficient of variation) of the gamma or inverse gamma priors is close to the CV of standard uniform distribution (1/ √ 3).Hence, we have the hyperparameter values as follows: We run a chain of 20,000 iterations with a burn-in period of 5000.To reduce the correlation among the samples, every fifth sample of the remaining 15,000 samples is used for posterior inference.The results are tabulated in Table 3, where, due to the relatively large sample size (n = 60), the point estimates obtained by both ML and Bayesian methods are close to each other.However, the 95% CIs of the Bayesian method are narrower than the ones of the ML approach, especially for the intervals of m and α.Additionally, the Chi-squared goodness of fit statistic and BIC values of model fitting by the Bayesian method are smaller than these by the ML method, indicating the greater accuracy in the Bayesian analysis.Based on the estimates of m and β and the given standard level of stress V 0 = 42.30,together with the relation in ( 9), the power p in the inverse power law and the rate of rise of voltage R are computed, respectively from ML and Bayesian approaches, as p = 8.9455, 8.4045 and R = 11.0734,10.1067.Our results (especially from the Bayesian) are close to the estimates obtained by [28], who fitted a Weibull distribution for these data.For graphical comparison, Figure 3 shows the histogram of the data and fitted GBS-II density curves estimated by both methods.[4].Then, the elements of Fisher's information in (16) are: g 2 (Z) ) ) with c = u mm u αα − u 2 mα .When α is small, some approximations can be implemented for the following terms:

Table 1 .
GBS-II estimation results for simulated data.AL, average length; CP, coverage probability.

Table 2 .
Oil breakdown time (seconds) in an accelerated test employed in an insulating oil.