Load-Sharing Model under Lindley Distribution and Its Parameter Estimation Using the Expectation-Maximization Algorithm

A load-sharing system is defined as a parallel system whose load will be redistributed to its surviving components as each of the components fails in the system. Our focus is on making statistical inference of the parameters associated with the lifetime distribution of each component in the system. In this paper, we introduce a methodology which integrates the conventional procedure under the assumption of the load-sharing system being made up of fundamental hypothetical latent random variables. We then develop an expectation maximization algorithm for performing the maximum likelihood estimation of the system with Lindley-distributed component lifetimes. We adopt several standard simulation techniques to compare the performance of the proposed methodology with the Newton–Raphson-type algorithm for the maximum likelihood estimate of the parameter. Numerical results indicate that the proposed method is more effective by consistently reaching a global maximum.


Introduction
We consider a load-sharing system which consists of several components associated in parallel. As each of the components fails one by one in the system, the surviving components will share the load with the remaining surviving components.
The load-sharing system has been studied since the 1940s [1]. Since then, it is widely used in many engineering applications [2][3][4]. The studies in [5,6] drew, in particular, load-sharing parameter inferences in a system with each component with a constant failure rate. Singh et al. [7] presented a more generalized load-sharing system under an assumption of a constant hazard at the beginning and a linearly increasing hazard rate after the failure of several components. Wang et al. [8] studied a more generalized load-sharing system with the work history or memory.
Park [9,10] considered the maximum likelihood estimate (MLE) of the parameters in the system, whose component lifetimes are distributed as exponential, Weibull, normal, and lognormal. Singh et al. [7] analyzed the discrete-type multi-component load-sharing parallel system and later on Deshpandé et al. [11] presented the nonparametric-type load-sharing system. Singh and Gupta [12,13] and Singh et al. [14] investigated the load-sharing system model, in which the failure times of each component can be either constant or time-dependent. It deserves mentioning that Bayesian analyses of the load-sharing systems with several well-known distributions of lifetime components have also been investigated in the literature [15][16][17]. For instance, Xu et al. [15] studied a test scheme for reliability demonstration test on load-sharing systems with Weibull-distributed (including exponential) lifetime components. More recently, Singh and Goel [16] studied a multi-component parallel system model by assuming a load-sharing tendency on its components. Xu et al. [17] proposed the expectation-maximization (EM) algorithm for estimating the parameters in the load-sharing system with continuous degradation.
It is worthwhile to mention that based on accelerated testing models, the MLE method for the parameter estimation of the load-sharing systems under the classical latent variable approach is not well studied when the normal and lognormal are assumed to be the lifetime distribution of components. As mentioned in [5], a direct generalization of the load-sharing system with these lifetime distributions is usually problematic when making likelihood-based parametric inference, because of following reasons: (a) it may be impossible or extremely difficult to obtain explicit MLEs, (b) standard numerical methods such as the Newton-Raphson-type method can perform poorly when the dimension of parameters is high, or (c) if the likelihood function is not sharp enough, then the standard numerical methods may fail to reach the global maximizer.
In this paper, we extend the load-sharing system with Lindley-distributed component lifetimes. Since the Lindley distribution was proposed by [18] as a counterexample of fiducial statistics, it has been widely used in reliability engineering due to its non-constant hazard rate in comparison to the exponential distribution. A natural question that arises is about making statistical inference of the parameters in this system. As mentioned above, the Newton-Raphson-type method can be sensitive to starting values so that it often fails to reach a correct solution especially for high-dimensional parameters. An alternative is to consider the EM algorithm particularly in engineering applications. For example, Albert and Baxter [19] derived the EM sequences with several causes, censoring, partial masking and right-censoring for the exponential model. Park [20] generalized their EM-type results in the system with different multi-modal strength distributions. Amari et al. [21] used the EM method to evaluate incomplete warranty data. Recently, Park [22] analyzed interval-data using the quantile implementation of the EM algorithm which improves the Monte Carlo EM (MCEM) algorithm's convergence and stability properties and Ouyang [23] considered an interval-based model for analyzing incomplete data in a different viewpoint. These observations motivate us to develop an EM-type MLE for the parameters in the considered system to overcome the possible difficulties mentioned above when using the standard numerical methods.
The rest of the paper is as follows. In Section 2, we discuss the construction of the likelihood of the system with Lindley-distributed component lifetimes. In Section 3, we develop an EM algorithm for performing the MLE of the system under consideration. In Section 4, we conduct numerical studies to investigate the performance of the proposed EM-type MLE method, and some concluding remarks are given in Section 5.

Likelihood Construction
In this section, we briefly overview the construction of the likelihood function for the load-sharing system in Section 2.1 and then focus on the working likelihood in the load-sharing system with Lindley-distributed component lifetimes in Section 2.2.

Working Likelihood for the Load-Sharing Model
We here provide a brief description of the likelihood construction needed for a load-sharing framework and refer the interested readers to [9,10] for details on various types of load-sharing models. We assume that in a load-sharing parallel system consisting of J components, the failure rates of the remaining surviving components could change and the lifetimes of the surviving components at each failure step are independent and identically distributed (iid). However, since each of the components fails one by one (without being replaced or repaired) in the system, the parameter of the remaining surviving components should change from θ (0) to θ (1) after the occurrence of the first failure. Similarly, it also changes from θ (j−1) to θ (j) after the occurrence of the j-th failure. Let Y (j) be the lifetime between the failures of the j-th and (j + 1)-st for j = 0, 1, . . . , J − 1. We will observe the value of Y (j) after the failure of the j-th component. For notational simplicity, we assume that the component fails in the order of the J-th, the (J − 1)-st, etc.
When analyzing the lifetime distribution in a load-sharing system, standard methods are developed based on the use of hypothetical latent random variables. After the occurrence of the j-th failure, there are J − j surviving components in the system. These surviving components are denoted by U J−j , respectively. Then, we follow the traditional latent variable approach and model For a parametric model, each lifetime distribution usually includes the unknown parameters. We shall be particularly interested in obtaining the MLEs of the unknown parameters in the system. Given that it is often impossible to obtain the analytic-form MLE of the parameters, we often employ some reliable numerical procedures for obtaining the MLEs. Of particular note is that many lifetimes of the non-observed components can be considered to be right-censoring. For more details, the reader is directed to Equation (1.1) in [24], Section 5.3.1 in [25], Example 6.30 in [26], and the EM algorithm with right-censoring in Park [20]. This observation motivates us to adopt a similar method for constructing the likelihood of the load-sharing system as follows.
We assume that for a random sample of size n, the i-th measurement can be categorized as below Let the realization of the random variables Y i . By using the formulation in [9,10], we can construct the observed likelihood function from the n observations as follows Under the equal load-sharing rule, we consider the parameter estimates for the system with Lindley-distributed component lifetimes. For the notion of completeness, we here review the likelihood construction provided in [10]. We assume that for a specific value of j, U (j) are iid for = 1, 2, . . . , J − j.
The cumulative distribution function (cdf) of Y (j) can then be defined as where S U (y) be the hazard function. The probability density function (pdf) of Y (j) can be easily obtained by differentiating Equation (2), and it is given by Then the likelihood and log-likelihood functions for the n observations are, respectively, given by

Likelihood Construction with the Lindley Distribution
We focus on the load-sharing system with Lindley-distributed component lifetimes. The pdf and cdf of the Lindley distribution with the parameter θ > 0 are given by respectively, for x > 0. Then the survival and hazard functions of the Lindley distribution are given by , respectively. Based on the Lindley distribution above, we have the survival and hazard functions of U Thus, using Equations (3) and (7), it follows that Substitution of Equations (8) into (5) leads to To the best of our knowledge, it is actually impossible to derive an analytic-expression of the MLE of the likelihood function in Equation (9). Thus, appropriate numerical procedures are usually required to obtain the MLE under this scenario, whereas, as afore-mentioned, the conventional Newton-Raphson-type method is very sensitive to starting values and it could often fail to reach a correct solution especially for high-dimensional parameters. Given that the likelihood in Equation (9) can be easily over-parameterized, we recommend the use of an EM algorithm to overcome these difficulties in maximizing the likelihood in Equation (4) discussed as follows.

The Proposed EM Algorithm
We first briefly discuss how to implement an EM algorithm based on the complete-data likelihood function in Section 3.1 and develop the EM-type MLEs for the load-sharing system with Lindley-distributed component lifetimes in Section 3.2.

An EM Algorithm with the Load-Sharing Model
The EM algorithm [24] is a numerical iterative algorithm to obtain the MLE of parametric models. The EM algorithm consists of two steps which are (i) an expectation step (E-step) and (ii) a maximization step (M-step). It is often employed to solve a difficult or complex likelihood problem to obtain the MLE by iterating two easier steps above. In the E-step, we calculate the expectation of the log-likelihood function under the incomplete data given the observed data. In the M-step, we calculate the maximizer of the expected log-likelihood. Although this algorithm may be slower than the gradient-based methods such as Newton-Raphson, it is very stable. For the sake of completeness, we briefly discuss the EM algorithm when missing values such as censored or masked data are considered. For more details on the term missing values in terms of data coarsening, the reader is referred to the references [26][27][28].
Let θ denote the set of unknown parameters. We assume that the complete data x = (x 1 , x 2 , . . . , x n ) can be reorganized as x = (y, z), where y = (y 1 , y 2 , . . . , y m ) represents the fully observed part and z = (z m+1 , z m+2 , . . . , z n ) stands for the missing part (right-censored in the load-sharing) [24]. Then the likelihood function becomes which can also be rewritten as For more details on the above, see Section 5.3.1 in [25] and Section 6.3.3 in [26]. Let θ s be the estimate at the s-th step of the algorithm and define The two distinct steps of the EM algorithm can be summarized as follows: • E-step: As yet, it is not clear how to implement the EM algorithm for performing the MLE of the considered load-sharing system. Given that the lifetime between the failures of the j-th and (j + 1)-st is observed from the minimum lifetime among the remaining J − j components, we could view this load-sharing system in the setting of competing risks [29]. Consequently, after the occurrence of the j-th J−j be the lifetimes of the remaining J − j surviving components, in which these components will compete with each other for the next failure. Also, given that the lifetime between the failures of the j-th and (j + 1)-st are partially observed with the unknown component, we may view the system as masking [30]. Then, it is equivalent to assuming that causes (failed components) are completely masked or missing [20,31]. Thus, analogous to his approach to complete masking, we can easily construct the likelihood function of the system for implementing the EM algorithm as follows.
First, we use an indicator function and a variable to express the cause of failure of the component: let I{C | D} be the indicator function of two event C and D and ∆ Since B (j) i ( ) are the indicator functions of the events for = 1, 2, . . . , J − j, they have an iid Bernoulli distribution with the probability P B (j) i ( ) = 1 , which is obtained as Analogous to the approach in Section 8.2.2 of [32], the Bernoulli probability is obtained by where i = 1, 2, . . . , n and = 1, 2, . . . , J − j. For the detailed explicit derivation of Equation (11), one can refer to Appendix A in [10]. Next, we incorporate the above Bernoulli random variables and construct the complete-data likelihood function as follows. To be more specific, by using Equation (4), the likelihood function can be constructed as i ). Using the above Bernoulli random variables in Equation (10), we can rewrite f (j) which results in the following likelihood function For convenience, we rewrite the above likelihood function as where L(θ (j) | y (j) i ) stands for the likelihood function of the i-th system given by We observe that the complete-data likelihood function in (13) can now be constructed based on the Bernoulli random variable B (10), whereas it still seems difficult to directly derive an explicit Q(·) function from Equation (13) in the E-step. We may overcome this difficulty by viewing the censored data as missing. This idea could help us avoid the use of the survival function, since in many practical cases, it is easier to deal with a density than a survival function. We thus provide a method for constructing the complete-data likelihood based on a density rather than a survival function, which could ease the difficulty in calculating the E-step. It should be noted that where the pdf of Z i is defined as This technique is widely used for simplifying the calculation of the E-step. The reader is referred to Equation (1.1) in [24], Section 5.3.1 in [25], Example 6.30 in [26], and Park [20] for details. Now, using Equation (15), we can obtain an explicit closed-form Q(·) function which will be shown in Equation (21). This brings great simplicity for finding the MLE in the M-step.
Since P[B  (11), we just write B (j) by ignoring i and for notational simplicity. Then the likelihood function of the complete-data in Equation (15) can be written as In what follows, using Equation (17), we implement the EM algorithm for finding the MLE of the parameters in the load-sharing system with Lindley-distributed component lifetimes.

The EM-Type Maximum Likelihood Estimates with the Lindley Distribution
Since the lifetime of the components follows the Lindley distribution with the pdf in Equation (6).
Then, the pdf of U (j) is defined as We substitute this function into Equation (17) and take the logarithm of L * c (θ (j) ), which leads to the complete-data log-likelihood function of θ (j) given by where It should be noted that C does not include θ (j) . Thus, we can treat it as a constant in the M-step, where Q(·) function is differentiated with respect to θ (j) .
Let θ (j) s be the estimate of θ (j) at the s-th EM sequences. We can summarize the proposed EM as follows.
• E-step: Using Equations (11) and (18), we obtain where Since the lifetime of each component follows the Lindley distribution, we observe from Equation (16) that the pdf of Z s can be written as . (20) It is immediate upon substituting Equation (20) into (19) that we obtain wherew (j) s ) with respect to θ (j) and obtain Finally, we set the above equation to be zero in order to solve for θ (j) and obtain the (s + 1)-st EM sequences, denoted by θ We can easily obtain the MLE of the parameter in the system based on the above EM sequences.
In addition, to make our procedure be accessible for practitioners, we provide the function written in R language [33] in Appendix A.

Real Data Analysis
We consider the real data set analyzed in [12,34] to illustrate the usefulness of the proposed procedure. We refer the readers to [12,34] for more details on the description of the data. Singh and Gupta [12] used the load-sharing model with the underlying Lindley distribution and estimatedθ (0) = 0.036246642,θ (1) = 0.040587738, andθ (2) = 0.069157598 based on the Newton-Raphson-type method. The corresponding log-likelihood and likelihood values are −340.2890 and 1.638193 × 10 −148 , respectively.
We reanalyze the above real data with the proposed EM algorithm in Section 3. The R function for this analysis (Lindley.LS.EM) and the real data set are provided in Appendix A. Using the Lindley.LS.EM function, we obtainedθ (0) = 0.03624714,θ (1) = 0.04104211, andθ (2) = 0.06915810. The corresponding log-likelihood and likelihood values are −340.079 and 2.021003 × 10 −148 , respectively. Comparing the two results, we observe that the estimates using the EM algorithm have a greater likelihood value, which clearly implies that it is more effective than the Newton-Raphson-type method by Singh and Gupta [12].

Sensitivity of Parameter Estimation Due to Starting Values
We compare the sensitivity of starting values to the Newton-Raphson-type method and the EM algorithm. To this end, we generated ten lifetimes (n = 10) from a five-component parallel system with the parameter values of θ (0) = 0.01, θ (1) = 0.02, θ (2) = 0.03, θ (3) = 0.04, and θ (4) = 0.05. Please note that Ghitanya et al. [35] used the mixture model to generate the Lindley random sample, whereas the sample can be directly obtained by solving F(x) = p for x where p is a uniform random number in (0, 1). In Appendix B, we provided the quantile function of the Lindley distribution by solving F(x) = p for x. We provided the generated data in Table 1. We first need to choose starting values for using both methods to obtain the estimates. We selected starting values at random and obtained the estimates based on the Newton-Raphson-type method and the EM algorithm. For the case of the proposed EM method, we obtainedθ (0) = 0.00837,θ (1) = 0.01994, θ (2) = 0.03099,θ (3) = 0.03908 andθ (4) = 0.05589 along with the corresponding log-likelihood value (Θ) = −223.63 whereΘ = (θ (0) ,θ (1) ,θ (2) ,θ (3) ,θ (4) ) regardless of starting values. On the other hand, for the case of the Newton-Raphson-type method, the parameter estimates are dependent on the choice of starting values. In Table 2, we reported only some non-convergent cases for brevity. These results clearly show that the Newton-Raphson-type method is sensitive to the choice of starting values.
We investigate the sensitivity of the estimates more thoroughly as follows.
Here, we choose starting values by randomly selecting a value in (0, 1) and then estimate the five parameters using both methods. We repeated this procedure 1000 times, which results in 1000 sets of parameter estimates. We draw the box-percentile plots [36] of the estimates under each method in Figure 1. It is easily seen from the figure that the Newton-Raphson-type method shows the wide spread of the estimates indicating that it is sensitive to starting values, while the EM algorithm does not show any spread of the estimates which clearly shows that it is not sensitive to starting values at all. In the figure, we also added the mean (red dashed line) and median (blue dotted line) of the estimates using the Newton-Raphson-type method which indicate that they tend to underestimate the true MLE but their distributions are skewed to the right. Table 2. Parameter estimates with corresponding log-likelihood values when the Newton-Raphson-type method is used.

Starting Values
Parameter Estimates Log-Likelihood It deserves mentioning that we have more extensive results, but due to the space limitations, we briefly report some important findings. As expected, with an increasing number of parameters, the Newton-Raphson-type method becomes more unstable, and in some cases, it even fails to provide reasonable results. On the other hand, the EM algorithm always provides reliable estimates. It is of interest to note that for a small number of the parameters (e.g., five or smaller), the Newton-Raphson-type method is not so bad. In summary, when the number of the parameter is more than five, we highly recommend the use of the EM algorithm.

Conclusions
In this paper, we have introduced a methodology by integrating the conventional procedure under the assumption of the load-sharing system being made up of fundamental hypothetical latent random variables. In addition, we have investigated the problem of estimating the parameter of a load-sharing system with Lindley-distributed component lifetimes. To our knowledge, it is not possible to obtain a closed-form MLE of the parameter under this system. We thus develop an EM algorithm for finding the MLE of the load-sharing system under consideration. The EM algorithm is not only easily implemented by practitioners with the R programs in Appendix A, but also can alleviate some potential issues faced by iterative numerical methods such as the Newton-Raphson-type method.
We have conducted extensive simulations to compare the performance of the EM algorithm with the Newton-Raphson-type method. Numerical results indicated that the Newton-Raphson-type method often fails to converge and is very dependent on the starting values. Also, as the number of load-sharing parameters increases, it becomes increasingly ineffective because the chance of poor convergence becomes greater. Consequently, we do not recommend the use of this method for parameter estimation of the described load-sharing system. Instead, we have a preference for the proposed EM algorithm, because it consistently offers reliable results, even as the number of parameters becomes larger.

Acknowledgments:
The authors wish to thank the anonymous referees for their careful reading and suggestions for improvement of the original manuscript. The authors also thank Mark Leeds for his help for writing an earlier version of this paper.

Conflicts of Interest:
The authors declare no conflict of interest.   Figure A1. Lambert W function. W −1 (z) denotes the lower branch and W 0 (z) denotes the upper branch.

Appendix A. R Programs for the Proposed EM Algorithm
Please note that the Lambert function W(z) is not injective when z is in (−1/e, 0) and has two branches as seen in Figure A1. The upper branch for the case of W > −1 is denoted by W 0 and the lower branch for the case of W < −1 is by W −1 . Then we can see that the x value in Equation (A6) is always positive in the lower branch and always negative in the upper branch. Thus, the quantile function or inverse cdf function of the Lindley distribution is explicitly expressed as To ensure that the cdf possesses the standard properties of the cdf, we calculate the cdf values at the endpoints 0 and ∞ as follows. As p → 1, we know W −1 (0 + ) → −∞ and thus we have F −1 (p) → ∞ as expected. When p = 0, we have It is immediate from W −1 (−ae −1 ) = −a that W −1 − (θ + 1)e −(θ+1) = −(θ + 1) and thus F −1 (0) = 0 as expected. The Lambert W function is available in the lamW package [39] of the R language.