Statistical Inference for Alpha-Series Process with the Generalized Rayleigh Distribution

In the modeling of successive arrival times with a monotone trend, the alpha-series process provides quite successful results. Both selecting the distribution of the first arrival time and making an optimal statistical inference play a crucial role in the modeling performance of the alpha-series process. In this study, when the distribution of the first arrival time is the generalized Rayleigh, the problem of statistical inference for the α, β, and λ parameters of the alpha-series process is considered. Further, in order to obtain optimal modeling performance from the mentioned alpha-series process, various estimators for the model parameters are obtained by employing different estimation methodologies such as maximum likelihood, modified maximum spacing, modified least-squares, modified moments, and modified L-moments. By a series of Monte Carlo simulations, the estimation efficiencies of the obtained estimators are evaluated through the different sample sizes. Finally, two real datasets are analyzed to illustrate the importance of modeling with the alpha-series process.


Introduction
Currently, modeling the failure times of an engineering product or a system is quite important in terms of reliability. It is a general approach to use the renewal process in modeling the non-trending times of successive failures (successive arrival times) of repairable systems. However, in most cases, successive arrival times for repairable systems may include a trend due to the effects of accumulated wear, aging, or unknown reasons such as changing the maintenance unit and quality of replacement parts. In this case, it would be more appropriate to consider a model with monotonic behavior, which takes into account the trend in the data [1].
If the GP cannot model the data, the alpha-series process (ASP) can be employed for modeling the successive arrival times. The ASP was introduced as a strong alternative to the GP by Braun et al. [18] and is described by the following definition. Definition 1. Let X k be the interarrival time of the (k − 1) th and k th events of a counting process {N(t), t ≥ 0} for k = 1, 2, .... The process {X k , k = 1, ..., n} is said to be an ASP with parameter α if there exists a real number such that the random variables: Y k = k α X k , k = 1, 2, ... (1) are independently and identically distributed (iid) with the distribution function F; see [19].
The parameter α manages the monotonic behavior of the ASP. For the different values of the parameter α, the behavior of the ASP is illustrated in Figure 1. As can be clearly seen from Figure 1, the behavior of the ASP is monotonically decreasing when α > 0, monotonically increasing with decreasing slope when −1 < α < 0, and monotonically increasing with increasing slope when α < −1. If α = 0, then the ASP is equivalent to the renewal process (RP). In this context, ASP is a more capable model than the GP for more various data types; because the GP does not have the capability of modeling data that are monotonically increasing with decreasing slope. Furthermore, some useful properties and theoretical results for the ASP can be found in [18,20].
An ASP contains two types of parameters, which are process parameter α and the distributional parameters of the first arrival time X 1 . Estimation of these parameters is quite important since they determine the mean and variance of random variable X k , (k = 1, 2, · · · , ) in such a way that: Var (X k ) = σ 2 k 2α , k = 1, 2, ..., where µ and σ 2 are the expectation and variance of the first arrival time X 1 , respectively. Although the ASP given by Definition 1 has many useful properties and superior data modeling capability, it has not been able to achieve its deserved position in the literature due to the popularity of the GP in reliability and scheduling problems. Nevertheless, one can find several papers on statistical inference for the ASP; see [19,21,22]. The main goal of the present study is to examine the solution of the parameter estimation problem for ASP under assumption that the first arrival time X 1 is distributed generalized Rayleigh, which is a possible alternative to popular reliability distributions such as Weibull, gamma, and log normal. The generalized Rayleigh distribution has special importance among of lifetime distributions since its hazard rate function can be either a bathtub type or an increasing function, depending on the shape parameter. The hazard function of the distribution is a bathtub type when its shape parameter is less than or equal to 1/2 and is an increasing function when its shape parameter is greater than 1/2 [23]. Therefore, it can be applied to many life testing experiments in which the aging effect is expected [3].
The rest of this paper is organized as follows: in Section 2, estimators for the parameters of ASP with the generalized Rayleigh distribution are obtained using the different estimation methodologies such as maximum likelihood (ML), modified maximum spacing (MMSP), modified least-squares (MLS), modified moments (MM), and modified L-moments (MLM). Some Monte Carlo simulation results, which compare the estimation efficiencies of the estimators obtained in Section 2, are presented in Section 3. Section 4 includes two real data applications that show the superiority of the ASP with the generalized Rayleigh distribution according to both the RP and the various ASPs with the gamma, Weibull, log normal, and inverse Gaussian. Section 5 concludes the study.

Estimation of the Parameters of ASP with the Generalized Rayleigh Distribution
In this section, we consider the parameter estimation problem for ASP with the generalized Rayleigh distribution by using the different estimation procedures such as the ML, MMSP, MLS, MM, and MLM.
Before progressing to the estimation stage, let us recall the generalized Rayleigh distribution, also known as two-parameter Burr Type X distribution [23]. The probability density function of the generalized Rayleigh distribution is: and the corresponding cumulative distribution function (cdf) is: where β > 0 and λ > 0 are the shape and scale parameters of the distribution, respectively. From now on, the generalized Rayleigh distribution with parameters β and λ will be indicated as GR (β, λ). Shannon entropy is a very important inferential measure to explain the variability or uncertainty of a random variable. The Shannon entropy for a random variable X with pdf f is given by (see [24,25]), Using the pdf (4) in the equation (6), the Shannon entropy of the generalized Rayleigh distribution is found as: where f (x) is the pdf of the generalized Rayleigh distribution, κ = E (lnX), and Ψ (.) is the digamma function [26]. For illustrative purposes, we present Figure 2 where the plots of the Shannon entropy of the generalized Rayleigh distribution are displayed for different values of the parameters.
See also [23,[27][28][29] in the context of theoretical properties and estimation problems for the generalized Rayleigh distribution.

Maximum Likelihood Estimation
Let X 1 , X 2 , ..., X n be a random sample from the ASP with parameter α and X 1 ∼ GR(β, λ) with the pdf (4). The log-likelihood function ln L(α, β, λ) of the random variables X k , (k = 1, 2, ..., n) can be written as: By deriving the (8) log-likelihood function with respect to parameters α, β, and λ, the three normal equations become: Unfortunately, these normal equations cannot be solved with respect to the corresponding parameters, and explicit expressions of the ML estimators cannot be obtained from these normal equations. However, Equations (9)-(11) can be solved simultaneously by using a numerical method. Newton's method is a commonly-used numerical method to investigate the solution of likelihood functions that cannot be solved analytically. Now, let us investigate the ML estimates of the parameters α, β, and λ employing Newton's method.
Newton's iterative formula is given by: where j is the iteration number,θ j is the estimation of the parameter vector in the j th iteration, ∇ θ j is the corresponding gradient, and H θ j is the corresponding Hessian matrix. For our problem,θ j , ∇ θ j , and H θ j are defined as:θ and: respectively. The elements of the gradient vector ∇ (θ) are given in Equations (9)- (11). The elements of the matrix H (θ), say h ij (i, j = 1, 2, 3), are obtained as: Note that H is a symmetrical matrix. We can also compute the inverse of the matrix H by: Hence, we can estimate the parameter vector θ using the iterative method given by Equation (12) with an initial estimationθ 0 . Then, the ML estimates of the parameters α, λ and β, sayα ML ,λ ML andβ ML , respectively, are obtained as respective elements of theθ m+1 .

Modified Methods
In the problem of estimating the parameters of an ASP, an explicit expression of the parametric estimator of the parameter α may not always be obtained, as in our case. In such a case, the parameter α is parametrically estimated using numerical methods. However, some divergence problems may be encountered in solving the parametric estimator by using numerical methods. In order to avoid these divergence problems in the parametric estimation of the parameter α and to provide an appropriate initial value for numerical methods, the parameter α can be estimated non-parametrically by using equation,α For further information on deriving the estimatorα NP , we refer the readers to [21]. Furthermore, when theα NP given by Equation (13) yields Equation (1), we have: Thus, using Equation (14) and the estimatorα NP , the other distributional parameters of the ASP can be estimated by using a selected method such as maximum spacing, least-squares, moments, or L-moments. This estimation rule is known as the modified estimation rule in the literature.

MMSP Estimation
In this subsection, we use a method based on maximizing the spacings to estimate the unknown parameters λ and β, when the parameter α is estimated by the estimator (13). This method is known as the maximum spacing (MSP) or maximum product space estimation. The MSP estimators have nice properties such as consistency and asymptotic unbiasedness. We refer the readers to [30] and [31] for further information on MSP.
In a general point of view, the moment estimators are obtained equating the first and second population moments to the corresponding sample moments for a family of distributions with two unknown parameters. Unfortunately, the first population moment of the generalized Rayleigh distribution cannot be obtained analytically. Kundu and Ragab [23] obtained the moments estimators for the parameters λ and β by equating the second and fourth population moments of the generalized Rayleigh distribution to the corresponding sample moments. Now, we adapt their approximation to our problem. For the sampleŶ 1 ,Ŷ 2 ,. . . ,Ŷ n , the second and fourth sample moments, m 2 and m 4 , are calculated by: and: respectively. On the other hand, the second and fourth population moments of the distribution GR(β, λ), say µ 2 and µ 4 , can be easily written as: and: respectively, where Ψ (.) and Ψ (.) indicate the digamma and polygamma functions, respectively. Then, the MM estimator of the parameter β,β MM , can be obtained from the solution of the following nonlinear equation: where V = m 4 − m 2 2 . Furthermore, the MM estimator of the parameter λ, sayλ MM , based on the estimationβ MM is obtained as follows (see [23]),

MLM Estimation
In this subsection, we discuss the L-moments estimators of the parameters λ and β, sayβ MLM and λ MLM , respectively, when the parameter α is non-parametrically estimated by Equation (13) asα NL . The L-moments estimation method was originally introduced by Hosking [32]. The method is a more robust estimation technique than the method of moments. Some valuable properties of the L-moments estimators were shown by Hosking [32]. In order to obtain L-moments estimators of the parameters of a family of distributions with two parameters, as in the moments method, the first two sample L-moments are equated to the corresponding population L-moments and solved with respect to the parameters. However, population L-moments of the generalized Rayleigh distribution cannot be obtained analytically. By using the quadratic transformation of a generalized Rayleigh random variable, Kundu and Ragab [23] have obtained the modified L-moments estimator of the parameters β and λ. Now, let X 1 , X 2 , ..., X n be a random sample from the ASP with parameter α and X 1 ∼ GR(λ, β) and also α estimated by the estimator (13) as α NL . In this situation, we have the sampleŶ 1 ,Ŷ 2 ...,Ŷ n from Equation (14). By using the sampleŶ k , (k = 1, 2, · · · , n) and following the similar steps of the "Modified L-Moment Estimators" section given in [23], it can be written that the sample L-moments l 1 and l 2 are: and: respectively. On the other hand, population L-moments L 1 and L 2 are: and: respectively; see [23]. Thus,β MLM can be obtained from the solution of the nonlinear equation: Therefore, using Equations (22) and (24) and estimationβ MLM ,λ MLM is: from [23].

MLS Estimation
In this subsection, when the parameter α is nonparametrically estimated by the estimator (13), we obtain the least-squares estimators of the λ and β parameters of the ASP with the generalized Rayleigh distribution. Let X 1 , X 2 , · · · , X n be a random sample of size n from an ASP with the generalized Rayleigh distribution, and we indicate the estimation of the parameter α asα NL . In this situation, we have the estimated observationsŶ 1 ,Ŷ 2 ...,Ŷ n from Equation (14). Thus, the MLS estimators of the parameters λ and β, sayλ MLS andβ MLS , respectively, can be obtained by minimizing the equation: with respect to β and λ, whereŶ (j) ,(j = 1, 2, · · · , n), indicates the j th ordered observations of the sampleŶ 1 ,Ŷ 2 , · · · ,Ŷ n .
We refer the readers to [33] for further information on least-squares estimation.
The entropy measure of generalized Rayleigh distribution given by Equation (7) can be easily computed by using the (plug-in) estimators of the parameters obtained by the methods of ML, MMSP, MLS, MM, and MLM [24].

Simulation Study
This section presents the results of some simulation studies that compare the efficiencies of the ML, MMSP, MLS, MM, and MLM estimators obtained in the previous section. In the simulation studies, the values of the parameters λ and β were set as 0.5 and 2.0, respectively, without loss of generality. For the different sample of sizes n (n = 50, 100, 150, ..., 500, 750, 1000) and the different values of the parameter α (α = −1.0, −0.5, 0.5, 1.0), estimates, biases, and MSE values were simulated by 1000 replicated simulations. The obtained simulation results are visualized by Figures 3-6.
According to the visualized simulation results given by Figures 3-6, the estimates of all estimators were quite satisfactory. In addition, the results show that when the sample size n increased, the biases and MSE values decreased for all estimators. Thus, it can be concluded that all estimators were asymptotically unbiased and consistent. In estimating the α parameter, the MLE estimator provided better estimation performance than the non-linear estimatorα NL according to the MSE criteria. Besides, in the estimation of the parameters λ and β, the MLE and MMSP estimators outperformed the MM, the MLM, and the MLS estimators with the smallest MSE values.

Data Analysis
In this section, we present two practical applications with real-life datasets: the air-conditioning system and No. 4 datasets. In order to demonstrate the performance of ASP in modeling the successive arrival times with a monotone trend, the datasets were modeled using both the ASP with the generalized Rayleigh distribution and the RP. Before the analysis of the datasets, we investigated whether the data were consistent with a generalized Rayleigh distribution by considering the following linear regression model derived by taking the logarithm of Equation (1).
where τ = E (ln Y k ) and ε i , (i = 1, 2, ..., n) is the error term; see [19] for further information on deriving this regression model. According to this regression model, if the exponentiated errors have the generalized Rayleigh distribution, then the data are consistent with a generalized Rayleigh distribution with parameters θ and ξ. Considering the parameter α is estimated with Equation (13), the error term ε k in Equation (29) can be estimated by: whereτ is easily estimated by: Therefore, the consistency of the exponentiated errors with a generalized Rayleigh distribution can be tested by using a goodness of fit test such as Kolmogorov-Smirnov (K-S). Besides, to compare the performance of ASP and RP, we used the mean-squared error (MSE*) given by: whereX k is calculated by: where theμ (.) notation indicates the estimate of the expected value of X 1 with the presented estimators in Section 2. Furthermore, we define the S k = X 1 + X 2 + · · · + X k , k = 1, 2, . . . , n. The random variable S k , k = 1, 2, ..., n is easily estimated by using the estimatesX k asŜ k = k ∑ j=1X j . Thus, we can demonstrate the performances of the RP and five ASPs with the ML, MMSP, MLS, MM, and MLM estimators by plotting S k andŜ k , against k, k = 1, 2, · · · , n.

Air-Conditioning System Data
This dataset is related to the study of the failure times of an aircraft (Aircraft Number 7912) air-conditioning system dataset presented by Proschan [34] that includes 30 observations. For this dataset, estimations for the exponential errors were θ = 0.3188 and ξ = 0.2280 (K-S statistic = 0.1886, p-value = 0.2083). Hence, we can conclude that the data can be modeled with a generalized Rayleigh distribution. We also present Figure 7a,b, where Figure 7a illustrates the Q-Q plot of the exponentiated errors against the generalized Rayleigh distribution and Figure 7b illustrates plots of the empirical and the fitted generalized Rayleigh cdf.  Table 1. According to Table 1, the ASP with the ML estimators gave the best modeling performance with the least MSE* value. For this dataset, the calculated Shannon entropy with the ML estimators was also −1.2412. The relative performance of the employed ASPs and RP can clearly be seen from Figure 8, which plots S k andŜ k versus the number of failures k (k = 1, 2, · · · , 30).  As can be seen in Figure 8, theŜ k s estimated by the ASP with the ML estimators followed the actual failure times more closely than the other process, as consistent with the Monte Carlo simulation results presented in the previous section. Now, let us investigate the optimal ASP considering the popular distribution models such as the generalized Rayleigh, gamma, log normal, inverse Gaussian, and Weibull distributions for modeling this dataset. For the different ASP with ML estimators, evaluated MSE* and parameter estimates are tabulated by Table 2. According to the results given by Table 2, ASP with the generalized Rayleigh distribution outperformed the other models with the least MSE* values. Therefore, it can be concluded that the ASP with the generalized Rayleigh distribution is an optimal model of the air-conditioning system data among the ASPs with the gamma, Weibull, inverse Gaussian, and log normal distribution.

No. 4 Data
The No. 4 dataset is related to unscheduled maintenance actions for the U.S.S. Grampus No. 4 main propulsion diesel engine [35]. The dataset contains 56 observations, which are the times between successive unscheduled maintenance actions.
As in the previous example, we first explored whether the underlying distribution of the data was appropriate with a generalized Rayleigh distribution. For this dataset, estimations for the exponential errors were θ = 0.3188, ξ = 0.2280; the value of the evaluated K-S test was 0.1886; and the corresponding p-value was 0.2083. By considering the value of the K-S statistic and the corresponding p-value, we can say that the data are appropriate for the generalized Rayleigh distribution. To support this conclusion, we present Figure 9a,b, which shows the Q-Q plot of the exponentiated errors (ε i ) against the generalized Rayleigh distribution and the empirical and fitted cdf of the generalized Rayleigh distribution, respectively. We can clearly see from Figure 9a that the data points fall approximately on a straight line, and the fitted cdf of the exponentiated errors closely followed the empirical cdf in Figure 9b. Thus, it can be concluded that the data can be modeled by a generalized Rayleigh distribution. For the No. 4 dataset, the ML, the MLS, the MM, the MLM, and the MMSP estimates of the parameters α, λ and β and the MSE * values of the corresponding processes are tabulated in Table 3. By Table 3, the ASP with ML estimators is an optimal process for modeling of this dataset because of it outperformed the RP and the ASPs with other estimators with a smaller MSE* value. Shannon entropy with the ML estimators was also calculated as −0.2418 for this dataset. Further, the relative performances of the mentioned ASPs and RP can be seen from Figure 10. Figure 10 plots the S k andŜ k versus the number of unscheduled maintenance actions k (k = 1, 2, · · · , 56).
As can be seen in Figure 10, the ASP with the ML estimators more fairly followed the actual values than the RP. Thus, according to Figure 10 and Table 3, it is concluded that the ASP provides a better data fit than RP. In addition, for the No. 4 dataset, the evaluated parameter estimates and the corresponding MSE* values of the alternative ASPs with the different distribution models are summarized by Table 4. By Table 4, we can say that the ASP with the generalized Rayleigh distribution is an optimal model for the No. 4 dataset, since it outperformed other ASPs with the gamma, log normal, inverse Gaussian, and Weibull distributions with a smaller MSE* value.

Conclusions
In this study, we have investigated the solution of the statistical inference problem for the ASP with the generalized Rayleigh distribution. The ASP is a useful monotonic stochastic process for successive arrival times with a trend. In the study, for the different values of the parameter α, the monotonic behavior of the ASP has been illustrated by Figure 1. In the stage of statistical inference, the estimators of the ASP parameters have been obtained by using the different estimation methods such as the ML, the MMSP, the MM, the MLM, and the MLS. To bring into the open the beneficial properties of the obtained estimators such as bias and MSE, some Monte Carlo simulation results have also been presented with different scenarios. According to the presented Monte Carlo simulation results, it can be said that all of the obtained estimators produced acceptable parameter estimates with similar accuracy from the bias and MSE point of view. In addition, by the results of the Monte Carlo simulations, it can also be concluded that all the estimators were asymptotically unbiased and consistent since their bias and MSE values decreased when the sample size increased. In terms of the convergence ratio of the estimators to the actual parameter values, it has been seen that the ML estimators converged faster to the actual values of the parameters than the modified estimators.
In the study, the real data modeling behavior of the ASP has been demonstrated with two data analyses on the air-conditioning system and No. 4 datasets. The ASP with the generalized Rayleigh distribution presented better data fits for both the air-conditioning system and the No. 4 datasets than the RP with smaller MSE* values. Besides, for both datasets, the ASP with the generalized Rayleigh distribution outperformed the alternative ASPs with the gamma, log normal, inverse Gaussian, and Weibull distributions with smaller MSE* values. Thus, it is concluded that the ASP with the generalized Rayleigh distribution provides quite satisfactory modeling performance for successive arrival times with a trend and is a powerful alternative to the ASPs with famous reliability distributions such as gamma, log normal, inverse Gaussian, and Weibull.