1. Introduction
In recent years, there has been a growing focus on the analysis of life test data. Within the field of reliability analysis, it is commonplace for data to be only partially observed, leading to the frequent use of censored data. According to the research conducted by [
1], a method was employed to estimate the population parameters of a life test distribution by subdividing the failure population into subpopulations that adhere to an exponential distribution. The sampling process was censored at a predetermined test termination time. In the analysis of lifetime data, two censoring schemes are commonly utilized: Type-I and Type-II censoring. In Type-I censoring, the censoring time is established in advance, while the number of observed failed units is variable. Conversely, in Type-II censoring, the lifetime test concludes upon reaching a predetermined number of failures, with the termination time being random.
One limitation of traditional censoring models is that live units can only be removed at the final termination point. In numerous scenarios involving life tests, it may be necessary to eliminate surviving units from the test prior to reaching the final termination point. In light of the work conducted by [
2], a definition for progressive Type-II censoring models has been proposed.
Numerous researchers have examined different lifetime distributions utilizing progressive Type-II censoring data. For instance, the study conducted by [
3] focused on linear inference for progressive Type-II censoring order statistics across various population distribution families, including scale, location, and location–scale families such as the Weibull distribution. Additionally, ref. [
4] presented a Bayesian analysis of the Rayleigh distribution based on progressively Type-II censored samples.
The Generalized Pareto Distribution (GPD) is extensively used in life test analysis across multiple domains, including biology and geography. For instance, the study by [
5] explored the applications of GPD in wind engineering, highlighting its advantages over the generalized extreme value distribution in the context of extreme value analysis. Additionally, ref. [
6] investigated the effects of event independence and threshold selection by conducting a GPD analysis on gust velocity maxima recorded at several island locations.
In this paper, we utilize the generalized Pareto distribution with unknown parameters, as discussed by [
7], to estimate these parameters using progressive Type-II censoring samples and to make corresponding Bayesian predictions.
In recent years, numerous researchers have made significant contributions to the study of the parameters of the generalized Pareto distribution. For instance, ref. [
8] investigated minimum density power divergence estimations (MDPDE) for GPD parameters and conducted a comparison of the efficiencies among MDPDE, maximum likelihood estimations (MLE), Dupuis’s optimally biased robust estimator, and the medians estimator proposed by Peng and Welsh. Additionally, ref. [
9] addressed statistical inference challenges related to GPD within the framework of progressive Type-I censoring. The author successfully derived MLEs using the Expectation–Maximization (EM) algorithm along with the Fisher information matrix, and they also formulated asymptotic confidence intervals for the parameters. The discussion in [
10] focused on estimating GPD parameters through MLE, the probability-weighted moments method, and the method of moments. Moreover, ref. [
11] implemented bootstrap algorithms and Monte Carlo methods to derive Bayesian estimations and confidence intervals for GPD. Furthermore, ref. [
12] utilized LINEX, entropy loss, and precautionary functions to achieve Bayesian estimations of GPD parameters, employing quasi, uniform prior, and inverted gamma distributions. For further details and examples related to the generalized Pareto distribution, please refer to the following articles: [
6,
7,
9,
13,
14,
15,
16].
The remainder of this article is structured as follows:
Section 2 will discuss Maximum Likelihood Estimators (MLEs).
Section 3 outlines the observed Fisher information matrix and the methodology for calculating asymptotic confidence intervals. In
Section 4, we derive Bayesian estimations of unknown parameters utilizing various loss functions. These estimations are then computed using the Tierney and Kadane (TK) method.
Section 4 also introduces the Metropolis–Hastings (MH) algorithm, which is employed to obtain Bayesian estimations. The samples generated using this approach are utilized to construct the highest posterior density intervals. In
Section 5, we present Bayesian point predictions and interval predictions for future observable samples.
Section 6 and
Section 7 offer simulation studies and compare the performance of the discussed approaches, with a real dataset being analyzed for demonstration purposes. Finally, we present our conclusions in
Section 8.
2. Maximum Likelihood Estimation
In the progressive Type-II censored model, a total of n units are included in a reliability test, with only m units being monitored until failure. Upon the occurrence of the first failure (), a random selection of units is removed from the remaining units. This procedure is repeated for each subsequent failure, resulting in the removal of units following the second failure (), and so forth, until units are removed after the m-th failure. The censoring scheme, denoted as , is established prior to the commencement of the test and must comply with the following conditions: and . Progressive Type-II censoring provides both practicality and flexibility by permitting units to be removed at any phase following a failure.
The notation denotes the censoring scheme. Here, represents the corresponding progressive censored samples derived from the generalized Pareto distribution.
A random variable
X is said to follow a generalized Pareto distribution if its probability density function (PDF) and cumulative distribution function (CDF) are defined as follows:
where the parameters associated with the scale and shape are denoted as
and
. We refer to this distribution as
. It is also recognized as the second kind of Pareto distribution or the Lomax distribution.
The likelihood function for samples obtained from the progressive Type-II censoring scheme is as follows:
where
represents the observed value of
. Consequently, we can derive the log-likelihood function as follows:
To obtain maximum likelihood estimates, the traditional approach is the Newton–Raphson method, as referenced in [
17]. However, a significant challenge with this method is the need to compute the quadratic partial derivatives of the log-likelihood function. In the context of this censored model, determining the roots of Equations (5) and (6) can often be quite complex. Therefore, the expectation–maximization algorithm, as introduced in [
18], is employed to address this issue and facilitate the calculation of MLEs. The EM algorithm ensures convergence to a stable point, requiring only the pseudo log-likelihood function of the complete sample to reach its maximum value, thereby effectively resolving the aforementioned challenges.
The EM algorithm consists of two main steps. The E-step entails calculating the expectation of the log-likelihood function using censored data based on the current parameter estimates. The M-step involves maximizing the log-likelihood function to obtain updated parameter estimates.
Let
represent the censored samples. Each
is a 1 ×
vector recorded as
, where
. The variable
denotes the observed sample, and
signifies the censored data following the failure of
. We define
to represent the complete sample. The vector
corresponds to the values
associated with
X. The log-likelihood function
for the complete sample is given as follows:
The pseudo log-likelihood function is expressed as follows:
where
In the M-step, we focus on maximizing the likelihood function
with respect to
and
. Let us denote the
s-th estimates of
and
as
. These estimates will be utilized to optimize the expression in function (
8). Consequently, the updated estimate
is then obtained by maximizing the following:
As a result, the associated likelihood equations are
and
The estimate of
at
-th stage can be characterized as follows:
Therefore, the maximization of (
10) can be obtained by solving the following fixed-point equation:
where
Once the difference falls below the specified tolerance limit, the iteration process concludes. After is obtained, is computed as . To achieve the maximum likelihood estimators for both and , it is essential to iteratively perform the E-step and M-step until the program converges.
The steps required for the implementation of the EM algorithm are as follows:
- (1)
Choose initial values .
- (2)
Calculate the expectation .
- (3)
Solve the fixed-point equation in (
14).
- (4)
Set to the solutions of step (3).
- (5)
Repeat until convergence.
3. Asymptotic Confidence Intervals
According to [
19], let Q represent the complete data and U represent the observed data. We denote
as the observed information and
as the complete information, and
. Subsequently, we introduce a new function,
, to represent the missing information.
where
With the
i-th observed data
, we derive the
, which represents the Fisher information for censored data.
As a result, we can obtain the complete missing information as follows:
Additionally, we employ a numerical technique to obtain
:
The expressions for
,
,
, and
are provided in the
Appendix A.
Next, we derive the
and consider its observed information, which encompasses the expectations outlined in (18):
The representations of
,
,
, and
are provided in the
Appendix A.
By employing the two matrices referenced above, we can derive the information matrix . Subsequently, we can calculate the variances of the maximum likelihood estimates for the parameters individually. The asymptotic variance–covariance matrix of the MLEs of the parameters is given by .
This allows us to establish
asymptotic confidence intervals (ACIs) for the estimates, where
. By applying the principle of asymptotic normality to the MLE of
, defined as
, we recognize that
follows a normal distribution, specifically
. Therefore, the ACIs for
can be defined as follows:
where
denotes the upper
-th quantile of the standard normal distribution. The term
refers to the principal diagonal element of
. We can similarly derive the
for
.
4. Bayesian Estimation
In contrast to maximum likelihood estimation, a parameter such as is considered not only as an unknown deterministic quantity but also as an unknown random variable. By observing the s-th sample , the probability density function is transformed into the posterior probability , leading to the pursuit of Bayesian estimation. The core objective of Bayesian estimation is to derive the optimal estimation of the parameter through Bayesian decision theory, thereby minimizing the total expected risk.
Bayesian statistics diverges from traditional statistical methods by emphasizing the incorporation of subjective prior information regarding the parameters used for the reliability analysis process. The prior distribution is derived from the experimenter’s subjective assessment, informed by data collected prior to experimentation. While this subjective approach may appear to conflict with the objective nature of scientific inquiry, it offers significant advantages by leveraging prior knowledge alongside the principles of the likelihood function. This approach effectively addresses the limitations of subjectivity in selecting sufficient statistics for classical estimation, allowing for a more coherent integration of sample randomness and sufficiency.
We will outline the procedure for Bayesian estimation. We begin by assuming that the parameters (, ) follow independent gamma distributions, specified by parameters and , respectively. Furthermore, let denote the progressive Type-II observed samples derived from the generalized Pareto distribution.
Let
represent the value of
X. The prior distribution is obtained as follows:
Based on the previously mentioned joint prior distribution and the likelihood distribution, the joint posterior distribution can be articulated as follows:
Given that
involves the solution of a double integral, it presents significant analytical challenges. Therefore, we will utilize the TK algorithm as discussed by [
20] alongside the MH algorithm to derive the approximate posterior distribution expectations. Additionally, we will obtain the approximate Bayesian estimations of
and
.
4.1. Loss Functions
In this subsection, we will examine three loss functions utilized in Bayesian statistics to enhance our understanding of their mathematical properties. These include the Squared Error Loss function (SEL), the LINEX, and the Balanced Squared Error Loss function (BSEL).
The function for
is as follows:
where
is a Bayesian estimation of
, and
is calculated as
.
The function is a conventional loss function; however, in many practical situations, its application without consideration for weighting or the differences between overestimation and underestimation can introduce bias into the results.
In such instances, it may be beneficial to employ an asymmetric loss function. As discussed by [
13], among the various options available, the LINEX loss function is often preferred, which is as follows:
It is important to note that we assume
here without loss of generality. The equation includes the Bayesian estimate of
, which is as follows:
In their work, ref. [
21] introduced a balanced loss function that is commonly employed to integrate both goodness of fit and estimation accuracy into the evaluation of estimators. Let
represent the observed data values of
. The formulation of the
function is described as follows:
where
. We will select
as the maximum likelihood estimation of
, and
represents the
function as defined above. We will compute the Bayesian estimation in accordance with
, which is expressed as follows:
Due to the variability of , we are able to consider additional scenarios. For instance, in the context of the BSEL function under , when represents the maximum likelihood estimator of the parameter, if , the Bayesian estimate coincides with the MLE. Conversely, when , the Bayesian estimate is reduced to the Bayesian estimation utilizing the function.
4.2. TK Method
Firstly, the posterior expectation for
is provided by the distribution
as follows:
where
represents the log-likelihood function and
indicates the prior density. For estimating the posterior expectation, we utilize the TK method to address the challenges associated with integrating ratio problems. To derive the explicit function for
, we will examine the following functions:
and
We individually maximize the two functions, resulting in the estimates
and
. Subsequently, we approximate
as follows:
In this context, and denote the determinants of the negative inverse Hessian matrices associated with and , respectively.
The specific action to be taken is as follows:
Now, we obtain the
as follows:
where
are provided in
Appendix A.
In line with the steps outlined above, the elements of
are provided in
Appendix A.
As a result, we can derive the
as follows:
Based on the calculations outlined above, we will now consider
in relation to both
and
. The Bayes estimations are subsequently presented according to the
function.
Further investigation into Bayesian estimation allows us to derive estimates using the BSEL function. The specific BSEL function is articulated as follows: , where .
In a similar manner, we derive Bayesian estimations utilizing the LINEX loss function:
We substitute with and treat as . This approach enhances the Bayesian estimations of unknown parameters, providing a more comprehensive understanding of the SEL and BSEL functions, thereby increasing their applicability across various fields.
4.3. Metropolis–Hastings Algorithm
In this section, we will compare various Bayesian estimation methods by focusing on the Metropolis–Hastings algorithm. As noted by [
22], the MH algorithm is a simulation-based Markov Chain Monte Carlo technique, primarily utilized for sampling from a specified probability distribution. The core idea is to construct a Markov Chain such that its steady state corresponds to the desired probability density.
To estimate the parameters using Bayesian methods, we begin by assuming that a bivariate normal distribution is appropriate for these parameters. The MH algorithm is subsequently applied to generate samples from a bivariate normal distribution. Furthermore, Bayesian estimates are derived under three different loss functions, and the highest posterior credible (HPD) intervals are established. The detailed steps of this process are outlined in Algorithm 1.
Algorithm 1 MH algorithm |
Step 1: Select an original value of , which is known as . |
Step 2: A proposal is generated from the binary normal distribution , where represents the variance covariance matrix, which tends to be considered as the inverse of the Fisher information matrix, and . |
Step 3: Calculate , where denotes the posterior distribution of . |
Step 4: Generate from a uniform distribution defined on the interval U(0,1). |
Step 5: If , then we assign ; otherwise, we maintain . |
Step 6: Repeat the steps outlined above a total of M times to obtain the desired number of samples. |
We exclude the initial
number of iterative values. Following the aforementioned steps, the Bayesian estimations utilizing the
function can be computed as follows:
Based on the aforementioned steps, the Bayesian estimations under the
function can be calculated as follows:
Additionally, we will establish the
highest posterior density (HPD) interval for
, as defined below:
To establish the upper and lower bounds of the interval, we will proceed with the following methodology: let
and
. Here,
for
is defined as follows:
The interval for can be obtained in a similar manner.
5. Bayesian Prediction
In this section, we utilize available information to generate estimations for future samples, as well as to construct corresponding prediction intervals. The ability to predict future samples is considered a crucial method in fields such as industrial, clinical, and agricultural experiments.
5.1. One-Sample Prediction
To obtain a one-sample prediction, we begin by assuming that the observed sample, under a progressive Type-II censoring scheme, is represented as , derived from a generalized Pareto distribution. The censoring scheme is denoted as . We denote the failure times for a future sample of size K as . Additionally, let (for ) represent the r-th failure of the future sample, which we aim to predict based on the observed samples. We will denote the recorded values of as , arranged in order.
Based on the function presented in Equation (2), we can derive the corresponding cumulative distribution function (CDF) as follows:
We will obtain the predictive survival function.
Utilizing the posterior distribution of
and
, we derive the posterior predictive cumulative distribution function (
), along with the associated survival function (
):
Next, we will derive the Bayesian prediction interval
, which corresponds to a confidence level of
:
Following this, we can generate the predictive estimation for the future
r-th ordered lifetime:
where
We have observed that the integrals mentioned above cannot be solved analytically. Utilizing the MH algorithm, we derive the predictive estimation of
as follows:
5.2. Two-Sample Prediction
Firstly, we consider a censored sample represented as
, with a size of
m, drawn from a population characterized by the generalized Pareto distribution. We also denote
as the ordered failure times of a future sample, where
. Additionally, we define
as the
r-th failure time from this future sample. The density function of
is outlined as follows:
Then, the posterior prediction density function of
is as follows:
Utilizing the MH algorithm, the approximate result can be efficiently calculated as follows:
Furthermore, we derive the posterior survival function of
, which is:
where
A Bayesian predictive interval for
w, denoted as
, is now constructed at the
level by obtaining the solutions to the following functions:
The next step involves obtaining the predictive estimation of the
r-th future ordered lifetime as follows:
The predictive estimation of
is derived using the MH algorithm as follows:
6. Simulation
Utilizing the methodology presented by [
2], the impact of the proposed estimation and prediction techniques has been simulated and assessed. Initially, the methods are outlined in Algorithm 2:
Algorithm 2 Sample generation |
Step 1: Generate from uniform distribution . |
Step 2: After confirming censoring scheme , |
let , where . |
Step 3: Let where stand the sample from by using the progressive Type-II censored scheme. |
Step 4: Let , , where stands for the CDF of the generalized Pareto distribution.
|
In our study, we obtain the required censored data from the generalized Pareto distribution as defined, denoting the data points as . We utilize the true values of parameters , specifically set at . Maximum likelihood estimates are calculated using the expectation–maximization method. For Bayesian estimation and prediction purposes, we select the parameters with values of , respectively. Additionally, we conduct Bayesian estimations using different loss functions, namely , , and , employing the TK and Metropolis–Hastings approaches.
We define the censoring model with respect to . For instance, if and , then the censoring scheme would be denoted as , which we record as . We will utilize the following ten censoring schemes for our simulations.
Based on the schemes presented in
Table 1, we will conduct a simulation study. The results are compiled and displayed below.
In
Table 2, we present the results of our numerical experiments conducted for each simulation. We report the maximum likelihood estimation and Bayesian estimation utilizing the TK method and the Metropolis–Hastings algorithm. Additionally, we provide the mean square error (MSE) values, which serve as a basis for comparing the results.
In the table, for each censoring scheme, the estimated values are displayed in the first and third rows, while the second and fourth rows contain the corresponding MSE values. The fifth and sixth columns present the results obtained from the MH method using the BSEL function. The seventh and eighth columns illustrate the results from the MH method employing the LINEX function. The tenth and eleventh columns show the outcomes using the TK method under the BSEL function, and the twelfth and thirteenth columns provide the results from the TK method under the LINEX function.
It is noteworthy that the MH method yields a smaller MSE compared to the TK approach. In terms of MSE comparison, the Bayesian estimation of parameters under the LINEX function exhibits a lower MSE than those using the SEL and BSEL functions, although the Bayesian estimations under SEL are closer to the true values. Regarding MLEs, larger sample sizes (denoted by m and n, representing the total and observed samples, respectively) result in more accurate estimations. Overall, the findings indicate that Bayesian estimation has a clear advantage over MLEs.
In
Table 3, we present the various intervals constructed using the MH algorithm, which include average credible intervals and Highest Posterior Density (HPD) intervals. These intervals are evaluated based on their average length (AL) and coverage probabilities (CP). In summary, Bayesian estimations demonstrate a clear advantage over maximum likelihood estimations. Additionally, we observe that the ALs for the HPD intervals are shorter. The results are detailed below:
In
Table 4, we present the point prediction results alongside the corresponding 95% prediction intervals. These predictions are generated based on the values of
and
(collectively referred to as
) within a future dataset of size 10. Overall, it is observed that as
p increases, the length of the prediction intervals also widens.
7. Real Data Analysis
In reference to [
23], we utilize the aforementioned methods on a real dataset comprising 50 observations, see
Table 5, as detailed below:
This dataset presents the cluster maxima of daily ozone concentrations exceeding 0.11 ppm during the summer months at the Pedregal station from 2002 to 2007. This research is highly relevant to efforts aimed at protecting public health in urban areas.
To conduct the analysis, we first calculate the maximum likelihood estimates for two parameters, and . Following this, we perform a goodness-of-fit assessment for the generalized Pareto distribution using several practical criteria, including the Bayesian Information Criterion (BIC), the Akaike Information Criterion (AIC), and Kolmogorov–Smirnov statistics (K-S). Additionally, for comparative purposes, we evaluate alternative distributions such as the Shifted Exponential Distribution (SED) and the Inverse Weibull Distribution (IWD).
The PDF of SED is as follows:
The PDF of IWD is as follows:
The statistical results and the maximum likelihood estimates are presented in
Table 6. The findings indicate that the generalized Pareto distribution is the most appropriate model.
To illustrate the aforementioned methods, we present two schemes of censored samples obtained from the dataset under progressive Type-II censoring.
Scheme 1:
Scheme 2:
The maximum likelihood estimates obtained using the expectation–maximization algorithm, the Bayesian estimations utilizing the TK and Metropolis–Hastings methods, and the Bayesian predictions are presented in
Table 7,
Table 8 and
Table 9.
We have computed maximum likelihood estimates employing the expectation– maximization algorithm. Bayesian estimates were derived utilizing the TK method and the Metropolis–Hastings method. In the absence of prior information, all parameter settings were initialized to values close to zero.
The results for both MLEs and Bayesian estimations are presented in
Table 7 through
Table 8. Furthermore,
Table 9 provides Bayesian point predictions along with 95% Bayesian interval predictions for variables
and
in future samples, with K set to 10.
8. Conclusions
In conclusion, this study examines censored data using both classical and Bayesian inference methods, specifically focusing on progressive Type-II censoring from the generalized Pareto distribution. Initially, maximum likelihood estimates are derived utilizing the expectation–maximization algorithm. Bayesian statistical methods are subsequently employed, utilizing three distinct loss functions (SEL, LINEX, and BSEL) for parameter estimation, with the TK method applied to manage the complexities associated with the posterior expectation. The Metropolis–Hastings algorithm is utilized for deriving Bayesian estimates and highest posterior density (HPD) intervals, as well as for making predictions regarding future samples. A simulation study is conducted to evaluate the effectiveness of these methodologies, alongside an analysis of a practical dataset. Furthermore, these approaches hold the potential for application to other distributions, such as the Gompertz and Weibull distributions, and may also facilitate further exploration of Bayesian prediction for the generalized Pareto distribution in the context of general progressive censoring in future research initiatives.
This study relies on the assumption that the data follow a Generalized Pareto Distribution (GPD). If the underlying data do not conform to this distribution, the estimates and predictions may be inaccurate. The use of a progressive Type-II censoring scheme may limit the generalizability of the findings. Different censoring schemes could yield different results, and the implications of using this specific scheme should be carefully considered. The effectiveness of the estimation and prediction methods may be sensitive to sample size. Small sample sizes can lead to unreliable estimates and wider credible intervals, affecting the precision of predictions.
The findings can be applied in various fields such as finance, environmental science, and reliability engineering, where modeling extreme values is crucial. The ability to make Bayesian predictions enhances decision-making under uncertainty. This work contributes to the statistical literature by providing a framework for Bayesian estimation and prediction in the context of the GPD. It encourages further exploration of Bayesian methods in extreme value theory. The limitations identified in this study open avenues for future research. For instance, exploring alternative censoring schemes, different prior distributions, or extending the model to accommodate other distributions could enhance the robustness of the findings. The ability to estimate and predict using the GPD can improve risk assessment strategies in various industries, allowing for better management of extreme events and their potential impacts.