Next Article in Journal
Local Polar Coordinate Feature Representation and Heterogeneous Fusion Framework for Accurate Leaf Image Retrieval
Previous Article in Journal
A Wideband Circularly Polarized Filtering Dipole Antenna
Previous Article in Special Issue
Non-Centered Chi Distributions as Models for Fair Assessment in Sports Performance
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Statistical Analysis Under a Random Censoring Scheme with Applications

by
Mustafa M. Hasaballah
1,* and
Mahmoud M. Abdelwahab
2
1
Department of Basic Sciences, Marg Higher Institute of Engineering and Modern Technology, Cairo 11721, Egypt
2
Department of Mathematics and Statistics, Faculty of Science, Imam Mohammad Ibn Saud Islamic University (IMSIU), Riyadh 11432, Saudi Arabia
*
Author to whom correspondence should be addressed.
Symmetry 2025, 17(7), 1048; https://doi.org/10.3390/sym17071048
Submission received: 4 May 2025 / Revised: 12 June 2025 / Accepted: 24 June 2025 / Published: 3 July 2025

Abstract

The Gumbel Type-II distribution is a widely recognized and frequently utilized lifetime distribution, playing a crucial role in reliability engineering. This paper focuses on the statistical inference of the Gumbel Type-II distribution under a random censoring scheme. From a frequentist perspective, point estimates for the unknown parameters are derived using the maximum likelihood estimation method, and confidence intervals are constructed based on the Fisher information matrix. From a Bayesian perspective, Bayes estimates of the parameters are obtained using the Markov Chain Monte Carlo method, and the average lengths of credible intervals are calculated. The Bayesian inference is performed under both the squared error loss function and the general entropy loss function. Additionally, a numerical simulation is conducted to evaluate the performance of the proposed methods. To demonstrate their practical applicability, a real world example is provided, illustrating the application and development of these inference techniques. In conclusion, the Bayesian method appears to outperform other approaches, although each method offers unique advantages.

1. Introduction

The Gumbel Type-II (G-II) distribution was initially put forth by German mathematician Emil Gumbel in [1]. For simulating “extreme values”, like earthquakes, floods, and natural calamities, it is an effective tool. The distribution is frequently used in disciplines including hydrology, life expectancy research, and rainfall analysis. According to Gumbel, it excels at simulating the anticipated lifespan of goods, especially in comparative lifetime testing. The G-II distribution has also been helpful in predicting the probability of specific meteorological phenomena and natural disasters.
Statistical inference techniques for the G-II distribution have been extensively studied by numerous researchers. For example, Mousa et al. [2] utilized simulations with predetermined parameter values to evaluate Bayesian estimates, while [3] explored Bayesian estimation under the k-th lower value scenario. Nadarajah and Kotz [4] investigated maximum likelihood methods for the beta Gumbel distribution. Additionally, Miladinovic and Tsokos [5] examined the sensitivity of Bayesian reliability estimates for a modified Gumbel failure model using the SE loss function.
Feroze and Aslam [6] explored Bayesian estimation for the G-II distribution under doubly censored samples and various loss functions. This work was further developed by Abbas et al. [7], who investigated Bayesian inference for the G-II distribution, including applications for two competing units, as initially proposed in [8]. Abbas et al. [9] derived Bayesian estimation for type-II censored data under various loss functions and non-informative priors, employing Lindley’s approximation. More recently, Qiu and Gui [10] applied joint type-II censoring to achieve Bayesian estimation for two G-II distributions.
These investigations demonstrate the distribution’s adaptability and significance in advanced statistical modeling and estimation methods. The cumulative distribution function (CDF) is given as follows:
F ( x ) = e ρ x ψ , x > 0 , ρ > 0 , ψ > 0 ,
Furthermore, the corresponding probability density function (PDF) is expressed as follows:
f ( x ) = ρ ψ x ( ψ + 1 ) e ρ x ψ , x > 0 , ρ > 0 , ψ > 0 .
In this case, ψ represents the shape parameter, while ρ is the scale parameter of the distribution. Figure 1 illustrate the PDF and CDF of the G-II distribution for various values of the parameters ψ and ρ .
Censoring data plays a critical role in statistical estimation, particularly in fields like survival analysis, reliability engineering, and medical research. It arises when complete information about a data point is not observed due to limitations in study design, time constraints, or other practical barriers. For instance, in reliability studies, the lifetime of a product might only be partially observed if the study ends before all items fail. Censoring allows researchers to make efficient use of incomplete data, ensuring that valuable information is not discarded. By employing appropriate statistical methods, such as likelihood-based approaches or Bayesian frameworks, censored data can provide robust parameter estimates and deeper insights into underlying distributions. The ability to model censored data effectively enhances the precision and applicability of statistical analyses in real-world scenarios, where complete data collection is often infeasible.
Due to time and economic limitations, which often hinder the data acquisition process, obtaining a complete data set can be challenging. Such incomplete information is referred to as “observed data”. To analyze this kind of data, different observational methods are described in the literature. The two most widely utilized observation categories are Type-I and Type-II. Events in Type-I observation are only recorded if they occur before to a certain time. Under Type-II observation, information is gathered up until a certain quantity of noteworthy occurrences has been noted.
One method that has been extensively discussed in the literature is randomly censoring (RC). This happens when a research subject disappears or is removed from an experiment prior to the event of interest. RC is commonly used in medical time-to-event research, including clinical trials, due to the potential for treatment discontinuation or early study termination. Consequently, the data from these participants is considered to be RC. In his dissertation, Gilbert initially proposed the idea of RC [11]. Numerous studies have addressed this topic, including Breslow and Crowley [12], Koziol and Green [13], Ghitany and Al-Awadhi [14], Saleem and Aslam [15], Danish and Aslam [16], Krishna et al. [17], Garg et al. [18], Krishna and Goel [19], Krishna and Goel [20], Garg et al. [21], Ajmal et al. [22], Goel and Krishna [23], Hasaballah et al. [24], Alshenawy et al. [25], Hassan et al. [26], Almetwally et al. [27], Hassan and Ahmed [28], and El-Saeed and Abdellatif [29].
The manuscript emphasizes the statistical inference of the Gumbel Type-II (G-II) distribution, a model extensively utilized in reliability engineering with broad practical applications across various domains. It introduces both frequentist and Bayesian approaches to parameter estimation. The frequentist method employs maximum likelihood estimation (MLE) and derives asymptotic confidence intervals (ACIs), while the Bayesian approach utilizes Markov Chain Monte Carlo (MCMC) techniques to compute Bayes estimators and credible intervals (CRIs). The manuscript compares the performance of these methods through numerical simulations, demonstrating that Bayesian methods typically outperform traditional techniques, particularly concerning average relative bias (ARB) values. Furthermore, it examines the influence of different loss functions, such as squared error (SE) and general entropy (GE), on estimation performance, providing valuable insights into their efficacy under diverse conditions. A real-world example is also presented to highlight the practical relevance of the proposed inference methods. The primary aim is to enhance the understanding and application of the G-II distribution under RC, while delivering robust estimation techniques and comprehensive comparative analyses.
This article explores inferential techniques for the G-II distribution under a RC model using both classical and Bayesian frameworks. The structure of the paper is as follows: Section 2 provides the mathematical foundation for RC, including the distributions of censoring times and failures. The MLE approach and the creation of ACIs for the parameters are the main topics of Section 3. The Bayesian inferential approach is detailed in Section 4, which incorporates two loss functions SE and GE along with gamma informative priors. The MCMC method is employed to derive CRIs for the parameters. Section 5 demonstrates the practical application of the proposed methods through a real-world dataset. Section 6 examines the characteristics of the various estimators based on simulation analysis. Concluding remarks are presented in Section 7.

2. Randomly Censoring Scheme

For a study with (b) participants, let X 1 , , X b represent the event times for each participant. The CDF Q X ( x ) and the PDF q X ( x ) are used to model these event timings as independent and identically distributed (iid) random variables. With the accompanying CDF Q T ( t ) and PDF q T ( t ) , the sequence T 1 , , T b represents the RC times, which are also iid. Since H i = min ( X i , T i ) for i = 1 , , n , the observed data consists of iid random pairings ( H 1 , G 1 ) , , ( H b , G b ) , indicating the mutual independence of X i and T i . The definition of the variable G i is as follows:
G i = 1 if X i T i , 0 if X i > T i .
Under this setup, the joint PDF of H i and G i is given by
q H , G ( h , g ) = q X ( h ) ( 1 q T ( h ) g q T ( h ) ( 1 Q X ( h ) ) 1 g , h 0 , g = 0 , 1 .
Assuming a proportionality constant η > 0 , the random variables X and T follow the proportional hazards model if
1 Q T ( t ) = 1 Q X ( t ) η .
When η = 0 , this model corresponds to the case of no censoring. By combining the expressions for the joint PDF q H , G ( h , g ) and the proportional hazards model, the joint PDF of H i and G i is derived as
q H , G ( h , g ) = q X ( h ) 1 Q X ( h ) η η 1 g , h 0 , g = 0 , 1 .
By combining Equations (1)–(3), the joint PDF can be further expressed in terms of the specific parameters ψ , ρ , and η as
q H , G ( h , g ; ψ , ρ , η ) = ψ ρ h ( ψ + 1 ) e ρ h ψ 1 e ρ h ψ η η 1 g , h 0 , g = 0 , 1 .

3. Classical Inference

Classical inference is a cornerstone of statistical analysis, widely used in various scientific disciplines for parameter estimation and hypothesis testing. It relies on the concept of long-run frequencies, interpreting probabilities as the relative frequency of an event occurring over repeated experiments. This approach offers objective results that are independent of prior beliefs, making it particularly valuable in situations where prior information is unavailable or unreliable. Frequentist methods, such as MLE and Confidence Intervals, provide robust tools for analyzing data, ensuring reproducibility and consistency of results. Its importance extends to applications in experimental design, quality control, clinical trials, and more, where reliable, data-driven conclusions are critical for informed decision-making.

3.1. Maximum Likelihood Estimation

MLE is a fundamental method in statistical inference, widely used for estimating the parameters of a probability distribution. MLE operates by identifying the parameter values that maximize the likelihood function, which represents the probability of observing the given data under specific parameter assumptions. Its importance lies in its versatility and efficiency, as it provides consistent and asymptotically unbiased estimators under regular conditions. MLE is particularly valuable in a broad range of applications, including machine learning, econometrics, and experimental sciences, where accurate parameter estimation is essential for model interpretation and prediction. Additionally, MLE facilitates hypothesis testing and model comparison, making it a critical tool for data-driven decision-making in both theoretical and applied research.
In this article, we use a RC framework to obtain the MLEs for the unknown parameters of the G-II distribution. Using the formula ( h , g ) = ( h 1 , g 1 ) , ( h 2 , g 2 ) , , ( h b , g b ) , we consider a sample of size b that has been RC according to Equation (4). For this censored sample ( h , g ) , the likelihood function can be written as follows:
L ( H , G ; ψ , ρ , η ) = ψ b ρ b e ( ψ + 1 ) i = 1 b h i e ρ i = 1 b h i ψ e η i = 1 b ln 1 e ρ i = 1 b h i ψ η b i = 1 b g i .
Taking the natural logarithm of both sides yields
( ψ , ρ , η ) = b ln ψ + b ln ρ ( ψ + 1 ) i = 1 b ln h i ρ i = 1 b h i ψ + η i = 1 b ln 1 e ρ i = 1 b h i ψ + ( b i = 1 b g i ) ln η .
The following system of normal equations is obtained by differentiating the likelihood function (5) with regard to the parameters ρ , ψ , and η :
b ψ ^ i = 1 b ln h i + ρ ^ i = 1 b h i ψ ^ ln h i i = 1 b η ^ ρ ^ e ρ ^ h i ψ ^ h i ψ ^ ln h i 1 e ρ ^ h i ψ ^ = 0 ,
b ρ ^ i = 1 b h i ψ ^ + i = 1 b η ^ e ρ ^ h i ψ ^ h i ψ ^ 1 e ρ ^ h i ψ ^ = 0 ,
i = 1 b ln 1 e ρ ^ h i ψ ^ + ( b i = 1 b g i ) η ^ = 0 .
From Equation (8), we obtain the MLE for η ^ as
η ^ = b i = 1 b g i i = 1 b ln 1 e ρ ^ h i ψ ^ .
The equations for ρ ^ and ψ ^ presented above do not have closed-form solutions. As a result, we use numerical iteration methods to compute the estimates.

3.2. Confidence Interval

Confidence intervals (CIs) are crucial tools in statistical inference that provide a range of values within which an unknown parameter is likely to lie, with a certain level of confidence. Typically expressed as a lower and upper bound, CIs offer more information than a simple point estimate by reflecting the uncertainty and variability in the data. The width of the confidence interval is influenced by factors such as sample size, variability in the data, and the chosen confidence level, often set at 95%. A narrower interval indicates more precise estimation, while a wider interval suggests greater uncertainty. CIs are particularly valuable in decision-making processes, as they help quantify the reliability of parameter estimates, allowing researchers to assess the strength of their conclusions. Furthermore, they are essential in hypothesis testing, as they provide a framework for evaluating whether a parameter falls within a hypothesized range, supporting more robust and informed statistical interpretations.
This subsection begins with the derivation of Fisher’s Information Matrix (FIM) to construct the ACI. Next, the observed FIM is utilized to calculate the estimated variances of ρ ^ , ψ ^ , and η ^ , as presented below.
I 1 ( ρ ^ , ψ ^ , η ^ ) = 2 ρ ^ 2 2 ρ ^ ρ ^ 2 ρ ^ η ^ 2 ρ ^ ρ ^ 2 ρ ^ 2 2 ρ ^ η ^ 2 η ^ ρ ^ 2 η ^ ρ ^ 2 η ^ 2 1 = v a r ^ ( ρ ^ ) c o v ( ρ ^ , ψ ^ ) c o v ( ρ ^ , η ^ ) c o v ( ψ ^ , ρ ^ ) v a r ^ ( ψ ^ ) c o v ( ψ ^ , η ^ ) c o v ( η ^ , ρ ^ ) c o v ( η ^ , ψ ^ ) v a r ^ ( η ^ ) ,
where
2 ψ ^ 2 = b ψ ^ 2 i = 1 b ln h i + ρ ^ i = 1 b h i ψ ^ ln h i + i = 1 b η ^ ρ ^ 1 e ρ ^ h i ψ ^ e ρ ^ h i ψ ^ ( h i ψ ^ ln h i ) 2 h i ψ ^ ( ln h i ) 2 1 e ρ ^ h i ψ ^ 2 + i = 1 b η ^ ρ ^ e ρ ^ h i ψ ^ h i ψ ^ ln h i 2 1 e ρ ^ h i ψ ^ 2 = 0 ,
2 ρ ^ 2 = n ρ ^ 2 i = 1 b η ^ 1 e ρ ^ h i ψ ^ e ρ ^ h i ψ ^ h i 2 ψ ^ + η e ρ ^ h i ψ ^ h i ψ ^ 2 1 e ρ ^ h i ψ ^ 2 = 0 ,
2 η ^ 2 = ( b i = 1 b g i ) 2 η ^ 2 ,
2 η ^ ρ ^ = 2 ρ ^ η ^ = i = 1 b h i ψ ^ e ρ ^ h i ψ ^ 1 e ρ ^ h i ψ ^ ,
2 ψ ^ ρ ^ = 2 ρ ^ ψ ^ = i = 1 b h i ψ ^ ln h i + i = 1 b η ^ ρ ^ 1 e ρ ^ h i ψ ^ e ρ ^ h i ψ ^ ln h i h i ψ ^ 1 1 e ρ ^ h i ψ ^ 2 + i = 1 b ρ ^ h i ψ ^ e ρ ^ h i ψ ^ ln h i 1 e ρ ^ h i ψ ^ 2 = 0 ,
2 ψ ^ η ^ = 2 η ^ ψ ^ = i = 1 b ρ ^ h i ψ ^ e ρ ^ h i ψ ^ ln h i 1 e ρ ^ h i ψ ^ .
The asymptotic normality of the MLEs enables the computation of ( 1 φ ) 100 % ACIs for the parameters ρ ^ , ψ ^ , and η ^ , which are expressed as follows:
ρ ^ ± Z φ / 2 v a r ^ ( ρ ^ ) , ψ ^ ± Z φ / 2 v a r ^ ( ψ ^ ) ,   and η ^ ± Z φ / 2 v a r ^ ( η ^ ) .
Here, Z φ / 2 is the φ / 2 quantile of the standard Gaussian distribution.

4. Bayesian Estimation

Bayesian inference is a powerful statistical approach that integrates prior knowledge with observed data to estimate unknown parameters and make predictions. Unlike frequentist methods, Bayesian inference explicitly incorporates uncertainty in parameter estimates through the use of probability distributions. This is achieved by applying Bayes’ theorem, which updates prior beliefs based on new evidence to produce posterior distributions. The approach is particularly useful in complex models, small sample sizes, or when prior information is available, as it provides a coherent framework for decision-making under uncertainty. Bayesian inference has widespread applications across fields, including medicine, engineering, and machine learning, where understanding uncertainty and making probabilistic predictions are crucial. Its flexibility and ability to handle hierarchical models and incorporate expert knowledge make it an essential tool in modern statistical analysis. For more information, see Geman and Geman [30] and Gamerman [31].

4.1. Prior Information and Posterior Distribution

Prior information and posterior distribution are foundational concepts in Bayesian inference. Prior information represents the knowledge or beliefs about an unknown parameter before observing any data, expressed as a probability distribution called the prior. This allows the incorporation of expert opinions, historical data, or subjective judgment into the analysis. When new data are observed, Bayes’ theorem combines the prior distribution with the likelihood of the observed data to produce the posterior distribution, which represents the updated beliefs about the parameter. The posterior distribution provides a comprehensive view of parameter uncertainty, enabling probabilistic interpretations and predictions. The importance of priors and posteriors lies in their ability to formalize and quantify uncertainty, integrate past and current information, and offer a flexible framework for modeling complex systems. These features make Bayesian methods invaluable in applications requiring adaptive learning, decision-making, or robust estimation under uncertainty.
The Bayes estimates for the RC G-II distribution’s unknown parameters are obtained in this section. The parameters ρ , ψ , and η are assumed to be independent, each following a gamma prior distribution. Each prior is characterized by its corresponding PDF and specific hyperparameters: ( a 1 , d 1 ) , ( a 2 , d 2 ) , and ( a 3 , d 3 ) .
π 1 ( ρ ) ρ a 1 1 e d 1 ρ , ρ > 0 , a 1 , d 1 > 0 ,
π 2 ( ψ ) ψ a 2 1 e d 2 ψ , ψ > 0 , a 2 , d 2 > 0 ,
π 3 ( η ) η a 3 1 e d 3 η , η > 0 , a 3 , d 3 > 0 .
Assuming that ρ , ψ , and η have independent prior distributions, we construct a combined prior distribution. The joint posterior distribution is then derived by combining this prior with the likelihood function, as outlined below:
π ( ρ , ψ , η ) ρ b + a 1 1 ψ b + a 2 1 e ( ψ + 1 ) i = 1 b h i e ρ i = 1 b h i ψ e η i = 1 b ln 1 e ρ i = 1 b h i ψ η b + a 3 i = 1 b g i e d 1 ρ d 2 ψ d 3 η .
As a result, the Bayes estimator for the function U ( ρ , ψ , η ) under the SE loss function is expressed as follows:
U ^ B S E ( ρ , ψ , η ) = E U ( ρ , ψ , η ) | h , g = ρ ψ η U ( ρ , ψ , η ) ×   π ( ρ , ψ , η | h , g ) d ρ d ψ d η .
The GE loss function is a versatile and widely used tool in statistical estimation and decision-making. It is designed to penalize estimation errors asymmetrically, allowing for greater flexibility in applications where overestimation and underestimation have different consequences. The GE loss function is particularly important in scenarios such as reliability analysis, risk assessment, and economics, where the cost of errors varies depending on the direction of deviation from the true value. By incorporating a parameter that controls the relative weights of overestimation and underestimation, the GE loss function provides a tailored approach to minimizing the expected loss. This adaptability makes it a critical component in Bayesian analysis, where it is often used to derive posterior-based estimates that align with specific practical objectives. Its importance lies in enabling more accurate and context-sensitive decision-making across diverse fields.
The Bayes estimator for the function U ( ρ , ψ , η ) can be expressed using the GE loss function in the following form:
U ^ B G E ( ρ , ψ , η ) = E [ U ( ρ , ψ , η ) ] τ | h , g ] 1 τ =   [ ρ ψ η [ U ( ρ , ψ , η ) ] τ ×   π ( ρ , ψ , η | h , g ) d ρ d ψ d η ] 1 τ .
Because it is difficult to derive complete expressions for the marginal posterior distributions associated with every parameter, Equations (19) and (20) are not able to solved mathematically. In order to estimate these equations while accounting for the loss functions, the MCMC method is recommended as a suitable approach.

4.2. MCMC Techniques

MCMC techniques are powerful computational methods used to approximate complex probability distributions, particularly when direct analytical solutions are infeasible. By building a Markov chain that gradually converges to the intended distribution, MCMC produces samples from the target distribution. This method is particularly useful in Bayesian inference, as posterior distributions sometimes require difficult-to-solve high-dimensional integrals. Algorithms such as Metropolis-Hastings or Gibbs sampling allow MCMC to estimate parameters, credible intervals, and predictive distributions. Its flexibility and efficiency make it indispensable in fields such as statistics, machine learning, genetics, and physics, where exact solutions would otherwise be computationally prohibitive. CRIs, a key component of Bayesian estimation, provide a range of values for a parameter based on the posterior distribution. Unlike CIs in frequentist statistics, which rely on repeated sampling, CRIs represent the range within which the parameter lies with a specified probability, given the observed data and prior information. For example, a 95% CRI means that there is a 95% probability the true parameter value falls within this range, conditioned on the data and prior. This probabilistic interpretation enhances the flexibility of Bayesian analysis by integrating prior knowledge, enabling more context-specific conclusions. The width of the CRIs reflects both the data’s influence and the strength of the prior, with a narrower interval indicating greater precision. CRIs are vital in decision-making and risk assessment, offering a more intuitive understanding of uncertainty and enabling more informed, context-aware decisions based on the range of plausible parameter values. For more details about the MCMC methods, see, for example, Chen and Shao [32].
Samples are generated using the MCMC approach based on Equation (18). The process is described in this section. After identifying the parameters ρ , ψ , and η , the Bayes estimates are calculated, and the CRI estimates are produced. Equation (18) facilitates maintaining proportionality while deriving the conditional posterior density distributions for ρ , ψ , and η . To simplify representation and avoid complex conditional expressions, these distributions are denoted as π 1 ( ρ ) , π 2 ( ψ ) , and π 3 ( η ) . Refer to the following for the formulation of these distributions:
π 1 ( ρ ) ρ b + a 1 1 e ρ d 1 + i = 1 b h i ψ ,
π 2 ( ψ ) ψ b + a 2 1 e ψ d 2 + i = 1 b h i ,
π 3 ( η ) η b + a 3 i = 1 b g i 1 e η d 3 i = 1 b ln 1 e ρ i = 1 b h i ψ .
Notably, the density function in Equation (21) is equivalent to a gamma distribution with a shape parameter ( a 1 + b ) and a scale parameter d 1 + i = 1 b h i ψ . Likewise, the density function in Equation (22) illustrates a gamma distribution with a shape parameter ( a 2 + n ) and a scale parameter d 2 i = 1 b h i . Furthermore, the density function in Equation (23) characterizes a gamma distribution with a shape parameter ( a 3 + b ) and a scale parameter ( d 3 i = 1 b ln 1 e ρ i = 1 b h i ψ . It is possible to create samples for ρ , ψ , and η using gamma-generating techniques.
According to Equations (21)–(23), the complete conditional posterior densities of ρ , ψ , and η are mutually independent in this situation. As a result, the Metropolis–Hastings (M-H) algorithm can be used to independently extract Bayesian samples from each density. Consult Metropolis et al. [33] and Hastings [34] for more information. The following procedures outline how to use the M-H method to create samples from these conditional densities:
  • Begin by initializing the parameters ρ , ψ , and η with their initial values, denoted as ρ ( 0 ) , ψ ( 0 ) , and η ( 0 ) , respectively. Set j = 1 . Choose an appropriate value for k, which implies the burn-in period.
  • Utilizing a gamma distribution with parameters b + a 1 , d 1 + i = 1 b h i ψ , compute the value of ρ ( i ) .
  • Finding ψ ( i ) with a gamma distribution and parameters b + a 2 , d 2 i = 1 b h i is possible.
  • With a gamma distribution’s parameters n + a 3 , d 3 i = 1 b ln 1 e ρ i = 1 b h i ψ , calculate η ( i ) .
  • Set j = 1
  • To obtain the parameter values ( ρ ( j ) , ψ ( j ) , η ( j ) ) , iterate Steps 2 through 5 a total of A times, where j = K + 1 , , A .
  • Using j = K + 1 , , A , arranged the parameter values ρ ( j ) , ψ ( j ) , and η ( j ) in ascending order to determine the CRIs for ρ , ψ , and η . Since μ = ( ρ , ψ , η ) , φ ( A K ) φ / 2 , φ ( A K ) ( 1 φ / 2 ) represents the 100 ( 1 φ ) % CRIs.
  • The Bayes estimate for the parameter μ under the SE loss function can be determined using the following formula:
    μ ^ B S E = 1 A K i = K + 1 b μ ( i ) .
    The GE loss function is utilized to obtain the estimations as follows:
    μ ^ B G E = 1 A K i = K + 1 b [ μ ( i ) ] τ 1 τ .

5. Application to Real Data

To demonstrate the application of the G-II distribution to censored data, we analyze two real-world datasets in this section. Below is a description of the datasets and their corresponding fits.

5.1. Wooden Toy Cost Dataset

The first dataset, sourced from The Open University (1993), represents the costs of 31 different children’s wooden toys sold in a Suffolk craft store in April 1991. This dataset has been previously analyzed by Shafiei and colleagues [35]. The dataset is presented as follows:
4 . 2 1 . 12 1.39 2.0 3.99 2.15 1.74 5 . 81 1.7 0.5 0 . 99 11.5 5.12 0.9 1 . 99 6.24 2.6 3.0 12 . 2 7.36 4 . 75 11.59 8.69 9.8 1 . 85 1.99 1.35 10.0 0 . 65 1.45

5.2. Ball Bearing Data

To evaluate the various estimation techniques discussed in this study, real-world data are introduced in this section. Meintanis [36] provides data from a life test, detailing the number of revolutions until failure for each of 23 ball bearings. The dataset includes the following values:
17 . 88 28.92 33 . 00 41.52 42.12 45.60 48.80 51.84 51.96 54.12 55 . 56 67.80 68 . 64 68.64 68 . 88 84.12 93 . 12 98.64 105 . 12 105.84 127.92 128.04 173.40
Asterisks indicate which observations have been censored. To analyze the model’s efficacy, we applied several goodness-of-fit tests, including
  • the Anderson–Darling test;
  • the Kolmogorov–Smirnov (KS) test;
  • Pearson’s χ 2 test.
  • the Cramér–von Mises test.
To assess the model’s fit to the data, we examine both the distance statistics and p-values. If the p-value is less than 0.05, we reject H 0 (and accept H 1 ) at a significance level of γ = 0.05 . The test statistics are shown in Table 1. As indicated by the table, we cannot reject the null hypothesis that the data follows the G-II distribution because the p-values are relatively high for all tests, confirming the proposed model’s fit to the actual data. For visual confirmation, Figure 2 displays both the fitted and empirical survival functions of the G-II distribution for the two datasets. The figures show a close alignment between the empirical and fitted distributions. Furthermore, Figure 3 and Figure 4 present the quantile plots and the observed versus expected probability plots for the two datasets, respectively. Therefore, this distribution can be used to analyze the dataset efficiently. The box-and-whisker chart and smooth histogram for Data I and II are shown in Figure 5 and Figure 6, respectively. The profile log-likelihood function exhibits unimodal behavior, as shown in Figure 7 and Figure 8. It reaches a single maximum, as indicated by the contour plot and 3D plot of the log-likelihood function for the parameters ρ and ψ in Figure 9.
We computed estimates using MLEs for ψ = 1.2 , ρ = 2.1 , and η = 0.5 based on Dataset I. Along with the 95% ACI values, these findings are presented in Table 2. For the Bayesian estimate, we employed the MCMC approach with a total of 11,000 samples. During the “burn-in” phase, the first 2000 samples were discarded to ensure the stability of the estimation process. Bayesian estimates were obtained for ψ , ρ , and η under both the SE and GE loss functions, using non-informative priors with hyperparameters set to a i = 0 and d i = 0 for i = 1 , 2 , 3 . Table 2 provides a comprehensive display of the Bayesian estimation results, including the 95% CRIs for ψ , ρ , and η .
MLEs were used to calculate estimates for ψ = 2.0 , ρ = 1.6 , and η = 1.5 using Dataset II. These results are shown in Table 3, which also provides the corresponding 95% ACIs. Using the MCMC approach, we generated 12,000 samples for Bayesian estimation. The burn-in phase discarded the first 2000 samples to ensure the stability of the estimation process. With non-informative priors, Bayesian estimates for ψ , ρ , and η under both the SE and GE loss functions were obtained, with hyperparameters set to a i = 0 and d i = 0 for i = 1 to 3. The results of the Bayesian estimation are presented in detail in Table 3, which also provides information on the 95% CRIs for ψ , ρ , and η .

6. Simulation Study

To evaluate the effectiveness and performance of various estimating techniques, a Monte Carlo simulation study is presented in this section. Using a predetermined procedure, a RC sample is generated from the G-II distribution. The parameters for the simulation are ρ = 2.5 , ψ = 1.5 , and η = 1.5 . The MLEs for ρ , ψ , and η are calculated, along with the ARB for various sample sizes. In the Bayesian analysis, a gamma prior is applied to all parameters, with the prior hyperparameters set as a 1 = a 2 = a 3 = 1.5 and d 1 = d 2 = d 3 = 2.5 . Bayesian parameter estimates are obtained under the SE and GE loss functions using the MCMC method. A chain of 11,000 iterations is generated using the M-H algorithm, with the first 2000 iterations discarded to minimize the influence of initial values.
The ARB is determined using the following formula:
ARB ( μ ^ ) = i = 1 1000 μ ^ i 1000 μ μ , where i = 1 , 2 , 3 , and ( μ 1 , μ 2 , μ 3 ) = ( ρ , ψ , η ) .
The classical and Bayesian estimates for the parameters ρ , ψ , and η , along with their respective ARB values, are presented in Table 4, Table 5 and Table 6.
From the findings of the simulation study, we conclude the following:
  • From Table 4, Table 5 and Table 6, it is observed that as the sample size increases, the ARB decreases.
  • Bayesian methods demonstrate superior performance compared to traditional methods, as the ARB values obtained using MCMC are lower than those obtained with MLE.
  • When using the GE loss function, the performance under underestimation is better than overestimation. Specifically, ARB values at μ = 4 are better than those at μ = 4 .
  • The performance of MCMC at μ = 4 is also observed to be better than the SE loss function in terms of achieving smaller ARB values.
  • The interval length in Bayesian methods is shorter compared to that observed in conventional methods.

7. Conclusions

The study presents key findings and contributions on the statistical inference of the G-II distribution under a RC scheme. The MLE method provides reliable point estimates, with decreasing ARB as sample sizes increase, indicating greater accuracy in parameter estimation. The Bayesian approach, utilizing MCMC techniques, outperforms frequentist methods, offering higher precision through shorter CRI lengths. The practical significance of these methods is demonstrated through successful applications in real-world scenarios like reliability engineering, underscoring the importance of selecting statistical tools suited to specific data characteristics. The research also identifies opportunities for future work, such as examining other distributions under similar censoring conditions, enhancing computational approaches, and incorporating prior information in Bayesian methods to improve estimates in cases of limited data. These findings offer valuable insights into the G-II distribution and highlight the need for robust methodologies in reliability analysis, paving the way for further advancements in the field.

8. Directions for Future Research

  • Extend to other lifetime distributions
    Apply the same inference framework to other two-parameter or three-parameter distributions like Weibull, Log-logistic, or Burr Type XII under random censoring.
  • Handle more complex censoring mechanisms
    Investigate models under progressive, interval, or informative censoring schemes. These reflect more realistic experimental settings in survival and reliability studies.

Author Contributions

Conceptualization, M.M.H.; methodology, M.M.H. and M.M.A.; software, M.M.H.; validation, M.M.A.; formal analysis, M.M.H. and M.M.A.; resources, M.M.A.; data curation, M.M.A.; writing—original draft, M.M.H.; writing—review & editing, M.M.H.; funding acquisition, M.M.A. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported and funded by the Deanship of Scientific Research at Imam Mohammad Ibn Saud Islamic University (IMSIU) (grant number IMSIU-DDRSP2502).

Data Availability Statement

All datasets are reported within the article.

Acknowledgments

The authors extend their appreciation to Deanship of Scientific Research at Imam Mohammad Ibn Saud Islamic University (IMSIU) for funding this work through Research Group: IMSIU-DDRSP2502.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Gumbel, E. Statistics of Extremes; Columbia University Press: New York, NY, USA, 1958. [Google Scholar]
  2. Mousa, M.A.; Jaheen, Z.; Ahmad, A. Bayesian estimation, prediction and characterization for the Gumbel model based on records. Stat. J. Theor. Appl. Stat. 2002, 36, 65–74. [Google Scholar] [CrossRef]
  3. Malinowska, I.; Szynal, D. On a family of Bayesian estimators and predictors for a Gumbel model based on the kth lower records. Appl. Math. 2004, 1, 107–115. [Google Scholar] [CrossRef]
  4. Nadarajah, S.; Kotz, S. The beta Gumbel distribution. Math. Probl. Eng. 2004, 2004, 323–332. [Google Scholar] [CrossRef]
  5. Miladinovic, B.; Tsokos, C.P. Ordinary, Bayes, empirical Bayes, and non-parametric reliability analysis for the modified Gumbel failure model. Nonlinear Anal. Theory Methods Appl. 2009, 71, e1426–e1436. [Google Scholar] [CrossRef]
  6. Feroze, N.; Aslam, M. Bayesian Analysis Of Gumbel Type II Distribution Under Doubly Censored Samples Using Different Loss Functions. Casp. J. Appl. Sci. Res. 2012, 1, 26–43. [Google Scholar]
  7. Abbas, K.; Fu, J.; Tang, Y. Bayesian estimation of Gumbel type-II distribution. Data Sci. J. 2013, 12, 33–46. [Google Scholar] [CrossRef]
  8. Feroze, N.; Aslam, M. Bayesian estimation of twocomponent mixture of gumbel type II distribution under informative priors. Int. J. Adv. Sci. Technol. 2013, 53, 11–30. [Google Scholar]
  9. Abbas, K.; Hussain, Z.; Rashid, N.; Ali, A.; Taj, M.; Khan, S.A.; Manzoor, S.; Khalil, U.; Khan, D.M. Bayesian estimation of gumbel type-II distribution under type-II censoring with medical applications. Comput. Math. Methods Med. 2020, 2020, 1876073. [Google Scholar] [CrossRef]
  10. Qiu, Y.; Gui, W. Statistical Inference for Two Gumbel Type-II Distributions under Joint Type-II Censoring Scheme. Axioms 2023, 12, 572. [Google Scholar] [CrossRef]
  11. Gilbert, J.P. Random Censorship. Ph.D. Thesis, University of Chicago, Chicago, IL, USA, 1962. [Google Scholar]
  12. Breslow, N.; Crowley, J. A large sample study of the life table and product limit estimates under random censorship. Ann. Stat. 1974, 2, 437–453. [Google Scholar] [CrossRef]
  13. Koziol, J.; Green, S. A Cramer–von Mises statistic for randomly censored data. Biometrika 1976, 63, 465–474. [Google Scholar] [CrossRef]
  14. Ghitany, M.E.; Al-Awadhi, S. Maximum likelihood estimation of Burr XII distribution parameters under random censoring. J. Appl. Stat. 2002, 29, 955–965. [Google Scholar] [CrossRef]
  15. Saleem, M.; Aslam, M. On Bayesian analysis of the Rayleigh survival time assuming the random censor time. Pak. J. Stat. 2009, 25, 71–82. [Google Scholar]
  16. Danish, M.Y.; Aslam, M. Bayesian inference for the randomly censored Weibull distribution. J. Stat. Comput. Simul. 2014, 84, 215–230. [Google Scholar] [CrossRef]
  17. Vivekanand, H.K.; Kumar, K. Estimation in Maxwell distribution with randomly censored data. J. Stat. Comput. Simul. 2015, 85, 3560–3578. [Google Scholar]
  18. Garg, R.; Dube, M.; Kumar, K.; Krishna, H. On randomly censored generalized inverted exponential distribution. Am. J. Math. Manag. Sci. 2016, 35, 361–379. [Google Scholar] [CrossRef]
  19. Krishna, H.; Goel, N. Maximum likelihood and Bayes estimation in randomly censored geometric distribution. J. Probab. Stat. 2017, 2017, 4860167. [Google Scholar] [CrossRef]
  20. Krishna, H.; Goel, N. Classical and Bayesian inference in two parameter exponential distribution with randomly censored data. Comput. Stat. 2018, 33, 249–275. [Google Scholar] [CrossRef]
  21. Garg, R.; Dube, M.; Krishna, H. Estimation of parameters and reliability characteristics in Lindley distribution using randomly censored data. Stat. Optim. Inf. Comput. 2020, 8, 80–97. [Google Scholar] [CrossRef]
  22. Ajmal, M.; Danish, M.Y.; Arshad, I.A. Objective Bayesian analysis for Weibull distribution with application to random censorshi model. J. Stat. Comput. Simul. 2022, 92, 43–59. [Google Scholar] [CrossRef]
  23. Goel, N.; Krishna, H. Different methods of estimation in two parameter Geometric distribution with randomly censored data. Int. J. Syst. Assur. Eng. Manag. 2022, 13, 1652–1665. [Google Scholar] [CrossRef]
  24. Hasaballah, M.M.; Balogun, O.S.; Bakr, M.E. Non-Bayesian and Bayesian estimation for Lomax distribution under randomly censored with application. AIP Adv. 2024, 14, 025318. [Google Scholar] [CrossRef]
  25. Alshenawy, R.; Sabry, M.A.H.; Almetwally, E.M.; Elomngy, H.M. Product Spacing of Stress–Strength under Progressive Hybrid Censored for Exponentiated-Gumbel Distribution. Comput. Mater. Contin. 2020, 66, 2973–2995. [Google Scholar] [CrossRef]
  26. Hassan, A.S.; Fayomi, A.; Algarni, A.; Almetwally, E.M. Bayesian and Non-Bayesian Inference for Unit-Exponentiated Half-Logistic Distribution with Data Analysis. Appl. Sci. 2022, 12, 11253. [Google Scholar] [CrossRef]
  27. Almetwally, E.M.; Kilai, M.; Aldallal, R. X-Gamma Lomax Distribution with Different Applications. J. Bus. Environ. Sci. 2022, 1, 129–140. [Google Scholar] [CrossRef]
  28. Hassan, A.S.; Abdelghaffar, A.M. Bayesian and E-Bayesian Estimation of Gompertz Distribution in Stress-Strength Reliability Model under Partially Accelerated Life Testing. Comput. J. Math. Stat. Sci. 2025, 4, 348–378. [Google Scholar] [CrossRef]
  29. El-Saeed, A.R.; Abdellatif, A.D. Point and Interval Estimation of Reliability and Entropy for Generalized Exponential Distribution under Generalized Type-II Hybrid Censoring Scheme. Comput. J. Math. Stat. Sci. 2025, 4, 96–138. [Google Scholar] [CrossRef]
  30. Geman, S.; Geman, D. Stochastic relaxation, Gibbs distribution and the Bayesian restoration of images. IEEE Trans. Pattern Anal. Mach. Intell. 1984, 6, 721–741. [Google Scholar] [CrossRef]
  31. Gamerman, D. Markov Chain Monte Carlo: Stochastic Simulation for Bayesian Inference; Chapman & Hall: London, UK, 1997. [Google Scholar]
  32. Chen, M.; Shao, Q. Monte Carlo estimation of Bayesian credible and HPD intervals. J. Comput. Graph. Stat. 1999, 8, 69–92. [Google Scholar] [CrossRef]
  33. Metropolis, N.; Rosenbluth, A.W.; Rosenbluth, M.N.; Teller, A.H.; Teller, E. Equations of state calculations by fast computing machines. J. Chem. Phys. 1953, 21, 1087–1091. [Google Scholar] [CrossRef]
  34. Hastings, W.K. Monte Carlo sampling methods using Markov chains and their applications. Biometrika 1970, 57, 97–109. [Google Scholar] [CrossRef]
  35. Shafiei, S.; Darijani, S.; Saboori, H. Inverse Weibull power series distributions: Properties and applications. J. Stat. Comput. Simul. 2016, 86, 1069–1094. [Google Scholar] [CrossRef]
  36. Meintanis, S.G. A new approach of goodness-of-fit testing for exponentiated laws applied to the generalized Rayleigh distribution. Comput. Stat. Data Anal. 2008, 52, 2496–2503. [Google Scholar] [CrossRef]
Figure 1. (a) Density plot of the G-II distribution for different values of ρ . (b) The CDF of the G-II distribution for various values of ρ .
Figure 1. (a) Density plot of the G-II distribution for different values of ρ . (b) The CDF of the G-II distribution for various values of ρ .
Symmetry 17 01048 g001
Figure 2. (a) Empirical and fitted survival function due to data-I. (b) Empirical and fitted survival function due to data-II.
Figure 2. (a) Empirical and fitted survival function due to data-I. (b) Empirical and fitted survival function due to data-II.
Symmetry 17 01048 g002
Figure 3. (a) Probability plot for data-I. (b) Probability plot for data-II.
Figure 3. (a) Probability plot for data-I. (b) Probability plot for data-II.
Symmetry 17 01048 g003
Figure 4. (a) Quantile plot for data-I. (b) Quantile plot for data-II.
Figure 4. (a) Quantile plot for data-I. (b) Quantile plot for data-II.
Symmetry 17 01048 g004
Figure 5. (a) Smooth histogram for data-I. (b) Smooth histogram for data-II.
Figure 5. (a) Smooth histogram for data-I. (b) Smooth histogram for data-II.
Symmetry 17 01048 g005
Figure 6. (a) Box whisker chart for data-I. (b) Box whisker chart for data-II.
Figure 6. (a) Box whisker chart for data-I. (b) Box whisker chart for data-II.
Symmetry 17 01048 g006
Figure 7. (a) Plots of the profile log-likelihood function of ρ for data-I and (b) Plots of the profile log-likelihood function of ψ for data-I.
Figure 7. (a) Plots of the profile log-likelihood function of ρ for data-I and (b) Plots of the profile log-likelihood function of ψ for data-I.
Symmetry 17 01048 g007
Figure 8. (a) Plots of the profile log-likelihood function of ρ for data-II and (b) Plots of the profile log-likelihood function of ψ for data-II.
Figure 8. (a) Plots of the profile log-likelihood function of ρ for data-II and (b) Plots of the profile log-likelihood function of ψ for data-II.
Symmetry 17 01048 g008
Figure 9. (a) Plot 3D log-likelihood function. and (b) Contour plot log-likelihood function.
Figure 9. (a) Plot 3D log-likelihood function. and (b) Contour plot log-likelihood function.
Symmetry 17 01048 g009
Table 1. Statistical measures for goodness-of-fit analysis of observed data.
Table 1. Statistical measures for goodness-of-fit analysis of observed data.
ModesAnderson-DarlingCramer-von MisesPearson χ 2 K-S
Statisticp-ValueStatisticp-ValueStatisticp-ValueStatisticp-Value
Dataset I0.41630.83110.05480.84631.46660.98340.10230.8804
Dataset II0.57810.66650.07840.70085.86950.55500.13280.7635
Table 2. Estimates for censored Dataset I using classical and Bayesian methods.
Table 2. Estimates for censored Dataset I using classical and Bayesian methods.
ParametersMLEMCMC
MeanLengthSEGE Length
τ = 4 τ = 4
ψ 0.67460.46310.928870.97600.84970.3707
ρ 3.36051.98481.93622.04551.75011.4811
η 2.42163.76250.94891.13390.72071.2759
Table 3. Estimates for censored Dataset II using classical and Bayesian methods.
Table 3. Estimates for censored Dataset II using classical and Bayesian methods.
ParametersMLEMCMC
MeanLengthSEGE Length
τ = 4 τ = 4
ψ 1.10060.81150.24100.25620.21460.1940
ρ 162.66469.9622.76463.1692.23283.2405
η 2.61534.66471.76785.93760.94324.3552
Table 4. MLE and MCMC estimates for the parameter ρ with corresponding ARB values in bold.
Table 4. MLE and MCMC estimates for the parameter ρ with corresponding ARB values in bold.
bMLEMCMC
MeanLengthSEGE Length
ζ = 4 ζ = 4
202.93462.17351.52291.63191.33801.3060
0.5738 0.39080.34730.4648
302.57331.59611.52491.59821.40031.0787
0.4593 0.39000.33070.4399
403.36431.90561.86531.94851.72901.2581
0.4457 0.25390.22060.3084
502.53491.20731.58101.62821.50180.8701
0.4140 0.24760.21870.3013
602.97441.30191.72391.76971.64700.9024
0.4097 0.21050.21210.2912
702.54671.02391.58091.61441.52460.7377
0.3987 0.20760.20420.2902
802.94561.12171.78251.81971.72040.8294
0.3782 0.20700.20210.2818
902.637716.24831.84271.87661.78580.8017
0.3551 0.19290.17930.2757
1002.68690.90271.74771.77501.70170.7015
0.2448 0.18090.16000.2193
Table 5. MLE and MCMC estimates for the parameter ψ with corresponding ARB values in bold.
Table 5. MLE and MCMC estimates for the parameter ψ with corresponding ARB values in bold.
bMLEMCMC
MeanLengthSEGE Length
ζ = 4 ζ = 4
201.37161.09461.65221.76451.46061.0866
0.4856 0.20140.19630.4648
302.06641.32102.43532.55052.23971.7023
0.4770 0.19360.19030.4231
401.94211.12341.66811.72801.56671.0192
0.4647 0.19210.18200.4145
501.36390.68451.78551.83791.69680.9850
0.4607 0.19030.17520.4012
601.38160.91121.69751.7381.62870.8464
0.4589 0.18170.16870.3858
701.37850.58611.91441.95421.84750.8859
0.2510 0.17630.15280.2316
801.57020.62341.80201.83491.74690.7824
0.2468 0.16130.15020.2146
901.545717.30111.69251.72011.64630.6927
0.2305 0.14840.14680.1976
1001.42310.51751.74951.77541.70590.6842
0.2113 0.12630.12060.1373
Table 6. MLE and MCMC estimates for the parameter η with corresponding ARB values in bold.
Table 6. MLE and MCMC estimates for the parameter η with corresponding ARB values in bold.
bMLEMCMC
MeanLengthSEGE Length
ζ = 4 ζ = 4
201.90243.52590.74700.89300.54321.0342
0.6683 0.50200.40470.6379
301.27441.82490.66770.76430.52620.7987
0.6504 0.50190.40040.6192
401.04301.26800.73090.83140.58740.8499
0.6447 0.41270.39570.6084
501.26381.38190.63500.69340.54740.6116
0.6074 0.40670.33770.5351
601.50631.53730.72270.78140.63360.6481
0.6042 0.35820.27910.5176
701.38431.29380.64370.68500.57870.5181
0.5771 0.47080.24330.5142
801.33241.16110.71510.76110.64370.5756
0.5117 0.42330.19260.4709
901.308645.42850.69290.73400.62910.5343
0.4076 0.33810.11060.3806
1001.23640.95880.67360.70660.62170.4731
0.3058 0.25090.10890.2855
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Hasaballah, M.M.; Abdelwahab, M.M. Statistical Analysis Under a Random Censoring Scheme with Applications. Symmetry 2025, 17, 1048. https://doi.org/10.3390/sym17071048

AMA Style

Hasaballah MM, Abdelwahab MM. Statistical Analysis Under a Random Censoring Scheme with Applications. Symmetry. 2025; 17(7):1048. https://doi.org/10.3390/sym17071048

Chicago/Turabian Style

Hasaballah, Mustafa M., and Mahmoud M. Abdelwahab. 2025. "Statistical Analysis Under a Random Censoring Scheme with Applications" Symmetry 17, no. 7: 1048. https://doi.org/10.3390/sym17071048

APA Style

Hasaballah, M. M., & Abdelwahab, M. M. (2025). Statistical Analysis Under a Random Censoring Scheme with Applications. Symmetry, 17(7), 1048. https://doi.org/10.3390/sym17071048

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop