Bayesian Estimation of a New Pareto-Type Distribution Based on Mixed Gibbs Sampling Algorithm

: In this paper, based on the mixed Gibbs sampling algorithm, a Bayesian estimation procedure is proposed for a new Pareto-type distribution in the case of complete and type II censored samples. Simulation studies show that the proposed method is consistently superior to the maximize likelihood estimation in the context of small samples. Also, an analysis of some real data is provided to test the Bayesian estimation


Introduction
As an important statistical inference method, the Bayesian method has been widely applied and developed in the fields of statistics and artificial intelligence.Especially with the rise of big data and machine learning in the early 21st century, the Bayesian method has received more and more attention.Researchers have improved the efficiency and accuracy of the Bayesian method by developing new algorithms and tools and applying them to various fields, such as computer vision, natural language processing, bioinformatics, and so on.Recently, many scholars [1][2][3] have studied it and obtained some valuable results.
The Pareto distribution [4] is a failure function with a decreasing value and has been widely applied in reliability, personal income, stock price fluctuation, and some other models.The origin and some applications of the Pareto distribution can be seen in Johnson et al. [5].Bourguignon et al. [6] proposed a new Pareto-type (NP) distribution, which is a generalization of the Pareto distribution and often applied for income and reliability data analysis.Raof et al. [7] used the NP distribution model to simulate the income of upper class groups in Malaysia.Karakaya et al. [8] proposed a new generalization of the Pareto distribution, incorporating a truncation parameter.Sarabia et al. [9] acquired further significant properties of this distribution and derived simpler expressions for relevant inequality indices and risk measures.Nik et al. [10] estimated the parameters of the NP distribution from two different perspectives, namely, classical and Bayesian statistics, in which the Bayesian estimation used importance sampling method.Nik et al. [11] studied the parameter estimation as well as predicting the failure times of the removed units in multiple stages of the progressively censored sample coming from the NP distribution, in which a Bayesian method (i.e., Markov chain Monte Carlo, MCMC) was applied to estimate the unknown parameters involved in the model.Soliman [12] studied the estimation of the reliability function in a generalized life model.The Bayesian estimations of the symmetric loss function (i.e., secondary loss) and asymmetric loss function (i.e., LINEX loss and GE loss) were obtained.These estimations were compared with the maximum likelihood estimation (MLE) with the Burr-XII model using the Bayesian approximation based on Lindley.Furthermore, Soliman [12] studied the Bayesian estimation of the parameter of interest on the premise that some other parameters are known.However, all of the parameters are often unknown in practice.Soliman [13] gave an approximate calculation formula for Bayesian estimation; however, the derivation of the formula is complicated and difficult to understand.Soliman's work shows that the Bayesian method may encounter high dimensions, complex models, and big data in practical use.Thus, scholars can greatly improve the efficiency and accuracy of the Bayesian method by improving the MCMC algorithm.
The Gibbs sampling algorithm was originally proposed by Geman [14].In recent years, with the development of technologies such as big data analysis, artificial intelligence, and machine learning, the applications of the Gibbs sampling algorithm to solve Bayesian estimation problems has become increasingly common.Ahmed [15] proposed two prediction methods for predicting the censored units with progressive type II censored samples.The lifetimes under consideration are set to follow a new two-parameter Pareto distribution.Moreover, point and interval estimation of the unknown parameters of the NP distribution can be obtained using the maximum likelihood and Bayesian estimation methods.Since Bayesian estimators cannot be expressed explicitly, the Gibbs sampling and MCMC techniques are utilized for Bayesian calculation.Ibrahim et al. [16] proposed a new discrete analogue of the Weibull class, and gave a Bayesian estimation procedure under the square error loss function.Furthermore, they compared the non-Bayesian estimation with the Bayesian estimation using Gibbs sampling and the Metropolis-Hastings algorithm.
For the NP distribution, Nik et al. [10] proposed some classical estimation methods and a Bayesian estimation based on the importance sampling method.However, the sampling efficiency of the importance sampling method is often low in high-dimensional situations.Thus, some other efficient methods are preferred.In our work, we use the mixed Gibbs sampling algorithm to estimate the parameters of the NP distribution with complete samples and type II censored samples.
The Gibbs sampling algorithm can solve some complex Bayes analysis problems directly, and has become a useful tool in Bayesian statistics.By establishing a Markov chain and iterating to the equilibrium state repeatedly, the Gibbs sampling algorithm can obtain a posterior distribution sample.However, the Gibbs sampling algorithm also has some shortcomings.For instance, the fully conditional posterior distribution of a parameter may be difficult to be directly sampled for some special prior distribution.Thus, the Gibbs sampling algorithm can be combined with other sampling methods to achieve the stability of the efficiency.Nowadays, there are some sampling methods, such as the Metropolis algorithm, importance sampling, accept-reject sampling, and so on.Among them, the Metropolis algorithm is a relatively easy and flexible sampling method.The Metropolis algorithm was improved by Hastings [17], and the result is called the Metropolis-Hastings algorithm.However, the Metropolis-Hastings algorithm is time consuming and inefficient in high-dimensional cases.Sampling may even be rejected after time-consuming calculations.A mixture of the Gibbs sampling algorithm and Metropolis-Hastings algorithm, i.e., the mixed Gibbs sampling algorithm, can overcome these shortcomings of the Metropolis-Hastings algorithm.Furthermore, the mixed Gibbs sampling algorithm can make the Gibbs sampling algorithm and Metropolis-Hastings algorithm perform fully to their own advantages.Thus, in our work, we use the mixed Gibbs sampling algorithm to propose a Bayesian estimation procedure for the NP distribution with complete and type II censored samples.
The rest of this paper is organized as follows.We state the MLE of the NP distribution in Section 2. In Section 3, we propose a Bayesian estimation procedure for the NP distribution with complete and type II censored samples using the mixed Gibbs sampling algorithm.Simulation studies and real data analysis are presented in Section 4. In Section 5, we present a brief discussion and conclusion of the results and methods.

Maximum Likelihood Estimation of NP Distribution
Suppose X is a random variable that follows the NP distribution with a shape parameter α and a scale parameter β, that is, X ∼ NP(α, β).The cumulative distribution function (CDF) F(x|α, β) of X is given as The probability density function (PDF) ( The PDF and CDF diagrams are shown in Figure 1.It should be highlighted that compared with the Pareto distribution, the NP distribution has the following advantages: (1) It can exhibit either an upside-down bathtub or a decreasing hazard rate function, depending on the values of its parameters.(2) It offers mathematical simplicity, as the probability and distribution functions of the new distribution have a simple form, in contrast to some generalizations of the Pareto distribution which involve special functions such as log, beta, and gamma functions.(3) The proposed distribution has only two parameters, unlike some generalizations of the Pareto distribution which have three or four parameters.Thus, in our work, it is assumed that the life of the product X follows the NP distribution.We present an MLE of the NP distribution below for comparison with the Bayesian estimation.

Complete Samples Case
In the case of complete samples, assuming that there are n products for a life test, and the order failure data It can be seen that ℓ(α, β) increases monotonically with β, that is to say, the greater the value of β, the greater the value of ℓ(α, β).Since x ≥ β, we obtain the MLE β = x 1 .Thus, the MLE α of α can be obtained from the solution of Equation (3):

Type II Censored Samples Case
In the case of type II censored samples, assume that there are n products for a life test, then the test stops when there are r product failures, and the order failure data can be obtained.MLE is used to estimate the unknown parameters α, β below.In order to obtain the likelihood function, it is necessary to know the probability of occurrence as follows.
(1) The probability of a product failing at (x i , (2) The probability that the lifetime of the remaining n − r products exceed x r is Thus, the probability of the above observation {x where Ignoring a constant factor does not affect the MLE of α and β, and the likelihood function L(α, β) can be given as Thus, the log-likelihood function ℓ(α, β) is as Similarly, we can obtain the MLE β of β as Therefore, the MLE α of α can be obtained from the solution of Equation ( 5) (5)

Bayesian Estimation of NP Distribution
The Bayesian estimations of the NP distribution under complete samples and type II censored samples are given below.

Complete Samples Case
In the case of complete samples, assume that there are n products for a life test, and the order failure data x 1 ≤ x 2 ≤ • • • ≤ x n can be obtained.When the parameters α and β are unknown, we assume that the joint prior distribution π(α, β) of α and β is Then, the posterior distribution π(α, β|x 1 , x 2 , • • • , x n ) is given as (7): Thus, the fully conditional posterior distribution π(α|x and the fully conditional posterior distribution π(β|x According to ( 8) and ( 9), both π(α|x 1 , do not have explicit expressions in the case of complete samples.Therefore, it is difficult to conduct sampling directly in practice.However, the mixed Gibbs sampling algorithm does not require an explicit expression of the fully conditional posterior distribution, and can be used to sampling different prior distributions.Thus, in our work, we consider the case of the prior distribution denoted as (6) using the mixed Gibbs sampling algorithm in Section 3.1.For Bayesian estimations for parameters of the NP distribution in the case of complete samples, we state the iterative steps of the mixed Gibbs sampling algorithm as follows.
(3) Generate a random number u from the uniform distribution U(0, 1), and then obtain α k+1 according to the following conditions (11): The resulting {α (i Step 1 ends when the Markov chain reaches equilibrium, then we can obtain α (i+1) .
It is worth noting that the sample trajectory of the Markov chain can be used as samples of the posterior distributions of α, β when the Markov chain reaches equilibrium, further leading to the Bayesian estimation of α, β.In order to ensure the approximate independence of the posterior distribution samples, subsequent determinations will be made based on the sample autocorrelation coefficients.

Type II Censored Samples Case
In the case of type II censored samples, assuming that there are n products for a life test, the test stops when there are r products failure, and the order failure data x 1 ≤ x 2 ≤ • • • ≤ x r can be obtained.We still take the joint prior distribution of α, β as π(α, β) ∝ 1/(αβ), then the posterior distribution π(α, β|x 1 , x 2 , • • • , x r ) is the following (14): Thus, the fully conditional posterior distribution π(α|x 1 , x 2 , • • • , x r , β) of α, denoted as (15), is and the fully conditional posterior distribution π(β|x 1 , x 2 , • • • , x r , α) of β, denoted as ( 16), is As can be seen from the above, in the case of type II censored samples, neither (15) nor ( 16) has an explicit expression, and so they are difficult to sample directly.Similarly, as previously discussed, the mixed Gibbs sampling algorithm does not require a display expression for the complete conditional posterior distribution, so we continue to use the mixed Gibbs algorithm.Similarly, we still only consider the same prior distribution as in Section 3.1.The steps of the mixed Gibbs sampling algorithm for Bayesian estimation of the NP distribution parameters in the case of type II censored samples are stated as follows.
Step 2 Obtain a sample β (i+1) from π(β|x 1 , x 2 , • • • , x r , α (i+1) ) using the Metropolis-Hastings algorithm in this step. ( , where β k is the current state and σ β is the standard deviation.Obtain a sample β ′ of β; if β ′ ≤ 0 or β ′ > x 1 , then a new sampling is required. (2) Calculate the acceptance probability P accept (β k , β ′ ) as ( 19): (3) Generate a random number u from the uniform distribution of U(0, 1), and then obtain β K+1 according to the following conditions (20): Similarly, the resulting The specific approach to obtaining the Bayesian estimation of the target parameters is similar to that in the case of complete samples and will not be further elaborated here.

Numerical Studies
In Section 4, we apply the R4.1.3software in our work.

Simulation Studies
Case 1. Denote X as the life of a product, and set X ∼ NP(α, β) with three groups of α and β.A Monte Carlo simulation was carried out in the complete samples case and 20% type II censored samples case.Generate 100 NP random variables using the inverse sampling method in each simulation.
To illustrate the complete implementation process of the mixed Gibbs sampling algorithm, we first present the Bayesian estimation and 95% credible interval of three groups of parameter simulations in both the complete samples case and the 20% type II censored samples case (refer to Tables 1 and 2).The parameters are estimated using Bayesian estimation based on the sample separately under the square loss function and the LINEX loss function (with a loss parameter of 1).Furthermore, when the parameters (α, β) = (4, 1), we demonstrate the process of mixed Gibbs sampling in both the complete samples case and the 20% type II censored samples case.This includes the trace plot, histogram, and autocorrelation coefficient diagram of corresponding parameters (refer to Figures 2 and 3).
As stated in Tables 1 and 2, the Bayesian estimation of parameters using the mixed Gibbs sampling algorithm has produced relatively satisfactory results for the three groups of selected parameters.The results show that the estimated value is close to the true value, which means that the proposed method has high accuracy and reliability concerning parameter estimation.In addition, the credible interval of the parameters also covers the true value, and the interval length is relatively short, indicating high precision in the estimation of the parameters.
From Figures 2 and 3, the trace plots show that the values of α and β are randomly scattered around the average.Furthermore, the traversal mean of the parameters (α, β) tends to stabilize after 200 iterations.It can be considered that the Gibbs sampling process reaches the equilibrium state at this point, and the remaining samples can be regarded as the observed value of the target parameter.As a precaution, the first 500 iterations are discarded, and then 4500 iterations are performed to obtain 4500 samples of each parameter.Additionally, the autocorrelation function indicates that the autocorrelation coefficient of all samples approaches 0 at a lag of 1 order.Therefore, the Bayesian estimation of the parameters is ultimately obtained through the 4500 samples that have been obtained.Case 2. It is assumed that the life X of a product follows the NP distribution of two parameters (α, β) = (4, 1).To investigate the strengths and weaknesses of the two estimation methods with small sample sizes, 20 random variables from the NP distribution were generated in each simulation in the complete samples case.In the type II censored samples case, 100 random variables of the NP distribution were generated for each simulation.Furthermore, 20% of the order failure data for the complete random variables are selected in each simulation.The mean value and mean square error (MSE) of MLE and Bayesian estimation (BE) are given in Table 3.The simulation shows that there is no significant difference between maximum likelihood estimation and Bayesian estimation in the large sample case, so we do not report the results in our article.As shown in Table 3, the MSEs for Bayesian estimation are consistently smaller than the MLEs.Consequently, we can infer that the BE method is evidently superior to the MLE in the context of small samples.In summary, the findings highlight the superior accuracy and robustness of Bayesian estimation utilizing the mixed Gibbs sampling algorithm compared to MLE.
DataSet I.For DataSet I, Bourguignon [6] provided the MLEs (i.e., αMLE = 0.433, βMLE = 1) of parameters α and β.Subsequently, Bayesian analysis was conducted for both type II censored data (r = 24) and the complete data (r = n = 30).In our work, 2000 sampling iterations were performed for each case.Testing the sampling mean revealed that the traversal mean of the parameters stabilized within 100 iterations, and the mixed Gibbs sampling algorithm converged in both cases.To mitigate the influence of the transition process, the sampled data from 201 to 2000 iterations were utilized as the mixed Gibbs sampling values for Bayesian estimation (BE), and the posterior estimation of the distribution parameters was performed using these sampling values (refer to Table 4).In Table 4, the posterior expectation and median represent the Bayesian point estimates of the parameters under square loss and absolute loss.The 95% credible intervals of the parameters are determined by the 2.5th and 97.5th percentiles.The results in Table 4 demonstrate the high accuracy of Bayesian estimation using the mixed Gibbs sampling algorithm for both complete data and type II censored data.Compared to frequency estimation methods, Bayesian estimation proves to be more effective in handling small sample sizes and provides probability distribution estimates of parameters, rather than just point estimates, for reliability data with limited observations.Consequently, Bayesian estimation offers a more precise reflection of parameter uncertainties, particularly when dealing with small datasets.
DataSet II.In the case of complete data, we conducted Bayesian analysis for DataSet II with 2000 sampling iterations.After testing the sampling mean, we found that the traversal mean of the parameters tended to be stable within 100 iterations, and the mixed Gibbs sampling algorithm converged.To eliminate the influence of the transition process, the sampled data of 201 to 2000 times were used as the mixed Gibbs sampling values for Bayesian estimation, and the posterior estimation of the distribution parameters was performed using these sampling values.Then, the Bayesian estimation (BE) results under the square loss were compared with the MLE results of Bourguignon [6] (refer to Table 5).It can be seen from the above Table 5 that the Bayesian estimation is very close to the MLE result, which further verifies the stability and reliability of the mixed Gibbs sampling algorithm.

Discussion and Conclusions
Some scholars have conducted valuable research for the NP distribution in the literature, such as [10,11,15].However, Bayesian methods may encounter challenges with high-dimensional, complex models, and big data in practice.Contrary to the low efficiency of importance sampling [10] and the complexity of the Lindley approximation [12] methods, a Bayesian estimation procedure is proposed for the NP distribution using the mixed Gibbs sampling algorithm.When the fully conditional distributions in Gibbs sampling steps lack an explicit form, direct sampling from these distributions becomes challenging.In such instances, alternative sampling methods must be proposed.As each step within a loop of Gibbs sampling is itself a Metropolis-Hastings iteration, the algorithm can be used to combine Gibbs sampling with the Metropolis-Hastings algorithm if necessary.
In this paper, we proposed a Bayesian estimation procedure for the NP distribution with complete and type II censored samples using the mixed Gibbs sampling algorithm.For instance, the mean and MSEs of the resulting parameters estimations are given, and the stability and effectiveness of the mixed Gibbs sampling algorithm are demonstrated with simulation studies and real data analysis.It can be seen that the mixed Gibbs sampling algorithm is stable, feasible, and high precision.
In reliability-related product tests, it is often challenging to obtain a sufficient number of samples to assess the quality of the products.Bayesian estimation, however, can assist in obtaining reliable results even in cases of limited sample sizes.By incorporating prior information, Bayesian estimation can improve the accuracy of parameter estimation and provide valuable uncertainty information about the parameters, which is particularly crucial when sample data is scarce.Therefore, in the context of product reliability tests, Bayesian estimation can serve as an effective tool for reliably evaluating and predicting product quality in situations with small sample sizes.Thus, in view of the merits of the Bayesian method with the mixed Gibbs sampling algorithm in this paper, it will become an important Bayesian inference method.
and we can obtain β (i+1) when the Markov chain reaches equilibrium.Step 3 Let i = i + 1, and repeat Step 1 and Step 2 in sequence until the Markov chains reach equilibrium.

Figure 2 .
Figure 2. Trace plots, histogram, and ACF of NP distribution parameters with complete samples case.

Figure 3 .
Figure 3. Trace plots, histogram, and ACF of NP distribution parameters with type II censored samples case.

Table 1 .
Bayesian estimation and 95% credible interval for parameters in complete samples case under different loss functions.

Table 2 .
Bayesian estimation and 95% credible interval for parameters in 20% type II censored samples case under different loss functions.

Table 3 .
Mean and mean square error of Bayesian estimation and maximum likelihood estimation.

Table 4 .
Numerical results of the real data analyses.

Table 5 .
Comparison of two estimation methods.