Simultaneous Conﬁdence Intervals for All Pairwise Differences between Means of Weibull Distributions

: The Weibull distribution is a continuous probability distribution that ﬁnds wide application in various ﬁelds for analyzing real-world data. Speciﬁcally, wind speed data often adhere to the Weibull distribution. In our study, our aim is to compare the mean wind speed datasets from different areas in Thailand. To achieve this, we proposed simultaneous conﬁdence intervals for all pairwise differences between the means of Weibull distributions. The generalized conﬁdence interval (GCI), method of variance estimates recovery (MOVER), and a Bayesian approach, utilizing both gamma and uniform prior distributions, are proposed to construct simultaneous conﬁdence intervals. Through simulations, we ﬁnd that the Bayesian highest posterior density (HPD) interval using a gamma prior distribution demonstrates satisfactory performance, while the GCI proves to be a viable alternative. Finally, we applied these proposed approaches to real wind speed data in northeastern and southern Thailand to illustrate their effectiveness and practicality.


Introduction
The Weibull distribution is widely used in various fields due to its heavy-tailed nature and remains asymmetric regardless of the chosen parameter values.Consequently, transforming it into a perfectly symmetric distribution is a complex task.However, it is feasible to make the Weibull distribution more symmetrical or to approximate symmetry.Techniques explored include a close-to-normal power approximation, logarithmic transformation, and the Box-Cox transformation.The Weibull distribution finds applications in diverse fields, such as insurance (Kreer et al. [1], Hamza and Sabri [2]), food science (Mafart et al. [3], Uribete et al. [4]), ecology (Mikolaj [5]), and medical science (Carroll [6]).In addition, numerous studies have presented examples of wind speed data that follow a Weibull distribution.For instance, Waewsak et al. [7] analyzed the statistical wind data obtained from the Thasala district in Nakhon Si Thammarat province, southern Thailand.Chauhan and Saini [8] examined the wind speed data from the Harshnath site in the Sikar district of Rajasthan, India.Kidmo et al. [9] studied the wind speed distribution based on six Weibull methods for wind power evaluation in Garoua, Cameroon.Bidaoui et al. [10] assessed and discussed the wind energy potential in five major cities in northern Morocco.Shu and Jesson [11] estimated Weibull parameters, which are used to evaluate wind power density in the UK.In the field of inferential statistics, researchers have shown interest in constructing confidence intervals for the mean and its functions in Weibull distributions.For example, Colosimo and Ho [12] conducted a study where they utilized censored reliability datasets to determine the confidence interval of the mean lifetime in Weibull distributions.In a different study, Muralidharan and Lathika [13] estimated the confidence interval for mean rainfall data by employing a modified version of the Weibull distribution.Krishnamoorthy et al. [14] conducted research in which they utilized the generalized variable method to establish confidence intervals for the mean of the Weibull distribution, and subsequently contrasted these results with Wald confidence intervals.In their study, La-ongkaew et al. [15] utilized the generalized confidence interval (GCI) and the method of variance estimates recovery (MOVER), along with the Wald confidence interval, to derive confidence intervals for the difference between means and the ratio of means of Weibull distributions.
The multiple comparisons of several parameters of interest correspond to the concept of simultaneous confidence intervals (SCIs).SCIs are intervals that comprise individual intervals for the separate components of the parameter.They allow for the estimation of all treatments at the same time.This problem has been extensively discussed in the literature.Hannig et al. [16] introduced the fiducial generalized pivotal quantities (FGPQ) method, which allows for the construction of SCIs for all pairwise ratios of means of lognormal distributions.Sadooghi-Alvandi and Malekzadeh [17] developed a new parametric bootstrap method for constructing SCIs for the ratios of means of multiple lognormal distributions.Their results indicated that the parametric bootstrap procedure consistently outperforms other procedures, which are based on the concepts of GPQ and FGPQ.In the same year, Zhang and Falk [18] introduced FGPQ-based SCIs for the ratios of means of multiple lognormal distributions, considering heteroscedastic variances and unequal group sizes.Li et al. [19] suggested a parametric bootstrap technique to construct SCIs for the differences of means from several two-parameter exponential distributions.Their outcomes showed that the suggested SCIs are typically closer to the nominal level.Also, in 2015, Abdel-Karim [20] presented a two-step MOVER, called FGCIs-MOVER and MOVER-MOVER, to estimate SCIs for ratios of means of lognormal distributions.Moving to 2018, Thangjai et al. [21] introduced a procedure to estimate SCIs for the differences in means of multiple normal populations with unknown coefficients of variation.They employed the GCI approach and the MOVER approach.Lastly, Maneerat et al. [22] proposed SCIs for pairwise comparisons of the means of delta-lognormal distributions.They used the parametric bootstrap approach, FGCI approach, MOVER approach, and Bayesian credible intervals with mixed and uniform priors.Their simulation results showed that Bayesian credible intervals-uniform prior and parametric bootstrap work well in different situations, even with large differences in variance.
As far as we know, there has not been any research examining the development of SCIs for all differences between the means of the Weibull distributions.In this particular study, we propose the GCI, MOVER, and Bayesian methods for establishing SCIs for all differences between the means of the Weibull distributions.The remaining sections of this paper are structured as follows.Section 2 outlines the materials and methods that are used to construct SCIs for all the differences between the means of the Weibull distributions.The simulation results were conducted on sample cases with sizes of 3 and 5 and are presented in Section 3.These methods were applied to wind speed datasets collected from three provinces in northeastern and southern Thailand, which are shown in Section 4. Lastly, we present the discussion and conclusion of this study.

Materials and Methods
Suppose that X i = (X i1 , X i2 ,. . ., X in i ) is the random vector from the p-dimensional Weibull distribution with a = (a 1 , a 2 , . . ., a p ) and k = (k 1 , k 2 , . . ., k p ) ; i = 1, 2, . . ., p, denoted as Weibull(a, k).Waloddi Weibull proposed the Weibull distribution and defined the probability density function (pdf) of X i as for i = 1, 2, . . ., p and j = 1, 2, . . ., n i .The respective mean and variance can be derived as and The methods described in this section aim to construct SCIs.Our focus lies in constructing the SCIs for all pairwise differences between the means, then where i, l = 1, 2, . . ., p, i = l.

GCI
To construct the SCIs based on the GCI method, the GPQ concept is employed in the following manner.Definition 1.Let X i = (X i1 , X i2 ,. . ., X in i ) be a random vector from p-independent Weibull distributions, which are based on a parameter of interest ϕ i , and a nuisance parameter γ i .Let x i = (x i1 , x i2 ,. . ., x in i ) be an observed value of X.Then, R(X i ; x i , ϕ i , γ i ) is called the GPQ when it fulfills the two conditions introduced by Weerahandi [23].
The GPQs for the scale and shape parameters of the Weibull distributions, which fulfill two conditions, were presented by Krishnamoorthy et al. [24].Let R k i be the GPQ for the shape parameter (k i ) and R a i be the GPQ for the scale parameter (a i ), respectively.They can be obtained from the following equations. and Let âi be the maximum likelihood estimators of a i from Weibull(a i , k i ), and âi0 be the observed values of âi .Similarly, let ki be the maximum likelihood estimators of k i from Weibull(a i , k i ), and ki0 be the observed values of ki .The research conducted by Thoman et al. [25]  ).Herein, we developed the GCI method to establish SCIs for θ il .Firstly, the GPQ for estimating θ i is determined as From Equation (4), we can obtain Therefore, the 100(1 − α)% SCI based on GCI for θ il is given by where Algorithm 1: GCI approach 1.

MOVER
The MOVER method was introduced by Donner and Zou [26].This technique allows for the construction of a confidence interval for a function for two parameters.Herein, we applied this method to construct a confidence interval for the difference between two parameters of interest.Therefore, the MOVER method is considered.For the difference in means, the confidence interval for parameter θ 1 − θ 2 is given by and Consider the p parameters of interest, for i, l = 1, 2, . . ., p and i = l, the lower bound L il and the upper bound U il become and where θ i and θ l are defined as Equation (2).Let l i , l l , u i , and u l be the confidence interval for θ i and θ l of the Weibull distributions.In our work, the Wald method was considered to construct them.According to the Wald confidence interval, the 100(1 − α)% two-sided confidence interval for θ i of the Weibull distribution is given by [l i , u i ], where i = 1, 2, . . ., p.
The derivation is provided below. and where z α/2 is the α/2-th quantile of a standard normal distribution.The estimate of variance for ln θi is defined by Using the Fisher information matrix enables the determination of estimates for the variance and covariance of âi and ki , which is var( θi ) = v ar ki After substituting l i , l l , u i , and u l into Equations ( 12) and (13), the 100(1 − α)% SCI based on the MOVER approach for θ il is where L il is defined as Equation ( 12) and U il is defined as Equation (13), respectively.Algorithm 2 is utilized to construct L il and U il .

4.
Compute intervals l i and u i for θ i from Equations ( 14) and (15).5.

Bayesian Inference
In Bayesian statistics, prior distributions play a pivotal role by enabling the integration of pre-existing knowledge or assumptions about the parameters of interest before observing the data.Combining this prior distribution with the likelihood function derived from observed data yields the posterior distribution in Bayesian inference.The posterior distribution effectively encapsulates refined beliefs about the parameters after considering the data.Here, we explore the Bayesian inference of the parameters of the Weibull distribution, utilizing both a gamma prior distribution and a uniform prior distribution.

Bayesian Gamma Prior
The gamma distribution is a versatile family that can manifest in various shapes, including the generalized gamma, exponential, and Rayleigh distributions.Notably, it serves as a conjugate prior to the exponential likelihood function, ensuring that its structure remains intact in the posterior.This conjugate property simplifies the computations required for Bayesian updating.Additionally, when employing a gamma distribution as a prior for the parameters, it harmonizes well with the characteristics of the Weibull distribution, making it a natural choice for Bayesian inference in these contexts.In this particular part, we consider a gamma prior for the scale and shape parameters, assuming their independent distributions.Then, the gamma priors for a and k are and where , and the hyperparameters v 1 , z 1 , v 2 , z 2 are assumed to be known real numbers.Let L(a , k|x) be an associated likelihood function, then the joint density function of the data, a and k, is Therefore, the posterior density function, given the data, is The Bayes estimates cannot be obtained in a simple, closed form due to the challenge of evaluating the integrals in Equation ( 21) analytically.As a result, an alternative method for parameter estimation is needed, and the Markov chain Monte Carlo (MCMC) method is often employed.The MCMC method has been proven to be successful in Bayesian computing, particularly through its ability to sample from full conditional distributions.One commonly used variant of MCMC is the Gibbs sampler.In this study, we propose using the Gibbs sampling procedure to generate MCMC samples and compute the Bayes estimate, as described by (Geman and Geman [27]).The wholly conditional posterior distributions of a and k are and Equation ( 22) corresponds to a gamma density with parameters (n + v 1 ) and (z 1 + ∑ x k ).Generating samples of a can be easily accomplished using any gamma-generating routine.However, for Equation (23), relying solely on the Gibbs sampling procedure to update them is not sufficient.Instead, we need to employ the Metropolis-Hastings algorithm to update the shape parameter.This algorithm is particularly useful when direct sampling from a distribution is challenging, making it the chosen method for generating samples of the shape parameter.In this research, we applied both the Gibbs sampling procedure and the Metropolis-Hastings algorithm to generate samples from full conditional distributions using the R (version 4.1.2)programming software package.OpenBUGS (Bayesian inference Using Gibbs Sampling) is a software package designed for performing Bayesian analysis with MCMC methods.It offers a flexible and intuitive approach for specifying and fitting Bayesian statistical models, particularly suitable for complex models.To utilize OpenBUGS in R, it is necessary to have the "R2OpenBUGS" package installed, provinding an interface between R and OpenBUGS.
Let us once again consider the p parameters of interest, for i, l = 1, 2, . . ., p, i = l; the differences between means were computed as Equation (4).Therefore, the 100(1 − α)% SCI based on Bayesian equal-tailed using gamma prior distribution for θ il is given by where L il(BAYE.g) and U il(BAYE.g) are the lower and upper bounds of the 100(1 − α)% equaltailed confidence intervals of θ il .In Bayesian statistics, alongside equal-tailed confidence intervals, there exists another type known as the highest posterior density (HPD) interval.This interval delineates the most densely populated region in the posterior distribution, encapsulating the parameter's most probable values.The Bayesian-HPD interval is the shortest among all of the available Bayesian credible intervals for some given (1 − α).
The assumption is that the probability density within it is higher compared to the values outside of it [28].As a result, it tends to be the narrowest possible interval.Using the R statistical program with a package of HDInterval, the 100(1 − α)% SCI based on the Bayesian-HPD interval using gamma prior distribution for θ il is Algorithm 3 is utilized to construct [L il(BAYE.g),U il(BAYE.g)] and [L il(HPD.g),U il(HPD.g)].

2.
Generate gamma prior for a i from Equation (18).
Compute the estimate of a i , k i using OpenBUGS in R.

Bayesian Uniform Prior
In the previous subsection, we explored how Bayesian estimation incorporates prior knowledge about the parameter.However, in situations where prior knowledge is lacking, we can employ a non-informative prior in Bayesian analysis.In the context of this subsection, we assume a uniform prior distribution for the scale and shape parameters: π(a) ∼ uni f orm(0, 100), and π(k) ∼ uni f orm(0, 4).
To estimate the parameters based on Bayesian with a uniform prior, the OpenBUGS (version 3.2-3.2.1) software in R programming is discussed again.By utilizing Equation ( 4) and considering the prior information, we make estimations to determine the differences in means for the p parameters of interest.
Compute the estimate of a i , k i using OpenBUGS in R.

Results
A simulation analysis was conducted to evaluate the efficacy of the suggested approaches.The evaluation of these methods involved analyzing the coverage probabilities (CPs), expected lengths (ELs), and standard error (s.e) of the confidence intervals.The simulation study consisted of 5000 simulation runs, and for the GCI, there were 2500 replications.Furthermore, we utilized Gibbs sampling in conjunction with the Metropolis-Hastings algorithm, conducting 20,000 iterations with a burn-in of 1000.The hyperparameter values were fixed at v 1 = z 1 = v 2 = z 2 = 0.1.To be considered satisfactory, the SCI should exhibit CPs that are near or above the nominal confidence level (1 − α) of 0.95 while also demonstrating the shortest EL.The simulation studies were conducted with parameter combinations chosen accordingly.According to Tables 1 and 2, these factors were both fixed and varied in various situations.Obtaining CPs for the SCIs is facilitated by following Algorithm 5.

Algorithm 5:
The CP of the SCIs estimates for the differences between means of Weibull distributions 1.

2.
Construct the SCIs based on the GCI method described in Algorithm 1 and record whether θ il = θ i − θ l , i, l = 1, 2, . . ., p, i = l reside within their corresponding SCI GCI .

3.
Construct the SCIs based on the MOVER method described in Algorithm 2 and record whether θ il = θ i − θ l , i, l = 1, 2, . . ., p, i = l reside within their corresponding SCI MOVER .4.
Construct the SCIs based on the Bayesian equal-tailed and Bayesian HPDinterval using gamma prior distribution described in Algorithm 3 and record whether θ il = θ i − θ l , i, l = 1, 2, . . ., p, i = l reside within their corresponding SCI BAYE.g and SCI HPD.g , respectively.5.
Construct the SCIs based on the Bayesian equal-tailed and Bayesian HPDinterval using uniform prior distribution described in Algorithm 4 and record whether θ il = θ i − θ l , i, l = 1, 2, . . ., p, i = l reside within their corresponding SCI BAYE.u and SCI HPD.u , respectively.6.
Compute the CP from the fraction of times that all θ il = θ i − θ l , i, l = 1, 2, . . ., p, i = l are in their corresponding SCIs.
The CPs and ELs of the proposed methods in Tables 1 and 2 are summarized in Figures 1 and 2. For p = 3, the SCI GCI , SCI BAYE.g , and SCI HPD.g demonstrate satisfactory CPs.Nevertheless, when comparing the ELs and the standard errors, it becomes apparent that SCI HPD.g consistently yields smaller values compared to the other methods in all scenarios.One noteworthy result from the study is that the SCI GCI performs well in situations where the means and sample sizes of each group are unequal.For SCI BAYE.u and SCI HPD.u , the CPs are close to 1 when θ = 0.5, regardless of the sample sizes.However, when θ i = (1, 2, 5), SCI HPD.u outperforms the other methods.Lastly, the SCI MOVER method consistently underestimates the CPs, yielding values below the target across all scenarios.The findings for p = 5 are similar to those for p = 3. Considering the equal sample sizes, SCI HPD.g showed the best performance in terms of CPs, ELs, and standard errors.The SCI GCI has good performance when the means of each population are not equal (θ i = (0.5, 1, 2, 2, 5)).The SCI BAYE.u had CPs higher than the goal when θ i = (1, 1, 2, 5, 5) for n = 10, and 30.When the sample sizes are not equal, both SCI HPD.g and SCI GCI are similarly effective in achieving CPs greater than or close to 0.95.Nevertheless, SCI HPD.g yielded narrower ELs and standard errors compared to the SCI GCI .Thus, we recommend using the Bayesian-HPD interval with a gamma prior distribution for constructing SCIs for the difference between the means of the Weibull distributions, both in cases of equal and unequal sample sizes.This recommendation is based on the observation that the results for n i = n l yielded similar outcomes to those for n i = n l .The bold type indicates the CP that exceeds the 0.95 goal value with the shortest EL and N/A stands for Not applicable.

Applications
Thailand, located in tropical Southeast Asia, can be categorized into five distinct regions: the North, Northeast, Central, East, and South.Each region has distinct geographical features.The North is characterized by valleys, while the Central region consists of lowlands.The Northeast region is renowned for its mountains, whereas the East has a combination of mountains and plains.The South, on the other hand, is a peninsula.A study conducted by the Department of Alternative Energy Development and Efficiency, Ministry of Energy, explored the potential sources of wind energy and found that most areas in Thailand have low wind speed potentials.However, specific regions stand out as promising sources of wind energy.The southern region of Nakhon Si Thammarat province, along with areas near the shores of Songkhla Lake, have been identified as good sources of wind energy.According to a report by the Electricity Generating Authority of Thailand, approximately 86% of the areas with significant wind energy potential are concentrated in the Northeastern region.Moreover, private sector studies have also identified the Lopburi, Nakhon Ratchasima, and Chaiyaphum provinces as having significant wind energy potential.Multiple reports from various agencies corroborate the notion that there are numerous areas in Thailand that could serve as excellent sources of wind energy.

Example 1
In this section, we utilized monthly wind speed (in knot) data from three provinces in northeastern Thailand-namely Surin, Nong Khai, and Sakon Nakhon-as examples for n i = n l to examine the efficiency of the proposed method.The data, collected over four years from 2018 to 2021, were obtained from the Meteorological Department of Thailand [29].To determine the fit of the distributions to the collected data, we employed the Akaike information criterion (AIC).The AIC results are presented in Table 3, demonstrating that the datasets from all three areas were well-suited to Weibull distributions.Additionally, we generated Q-Q plots to visualize the datasets, depicted in Figure 3.For further insight, we also compiled summary statistics for the three datasets, which can be found in Table 4. Subsequently, Table 5 provides an overview of the 95% SCIs for all pairwise differences between the means in wind speed data among the three provinces in northeastern Thailand, computed using all the methods.When considering the length of the pairwise differences of means from Table 5, the MOVER method had the smallest length.However, in the simulation results of the MOVER method, the CP did not reach 0.95 in all cases.As a result, the MOVER method will not be considered.Therefore, when considering both CP and the smallest length, it is recommended to use the Bayesian HPD-interval using gamma prior to estimating the SCIs for all pairwise differences between the means of wind speed data from the three provinces in northeastern Thailand.It has been confirmed that the Bayesian HPD-interval using a gamma prior is suitable for constructing SCIs for all pairwise differences between the means of Weibull distributions, particularly in cases of equal sample sizes.

Example 2
For n i = n l , we used monthly wind speed (in knot) data from three provinces in southern Thailand-namely Pattni, Chumphon, and Songkhla-as an example.The data for each province were collected from 2008 to 2012 and were obtained from the Meteorological Department of [29].Wind speed data from Pattani Province were collected from one station, Chumphon Province from two stations, and Songkhla Province from three stations.This results in an unequal number of samples in each province.To evaluate the appropriateness of the distributions, we revisited the AIC results.The findings indicated that the Weibull distributions were well-matched to the datasets from all three areas, as detailed in Table 6.Q-Q plots illustrating the datasets were generated and can be seen in Figure 4.The summary statistics for the wind speed dataset of three provinces in southern Thailand are presented in Table 7.The 95% SCIs for all pairwise differences between the means in wind speed data among the three provinces in southern Thailand reported in Table 8 indicated that the length of the Bayesian HPD-interval using gamma prior was the shortest.Once again, it has been confirmed that the Bayesian HPD-interval, utilizing a gamma prior, is well-suited for establishing SCIs for all pairwise differences in the means of Weibull distributions, even when dealing with uneven sample sizes.

Discussion
La-ongkaew et al. [30] presented Bayesian methods relying on a gamma prior distribution to establish the difference in parameter values of Weibull distributions.The results of their investigation demonstrated that both the Bayesian HPD-interval and GCI methods outperformed others in different scenarios.Buliding on this concept, we extended our work to construct estimates for the confidence interval for the differences of means of Weibull populations simultaneously.
The concepts of GCI, MOVER, and Bayesian methods were used to estimate the SCIs for all pairwise differences between the means of Weibull distributions.This study presented a Bayesian approach utilizing gamma and uniform prior distributions.The Bayesian HPD-interval using the gamma prior performed well in most cases, with CPs frequently meeting or closely approaching the set target while also yielding the shortest ELs.However, there are limitations for the Bayesian HPD-interval using the gamma prior in the case of small sample sizes.It is possible that these limitations stem from the hyperparameter configuration in the gamma prior distribution.Additionally, there are other algorithms available to address the issue of a non-closed-form posterior distribution, such as the independent Metropolis-Hastings algorithm, slice sampling, or Lindley's approximation.These alternatatives provide an interesting avenue for future research.Furthermore, an increase in sample size consistently led to a decrease in ELs, with the s.e.producing consistent outcomes.
Finally, the results were further supported by the computation of an example involving wind speed data from three provinces in northeastern and southern Thailand.Armed the with knowledge of the difference in the means of wind speed between two areas, related agencies can more effectively comprehend, utilize, plan for, and even anticipate suitable wind speed levels.

Conclusions
Herein, six methods for constructing SCIs for all pairwise differences between the means of Weibull distributions using GCI, MOVER, Bayesian equal-tailed, and HPDinterval, based on gamma and uniform priors, are presented.Our findings indicated that the HPD-interval using gamma prior had a reasonable CP, along with a satisfactory EL.Therefore, we recommend this method for constructing the SCIs for all pairwise differences between the means of Weibull distributions.Additionally, the GCI showed a CP higher than the goal in many cases, making it a viable alternative method.

Figure 1 .
Figure 1.Comparison of method performance in terms of CPs and ELs according to the number of samples (n i ).

Figure 2 .
Figure 2. Comparison of method performance in terms of CPs and ELs according to the mean (θ i ).

Figure 3 .
Figure 3. Weibull Q-Q plots of the dataset from three provinces in northeastern Thailand.

Figure 4 .
Figure 4. Weibull Q-Q plots of the dataset from three provinces in southern Thailand.

Table 1 .
The CPs and ELs of 95% SCIs for all differences between means of Weibull distributions for p = 3.
The bold type indicates the CP that exceeds the 0.95 goal value and with shortest EL and N/A stands for Not applicable.

Table 2 .
The CPs and ELs of 95% SCIs for all differences between means of Weibull distributions for p = 5.

Table 3 .
AIC values for the wind speed datasets representing three provinces in northeastern.
The bold type indicates the smallest AIC of the distribution.

Table 4 .
Summary statistics for the wind speed datasets from three provinces in northeastern Thailand.

Table 5 .
The 95% simultaneous confidence intervals for all pairwise differences between means of the wind speed data from three provinces in northeastern Thailand.
The bold type indicates the smallest length.

Table 6 .
AIC values for the wind speed datasets representing three provinces in southern Thailand.

Table 7 .
Summary statistics for the wind speed datasets from three provinces in southern Thailand.

Table 8 .
The 95% simultaneous confidence intervals for all pairwise differences between means of the wind speed data from three provinces in southern Thailand.