1. Introduction
In statistics, Weibull distribution is a continuous probability distribution that is positively asymmetric. It is related to several other probability distributions, for instance, it interpolates between the exponential distribution and the Rayleigh distribution. In addition, we can apply a simple closetonormal approximation of a Weibull random variable. Suppose that random variable
X follows a twoparameter Weibull distribution
$Weibull(a,k)$, with
a as the scale parameter and
k as the shape parameter, then the probability density function (pdf) of
X can be defined as
where
$E\left(X\right)=\mu =a\Gamma (1+\frac{1}{k})$ and
$V\left(X\right)={\sigma}^{2}={a}^{2}[\Gamma (1+\frac{2}{k}){(\Gamma (1+\frac{1}{k}))}^{2}]$. Kulkarni and Powar [
1] proposed transformation
$Y={X}^{p}$, where the power
p is chosen so that the distribution of transformed variable Y
Y only has very a small deviation from symmetry, and simultaneously has tail behavior very close to that of normal distribution with the same mean and variance. To approximate the distribution of
Y to a normal distribution,
$p=k\theta $ is exactly symmetric, where the value of
$\theta $ is the solution of skewness equation
$\left(t\right(\theta \left)\right)$ of the distribution of
Y as follows:
The skewness function of
X is obtained by substituting
$\mu $,
${\sigma}^{2}$,
$p=k\theta $, and thus is no longer depends on scale parameter
a. The Weibull distribution has been used in many fields, including engineering, industry, insurance claims and weather forecasting. For example, it has been used to analyze the survival time of guinea pigs injected with different doses of tubercle bacilli [
2] and the lifetime the front and rear brake pads [
3]. Xu et al. [
4] analysed the effect of laser treatment in delay the onset of blindness in patients with diabetic retinopathy. Wang et al. [
5] investigated inverse estimators for parameters of Weibull distribution and applied to the data on times to breakdown of an insulating fluid. Zhang et al. [
6] studied the reliability estimation of the multicomponent stressstrength model involving one stress and two correlated strength components from a parallel system. Zhuang et al. [
7] analyzed the progressivestress accelerated life tests with group effect under progressive censoring. Pang et al. [
8] approximated the Weibull distribution’s parameters using wind speed data from a Hong Kong observatory. Yingni et al. [
9] estimated the wind energy potential of 15 wind farms in China using the Weibull distribution. It has also been to estimate the distance traveled by a vehicle before throttle failure [
10,
11,
12]. One interesting data consideration about Weibull distribution is wind speed data. In Thailand, a number of research studies have been conducted on the potential of wind energy in order to find suitable sites for wind turbine installations [
13,
14,
15,
16,
17,
18].
Wind is a significant source of renewable energy for the electricity generation that is clean and ecologically friendly. Over the years, Thailand has been unable to utilize wind power as efficiently as it should due to the high cost of building wind turbines. Therefore, selecting suitable sites for the installation of wind turbines and wind potential measurement stations is very important. If an area has poor wind speed potential, a station may have to be dismantled and installed at a new site with better wind energy potential. Sufficient wind power should have wind speeds that are not too low and consistent throughout the year, and the mean of the distribution of the wind speed data should be applied for statistical inference. However, since the distribution of the data is skewed, the mean may not be the best measure of the central tendency as it is very sensitive to extreme values in a small sample. Instead, the dispersion in the wind speed data over time is a better measure, with low dispersion being preferable.
The coefficient of variation (CV) is a measure of the degree of dispersion in a distribution. It is the ratio of the standard deviation to the mean, and, unlike the variance and standard deviation, the measurement unit of the original data is not involved, which makes it very useful for comparing the dispersion in multiple datasets with different units or very different means. Utilization of the CV is widespread, including in the fields of science, engineering, and medicine. For example, Billings et al. [
19] estimated the CV for the impact of socioeconomic status on paying hospital bills. Kim et al. [
20] analyzed variations in the cycle of hydrogenfueled engines. Romano et al. [
21] analyzed the shear and tensile bond strength of the tooth structure. Saelee et al. [
22] examined the variability of agricultural production using an approximate confidence interval for the CV. Ospina and MarmolejoRamos [
23] studied the stability of a robust CV estimator for psychological and genetic data. Estimating the CV can be achieved via point or interval estimation. Of the two, the confidence interval is more meaningful and provides better information on the parameter of interest than a point estimator [
24]. For example, Yosboonruang et al. [
25] estimated the confidence interval for the CV of rainfall data from Songkhla province in Thailand, while Laongkaew et al. [
26] estimated the confidence for the difference in the CVs of wind datasets from Trad and Chonburi provinces in Thailand. Either the difference between or the ratio of the parameter of interest values can be applied to compare two populations. However, if the difference between the CVs is small, the conclusions drawn for the statistical inference may not be accurate. Therefore, the ratio of the parameter values is usually more appropriate than the difference between them. In the present study, we are interested in comparing the dispersion of wind speed data in two locations in Thailand. Since the dispersion of wind speed in a similar area or province may not be that different, we change constructed estimators for the confidence interval for the ratio of the CVs of two wind speed datasets. Estimating the confidence interval for the ratio of CVs of two populations has been studied in many instances.
For example, Verrill and Johnson [
27] introduced the confidence bounds for the ratio of the CVs of normal distributions using both asymptotic and simulation procedures. Buntao and Niwitpong [
28] estimated the confidence bounds for the ratio of the CVs of two independent deltalognormal distributions using the concepts of the generalized variable approach (GVA) and the method of variance estimates recovery (MOVER). Sangnawakij et al. [
29] provided two new estimators based on MOVER, Wald, and Score intervals for the confidence interval of the ratio of CVs of gamma distributions. Niwitpong and WongKhao [
30] provided estimators for the confidence bounds for the ratio of the CVs of normal distributions with known ratio of variances based on GVA and MOVER. Moreover, estimating the confidence interval for the ratio of the CVs of two twoparameter exponential distributions was achieved using MOVER and the generalized confidence interval (GCI) method by Sangnawakij et al. [
31]. Hasan and Krishnamoorthy [
32] improved the confidence interval estimation for the ratio of the CVs of two lognormal distributions based on MOVER and the fiducial approach. Based on the GCI, Puggard et al. [
33] estimated the confidence interval for the ratio of the CVs of Birnbaum–Saunders distributions and compared it along with the biasedcorrected percentile bootstrap and the biasedcorrected and accelerated approaches. Using the ideas of GCI and MOVER, Yosboonruang and Niwitpong [
34] proposed new confidence interval estimator for the ratio of the CVs of deltalognormal distributions. However, to the best of our knowledge, estimating the confidence interval for the ratio of the CVs of two Weibull distributions has not previously been considered.
Herein, we propose estimators for the confidence interval of the ratio of the CVs of two twoparameter Weibull distributions. The methods used are GCI, MOVER based on Hendricks and Robey’s confidence interval, and Bayesian methods using the gamma and uniform priors, the details of which are given in
Section 2. The details of a simulation study and the results thereof are covered in
Section 3. Application of the methods to two wind speed datasets to illustrate their efficacy is provided in
Section 4. Finally, a discussion and conclusions are presented in the last section.
2. Methods
Suppose that
${X}_{i}=({X}_{i1}$,
${X}_{i2}$, …,
${X}_{i{n}_{i}}),i=1,2$ are independent twoparameter Weibull random variables, denoted as
$Weibull({a}_{i},{k}_{i})$, where positive constants
${a}_{i}$ and
${k}_{i}$ are the scale parameters and shape parameters, respectively. The pdf of
${X}_{i}$ is given by
The maximum likelihood estimation can be used to estimate parameters
${a}_{i}$ and
${k}_{i}$. Since the maximum likelihood estimators lack a closed form, they must have been acquired numerically; (see the results of Cohen [
35] and Lemon [
36]). Maximum likelihood estimators
${\widehat{k}}_{i}$ of
${k}_{i}$ can be obtained from the solution to the following equation:
Similarly, the maximum likelihood estimators of
${\widehat{a}}_{i}$ of
${a}_{i}$ are defined by
Furthermore, the mean, variance, and CV of
${X}_{i}$ are, respectively, obtained as
Let
${X}_{1}$ = (
${X}_{11}$,
${X}_{12}$, …,
${X}_{1{n}_{1}}$) and
${X}_{2}$ = (
${X}_{21}$,
${X}_{22}$, …,
${X}_{2{n}_{2}}$) be random samples of size
${n}_{1}$ and
${n}_{2}$ from Weibull distributions with the parameters
${a}_{1}$,
${a}_{2}$,
${k}_{1}$ and
${k}_{2}$, respectively. Thus, the ratio of their CVs can be derived as
Next, the six methods for estimating the confidence interval for $\beta $ are derived.
2.1. The GCI Approach
The important concept of GCI introduced by Weerahandi [
37] uses the concept of the generalized pivotal quantity (GPQ).
Let $X=({X}_{1}$, ${X}_{2}$, …, ${X}_{n}$) be a random variable from a distribution with the parameter ($\phi $,$\gamma $), where $\phi $ is the parameter of interest and $\gamma $ is possibly a nuisance parameter, and x is the observed value of X. The GPQ, $R(X;x,\phi ,\gamma )$ for confidence interval estimation, must satisfy the following conditions.
(GPQ1) The probability distribution of $R(X;x,\phi ,\gamma )$ is free of unknown parameters.
(GPQ2) The observed value of $R(X;x,\phi ,\gamma )$ at $X=x$ is the parameter of interest.
Afterward, the 100$(1\alpha )\%$ confidence interval using the GCI for $\phi $ is provided by [${R}_{\phi}(\alpha /2)$,${R}_{\phi}(1\alpha /2)$], where ${R}_{\phi}(\alpha /2)$ is obtained by using 100($\alpha $/2)th percentile of ${R}_{\phi}(X;x)$.
The GPQs of the parameters of a Weibull distribution are presented by Krishnamoorthy et al. [
38]. They presented that the distribution of
$\frac{\widehat{k}}{k}$ is
${\widehat{k}}^{*}$ and the distribution of
$\widehat{k}ln\frac{\widehat{a}}{a}$ is
${\widehat{k}}^{*}ln{\widehat{a}}^{*}$, neither of which depend on the parameter, and so they are the GPQs of
a and
k. Let
${\widehat{a}}^{*}$ and
${\widehat{k}}^{*}$ be the maximum likelihood estimators from
$Weibull(1,1)$, and let
${\widehat{a}}_{0}$ and
${\widehat{k}}_{0}$ be the observed values of
$\widehat{a}$ and
$\widehat{k}$, respectively. Thus, the GPQs of the shape and scale parameters are, respectively, obtained by
and
Suppose that
${X}_{i}=({X}_{i1}$,
${X}_{i2}$, …,
${X}_{i{n}_{i}}),i=1,2$ are random samples of size
${n}_{i}$ from Weibull distributions, then the respective GPQs of the parameters can be defined as
and
A useful feature of the GVA is that the GPQs of the functions of
a and
k can be obtained by simply plugging their GPQs into the function. The GPQ for function
$g({a}_{i},{k}_{i})$ is
$g({R}_{{a}_{i}},{R}_{{k}_{i}})$. According to
$\beta $ in Equation (
9), which depends on shape parameter
k,
${R}_{\beta}=\beta \left({R}_{{k}_{i}}\right)$. Thus, the GPQ of the ratio of the CVs of two Weibull distributions is given by
Subsequently, the 100
$(1\alpha )\%$ twosided GCI confidence interval for the ratio of the CVs of two Weibull distributions is
where
${R}_{\beta}(\alpha /2)$ is the 100(
$\alpha $/2)th percentile of
${R}_{\beta}$.
2.2. The MOVER Approach
MOVER (Donner and Zou [
39]) can be used to construct the confidence interval for the difference or the ratio of two distribution parameters. For the ratio of
${\lambda}_{i},i=1,2$, the confidence interval is identified as
where the lower bound and upper bound for
${\widehat{\lambda}}_{1}$/
${\widehat{\lambda}}_{2}$ are defined by
and
Suppose that
${l}_{i}$ and
${u}_{i}$ are the intervals of
${\lambda}_{i}$ (Hendricks and Robey [
40]), then the confidence intervals for
${\lambda}_{1}$ and
${\lambda}_{2}$ can, respectively, be defined as
and
where
${t}_{(\alpha /2,{n}_{1}1)}$ and
${t}_{(\alpha /2,{n}_{2}1)}$ is obtained by using 100(
$\alpha $/2)th percentiles of two tdistributions with degrees of freedom of
${n}_{1}1$ and
${n}_{2}1$, respectively.
Afterward, the 100
$(1\alpha )\%$ twosided MOVER with Hendricks and Robey’s confidence interval for the ratio of the CVs of two Weibull distributions is
2.3. The Bayesian Approaches
In Bayesian methodology, the posterior density is obtained from the posterior distribution as follows:
Suppose that
X is a random variable following a Weibull distribution, then another form of the pdf of
X is provided by
where
${a}^{\prime}={\left(\frac{1}{a}\right)}^{k}$. In this study, we applied two priors for the parameters which are defined in the following subsections.
2.3.1. The Gamma Prior
First, we consider the independent priors of the two parameters from a Weibull distribution as follows:
and
where
${v}_{1},{z}_{1},{v}_{2},{z}_{2}$ are the hyperparameters.
Accordingly, the joint posterior density function of
${a}^{\prime}$ and
k given
x can be written as
where
$L({a}^{\prime},kx)$ is a likelihood function.
Assuming that the priors in Equation (
24) and Equation (
25) are independent, then the conditional posterior distributions of parameters are, respectively, given by
and
It can be seen that a sample of
${a}^{\prime}$ can be obtained from a gamma distribution. However, the distribution of
$\pi (k,{a}^{\prime}x)$ is not closed and for solving this problem, so that the Markov chain Monte Carlo (MCMC) (Geman and Geman [
41]), a Gibbs sampling procedure was applied in the present study to generate a sample from the posterior density function. However, MCMC cannot be applied to the conditional posterior distribution of the shape parameter in a straightforward manner, so we combined it with the Random walk Metropolis (RWM) algorithm. The combined algorithm obtained using the Bayesian estimate for
${\widehat{\beta}}^{\left(t\right)}$ is as follows.
Start with $({a}^{\prime \left(0\right)},{k}^{\left(0\right)})$, where it is an initial value
Generate ${a}^{\prime \left(t\right)}$ from gamma distribution with parameters $(n+{v}_{2},{z}_{2}+\sum {x}^{{k}^{(t1)}})$
Update ${k}^{\left(t\right)}$ from RWM algorithm
Generate $\epsilon $ from normal distribution with parameters $(0,{\sigma}_{k}^{2})$
Calculate ${k}^{*}$ from ${k}^{(t1)}+\epsilon $
Calculate ${A}_{k}=\frac{L({k}^{*},{a}^{\prime}x)\pi \left({k}^{*}\right)}{L(k,{a}^{\prime}x)\pi \left(k\right)}$
Generate variable u from uniform distribution with parameters $(0,1)$
If $u\le min(1,{A}_{k})$ set ${k}^{\left(t\right)}={k}^{*}$, else set ${k}^{\left(t\right)}={k}^{t1}$
Calculate the parameter of interest, ${\widehat{\beta}}^{\left(t\right)}$
Discard the first 1000 values of ${\widehat{\beta}}^{\left(t\right)}$
For $i=1,2$, let ${X}_{i}$ = (${X}_{i1}$, ${X}_{i2}$, …, ${X}_{i{n}_{i}}$) be random samples from Weibull population with parameters ${a}_{i}$ and ${k}_{i}$. After we computed Bayesian estimates via the above algorithms, confidence interval estimation can be obtained from the percentile of the estimate as follows.
The 100
$(1\alpha )\%$ twosided confidence interval for the ratio of CVs based on the Bayesian method using the gamma prior is given by
where
${L}_{gamma.\beta}$ and
${U}_{gamma.\beta}$ are the lower and upper bounds of the 100
$(1\alpha )\%$ equaltailed confidence intervals and the highest posterior density (HPD) interval of
$\beta $, respectively.
The HPD interval is the shortest interval in the HPD region when all of the values inside the HPD region have higher probability densities than any value outside of it [
42]. The HPD interval was calculated by using the HDInterval package in the R programming suite.
2.3.2. Uniform Prior Distribution
The noninformative uniform prior can be applied as follows:
and
A sample from a joint posterior distribution can be acquired via Gibbs sampling. For a Weibull distribution, Khan and Ahmed [
43] used the R2jags package in R programming (a Gibbs sampler) to summarize the posterior inference. Thereby, they specified the model for a twoparameter Weibull distribution and provided the code for generating an MCMC sample.
Model specification:
"model{
# Likelihood
for (i in 1:length(x)){
p[i] ← dweib(x[i],shape, theta);
ones[i] ∼ dbern(p[i]);
}
# Priors
shape ∼ dunif(0,4)
scale ∼ dunif(0,100)
theta ← pow(1/(scale), shape)
}", file = "weibullmodel.txt"),
where theta is a transformation of a scale parameter in another form. The R2jags functions and the arguments used for fitting the parameters of a Weibull distribution are obtained from Su and Yajima [
44] as follows:
jags.fit ← jags(data, inits, parameters.to.save, n.iter = 20,000, model.file = "weibullmodel.txt", n.burnin = 1000)
From the sample obtained via R2jags (denoted as
${\widehat{\beta}}^{\left(j\right)}$), the 100
$(1\alpha )\%$ Bayesian equaltailed and the HPD interval based on uniform prior for the ratio of the CVs of two Weibull distributions are given by
where
${L}_{uni.\beta}$ and
${U}_{uni.\beta}$ are the lower and upper bounds.
3. Simulation Results
Using the R statistical program, the performances of the confidence interval estimators were compared in terms of their coverage probability (CP) and expected length (EL). The bestperforming method in each scenario had a CP greater than or equal to the nominal confidence level and the shortest EL.
For the simulation study, sample sizes $({n}_{1},{n}_{2})$ = $(10,10)$, $(10,30)$, $(10,50)$, $(30,30)$, $(30,50)$, $(50,50)$, $(50,100)$, or $(100,100)$; scale parameters ${a}_{1}={a}_{2}=2$; shape parameters ${k}_{1}=1$ and ${k}_{2}$ = 0.5, 1, 2, or 4 with the ratio of the CVs of 0.4472, 1, 1.9130, or 3.5645, respectively. The number of times each situation is replicated was $M=$ 5000, with $m=$ 2500 for the GCI method. Furthermore, $T=$ 20,000 realizations of MCMC were generated using the Gibbs algorithm with a burnin of 1000. The nominal confidence level was set as 0.95.
The following algorithm 1 was used to obtain the CP and the EL of the confidence interval estimates.
Algorithm 1: The CP and EL of the confidence interval estimates for the ratio of the CVs

Set $M,m,T,{n}_{1},{n}_{2},{a}_{1},{a}_{2},{k}_{1},$ and ${k}_{2}$ Generate ${X}_{i}=({X}_{i1}$, ${X}_{i2}$, …, ${X}_{i{n}_{i}})$ from $Weibull({a}_{i},{k}_{i})$ Construct generalized confidence interval $({L}_{gci.\beta},{U}_{gci.\beta})$ from Equation ( 15) Construct MOVER confidence interval $({L}_{mover.\beta},{U}_{mover.\beta})$ from Equation ( 21) Construct equaltailed and HPD interval based on Bayesian using gamma prior distribution $({L}_{gamma.\beta},{U}_{gamma.\beta})$ from Equation ( 29) Construct equaltailed and HPD interval based on Bayesian using uniform prior distribution $({L}_{uni.\beta},{U}_{uni.\beta})$ from Equation ( 32) If $(L\le \beta \le U)$, then set $P=1$, else set $P=0$ Repeat steps 1–6 for M times Determine the average of P for the CP Determine the average of $(UL)$ for the EL

The simulation results for
${n}_{1}={n}_{2}$ in
Table 1, show that the GCI method yielded CP higher than or close to the nominal confidence level of 0.95 for all cases whereas those using the MOVER method were under estimated, except for
$({n}_{1},{n}_{2})$ = (10, 10). Of the Bayesian methods, the HPD interval based on the uniform prior outperformed the others for
${k}_{2}$ = 1 or 2 because its CPs were greater than the target. Furthermore, CPs of the HPD interval based on the gamma prior were over or close to 0.95 in most cases. Especially, for
$({n}_{1},{n}_{2})$ = (100, 100) and
${k}_{2}$ = 0.5, it also provided the shortest EL. The Bayesian equaltailed confidence interval based on the gamma prior and uniform prior yielded CPs close to the goal, except for the Bayesian method based on the uniform prior when
${k}_{2}$ = 0.5 or 4. The results for
${n}_{1}\ne {n}_{2}$ in
Table 2 were similar to those for
${n}_{1}={n}_{2}$ in
Table 1 in that the Bayesian methods based on the uniform prior provided CPs greater than 0.95 and the shortest ELs in most cases. The CPs obtained with the equaltailed confidence interval satisfied the target for
$({n}_{1},{n}_{2})$ = (10, 30) and (10, 50) and
${k}_{2}$ = 1 or 2;
$({n}_{1},{n}_{2})$ = (30, 50), and
${k}_{2}$ = 2. Meanwhile, the HPD interval obtained CPs that satisfied the target for
$({n}_{1},{n}_{2})$ = (30, 50) and
${k}_{2}$ = 1, and
$({n}_{1},{n}_{2})$ = (50, 100) and
${k}_{2}$ = 1 or 2. Finally, the CPs and ELs of the methods in
Table 1 and
Table 2 are summarized in
Figure 1 and
Figure 2.