Inference about the Ratio of the Coe ﬃ cients of Variation of Two Independent Symmetric or Asymmetric Populations

: Coe ﬃ cient of variation (CV) is a simple but useful statistical tool to make comparisons about the independent populations in many research areas. In this study, ﬁrstly, we proposed the asymptotic distribution for the ratio of the CVs of two separate symmetric or asymmetric populations. Then, we derived the asymptotic conﬁdence interval and test statistic for hypothesis testing about the ratio of the CVs of these populations. Finally, the performance of the introduced approach was studied through simulation study.


Introduction
Based on the literature, to describe a dataset (random variable), three main characteristics containing central tendencies, dispersion tendencies and shape tendencies, are used. A central tendency (or measure of central tendency) is a central or typical value for a random variable that describes the way in which the random variable is clustered around a central value. It may also be called a center or location of the distribution of the random variable. The most common measures of central tendency are the mean, the median and the mode. Measures of dispersion like the range, variance and standard deviation tell us about the spread of the values of a random variable. It may also be called a scale of the distribution of the random variable. The shape tendencies such as skewness and kurtosis describe the distribution (or pattern) of the random variable.
The division of the standard deviation to the mean of population, CV = σ µ , is called as coefficient of variation (CV) which is an applicable statistic to evaluate the relative variability. This free dimension parameter can be widely used as an index of reliability or variability in many applied sciences such as agriculture, biology, engineering, finance, medicine, and many others [1][2][3]. Since it is often necessary to relate the standard deviation to the level of the measurements, the CV is a widely used measure of dispersion. The CVs are often calculated on samples from several independent populations, and questions about how to compare them naturally arise, especially when the distributions of the populations are skewed. In real world applications, the researchers may intend to compare the CVs of two separate populations to understand the structure of the data. ANOVA and Levene tests can be used to investigate the equality of CVs of populations in case the means or variances of the populations are equal. It is obvious that in many situations two populations with different means and variances may have an equal CV. For the normal case, the problems of interval estimating the CV or comparison of two or several CVs have been well addressed in the literature. Due to possible small differences of two small CVs and no strong interpretation, the ratio of CVs is more accurate than the difference of CVs. Bennett [4] proposed a likelihood ratio test. Doornbos and Dijkstra [5] and Hedges and Olkin [6] presented two tests based on the non-central t test. A modification of Bennett's method was provided by Shafer and Sullivan [7]. Wald tests have been introduced by [8][9][10]. Based on Renyi's divergence, Pardo and Pardo [11] proposed a new method. Nairy and Rao [12] applied the likelihood ratio, score test and Wald test to check that the inverse CVs are equal. Verrill and Johnson [13] applied one-step Newton estimators to establish a likelihood ratio test. Jafari and Kazemi [14] developed a parametric bootstrap (PB) approach. Some statisticians improved these tests for symmetric distributions [15][16][17][18][19][20][21][22][23]. The problem of comparing two or more CVs arises in many practical situations [24][25][26]. Nam and Kwon [25] developed approximate interval estimation of the ratio of two CVs for lognormal distributions by using the Wald-type, Fieller-type, log methods, and the method of variance estimates recovery (MOVER). Wong and Jiang [26] proposed a simulated Bartlett corrected likelihood ratio approach to obtain inference concerning the ratio of two CVs for lognormal distribution.
In applications, it is usually assumed that the data follows symmetric distributions. For this reason, most previous works have focused on the comparison of CVs in symmetric distributions, especially normal distributions. In this paper, we propose a method to compare the CVs of two separate symmetric or asymmetric populations. Firstly, we propose the asymptotic distribution for the ratio of the CVs. Then, we derive the asymptotic confidence interval and test statistic for hypothesis testing about the ratio of the CVs. Finally, the performance of the introduced approach is studied through simulation study. The introduced approach seems to have many advantages. First, it is powerful. Second, it is not too computational. Third, it can be applied to compare the CVs of two separate symmetric or asymmetric populations. We apply a methodology similar to that which has been used in [27][28][29][30][31][32][33]. The comparison between the parameters of two datasets or models has been considered in several works [34][35][36][37][38][39][40]

Asymptotic Results
Assume that X and Y are uncorrelated variables with non-zero means µ X and µ Y , and the finite i th central moments: respectively. Also assume two samples X 1 , . . . , X m , and Y 1 , . . . , Y n , distributed from X and Y, respectively. From the motivation given in the introduction, the parameter: is interesting to inference, where CV Y and CV X are the CVs corresponding to Y and X, respectively.
can reasonably estimate the parameter γ. For simplicity, let m = n. When n = m, let n * = min(m, n) instead of m and n in the following discussions.

Lemma 1.
If the above assumptions are satisfied, then: where: is the asymptotic variance.
Proof. The outline of proof can be found in [41].
The next theorem corresponds to the asymptotic distribution ofγ. This theorem will be applied to construct the confidence interval and perform hypothesis testing for the parameter γ. Theorem 1. If the previous assumptions are satisfied, then: where: and: Proof. By using Lemma 1, we have: Slutsky's Theorem gives: for independent samples [41].
x . Then we have: Because of continuity of ∇ f in the neighbourhood of (CV X , CV Y ), by using Cramer's Rule: the proof ends.
Thus, the asymptotic distribution can be constructed as:

Constructing the Confidence Interval
As can be seen, the parameter λ depends on CV X , δ 2 X , δ 2 Y and γ which are unknown parameters in practice. The result of the next theorem can be applied to construct the confidence interval and to perform the hypothesis testing for the parameter γ. Theorem 2. If the previous assumptions are satisfied, then: where: Proof. From the Weak Law of Large Numbers, it is known that: as n → ∞. Consequently, by applying Slutsky's Theorem, we haveλ p → λ, as n → ∞ . Appliying Theorem 1 the proof is completed. Now, T * n is a pivotal quantity for γ. In the following, this pivotal quantity is used to construct asymptotic confidence interval for γ. (3)

Hypothesis Testing
In real word applications, researchers are interested in testing about the parameter γ. For example, the null hypothesis H 0 : γ = 1 means that the CVs of two populations are equal. To perform the hypothesis test H 0 : γ = γ 0 , the test statistic: is generally applied, such that: If the null hypothesis H 0 : γ = γ 0 is satisfied, then the asymptotic distribution of T 0 is standard normal.

Normal Populations
Naturally, many phenomena follow normal distribution. This distribution is very important in natural and social sciences. Many researchers focused on the comparison between the CVs of two independent normal distributions. Nairy and Rao [12] reviewed and studied several methods such as likelihood ratio test, score test and Wald test that could be used to compare the CVs of two independent normal distributions. If the parent distributions X and Y are normal, then: Consequently, for normal distributions, δ 2 X andδ 2 X can be rewritten as: respectively.

Simulation Study
In this section, the accuracy of the given theoretical results is studied and analyzed by different simulated datasets. For the populations X and Y, we respectively simulated different samples from symmetric distribution (normal) and asymmetric distributions (gamma and beta) with different CV values, (CV X , CV Y ) ∈ (1, 1), (1, 2), (2, 3), (2,5) , which are equivalent to γ ∈ {1, 2, 1.5, 2.5}. Figures 1-3 show the plots of probability density function (PDF) for the considered distributions.      To check the accuracy of Equations (3) and (4), we estimated the coverage probability, CP = , for each parameter setting. We also computed the value of the test statistic in Equation (4), for each run. Then we considered the Shapiro-Wilk's normality test and the Q-Q plots to verify normality assumption for the proposed test statistic.   To check the accuracy of Equations (3) and (4) As Table 1 indicates, the CP are very close to the considered level (1 − α = 0.95), especially when sample size was increased, and consequently the proposed method controlled the type I error.
In other words, about 95% of simulated confidence intervals contained true γ and consequently it can be accepted that Equation (3) is asymptotically confidence interval for γ. The values of CPU times (in seconds) for different parameter settings given in Table 2, verify that this approach is not too time consuming. Furthermore, Figure 4 and Table 3 illustrate the Q-Q plots and the p-values of Shapiro-Wilk's test, respectively, to study the normality of the introduced test statistic.  Table 3 indicates that all p-values are more than 0.05 and consequently the Shapiro-Wilk's test verified the normality of the proposed test statistic. This result could also be derived from Q-Q plots. Since the points form almost a straight line, the observed quantiles are very similar to the quantiles of theoretical distribution (normal). Therefore, the simulation results verify that the asymptotic theoretical results seem to be quite satisfying for all parameter settings. Consequently our proposed approach is a good choice to perform hypothesis testing and to establish a confidence interval for the ratio of the CVs in two separate populations.  As Table 1 indicates, the CP are very close to the considered level (1 = 0.95), especially when sample size was increased, and consequently the proposed method controlled the type I error.
In other words, about 95% of simulated confidence intervals contained true and consequently it can be accepted that Equation 3 is asymptotically confidence interval for . The values of CPU times (in seconds) for different parameter settings given in Table 2, verify that this approach is not too time consuming. Furthermore, Figure 4 and Table 3 illustrate the Q-Q plots and the p-values of Shapiro-Wilk's test, respectively, to study the normality of the introduced test statistic.

Conclusions
Coefficient of variation is a simple but useful statistical tool to make comparisons about independent populations. In many situations two populations with different means and variances may have equal CVs. In real world applications, researchers may intend to study the similarity of the CVs in two separate populations to understand the structure of the data. Due to possible small differences of two small CVs and no strong interpretation, the ratio of CVs is more accurate than the difference of the CVs. In this study, we proposed the asymptotic distribution, derived the asymptotic confidence interval and established hypothesis testing for the ratio of the CVs in two separate populations. The results indicated that the coverage probabilities are very close to the considered level, especially when sample sizes were increased, and consequently the proposed method controlled the type I error. The values of CPU times also verified that this approach is not too time consuming. Shapiro-Wilk's normality test and Q-Q plots also verified the normality of the proposed test statistic. The results verified that the asymptotic approximations were satisfied for all simulated datasets and the introduced technique acted well in constructing CI and performing tests of hypothesis.