Abstract
One of the most accessible and useful statistical tools for comparing independent populations in different research areas is the coefficient of variation (CV). In this study, first, the asymptotic distribution of the ratio of CV of two uncorrelated populations is investigated. Then, the outputs are used to create a confidence interval and to establish a test of hypothesis about the CV ratio of the populations. The proposed approach is compared with an alternative method, showing its superiority and effectiveness.
MSC:
62E20; 62P12; 97K50
1. Introduction
According to the literature, three fundamental measures are used to explain a dataset (random variable). These include central, shape and dispersion tendencies. By obtaining the value of the central tendency, we can know how a random variable is gathered around a central value. The mean, median and mode are the most used criteria to express the central tendency’s measures. Criteria such as range, variance and standard deviation can be used to measure the dispersion of a random variable. In some literature, this is called the random variable distribution scale. Another need of statisticians is to know how a random variable is distributed, or to know its pattern shape, which can be addressed by the use of statistical measures such as kurtosis or skewness.
The coefficient of variation (CV) is obtained by dividing the population standard deviation by the population mean, CV = σ/μ, being an applicable and suitable statistic for evaluating relative variability. The CV is a free parameter that is used in many areas, such as agronomy, biology, engineering, finance, medicine and others, as an indicator of reliability or variability [1,2,3]. In many cases, relating standard deviation to the level of measurement is of great importance to researchers. For this reason, the CV is widely used to measure dispersion. When studying several independent populations, knowing how their CVs are compared is essential. This becomes even more important when populations have skewed distributions. In practical matters, statisticians may be interested in comparing two independent populations’ CV to better understand the data structure. In situations where the means or variances of independent populations are equal, the analysis of variance and Levene’s tests are employed to examine the equality of CVs. If each population has a different mean and variance, it is clearly possible that their CVs are equal. There have been many studies comparing two or more CVs, which we will discuss below. When two CVs are small, the differences between them are likely to be minor, and these minor potential differences can lead to inability to make powerful or definite conclusions. Therefore, the CV ratio is more accurate than the CV difference.
The probability ratio test was introduced by Bennett [4], and later modified by Schaefer and Sullivan [5]. Scientists such as Doornbos et al. [6] and Hedges et al. [7] proposed tests that relied on decentralized meta-analysis. In [8,9,10], some methods were introduced which are known as Wald tests. Pardo and Pardo [11] were able to propose a new method for comparing CVs based on the Rényi divergence. Some researchers, such as Nairy and Rao [12], investigated the inverse CV. Another method proposed for likelihood ratio testing is based on Newton one-step estimators, performed by Verrill and Johnson [13]. The parametric bootstrap (PB) approach was also considered by Jafari and Kazemi [14]. Other statisticians continued these studies and were able to improve them for symmetric distributions, as presented in [15,16,17,18,19,20,21,22,23]. However, comparing two CVs and, in more complex cases, multiple CVs is challenging in practice [24,25]. The approximate interval for the ratio of two CVs in the lognormal distribution was obtained by Nam and Kwon [25]. They obtained the estimate using the Fieller-type log method of variance estimates recovery (MOVER) and the Wald-type method. Two other researchers, Jiang and Wong [26], used the Barlett’s modified likelihood ratio simulation method to make inferences about the ratio of two CVs in the lognormal distribution.
There are many phenomena in the real world that have symmetrical distributions. Due to this reason, most previous studies have focused just on a special type of distributions, especially on the normal distribution. Yue and Baleanu [27] inferred about the CV ratio of two independent symmetric or asymmetric distributions. They studied the asymptotic distribution for the ratio of two populations’ CV. Then, the outputs were used to create the confidence interval and to test the hypothesis about the ratio of CV of the populations.
In this paper, we adopt a procedure similar to those employed in other studies [28,29,30,31,32] to improve the method given by Yue and Baleanu [27]. The rational estimation for the ratio of CV of two populations given by Yue and Baleanu is considered and its asymptotic distribution is investigated. In computing the asymptotic variance, a formulation different from that given by Yue and Baleanu [27] is proposed. Then, using the asymptotic distribution of the given estimator, the confidence interval and hypothesis testing about the ratio of CV of two populations are studied. The new formulation leads to a more powerful estimator and a more accurate confidence interval than those in [27].
The paper is organized as follows: In Section 2, first, a rational estimation for the ratio of CV of two populations is given. Then, the asymptotic distribution of this estimator is considered. Finally, using the asymptotic distribution of the given estimator, the confidence interval and hypothesis testing about the ratio of CV of two populations are studied. In Section 3, various simulated datasets are reviewed and analyzed to verify the accuracy of the theoretical findings. The applicability of the proposed technique in real situations is studied in Section 4. Additionally, Section 5 includes the concluding remarks.
2. Materials and Methods
2.1. Asymptotic Outcomes
Let us consider two uncorrelated variables and , with non-zero means of and . Their finite central moments are, respectively:
Furthermore, suppose that and are samples that are identically and independently distributed from and , respectively. According to what was mentioned in Section 1, given that and are the CV of and , respectively, the parameter
is compelling for statistical inference.
Suppose that
then, and are consistently estimated [27,33] by and , respectively. Therefore, it follows that
has the ability to estimate the parameter reasonably. For convenience, can be defined. At can be used instead of m and n.
Lemma 1.
If the mentioned assumptions are accepted, thus
Proof.
The proof sketch is given in [33]. □
In the subsequent theorem, Yue and Baleanu [33] used the asymptotic distribution of to build the confidence interval and test the hypothesis about γ.
Lemma 2.
If the previous assumptions are accepted, thus
Proof of Lemma 2.
The proof sketch is given in [27]. □
It can be observed that the asymptotic variance depends on the unknown parameters , , and γ. Yue and Baleanu [33] estimated by
where
and
They used Slutsky’s theorem and proved that as .
In this work, we rewrite the asymptotic variance in an alternative form and re-estimate it using a more robust estimator. The asymptotic variance can be represented by
Therefore, we can construct the asymptotic distribution as
Theorem 1.
If the previous assumptions are accepted, thus
Proof of Theorem 1.
Due to the weak law of large numbers, it is clear that
As a result, by using Slutsky’s theorem, we know that as . With the help of Lemma 2 the proof is completed.
2.1.1. Building the Confidence Interval
To construct an asymptotic confidence interval for γ, we need a pivotal quantity for the parameter γ. Based on the previous results, T is a pivotal quantity for γ. Therefore, T can be used for this purpose, as follows:
2.1.2. Hypothesis Testing
In practical situations, scientists need to test the parameter γ. As a particular instance, the null hypothesis shows that the CVs are the same in both populations. To accomplish the hypothesis test : γ =, the test statistic
can be generally applied.
If the null hypothesis is verified, thus the asymptotic distribution of is standard normal.
It should be noted that this method can be used for all distributions, because asymptotic methods have been used and the technique includes all exponential or non-exponential distributions. Moreover, the method is also appropriate if the populations have different distributions.
3. Results
3.1. Simulation Study
In this section, various simulated datasets are reviewed and analyzed to verify the accuracy of the theoretical findings. To compare the results obtained with the proposed method and those from the method given by Yue and Baleanu [27], we consider the distributions used therein. To simulate different samples of one symmetric and two asymmetric distributions, populations X and Y are considered. For this purpose, we selected the normal, and the gamma and beta distributions, respectively, with a variety of values of CV, , that is, identical to . Figure 1, Figure 2 and Figure 3 depict the probability density function (PDF) for the intended distributions.
Figure 1.
The PDF of the distribution normal (μ, ) for different values of CV (black: CV = 1; red: CV = 2; green: CV = 3; blue: CV = 5).
Figure 2.
The PDF of the distribution gamma (α, λ) for different values of CV (blue: α = 0.04, λ = 0.001, CV = 5; green: α = 0.11, λ = 0.001, CV = 3; red: α = 0.25, λ = 0.001, CV = 2; black: α = 1, λ = 0.001, CV = 1).
Figure 3.
PDF of distribution of beta (α, β) in the presence of different values of CV (blue: α = 0.009, β = 0.285, CV = 5; green: α = 0.08, β = 2.51; red: α = 0.21, β = 6.87, CV = 2; black: α = 0.94, β = 30.39, CV = 1).
The simulations are performed using the software R 3.3.2, adopting 1000 repetitions.
For investigating the accuracy of the proposed method, we compute the coverage probability, by
Furthermore, the value of test statistic is calculated. Then, we use the Shapiro–Wilk test to check the normality. After that, Q–Q plots are drawn to confirm the normality assumption for the offered test statistic. Table 1 summarizes the values of CP of the proposed method and those of the method by Yue and Baleanu [27], for various settings of parameters.
Table 1.
The values of CP of the proposed and the Yue and Baleanu [27] methods, for various settings of parameters.
3.2. Results
From Table 1, we conclude that both methods have been successful in controlling type I error. However, as can be seen, our method achieved better results.
From Table 2, we verify that the average length of the confidence interval of our method is relatively smaller than the length of the confidence interval of Ballino’s method. Additionally, as the sample size increases, the length of the confidence interval decreases.
Table 2.
The values of average lengths of the proposed and the Yue and Baleanu [27] methods, for various settings of parameters.
From Table 1, we verify that the CP of the proposed method is close to the intended level (1 − α = 0.95). This is more visible when the sample size increases. As a result, the introduced approach controls type I error. This means that about 95% of the simulated confidence intervals contain true γ, so it is accepted that the proposed confidence is an asymptomatic confidence interval for γ. Moreover, since the CP of the proposed method is closer to the intended level than the one of the alternative method [27], our approach is more robust. Table 3 shows the CPU time needed for running the proposed and the Yue and Baleanu [27] methods, for various combinations of parameters. We verify that our approach is faster than the comparative one.
Table 3.
The CPU time needed for running the proposed and the Yue and Baleanu [27] methods, for various settings of parameters.
Table 4 summarizes the p-values for assessing the normality of the statistic T0. We verify that all p-values are greater than 0.05, meaning that the normality of the introduced statistical test is confirmed by the Shapiro–Wilk test. This is also confirmed in the Q–Q plots depicted in Figure 4. Indeed, the points are around the direct line, so the observed values (quantiles) are very much like the quantiles of the normal distribution. Accordingly, the simulation outcomes verified that the asymptotic theoretical findings seem to be satisfied towards a parameter setting. As a result, our method is a suitable selection to perform a test hypothesis and build a confidence interval for the CV ratio of two independent populations.
Table 4.
The p-values to investigate the normality of the statistic .
Figure 4.
The Q–Q plots used to investigate the normality of the statistic . First column: Up: (, and ; middle: (, and down: (, and ; Second column: Up: (, and ; middle: (, and down: (, and ; Third column: Up: (, and ; middle: (, and ; down: (, and .
As the sample size increases, the power also increases. From Table 5, it can also be seen that our method is more powerful than the one by Yue and Baleanu [27].
Table 5.
The values of powers of the proposed and the Yue and Baleanu [27] methods, for various settings of parameters.
4. Discussion
4.1. Application
The applicability of the proposed technique in real situations is studied. For this purpose, we analyze a dataset containing two features: The humidity (%) and the scale of mouth, foot and hand disease outbreak from 2010 until 2017 for two regions in Malaysia (Sarawak and Malacca). Details about the datasets can be found in [34,35]. Table 6 and Figure 5 report the descriptive statistics and histograms of the features. The results show that the CVs of the humidity in Sarawak and Malacca are 0.8872 and 1.0788, respectively. The CVs of the disease outbreak in Sarawak and Malacca are 0.0107 and 0.0159, respectively.
Table 6.
Descriptive statistics about the humidity and the scale of mouth, foot and hand disease outbreak in Sarawak and Malacca.

Figure 5.
The humidity and the scale of mouth, foot and hand disease outbreak in Sarawak and Malacca.
4.2. Discussion
The proposed approach is applied to provide 95% confidence intervals for the ratio of the CV of the two considered features in Malacca with respect to Sarawak. The computed confidence intervals are reported in Table 7. It can be observed that the intervals (0.973, 1.459) and (1.189, 1.783) are the 95% confidence intervals for the ratio of the CV of the disease outbreak and the ratio of the CV of the humidity in Malacca with respect to Sarawak. Since the lower bound (1.189, 1.783) is greater than 1, it can be concluded that the CV of the humidity in Malacca is significantly higher than the CV of the humidity in Sarawak. Moreover, since the interval (0.973, 1.459) contains the value 1, the hypothesis of the equality of the CV of the disease outbreak in Sarawak and Malacca cannot be rejected.
Table 7.
The lower and upper bounds of the computed confidence intervals for the ratio of the CV of the humidity and the ratio of the CV of the scale of hand, foot and mouth disease outbreak in Malacca with respect to Sarawak.
It is not necessary that the distribution of x and y be the same, and it is omitted from the article and the table due to length. Therefore, this method can be used when the x and y distributions are different.
As can be seen in Table 7, the confidence interval of our method is slightly smaller than the confidence interval of Yue and Baleanu’s method. This is because we wrote the variance in such a way that not only simplifies the calculations, but also increases the power.
5. Conclusions
To compare populations, the CV is a useful, convenient and simple tool. In countless cases, two populations have the same CV, despite different means and variances. One way to understand the data structure is to study the equality of the two separate populations’ CVs. However, when the difference between the two CVs is slight, it does not provide a solid and valuable interpretation. For this reason, the CV ratio is used, which is more accurate. Asymptotic distribution was proposed in this paper, and then the test of hypothesis and the asymptotic confidence interval for the CV ratio of two independent populations were extracted. The findings showed that the probability of coverage is very close to the desired level when increasing the sample. Based on this, it can be concluded that the introduced methodology controls the type I error. In addition, CPU times confirmed that the proposed method does not involve excessive computational burden. In addition, the normality of the introduced test statistics was confirmed using the normal Shapiro–Wilk test and Q–Q plots. According to the results, it was observed that the asymptotic approximation acted very well for the whole simulated dataset. Moreover, the proposed method is more powerful than the comparative method given by Yue and Baleanu [27].
Author Contributions
Conceptualization, A.B., Z.A. and M.R.M.; Investigation, A.B., Z.A., M.R.M. and A.M.L.; Methodology, A.B., Z.A., M.R.M. and A.M.L.; Software, Z.A.; Writing—original draft, A.B., Z.A., M.R.M. and A.M.L.; Writing—review & editing, A.M.L. All authors have read and agreed to the published version of the manuscript.
Funding
This research received no external funding.
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
Not applicable.
Conflicts of Interest
The authors declare no conflict of interest.
References
- Meng, Q.; Yan, L.; Chen, Y.; Zhang, Q. Generation of Numerical Models of Anisotropic Columnar Jointed Rock Mass Using Modified Centroidal Voronoi Diagrams. Symmetry 2018, 10, 618. [Google Scholar] [CrossRef]
- Aslam, M.; Aldosari, M.S. Inspection Strategy under Indeterminacy Based on Neutrosophic Coefficient of Variation. Symmetry 2019, 11, 193. [Google Scholar] [CrossRef]
- Iglesias-Caamaño, M.; Carballo-López, J.; Álvarez-Yates, T.; Cuba-Dorado, A.; García-García, O. Intrasession Reliability of the Tests to Determine Lateral Asymmetry and Performance in Volleyball Players. Symmetry 2018, 10, 416. [Google Scholar] [CrossRef]
- Bennett, B.M. On an approximate test for homogeneity of coefficients of variation. In Contribution to Applied Statistics; Ziegler, W.J., Ed.; Birkhauser Verlag: Basel, Switzerland; Stuttgart, Germany, 1976; pp. 169–171. [Google Scholar]
- Shafer, N.J.; Sullivan, J.A. A simulation study of a test for the equality of the coefficients of variation. Commun. Stat. Simul. Comput. 1986, 15, 681–695. [Google Scholar] [CrossRef]
- Doornbos, R.; Dijkstra, J.B. A multi sample test for the equality of coefficients of variation in normal populations. Commun. Stat. Simul. Comput. 1983, 12, 147–158. [Google Scholar] [CrossRef]
- Hedges, L.; Olkin, I. Statistical Methods for Meta-Analysis; Academic Press: Orlando, FL, USA, 1985. [Google Scholar]
- Rao, K.A.; Vidya, R. On the performance of test for coefficient of variation. Calcutta. Stat. Assoc. Bull. 1992, 42, 87–95. [Google Scholar] [CrossRef]
- Gupta, R.C.; Ma, S. Testing the equality of coefficients of variation in k normal populations. Commun. Stat. Theory Methods 1996, 25, 115–132. [Google Scholar] [CrossRef]
- Rao, K.A.; Jose, C.T. Test for equality of coefficient of variation of k populations. In Proceedings of the 53rd Session of International Statistical Institute, Seoul, Korea, 22–29 August 2001. [Google Scholar]
- Pardo, M.C.; Pardo, J.A. Use of Rényi’s divergence to test for the equality of the coefficient of variation. J. Comput. Appl. Math. 2000, 116, 93–104. [Google Scholar] [CrossRef]
- Nairy, K.S.; Rao, K.A. Tests of coefficient of variation of normal population. Commun. Stat. Simul. Comput. 2003, 32, 641–661. [Google Scholar] [CrossRef]
- Verrill, S.; Johnson, R.A. Confidence bounds and hypothesis tests for normal distribution coefficients of variation. Commun. Stat. Theory Methods 2007, 36, 2187–2206. [Google Scholar] [CrossRef]
- Jafari, A.A.; Kazemi, M.R. A parametric bootstrap approach for the equality of coefficients of variation. Comput. Stat. 2013, 28, 2621–2639. [Google Scholar] [CrossRef]
- Feltz, G.J.; Miller, G.E. An asymptotic test for the equality of coefficients of variation from k normal populations. Stat. Med. 1996, 15, 647–658. [Google Scholar] [CrossRef]
- Fung, W.K.; Tsang, T.S. A simulation study comparing tests for the equality of coefficients of variation. Stat. Med. 1998, 17, 2003–2014. [Google Scholar] [CrossRef]
- Tian, L. Inferences on the common coefficient of variation. Stat. Med. 2005, 24, 2213–2220. [Google Scholar] [CrossRef]
- Forkman, J. Estimator and Tests for Common Coefficients of Variation in Normal Distributions. Commun. Stat. Theory Methods 2009, 38, 233–251. [Google Scholar] [CrossRef]
- Liu, X.; Xu, X.; Zhao, J. A new generalized p-value approach for testing equality of coefficients of variation in k normal populations. J. Stat. Comput. Simul. 2011, 81, 1121–1130. [Google Scholar] [CrossRef]
- Krishnamoorthy, K.; Lee, M. Improved tests for the equality of normal coefficients of variation. Comput. Stat. 2013, 29, 215–232. [Google Scholar]
- Jafari, A.A. Inferences on the coefficients of variation in a multivariate normal population. Commun. Stat. Theory Methods 2015, 44, 2630–2643. [Google Scholar] [CrossRef]
- Hasan, M.S.; Krishnamoorthy, K. Improved confidence intervals for the ratio of coefficients of variation of two lognormal distributions. J. Stat. Theory Appl. 2017, 16, 345–353. [Google Scholar] [CrossRef]
- Shi, X.; Wong, A. Accurate tests for the equality of coefficients of variation. J. Stat. Comput. Simul. 2018, 88, 3529–3543. [Google Scholar] [CrossRef]
- Miller, G.E. Use of the squared ranks test to test for the equality of the coefficients of variation. Commun. Stat. Simul. Comput. 1991, 20, 743–750. [Google Scholar] [CrossRef]
- Nam, J.; Kwon, D. Inference on the ratio of two coefficients of variation of two lognormal distributions. Commun. Stat. Theory Methods 2016, 46, 8575–8587. [Google Scholar] [CrossRef]
- Wong, A.; Jiang, L. Improved Small Sample Inference on the Ratio of Two Coefficients of Variation of Two Independent Lognormal Distributions. J. Probab. Stat. 2019, 2019, 7173416. [Google Scholar] [CrossRef]
- Yue, Z.; Baleanu, D. Inference about the Ratio of the Coefficients of Variation of Two Independent Symmetric or Asymmetric Populations. Symmetry 2019, 11, 824. [Google Scholar] [CrossRef]
- Haghbin, H.; Mahmoudi, M.R.; Shishebor, Z. Large Sample Inference on the Ratio of Two Independent Binomial Proportions. J. Math. Ext. 2011, 5, 87–95. [Google Scholar]
- Mahmoudi, M.R.; Mahmoodi, M. Inference on the Ratio of Variances of Two Independent Populations. J. Math. Ext. 2014, 7, 83–91. [Google Scholar]
- Mahmoudi, M.R.; Mahmoodi, M. Inference on the Ratio of Correlations of Two Independent Populations. J. Math. Ext. 2014, 7, 71–82. [Google Scholar]
- Mahmoudi, M.R.; Nasirzadeh, R.; Mohammadi, M. On the Ratio of Two Independent Skewnesses. Commun. Stat. Theory Methods 2019, 48, 1721–1727. [Google Scholar] [CrossRef]
- Mahmoudi, M.R.; Behboodian, J.; Maleki, M. Large Sample Inference about the Ratio of Means in Two Independent Populations. J. Stat. Theory Appl. 2017, 16, 366–374. [Google Scholar] [CrossRef]
- Ferguson Thomas, S. A Course in Large Sample Theory; Chapman & Hall: London, UK, 1996. [Google Scholar]
- Nelson, B.R.; Edinur, H.A.; Abdullah, M.T. Compendium of hand, foot and mouth disease data in Malaysia from years 2010 to 2017. Data Brief 2019, 24, 103868. [Google Scholar] [CrossRef]
- Mahmoudi, M.R.; Tuan, B.A.; Pho, K.H. On kurtoses of two symmetric or asymmetric populations. J. Comput. Appl. Math. 2021, 391, 113370. [Google Scholar] [CrossRef]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).