1. Introduction
In the health and social sciences, outcome variables usually show values of skewness and kurtosis that clearly deviate from the normal distribution [
1,
2]. Micceri [
2] analyzed the distributional characteristics of over 400 large-sample achievement and psychometric measures and found several classes of deviation from the normal distribution in addition to skewness and kurtosis, leading him to conclude that normal distributions are uncommon with real data. This was further illustrated in the study by Blanca et al. [
1], who analyzed 693 distributions corresponding to cognitive ability and other psychological variables derived from 130 different populations. They found that most distributions were non-normal. It should also be noted that variables with non-normal distributions are commonly encountered in several areas of psychological and social research, an example being the distributions for some dimensions of self-concept [
3,
4]. Further examples cited by Arnau et al. [
5] include reaction times or response latency in cognitive studies [
6], clinical assessment indices in drug abuse research [
7], physical and verbal violence in couples [
8], and labor income [
9] and health care costs [
10] in sociological studies.
In a recent systematic review, Bono et al. [
11] found that the most widely used distributions in health, education, and social sciences can be ranked in descending order as follows: gamma, negative binomial, multinomial, binomial, lognormal and exponential.
The detailed study of distribution shape requires the calculation of skewness and kurtosis. In the majority of studies these indices are estimated based on the third and fourth central moments of a distribution. However, their values are usually biased in the case of non-normal distributions, that is, there may be a discrepancy between the true (theoretical) values or accepted references of skewness and kurtosis measures, for a given distribution, and the observed (empirical) values. With the aim of reducing this bias, Hogg [
12] proposed alternative estimators, which were subsequently used in the study by Micceri [
2]. These estimators are referred to as elements leading to adaptive robust estimators of location [
13]. Hogg et al. [
14] and Redd and Stark [
15] complemented the earlier study by Hogg [
12], showing that these estimators are a good choice for a wide variety of distributions. The advantage of these estimators is that they provide more precise and accurate information about the data distribution, prior to choosing—on the basis of this information—the most suitable statistical analysis.
In the field of finance, Bonato [
16] and Kim and White [
17] proposed the use of other indices for estimating skewness and kurtosis, for example, those of Bowley [
18], Groeneveld and Meeden [
19] and Moors [
20]. Recently, Aytaçoğlu and Sazak [
21] analyzed all these indices, as well as both conventional and Hogg’s estimators, and made a series of recommendations for practitioners regarding the best estimator of skewness and kurtosis for certain distributions. Their conclusion was that the performance of each estimator depended upon the shape of the distribution and the sample size. For example, they found that conventional estimators performed quite well with large samples and for the Weibull and lognormal distributions, whereas the other estimators performed better with the Student distribution and were less affected by sample size. Bonato [
16] and Kim and White [
17] also pointed out that conventional estimators are influenced by outliers. For their part, Keselman et al. [
13] examined empirical methods based on data trimming in the presence of outliers.
The results of all the aforementioned simulation studies provide information about the different tests which may be used to determine the shape of a distribution, and the applied researcher therefore has something of a guide when it comes to choosing estimators of skewness and kurtosis. However, having to choose between several estimators is impractical, especially when the majority of statistical packages only include the conventional estimators. In the present study we therefore focus solely on Hogg’s estimators [
12], as these are well-known, and we provide a macro that applied researchers can use to calculate them.
The purpose of this paper is to compare the behavior of Hogg’s estimators of skewness and kurtosis with that of conventional estimators based on the third and fourth central moments of a distribution, with a view to their subsequent use in applied contexts.
To this end we conducted a Monte Carlo simulation study to estimate the skewness and kurtosis of the most frequent continuous distributions of the exponential family (gamma, lognormal and exponential), according to Bono et al. [
11]. The indicators used to assess the bias, precision and accuracy of the different estimators of skewness and kurtosis were, respectively, the relative bias (RB), the coefficient of variation (CV) and the scaled root mean square error (SRMSE). Finally, we developed a SAS macro for calculating skewness and kurtosis with both conventional and Hogg’s estimators. This macro will be highly useful for applied researchers.
In summary, this study had two objectives: (a) To assess the bias, precision and accuracy of conventional and Hogg’s estimators on the basis of RB (interpreted in absolute values), CV and SRMSE, and (b) to provide applied researchers with a macro for calculating kurtosis and skewness with both conventional and Hogg’s estimators. In order to address these objectives, we begin by reviewing the kurtosis and skewness estimators used in this study, detailing their theoretical values for the gamma, lognormal and exponential distributions. Extensive simulation work is then carried out to assess the estimators of skewness and kurtosis for each distribution and different sample sizes. Overall, the results indicated that for the non-normal distributions investigated, the estimators of skewness and kurtosis which best reflect the shape of the distribution are Hogg’s estimators.
It should be noted that as this is a simulation study, the results apply solely to the distributions and sample sizes analyzed, and thus they cannot be compared with those of other simulation studies that analyze different conditions.
2. Review of Kurtosis and Skewness Estimators
The procedure for assessing the nature of a data distribution includes two tests: skewness and kurtosis. Skewness is the degree of asymmetry of a distribution, that is, how much it is skewed to the left or right. Kurtosis refers to the nature of distribution tails, that is, their length and weight. On a practical level it is more useful to know the nature of the tails first, and hence we begin by discussing kurtosis.
The traditional calculation of kurtosis is based on the fourth central moment, defined by the SAS program as:
where
n is the number of values for a variable,
is the ith value of the variable,
is the sample average, and
s is the sample standard deviation (SD). Equation (1) yields a value close to zero when the distribution is normal. Kr1 < 0 represents a shorter distribution with light and thinner tails, whereas Kr1 > 0 implies a longer distribution with heavy and fatter tails.
The classical kurtosis coefficient is very sensitive to outlying values. Therefore, Hogg [
12,
22] developed two estimators of kurtosis, Q and Q
1, which in this paper are referred to as Kr2 and Kr3. The Kr2 estimator is as follows:
where U
0.05 is the mean of the upper 5% of the order statistics of the sample, L
0.05 is the mean of the lower 5% of the order statistics of the sample, and U
0.5 and L
0.5 are, respectively, the means of the upper and lower 50% of the order statistics. Thus, Kr2 is a ratio of two linear functions of the order statistics.
According to Hogg [
12] and Reed and Stark [
15], values of Kr2 < 2 imply a light-tailed distribution (uniform), 2 ≤ Kr2 ≤ 2.6 a medium-tailed distribution (normal), 2.6 < Kr2 ≤ 3.2 a heavy-tailed distribution (double exponential) and Kr2 > 3.2 a very heavy-tailed distribution (like the Cauchy). Othman et al. [
23] applied the following rule: a distribution is classified as normal-tailed if Kr2 < 3, as heavy-tailed if 3 ≤ Kr2 < 5, and as very heavy-tailed if Kr2 ≥ 5.
Hogg [
12] also developed another estimator of kurtosis, a modification of Kr2, defined as:
where U
0.2 and L
0.2 are, respectively, the means of the upper and lower 20% of the order statistics of the sample, and U
0.5 and L
0.5 are defined as in Equation (2).
The reason for this change is that the numerator of Kr3 is an excellent linear estimate of the SD of a normal distribution [
24], while the denominator is an excellent measure of the SD of a double exponential distribution. Both Kr2 and Kr3 are location-free statistics, and hence are uncorrelated with location statistics like the trimmed means [
12,
15].
Since the population characteristics corresponding to Kr3 associated with the normal and double exponential are around 1.75 and 1.93, respectively, Hogg [
12] and Reed and Stark [
15] suggested that the cut-offs for Kr3 are as follows: Kr3 < 1.81 (light-tailed), 1.81 ≤ Kr3 ≤ 1.87 (medium-tailed) and Kr3 > 1.87 (heavy-tailed).
As for skewness, the calculation of the conventional estimator is based on the third central moment, and is defined by the SAS program as:
where
n is the number of values for a variable,
is the
ith value of the variable,
is the sample average, and
s is the sample SD. The normal distribution is symmetric, and thus the value of Equation (4) is zero. If Sk1 < 0 the distribution has a longer tail to the left than the right, in other words, it is left skewed. If Sk1 > 0 the distribution is skewed to the right.
Hogg [
12] proposed the estimator Q2 as a measure of skewness, labeled here as Sk2 and defined as:
where U
0.05 and L
0.05 are the means of the largest and smallest 5% of the order statistics of the sample, and m
0.25 is the 25-trimmed mean, that is, the mean of the ordered observations trimmed by 25% at both the upper and lower ends. Keselman et al. [
13] and Reed and Stark [
15] referred to the 25-trimmed mean as T
0.25. The calculation of m
0.25 or T
0.25 coincides with the mean of the middle (MID) 50% of the sample. Hence, the mean of the middle 50% is referred to by Hogg et al. [
14] as M
0.5 and by Othman et al. [
23] as MID.
According to Hogg et al. [
14] and Othman et al. [
23], if 0.5 ≤ Sk2 ≤ 2 the distribution can be considered symmetric, if Sk2 < 0.5 the distribution may be skewed to the left, and if Sk2 > 2 the distribution is skewed to the right. Reed and Stark [
15], based on the work of Hertsgaard [
25], classified distributions as symmetric (0.7 ≤ Sk2 ≤ 1.4), left skewed (Sk2 < 0.7) and right skewed (Sk2 > 1.4).
4. Method
Simulations were performed using SAS 9.4 [
27], which features random number generators for known distributions. Thus, we constructed macros that enabled pseudorandom variables to be generated for the different distributions, and which could calculate kurtosis and skewness both with the conventional estimators that SAS uses by default (Kr1 and Sk1), and also with those of Hogg (Kr2, Kr3, and Sk2). The generated data were ordered from least to greatest, and we considered, in accordance with the corresponding equation, percentages above or below the order statistic of 5%, 20%, 25% and 50%.
In addition to the macros developed for the simulation study we constructed a macro for calculating, from a data file, the conventional estimators that SAS uses by default (Kr1 and Sk1), as well as those of Hogg (Kr2, Kr3, and Sk2). The first column of the data file is the subject identifier, while the second corresponds to the data. Before running the macro, the user must specify the sample size (N), and in the section “load a data file”, the path and name of the data file (see
Appendix A).
4.1. Study Variables
1. Sample sizes. The study considered sample sizes of N = 50, 100, 400, 1000 and 5000.
2.
Distributions. In accordance with the systematic review of Bono et al. [
11] we chose non-normal continuous distributions from the exponential family. Thus, data from the exponential distribution were generated using the function
rand(‘exponential’) in SAS. In order to generate data for the lognormal distribution we exponentialized the normal function with the parameters most commonly used in simulation studies, namely ζ = 1 and σ = 0.5
exp(rand(‘normal’,1,0.5)), since if data are normally distributed, then X = exp(data) is lognormally distributed [
26]. The SAS function that was used to generate data for the gamma distributions was
rand(‘gamma’, shape_parameter). In order to cover different shapes of the gamma distribution we chose shape parameters of α = 0.5, 2 and 4.
Each combination of sample size and distribution shape was replicated 10,000 times (5 × 5 × 10,000 = 250,000 simulations). For each replication the estimators of kurtosis (Kr1, Kr2 and Kr3) and skewness (Sk1 and Sk2) were computed in order to obtain the RB, CV and SRMSE of these estimators.
4.2. Evaluation Criteria
We analyzed which estimator best reflected the shape of the distribution, comparing the empirical values with the theoretical ones shown in
Table 1, based on three criteria: bias, precision and accuracy [
28].
Regarding the bias or deviation between the average empirical value and the theoretical value of kurtosis or skewness, we examined the RB because the amount of bias must be assessed in relative terms in order to be compared [
29,
30]. Thus, Equation (6) quantifies the percentage to which the estimated parameter values deviate from the true parameter values [
29,
31]:
where
β is the true value for the estimate of interest, and
(
S is the number of simulations performed and
is the estimate of interest within each of the
i = 1, …,
S simulations).
In relation to precision, we calculated SD and CV. The SD is related to the variability of the estimated values, and little variation indicates that the estimator is precise [
28]. In order to achieve greater comparability of precision among estimators, we also calculated the CV [
28], which is the SD expressed as a percentage of the mean [
32]:
where
is the empirical SD of the estimate of interest across all simulations. Note that the calculation of CV does not require the true value to be known.
Finally, in order to evaluate accuracy, we calculated the root mean square error (RMSE). The RMSE incorporates both the bias and SD of the estimated parameters, and it transforms the mean square error (MSE) onto the same scale as the parameter [
29,
33,
34]. The RMSE indicator represents the degree to which the estimated parameter values vary around the parameter value [
30], and it is defined as:
We calculated the scaled RMSE (SRMSE), dividing by the true value of the estimate of interest, in order to make comparisons between the results obtained with different estimators [
28]:
To sum up, we calculated the percentages of RB, CV and SRMSE of skewness and kurtosis across 10,000 simulated data sets from the gamma, lognormal and exponential distributions, in each case with respect to different sample sizes. A good criterion is choosing the estimators of skewness and kurtosis with the lowest percentages of RB, CV and SRMSE.
6. Discussion
The choice of which estimators best reflect the shape of the distribution depends on RB, CV and SRMSE. These are the main indicators that are generally used to evaluate the degree of bias, precision and accuracy of an estimator.
This study had two objectives. The first was to compare the bias, precision and accuracy of conventional and Hogg’s estimators of skewness and kurtosis by calculating their RB, CV and SRMSE for the most frequent continuous distributions used in health, education, and social sciences (i.e., gamma, lognormal and exponential distributions). Second, we constructed a macro with SAS for calculating both conventional and Hogg’s estimators. The first objective was addressed by performing a Monte Carlo simulation study.
Regarding kurtosis, the values estimated by Kr3, followed by those obtained with Kr2, show lower percentages of RB, CV and SRMSE than do the values estimated with the conventional estimator Kr1, regardless of sample size. Aytaçoğlu and Sazak [
21] found that with sample sizes of N = 300 and N = 1000, the Kr2 estimator was better than Kr1 for Student’s distribution. For the lognormal distribution, by contrast, these authors concluded that Kr2 was worse than Kr1. However, their results are not comparable with our own, as they used different parameters when simulating a lognormal distribution. That is, the shape of the lognormal distribution that was simulated by these authors is different to that considered here. In addition, these authors studied bias and MSE, and to make the results totally comparable, these criteria would need to be scaled.
With respect to skewness, and according to RB values, Sk2 is less biased than Sk1. In terms of the CV and SRMSE values, Sk2 is a little more precise and accurate than Sk1. In contrast to these findings, Joanes and Gill [
35] showed that Sk1 performs well and yields small MSE in samples from non-normal distributions. However, they used a different criterion to our own to evaluate accuracy (i.e., MSE vs. SRMSE in our study).
Regarding the impact of sample size, this was greater in the case of the Kr1 estimator, with the RB, CV and SRMSE values decreasing considerably as N increased. The observed influence of sample size is consistent with the results regarding bias and MSE that Aytaçoğlu and Sazak [
21] obtained with lognormal and Weibull distributions.
The simulations we performed confirm that the estimations obtained with the conventional measure of kurtosis deviate extremely from the theoretical values, especially with small samples. This is likely due to the presence of outliers in non-normal distributions. The calculation of conventional estimators uses all the data, whereas outliers are excluded, by means of trimming, when calculating Hogg’s estimators. In this context, Kim and White [
17] showed that the conventional measures are influenced by extreme observations because they are essentially based on sample averages. We also observed that changes in sample size greatly affect the estimation of Kr1. With the gamma distribution and Kr1 the effect of sample size varies depending on the shape of this distribution, it being most marked when the shape parameter is α = 4 for CV and SRMSE.
By contrast, Hogg’s estimators of kurtosis are less influenced by sample size and give results closer to the theoretical values. Our results also show that the conventional skewness estimator Sk1 is less affected by sample size than is the conventional kurtosis estimator Kr1. As Bonato [
16] points out, it is important that values of kurtosis and skewness are stable; that is, they do not vary too much depending on sample size and the presence of outliers. In this study we found that Hogg’s kurtosis estimators are more stable than is the conventional estimator.
In conclusion, this Monte Carlo simulation study shows that Kr3 and Sk2 are the best estimators of kurtosis and skewness, since they yield less bias and are more precise and accurate. In other words, they are better according to all three evaluation criteria (RB, CV and SRMSE), irrespective of the type of distribution (exponential, lognormal or gamma).
6.1. Utility for Researchers
This study has shown that the conventional indices of skewness and kurtosis based on the third and fourth central moments of a distribution yield biased values in a wide range of distributions commonly used in health, education, and social sciences. Consequently, researchers should also compute other estimators which better reflect the shape of the distribution. The results from our Monte Carlo simulation study indicate that Hogg’s estimators, specifically Kr3 and Sk2, are less biased and more precise and accurate than are conventional estimators. Although statistical packages include the conventional indices of skewness and kurtosis, they do not compute Hogg’s estimators, thus restricting their use. For this reason, we developed a SAS macro (available in
Appendix A) for calculating not only the conventional estimators, but also those proposed by Hogg. Our aim in doing so is to make accessible to a wide range of applied researchers and practitioners the computation of several estimators of skewness and kurtosis. Identifying the shape of the underlying distribution associated with data is a key aspect to consider when selecting a data analysis procedure. As Islam [
36] points out, non-normality may lead to incorrect statistical inferences, since many estimation techniques rely on the underlying distributional assumption of normality of the data.
6.2. Limitations and Future Directions
The present study has certain limitations that should be highlighted. First, the results are limited to the conditions examined in this study, and it is difficult to compare them with other simulation studies since the sample sizes and distribution shapes differ from one study to another. In addition, other simulation studies have used different evaluation criteria, such as unscaled bias or RMSE, which also makes comparison difficult. Second, our study is focused on Hogg’s estimators, although it can now be extended to other estimators of skewness and kurtosis in further research. Third, although the distributions studied are the most common non-normal distributions found in research, there may be studies whose data derive from other distributions. In this respect, it would be interesting, for example, to examine the performance of estimators with distributions that have negative values of kurtosis and skewness.
Finally, an interesting line of future research would be to review different trimming procedures in statistical analysis, and to carry out the corresponding simulation studies, manipulating sample size and the type of distribution. In fact, one kind of study that has been repeated over the years involves the analysis of the amount and type of data trimming that should be adopted in order to circumvent the biasing effects of outlying values on Type I error and/or power rates for classical tests [
13,
37]. For example, Othman et al. [
23], building on the study by Babu et al. [
38], suggested using all the sample data if Kr2 < 3; if 3 ≤ Kr2 < 5 they recommend trimming the top and bottom 10% of the sample points and using the middle 80%, while if Kr2 ≥ 5 the researcher should trim the top and bottom 20% of the sample points and use the middle 60%. Keselman et al. [
13] recommend that applied researchers could adopt the minimal percentage (10% total trimming), except when data are extremely non-normal.