Studies of EEG Asymmetry and Depression: To Normalise or Not?

: A brief review of 50 studies from the last 10 years indicated that it is often accepted practice to apply log transformation processes to raw EEG data. This practice is based upon the assumptions that (a) EEG data do not resemble a normal distribution, (b) applying a transformation will produce an acceptably normal distribution, (c) the logarithmic transformation is the most valid form of transformation for these data, and (d) the statistical procedures intended to be used are not robust to non-normality. To test those assumptions, EEG data from 100 community participants were analysed for their normality by reference to their skewness and kurtosis, the Kolmogorov– Smirnov and Shapiro–Wilk statistics, and shapes of histograms. Where non-normality was observed, several transformations were applied, and the data again tested for normality to identify the most appropriate method. To test the effects of normalisation from all these processes, Pearson and Spearman correlations between the raw and normalised EEG alpha asymmetry data and depression were calculated to detect any variation in the signiﬁcance of the resultant statistic.


Introduction
Although previously considered to be found in humans alone [1], more recent research indicates that hemispheric symmetry is a characteristic of most vertebrate brains [2]. This division of the brain into two hemispheres occurred early in evolution and conferred some selective advantage [3]). However, the appearance of a symmetrical brain in humans does not extend to neurological structure and functioning, which is demonstrably asymmetrical in terms of such capabilities as motor control [4], cognitive capacity [5], speech [6], emotional processing [7], attentional processes [8] and other core aspects of the cognitive and behavioural characteristics of humans. Consequent to the importance of functional asymmetry in many cognitive activities, there is some evidence that alterations to this functional and/or structural asymmetry are associated with psychiatric disorders [9], including autism [10], schizophrenia [11], psychosis [12], dyslexia [13] and attention-deficit-hyperactivity disorder [14].
One particular psychiatric disorder that has been repeatedly associated with alterations in brain asymmetry is depression [15]. Much of the early research regarding depression and brain asymmetry used alpha wave (8)(9)(10)(11)(12)(13) Hz) discrepancies across the frontal lobe hemispheres as its principal predictor variable [16][17][18] and, although many other brain regions and frequencies have been examined, most of the current research in this field remains focused upon alpha wave asymmetry in the frontal region [19]. Despite the initial and ongoing support for an association between alpha wave asymmetry and depression, not all results are uniformly supportive [20], and this field continues to draw the attention of researchers. As such, clarification of the required methodological processes necessary for Symmetry 2023, 15, 1689 2 of 16 adequate testing of the association between frontal alpha asymmetry (FAA) and depression is an important step in clarifying the occurrence of some unconfirmed results in the previous literature. One of those methodological processes is the commonly-applied logarithmic transformation of EEG data prior to statistical analysis.
Normalisation of skewed data is argued as necessary because some statistical procedures may be confounded by non-normally distributed data [21]. This rule has been almost universally applied to EEG data, but the basis for that application is unclear, apart from a publication by Allen et al. (2004), which has been cited over 650 times in subsequent EEG research and which presented EEG data exhibiting skewness and kurtosis in a unique dataset. Those authors argued that this degree of skewness and kurtosis in their data necessitated logarithmic transformation to produce a reasonably normal distribution.
However, the requirement to apply a logarithmic transformation that is based on Allen et al. [22] depends upon a number of steps, including evidence that (a) the EEG data in question do not resemble a normal distribution, (b) the application of a transformation will produce an acceptably normal distribution, (c) the logarithmic transformation is the most valid form of transformation for these data, and (d) the statistical procedures intended to be used are not robust to non-normality. These steps were used as research questions and will be discussed below. The first research question was to ascertain the extent of log transformation of EEG data in studies of FAA.
The presence of log transformations in the FAA-depression literature. Table 1 presents summary data from 50 studies published during the period 2013-2022, using five studies per year selected on the basis of their citation scores in Google Scholar (GS). GS was chosen as the search engine because it is the most encompassing in terms of sources identified. The search criteria were "EEG/eeg, alpha asymmetry, depression" and the search was carried out by authors CFS and VB independently, with any disagreements decided by consultation until 100% agreement was reached on which papers should be included. As a general rule, this was achieved by the identification of relevant peerreviewed papers that used alpha asymmetry as their primary dependent variable, then reference to the citation score, and examination of the abstracts to eliminate review or methodological discussion papers. These search criteria were relatively easily applied for the years 2013 to 2020. However, the comparative lack of citations in later years resulted in selections more focused on the content of the abstracts and the papers themselves. Thus, the selection criteria may be summarised as the study was: (i) peer-reviewed, (ii) focused on using EEG to measure alpha asymmetry as its primary dependent variable, usually also with a measure of depression, and (iii) had the highest citation scores for a chosen year, based on GS. Exclusion criteria were the absence of any of these three characteristics. To inform this discussion of normalisation of EEG data, the following aspects of these papers were identified and appear in Table 1: author and year of publication (to demonstrate any variation over time), citations listed in GS (as an index of relative importance in the field according to peers), sample size, sex, age (to identify the range of these covered by the studies), whether EEG data were collected under eyes open or eyes closed conditions (the former has been found to be less reliable than the latter: [67]), the number of EEG sites used to calculate FAA (as an indicator of possible reliability of the FAA index used), whether the EEG data were tested for normality prior to normalisation (a necessary step in rationalizing the use of normalisation), which form of normalisation was used (if any), whether the effect of normalisation was measured (i.e., was the normalisation process effective in producing normally-distributed data), and the statistical procedures used with these EEG data (because some statistical processes do not require normalisation). Table 1 The reports summarised in Table 1 included both sexes, a range of ages from 6 months to 80 years, and 42% of studies using both eyes open and eyes closed conditions, 32% only eyes closed, 24% only eyes open, and one study (2%) which did not report that information. There were varying numbers of EEG sites used to calculate FAA (from 1 pair up to 11 pairs), mostly in the frontal region alone, but sometimes with other regions also. These 50 papers may, therefore, be said to represent a range of methodological factors.

Summary of Findings from
Of the 50 papers summarised in Table 1, eleven did not report undertaking any normalisation procedures, one applied a non-log process, another normalised EEG data to their sum, and 37 (74%) reported using log transformations. Only one study tested for non-normality before normalising their data, and two reported that they checked the effects of normalisation to determine if non-normality was reduced or removed. These data suggest that the practice of applying log transformation to EEG data has been relatively widespread but not completely so. If the four steps described above regarding the decision to normalise are followed, the following emerges from this brief and selective review of the last decade's research on FAA depression.
(a) The EEG data in question do not resemble a normal distribution. This is impossible to decide in 48 of the 50 studies because no attempt was made to determine if the EEG data under examination were non-normal. (b) Application of a transformation will produce an acceptably normal distribution. This also cannot be determined in 48 of the 50 studies because no check was made on the effect of the transformation upon the nonnormality of EEG data in all but two studies that normalised their EEG data. (c) The logarithmic transformation is the most valid form of transformation for these data. This was not established in any of the 50 studies. There are multiple methods of normalising data, and the preferred method can be identified by scrutinising the distribution of the data, but no studies reported taking this step. In fact, logarithmic transformation is recommended only when "the distribution differs substantially" from normal; by contrast, if the difference from normality is only "moderate", then "a square root transformation is tried first" ( [21], p. 87). The onus is upon the researcher to interrogate the distribution of their data (preferably by observation of the histograms, the expected normal probability plots, and the detrended expected normal probability plots, rather than reference to formal inference tests such as the Kolmogorov-Smirnov statistic [21], then apply the appropriate transformation depending on whether the data are positively or negatively skewed, and the degree to which they are skewed (i.e., moderately, more severely, or quite severely). However, without identification of the presence, form and severity of the departure from normality prior to transformation, there is no evidence from almost all of the studies reviewed in Table 1 that log transformation was the most appropriate method. (d) The statistical procedures intended to be used are not robust to non-normality. There is some argument that most or all of the statistical procedures used to test the major Although Norris and Aroian ( [73], p. 67) argued that "data transformation is not always needed or advisable" when using Pearson correlations with non-normal data, other simulation studies have not been so positively conclusive [74][75][76][77]. What is clearer from these studies is that the impact of non-normality on Pearson correlation coefficients is dependent upon sample size, the magnitude of the correlation coefficient (i.e., the effect size being detected), and the nature of the non-normal distribution, with high kurtotic distributions and the presence of outliers being of greater concern than skewness alone [75,78,79]. Cohen et al. [80] noted that "Violations of the normality assumption do not lead to bias in estimates of the regression coefficients", but Serlin and Harwell [81] recommended using nonparametric tests of significance rather than the F test to evaluate regression outcomes when sample sizes are small. Tabachnik and Fidell ([21], p. 251) commented that "Univariate F is robust to modest violations of normality as long as there are at least 20 degrees of freedom for error" and that MANOVA also exhibits "robustness to nonnormality" when the sample size is at least 40. In all of the studies that used ANOVA, there were sufficient participants to produce more than 20 degrees of freedom, and in the two studies that used MANOVA, both had samples much greater than 40. Two studies used mixed linear modelling, which is robust to violations of normality [82]. One study used t-tests, which are argued to be "so robust against non-normality that there is nearly no need to use" non-parametric tests ( [83], p. 175), one applied Support Vector Machine ( [84], noted that only 2% of studies using this procedure also performed normalisation and that the necessity for normalisation when using this procedure is yet to be determined), and two used Laterality Coefficients. As noted by Brumer et al. [85], there are several methods by which a Laterality Coefficient or Laterality Index may be calculated, but all depend upon the comparison of a number of indices (usually brain regions of interest) that meet a particular threshold indicative of neural activity, calculated across brain hemispheres and subject to further statistical analyses of hemispheric differences, depending upon the aims of the study. As shown in Table 1, one of the studies that used Laterality coefficients also applied regression analysis, and the other used ANOVA.
On the basis of the findings presented in Table 1, plus the discussion in the paragraphs above, there is no compelling reason for applying logarithmic normalisation procedures to these datasets. However, because the necessary number of steps in identifying the normality status of the data and then selecting the appropriate normalisation procedure (and checking for its efficacy in removing non-normality) have not been demonstrated in these 50 studies, it is relevant to follow that process for the purpose of evaluating the relative necessity and benefit of normalisation.
Therefore, the following sections of this paper apply the four steps described above with an EEG dataset collected across a number of frontal sites to address the research questions: (i) are these data non-normal, (ii) if so, what is the form and severity of that non-normality, (iii) what is the effect of application of the recommended normalisation procedure in terms of the normality of the data, and (iv) is there evidence of any meaningful difference in the outcomes of correlational analyses applied to the non-normalised versus the normalised data. Correlational analyses were selected because ANOVA models and other statistical procedures described above are relatively robust to non-normality. Discussion of these findings and their implications for statistical analyses of EEG data, specifically whether normalisation (particularly log transformation) is necessary or advantageous, will be based upon these analyses.

Data
EEG and depression data were collected from a previous study of FAA and depression in 100 community volunteers [86]. Descriptions of participants and procedures are presented there. The depression measure was the self-rated depression scale (SDS) [87,88]. Data for this study were those collected from EEG sites FP1, FP2, F3, F4, F7, F8, FT7, FT8, FC3, and FC4 during eyes-open (EO) and eyes-closed (EC) conditions because this was the methodology used by the largest proportion (42%) of studies reviewed in Table 1. From these, five alpha asymmetry indices were derived according to the following procedures: FP2-FP1, F4-F3, F8-F7, FT8-FT7, and FC4-FC3, consistent with the wider literature on the FAA-depression hypothesis [89].

Analyses
As recommended in the methodological literature pertaining to the impacts of outliers on assessments of normality (e.g., [79]), EEG frontal site data were screened for univariate and multivariate outliers. Univariate outliers were determined by converting EEG data to z-scores for each case, with values above or below a threshold of ±3.29 (i.e., 0.001 alpha level) considered to be problematic [21]. Multivariate outliers were assessed by computing Mahalanobis distance and analysing with a χ 2 test using an alpha level of 0.001 [21]. For the EC condition, two cases which were significant multivariate outliers and had z-scores above the threshold on all 10 frontal alpha sites were deleted. Five cases in the EO condition were deleted due to being significant multivariate outliers and having z-scores above the threshold on several frontal alpha sites. The resulting sample sizes were 98 participants for the EC and 95 participants for the EO conditions. Normality was assessed with frequency histograms, skewness and kurtosis values, plus their associated z-scores, also using a threshold of ±3.29, computed by dividing the skewness and kurtosis values by their respective standard errors [90], and the Shapiro-Wilk (SW) and Kolmogorov-Smirnov (KS) tests.
Following the recommendations set out for each form of histogram distribution and skewness by Tabachnik and Fidell ([21], pp. [79][80][87][88][89], several transformation methods were applied to correct variables that were non-normal. This included nonlinear transformations (square root, logarithmic, and inverse) and rank-based inverse normal (RIN) transformation using the rankit equation [91] as employed in previous simulation studies [74,78]. As recommended by Tabachnik and Fidell (2013), these normality statistics and histograms were again inspected following each transformation to determine the most appropriate method.
To assess how each of these transformations affected the outcome of the calculation of the correlation between SDS and FAA, Spearman's rank order correlation (rho) was applied to the association between the raw data for SDS score and each of the pairs of frontal EEG asymmetry values under eyes open and eyes closed conditions. Because Spearman's rho is specifically designed to cater for the presence of non-normality, these results were used as the yardstick against which Pearson's r results could be compared when raw data and the logarithmic and RIN normalisation data were included. Due to the exploratory nature of this study, those correlation coefficients that reached the p-value of <0.1 are also shown in Table 2 in addition to those that reached traditional (p < 0.05) levels of statistical significance.

Research Question 1:
Are these data non-normal? If so, what is the form and severity of that non-normality, and what normalisation procedure is recommended?
For both the EC and EO conditions, all 10 frontal EEG sites' alpha data were markedly non-normal, such that: (i) all associated SW and KS tests were significant at the p < 0.001 level; (ii) the kurtosis z-score for FT7 (EC) was 1.85, but in all other cases the skewness and kurtosis z-scores exceeded the ±3.29 threshold (i.e., all p < 0.05) [90]; and (iii) visual inspections of frequency histograms were consistent with these findings in determining that the distributions were markedly non-normal.
Research Question 2: If the recommended normalisation procedure is applied, does it reduce the level of non-normality to acceptable levels?
For all 10 sites under both EC and EO conditions, square root and inverse transformations were ineffective in producing normal data and thus are unreported here. In the EC conditions and across all 10 sites, logarithmic transformation improved normality metrics; no skewness or kurtosis z-scores exceeded the ±3.29 threshold after log-transformation, and all associated SW and KS tests were also non-significant (all p > 0.05). However, while this was also the case for RIN transformations of these same 10 frontal sites, visual inspection of the frequency distribution histograms determined that RIN was consistently superior in normalising these data when compared to log-transformation for all 10 frontal sites.
Similarly, for the frontal site data under the EO condition, none of the skewness and kurtosis z-scores exceeded the ±3.29 threshold after both logarithmic and RIN transformation. However, three sites were still non-normal following logarithmic transformations according to the KS tests: For the RIN-transformed data under EO condition, all 10 frontal sites were successfully normalised according to the KS tests, but the SW tests found two sites were still non-normal: F4 (W(95) = 0.973, p = 0.049) and FC4 (W(95) = 0.961, p = 0.006). Visual inspections of histograms were consistent with these findings, and overall, these normality tests determined that the RIN transformation was again superior to the logarithmic transformation for the EO data but also that these two transformations were less effective at normalising the data under EO conditions when compared to the EC condition. Research question 3: Are there any meaningful differences between the results of statistical tests using the non-normalised versus the normalised data?
Results of the correlation analyses between FAA and SDS data are reported in Table 2, with Spearman's rho results shown in column 3, followed by each of the Pearson correlation results. Coefficients that reached the p < 0.1 level are shown in light grey, and those that reached the p < 0.05 level are depicted in darker grey for ease of reading. Examining the columns data (i.e., the specific correlation process outcomes), Spearman's rho produced the only significant result at the p < 0.05 level (F8-F7 × SDS, EC condition). Pearson's correlation coefficients based on the raw EEG data produced only a single p < 0.1 result (FC4-FC3 × SDS, EO condition) and none when using either the log-transformed data or RIN-transformed data. The p < 0.1 Pearson correlation coefficient found for FC4-FC3 (EO) using untransformed EEG data was not found with Spearman, nor when logarithmic or RIN-transformed EEG data were used. Finally, both the EO and EC conditions FP2-FP1-SDS analyses produced positive correlations for the Spearman analysis of raw data but negative coefficients for all three Pearson calculations.

Discussion
The primary aim of this study was to evaluate the necessity for applying logarithmic normalisation procedures to EEG data when testing for the association between FAA and depression. A brief review of 50 published papers during the last 10 years indicated that log transformation of EEG data was evident in almost 75% of those 50 studies but that only one of those studies had tested for nonnormality in their raw data, and two performed a post-normalisation test to determine the effectiveness of that transformation in resolving nonnormality. This finding suggests that the process of normalisation is widespread and unquestioned but not usually applied in a logical sequence. That is, only one of these 50 studies examined the presence of any nonnormality in their EEG data (i.e., the severity of skewness and kurtosis) prior to normalisation, which is a necessary step in choosing which normalisation process to apply. Finally, statistical procedures applied in these 50 studies varied in their robustness to nonnormality. Although it is true that, with reasonably large samples, many of these procedures are usually robust to nonnormality if it is not severe, thus removing the necessity for normalisation, some studies had only limited size samples (e.g., [38], n = 24). Further, the commonplace use of Pearson correlation analysis with raw or logarithmic transformed data may be questioned simply because of the limitations of that procedure to accommodate non-normality under some conditions. On the basis of this aspect of the current investigation, it is yet to be proven that normalisation was required in 49 of the 50 studies.
Therefore, to run a test case of the need for normalisation and to identify which kind of normalisation was most relevant, a dataset of 100 participants' EEG data from 10 frontal sites was examined. All of these 10 sites produced EEG data that showed nonnormality as identified by two statistical tests, examination of the severity of skewness and kurtosis, and inspection of the histograms. Two standard transformations recommended for skewed data (square root, inverse) [21] did not produce normality. The commonlyused logarithmic transformation did reduce non-normality to acceptable levels in most of the variables studied, but not as effectively as the rank-based inverse normal (RIN) transformation, which was consistently superior in normalising these data when compared to log-transformation for all 10 frontal sites' EEG data. None of the 50 reviewed papers shown in Table 1 applied this transformation method, and it is relevant for consideration in future EEG studies where data have been demonstrated to be non-normal.
However, it is the results of the third stage of these investigations that hold the most important implications for studies of EEG data-the effect of transformation upon statistical outcomes from Pearson correlation analyses. By using Spearman's rho with untransformed EEG data, a yardstick was able to be applied to the results from the Pearson-log/RIN transformed data results. It is apparent from Table 2 that the use of Pearson correlation analyses with raw, logarithmic, and RIN data is open to some uncertainty in the outcomes. On that basis, the very commonly used process of logarithmic transformation of EEG data is not the most reliable data analysis method for detecting associations between EEG FAA data and other variables such as depression. Instead, the application of a nonparametric procedure such as Spearman's rho with the untransformed EEG data may be more likely to produce valid results due to its ability to handle non-normal data irrespective of the nature of the distribution of that data. Further, if Spearman's rho results are accepted as the yardstick because of the ability of that statistic to process untransformed data, then it is of importance that none of the Pearson procedures also produced significant coefficients at either p < 0.1 or p < 0.05 levels for the F8-F7 × SDS association under the eyes closed condition, thus potentially contributing to a Type II error. On the basis of these comparisons between the yardstick of Spearman's rho on raw EEG data and Pearson's r using raw, log-transformed, and RIN-transformed EEG data, there is no evidence indicating that transforming these EEG data to meet the assumptions of a parametric test produced appreciably stronger or more valid correlations compared to a nonparametric test on untransformed data.
Several additional comments are relevant to this discussion. First, it is argued elsewhere that transformation can hinder the interpretation of data and that researchers should instead use untransformed data with procedures that are robust to non-normality [92]. We would agree with this sentiment in the current case of EEG data used to test the FAA depression hypothesis. Second, Spearman's rho does not solely account for non-normality in EEG data (as in this example) but also with skewness and kurtosis in the SDS data as well, which is not so when transformations are made to the EEG data alone, as was the case in many of the 50 studies reviewed in Table 1. The Spearman analysis may be a preferable process to follow because the Pearson correlation assumes that the two variables being examined "follow a bivariate normal distribution in the population from which they were sampled" ( [93], p. 1764), but psychological data (such as depression, and EEG spectral power) are pervasively non-normal [94]. Third, despite these limitations upon the use of Pearson correlational procedures in general and particularly with smaller datasets, if statistical power is maximised (i.e., by utilising large enough samples to detect the effect size of interest) and outliers are dealt with, Pearson correlations may be relatively robust to violations of normality, although the issue of non-normality in the second variable (i.e., depression) also needs to be considered. Finally, ANOVA-based models are highly robust to violations of normality [21,95] when wishing to answer research questions based upon comparisons between sample means. Therefore, the necessity for data normalisation when employing such statistical procedures is questionable.

Conclusions
Although it is widespread in the FAA literature, the application of logarithmic transformations is not easily defended for EEG data. Similarly, other forms of EEG data transformation may be unnecessary, especially when the analysis performed is robust to violations of normality (e.g., ANOVA). Spearman's rho is but one nonparametric procedure and there are others, such as generalised estimating equations (GEE), which does not assume a