On the variability of functional connectivity and network measures in source-reconstructed EEG time-series

The idea to estimate the statistical interdependence among (interacting) EEG signals has motivated numerous researchers to investigate how the resulting networks may reorganize themselves under any conceivable scenario. Even though this idea is not at initial stages, its application is still far to be widespread. One concurrent cause may be related to the proliferation of different approaches that promise to catch the underlying correlation among the (interacting) units. This issue has probably contributed to hinder the comparison among different studies. Not only all these approaches go under the same name (functional connectivity) but they have been often tested and validated using different methods, therefore, making it difficult to understand to what extent they are similar or not. In this study, we aim to compare a set of different approaches commonly used to estimate the functional connectivity on a public EEG dataset representing a possible realistic scenario. Our results show that source-level EEG functional connectivity estimates and the derived network measures display a substantial dependency on the arbitrary choice of the selected connectivity metric. The observed variability reflects ambiguity and concern that should be always discussed when reporting findings based on any connectivity metric.


Introduction
The idea to estimate the statistical interdependence among (interacting) EEG signals, generally named as functional connectivity (Fingelkurts et al., 2005;Lee et al., 2003;Sakkalis, 2011), has motivated numerous researchers to investigate how the resulting networks may reorganize themselves, in the context of the importance of the whole (Sizemore et al., 2018), under any conceivable scenario. This phenomenon seems of particular relevance since brain function critically depends, other than functional segregation, also on functional integration, which, indeed, relates to the pattern of interactions between brain regions (Schoffelen and Gross, 2009). In general, functional connectivity may be investigated both at scalp-and at source-level. Nevertheless, it has been extensively shown that the two different approaches may lead to important differences in the reported results (Anzolin et al., 2019;Lai et al., 2018) as at scalp-level the EEG signals are more corrupted by effects of field spread. This problem, even though cannot be considered completely absent at source-level, seems to be importantly attenuated in this latter case (Brookes et al., 2012).
Countless studies reported how these correlation patterns can be associated with behavioral/clinical parameters or used to contrast different groups/conditions (Stam, 2014). Even though this idea is not at initial stages, its application is still far to be widespread. One concurrent causes may be related to the proliferation of different approaches aimed to estimate the correlation among these signals (Hassan et al., 2014;Kida et al., 2016;Olejarczyk et al., 2017;Schoffelen and Gross, 2009). Despite their substantial differences, all these approaches promise to catch the underlying correlation among the (interacting) units. The tacit idea that all these metrics may measure the same connectivity may induce to inaccurate interpretations. Here, we want to investigate whether the arbitrary choice of the adopted connectivity metric may have a severe impact on the results in a realistic scenario. Indeed, different functions could measure different characteristics, elements or aspects of connectivity, even if always related to the 'true' functional connectivity. The fact of using the same name has probably contributed to generate confusion and to hinder the comparison among different studies. Indeed, not only these approaches (based on linear or nonlinear relations, computed in time or frequency domain, using amplitude or phase information) go under the same name (functional connectivity), but they have also been tested and validated using different methods, both simulation (Astolfi et al., 2007;Mahjoory et al., 2017) or empirical studies (Astolfi et al., 2007;Olejarczyk et al., 2017), therefore making it even more difficult understanding to what extent they are similar or not.
In this study, we aimed to compare a set of different approaches commonly used to estimate the functional connectivity on a public EEG dataset (Goldberger Ary L. et al., 2000;Schalk et al., 2004) representing a possible realistic scenario. Ten different connectivity metrics were included in the analysis. Furthermore, five different thresholding approached were used in order to investigate several network measures. The proposed scenario consists of contrasting two different resting-state EEG conditions, eyes-closed and eyesopen, on 109 subjects recorded with a 64 channel system, where the EEG signals were successively reconstructed at source-level and projected onto the Desikan-Killiany atlas (Desikan et al., 2006). Other than the inherent computational differences among connectivity metrics, it is relevant to highlight that other methodological issues could have an effect on the reported findings, as for example, the problem of field spread, volume conduction, and reference montages (van Diessen et al., 2015). For this reason, we decided to perform the analysis using both metrics that are prone to an erroneous estimate of connectivity and metrics that try to remove these effects prior to computing the connectivity, including phase-based metrics that are less sensitive to these spurious interactions (van Diessen et al., 2015). Moreover, since network density (the number of connections in a network) will directly influence the estimated network measures (Wijk et al., 2010), we performed the analysis using three different densities (preserving 10, 15 and 20% of the weights) and two methods to filtering information in complex brain network that help to overcome the problem of network density in network analytical studies, namely the minimum spanning tree  and the efficiency cost optimization approach (De Vico Fallani et al., 2017). All the reported results refer to the alpha [8 -13 Hz] frequency band and the analysis was performed using MNE python software (Gramfort et al., 2013) and Brain Connectivity Toolbox for MATLAB (Rubinov and Sporns, 2010).

Dataset.
In order to test our hypothesis, we used a public and freely available EEG dataset (Goldberger Ary L. et al., 2000;Schalk et al., 2004) consigning on a set of recordings performed on 109 subjects, including signals from resting-state for eyes-closed and eyes-open recordings, each one lasting 1 minute. The EEG traces were recorded from 64 electrodes as per the international 10-10 system with a sampling frequency equals to 160 Hz. All the EEG recordings are available at the following link: https://physionet.org/content/eegmmidb/1.0.0/. Preprocessing.
The EEGLAB toolbox (version 13_6_5b) (Delorme and Makeig, 2004) was used to re-reference to common average reference. Successively, ADJUST (version 1.1.1) (Mognon et al., 2011), a fully automatic algorithm based on Independent Component Analysis (ICA), was used to detect and remove artifacts from the EEG signals. Subsequently, the source-based EEG signals were reconstructed using Brainstorm software (version 3.4) (Tadel et al., 2011) with the head model created using a symmetric boundary element method in Open-MEEG (version 2.3.0.1) (Gramfort et al., 2010) based on the anatomy ICBM152 brain. The whitened and depth-weighted linear L2 minimum norm estimate (wMNE) (Hämäläinen and Ilmoniemi, 1994) was used with an identity matrix as noise covariance was used. The source-reconstructed EEG time-series were projected onto the Desikan-Killiany atlas (Desikan et al., 2006), which includes 68 regions of interest and where the time-series for voxels within a ROI were averaged after flipping the sign of sources with opposite directions. The subsequent analysis was performed using five non-overlapping epochs of 12 seconds, which is in line with what reported in (Fraschini et al., 2016).

Network measures.
The network analysis was performed using three different densities, WEI10 (10% of weights preserved), WEI15 (15% of weights preserved), and WEI20 (20% of weights preserved). Furthermore, two methods to filtering information in complex brain network that help to overcome the problem of network density in network analytical studies, namely the minimum spanning tree  and the efficiency cost optimization approach (De Vico Fallani et al., 2017) were also added to the analysis. This analysis was performed using the Brain Connectivity Toolbox for MATLAB (Rubinov and Sporns, 2010).

Statistical analysis.
In order to contrast the two conditions, namely eyes-closed and eyes-open resting state, the Wilcoxon signed-rank test was used. The statistical results were reported in terms of p-value, effect size, and direction of the effect. The alpha level, equals to 0.05, was corrected for the number of measures extracted for each analysis.

Results
The global connectivity patterns (averaged over all the subjects) for each connectivity metric and for each condition (eyes-closed and eyes-open) are depicted in Figure 1. Different scales were used in order to allow a better visualization of the underlying connectivity patterns. The results from the WEI10 approach (where the 10% of weights were preserved) are summarized in Table  1. A significant difference between the two conditions in global efficiency was observed for two out of the ten connectivity metrics, namely cohy (p=1.28E-05, ES=0.42) and imcoh (p=1.14E-05, ES=0.24). All the connectivity metrics allowed to observe a significant difference for the clustering coefficient, whilst five out of ten allowed to observe a significant difference for the assortativity and seven out of ten for the modularity.  The results from the WEI15 approach (where the 15% of weights were preserved) are summarized in Table  2. A significant difference between the two conditions in global efficiency was observed for seven out of the ten connectivity metrics. All the connectivity metrics allowed to observe a significant difference for the clustering coefficient, whilst four out of ten allowed to observe a significant difference for the assortativity and six out of ten for the modularity. In this case, four out of ten connectivity metrics, namely ciplv, pli, pli_unbiased and wpli_debiased, allowed to observe differences between the two conditions for all the network measures. The results from the WEI20 approach (where the 20% of weights were preserved) are summarized in Table  3. A significant difference between the two conditions in global efficiency was observed for the same seven connectivity metrics as described using the WEI15 approach. All the connectivity metrics allowed to observe a significant difference for the clustering coefficient, whilst five out of ten allowed to observe a significant difference for the assortativity and seven out of ten for the modularity. In this case, five out of ten connectivity metrics, namely ciplv, cohy, pli, pli_unbiased and wpli_debiased, allowed to observe differences between the two conditions for all the network measures.  Table 3. Statistical results for the WEI20 (20% of weights preserved) approach where eyes-open and eyes-closed conditions were contrasted. For each connectivity and network measure is reported the p-value, the effect size (ES) and the direction (D) of the effect.
The results from the ECO approach are summarized in Table 4. A significant difference between the two conditions in global efficiency was observed for the only one connectivity metric, namely cohy (p=5.90E-04, ES=0.33). Three out of the ten connectivity metrics allowed to observe a significant difference for the clustering coefficient, whilst six out of ten allowed to observe a significant difference for the assortativity and none the modularity.  The results from the MST approach are summarized in Table 5. A significant difference between the two conditions in leaf fraction was observed for the six out of ten metrics. Six out of the ten connectivity metrics allowed to observe a significant difference for the kappa parameter, whilst none for the diameter, eccentricity and hierarchy parameters.

Discussion and conclusions
In this study, we compared ten different connectivity metrics in a realistic scenario where two different resting-state conditions, namely eyes-closed and eyes-open, were contrasted. In order to assess the possible differences induced by these different connectivity metrics, we reported the results in terms of statistical significance, effect size and direction of the effect (eyes-closed -eyes-open), using three different densities (preserving 10, 15 and 20% of the weights) and two methods to filtering information in complex brain network that help to overcome the problem of network density in network analytical studies, namely the minimum spanning tree  and the efficiency cost optimization approach (De Vico Fallani et al., 2017). The main result of this study confirms the hypothesis that different connectivity metrics, as long as different thresholding approaches, lead to relevant differences in the final results, also in the case of a very simple realistic scenario where the underlying effect is very well known (Barry et al., 2007;Li, 2010;Tan et al., 2013) especially in the alpha band. In particular, the results show that only for the clustering coefficient it is possible to observe a statistical significance for all the connectivity metrics, but only for the WEI10, WEI15, and WEI20 approaches. However, the effect size shows a relevant variance among the different metrics and thresholding methods. A more evident consistency among the connectivity metrics can be observed with a density increase, where for WEI15 and WEI20 the number of metrics that show the same results is higher. In contrast, the use of efficiency cost optimization and minimum spanning tree tend to amplify the differences. It is, however, important to highlight that the direction of the effect is always consistent for all the connectivity metrics and for all the thresholding approaches.
In our opinion, these findings represent a confirmation that the (generally) arbitrary choice of the adopted connectivity metric may have a severe impact on the conclusions reported in the current literature on functional connectivity in EEG. As a consequence, we would suggest avoiding using the functional connectivity term interchangeably for any connectivity metric since this may lead to an erroneous belief of the generalizability of the results. We also would like to stress that this problem, generalization of the results based on one arbitrary connectivity metric, may be also more relevant when the underlying effects are more subtle and less trivial (i.e., effects of treatment or comparison between healthy and pathological groups) or when the individual variability may have an even more robust effect (Fraschini et al., 2019;Rajapandian et al., 2020).
An important limitation of the present study is related to the possible influence due to the source localization method. In fact, it has been previously shown, in a simulation study (Mahjoory et al., 2017), that the choice of the inverse method and source imaging package may induce a considerable variability in the functional connectivity estimate. In any case, we may speculate that this possible effect adds even more variability and uncertainty on the reported findings. It is also important to highlight that there are several other issues that may play a relevant role in network analysis (Hallquist and Hillary, 2018).
In conclusion, our results show that source-level EEG functional connectivity estimates and the derived network measures display a considerable dependency on the arbitrary choice of the selected connectivity metric. This variability reflects uncertainty and ambiguity in the final results, and, in future studies, this issue should be always discussed when reporting findings based on functional connectivity. Since it is very difficult to conclude on the superiority of one particular metric over the others, we would like to suggest the researchers to report functional connectivity results based on more than one connectivity metric.