Functional Connectome of the Human Brain with Total Correlation

Recent studies proposed the use of Total Correlation to describe functional connectivity among brain regions as a multivariate alternative to conventional pairwise measures such as correlation or mutual information. In this work, we build on this idea to infer a large-scale (whole-brain) connectivity network based on Total Correlation and show the possibility of using this kind of network as biomarkers of brain alterations. In particular, this work uses Correlation Explanation (CorEx) to estimate Total Correlation. First, we prove that CorEx estimates of Total Correlation and clustering results are trustable compared to ground truth values. Second, the inferred large-scale connectivity network extracted from the more extensive open fMRI datasets is consistent with existing neuroscience studies, but, interestingly, can estimate additional relations beyond pairwise regions. And finally, we show how the connectivity graphs based on Total Correlation can also be an effective tool to aid in the discovery of brain diseases.


Introduction
The human brain is a complex system comprised of interconnected functional units.Millions of neurons in the brain interact with each other at both a structural and functional level to drive efficient inference and processing in the brain.Furthermore, the functional connectivity among these regions also reveals how they interact with each other in specific cognitive tasks.Functional connectivity refers to the statistical dependency of activation patterns between various brain regions that emerges as a result of direct and indirect interactions [1,2].It is usually measured by how similar neural time series are to each other, and it shows how the time series statistically interact with each other.
A variety of ways to analyze functional connectivity exist.A seed-wise analysis can be performed by selecting a seed-driven hypothesis and analyzing its statistical dependencies with all other voxels outside its limits.It's a common tool for studying how different parts of the brain are connected to one another.Connectivity is determined by calculating the correlation between the time series of each voxel in the brain and the time series of single seed voxel.Another option is to perform a wide analysis of the voxel or region of interest (ROI), where statistical dependencies on all voxels or ROIs are studied [3].Structural connectivity refers to the anatomical organization of the brain by means of fiber tracts [4].The sharing of communication between neurons in multiple regions is coordinated dynamically via changes in neural oscillation synchronizations [5].When it comes to the brain connectome, functional connectivity refers to how different areas of the brain communicate with one another during task-related or resting-state activities [6].The use of November 15, 2022 information-theoretic metrics can efficiently detect their interaction in dynamical brain networks, and it's widely used in the field of neuroscience [7].For instance, quantify information encoding and decoding in the neural system [8][9][10][11], measure visual information flow in the biological neural networks [12,13], and color information processing in the neural cortex [14], and so on.However, although functional connectivity has already become a hot research topic in neuroscience [15,16], systematic studies on the information flow or the redundancy and synergy amongst brain regions remain limited.One extreme type of redundancy is full synchronization, where the state of one neural signal may be used to predict the status of any other neural signal, and this concept of redundancy is thus viewed as an extension of the standard notion of correlation to more than two variables [17].Synergy, on the other hand, is analogous to those statistical correlations that govern the whole but not its constituent components [18].High-order brain functions are assumed to require synergies, which give simultaneous local independence and global cohesion, but are less suitable for them under high synchronization situations, such as epileptic seizures [19].Most functional connectivity approaches until now have mainly concentrated on pairwise relationships between two regions.The conventional approach used to estimate indirect functional connectivity among brain regions is Pearson correlation (CC) [20] and Mutual Information (I) [8,[21][22][23].However, real brain network relationships are often complex, involving more than two regions, and the pairwise dependencies measured by correlation or mutual information could not reflect these multivariate dependencies.Therefore, recent studies in neuroscience focus on the development of information-theoretic measures that can handle more than two regions simultaneously such as the Total Correlation [24,25].
Total Correlation (TC) [26] (also known as multi-information [27][28][29]) mainly describes the amount of dependence observed in the data and, by definition can be applied to multiple multivariate variables.Its use to describe functional connectivity in the brain was first proposed as a empirical measure in [24], but in [25] the superiority of TC over mutual information was proved analytically.The consideration of low-level vision models allows to derive analytical expressions for the TC as a function of the connectivity.These analytical results show that pairwise I cannot capture the effect of different intra-cortical inhibitory connections while the TC can.Similarly, in analytical models with feedback, synergy can be shown using TC, while it is not so obvious using mutual information [25].Moreover, these analytical results allow to calibrate computational estimators of TC.
In this work we build on these empirical and theoretical results [24,25] to infer a larger scale (whole brain) network based on TC for the first time.As opposed to [24,25] where the number of considered nodes was limited to the range of tens and focused on specialized subsystems, here we consider wider recordings [30,31] so we use signals coming from hundreds of nodes across the whole brain.Additionally, applying our analysis to data of the same scale for regular and altered brains 1 .We also show the possibility of using this kind of wide-range networks as biomarkers.From the technical point of view, here we use Correlation Explanation (CorEx) [32,33] to estimate TC in these high-dimensional scenarios.Furthermore, graph theory and clustering [15,16] are used here to represent the relationships between the considered regions.
The rest of this paper is organized as follows: Section 2 introduces the necessary information-theoretic concepts and explains CorEx.Sections 3 and 4 show two synthetic experiments that prove that CorEx results are trustable.Section 5 estimates the large-scale connectomes with fMRI datasets that involve more than 100 regions across the whole brain.Moreover, we show how the analysis of these large scale networks based on TC may indicate brain alterations.Sections 6 and 7 give a general discussion and the conclusion of the paper, respectively.
2 Total Correlation as neural connectivity descriptor

Definitions and Preliminaries
Mutual Information: Given two multivariate random variables X 1 and X 2 , the mutual information between them, I(X 1 ; X 2 ), can be calculated as the difference between the sum of individual entropies, H(X i ) and the entropy of the variables considered jointly as a single system, H(X 1 , X 2 ) [34]: where for each (multivariate) random variable v, the entropy is H(v) = − log 2 p(v) and the brackets represent expectation values spanning random variables.The mutual information also can be seen as the information shared by the two variables or the reduction of uncertainty in one variable given the information about the other [35].
Mutual information is better than linear correlation: For Gaussian sources mutual information reduces to linear correlation because the entropy factors in Eq. 1 just depend on | X 1 • X 2 |.However, for more general (non-Gaussian) sources mutual information cannot be reduced to covariance and cross-covariance matrices.In these (more realistic) situations I is better than the linear correlation because I captures nonlinear relations that are ruled out by For an illustration of the qualitative differences between I and linear correlation see the examples in Section 2.2 of [24].
As a result, mutual information has been proposed as a good alternative to linear correlation for estimating functional connectivity [8,21].However, mutual information cannot not capture dependencies beyond pairs of nodes.And this may be a limitation in complex networks [36].
Total Correlation: This magnitude describes the dependence among n variables and it is a generalization of the mutual information concept from two parties to n parties.The Venn Diagram in Fig. 1 qualitatively illustrates this for three variables.The definition of total correlation from Watanabe [26] can be denoted as, where X ≡ (X 1 , . . ., X n ), and TC can also be expressed as the Kullback Leibler divergence, D KL between the joint probability density and the product of the marginal densities.From these definitions, if all variables are independent then TC will be zero.
The conditional total correlation, which is similar to the definition of total correlation but with a condition appended to each term, The Kullback-Leibler divergence of the two conditional probability distributions can also be used to define the conditional total correlation.The estimation method used in this work (CorEx presented in the next subsection) uses the TC after conditioning on some other variable Y , which can be defined as [34], Total correlation is better than Mutual information: This superiority is not only due to the obvious n-wise versus pair-wise definitions in Eqs. 1 and 2. It also has to do with the different properties of these magnitudes.To illustrate this point let us recall one of the analytical examples in [25].Consider the following feedforward network: where the nodes X 1 , X 2 , e, and X 3 can have any number of neurons, the first two transforms, X 1 / / X 2 / / e , are linear and affected by additive noise, and the last transform, f (•), is nonlinear but deterministic.Imagine that in this network one is interested in the connectivity between the neurons in the hidden layer, e, but the nonlinear function f (•) is unknown and one only has experimental access to the signal in the regions X 1 , X 2 and X 3 .In this situation one could think on measuring I(X 1 , X 3 ) = I(X 1 , f (e)) or I(X 2 , X 3 ) = I(X 1 , f (e)).However, the invariance of I under arbitrary nonlinear re-parametrization of the variables [35] implies that these measures are insensitive to f and the connectivity there in.On the contrary, as pointed out in [25], using the expression for the variation of TC under nonlinear transforms [13,37], the variation of H under nonlinear transforms [34], and the definition in Eq. 2, one obtains , where the term in the bracket does not depend on f (•), but the last term definitely does, which proves the superiority of T C over I in describing connectivity.In [25] the network in Eq. 4 specifically refers to the flow from the retina, X 1 , to the LGN, X 2 , and finally to the visual cortex, e and X 3 .However, the result of the superiority of T C over I to describe the connectivity in the hidden layer is totally general for every network with the generic properties listed after Eq. 4.

Total Correlation estimated from CorEx
Straightforward application of the direct definition of TC is not feasible in high dimensional scenarios, and alternatives are required [28,29].A practical approach to estimate total correlation is via latent factor modelling.A latent factor model is a statistical model that relates a set of observable variables to a set of latent variables.The idea is to explicitly construct latent factors, Y , that somehow capture the dependencies in the data.If we measure dependencies via total correlation, T C(X), then we say that the latent factors explain the dependencies if T C(X|Y ) = 0. We can measure the extent to which Y explains the correlations in X by looking at how much the total correlation is reduced The total correlation is always non-negative, and the decomposition on the right in terms of mutual information can be verified directly from the definitions.Any latent factor model can be used to lower bound total correlation, and the terms on the right-hand side of Eq. 5 can be further lower-bounded with tractable estimators using variational methods, and Variational Autoencoders (VAEs) are a popular example [38].
Although latent factor models do not give a direct total correlation estimation as the Rotation-based Iterative Gaussianization (RBIG) [28,29] and the Matrix-based Rényi's entropy [39] did, the approach can be complementary because the construction of latent factors can help in dealing with the curse of dimensionality and for interpreting the dependencies in the data.Compared to CorEx, the main goal of RBIG2 is to convert any non-Gaussian distribution data into a Gaussian distribution through marginal Gaussization and rotation to get TC.The Matrix-based Rényi's entropy3 is mainly used for estimating multivariate information based on Shannon's entropy, which is Rényi's α−order entropy [40].With these goals in mind, we now describe a particular latent factor approach known as Total Cor-relation Ex-planation (CorEx4 ) [32].
CorEx constructs a factor model by reconstructing latent factors using a factorized probabilistic function of the input data, p(y|x) = m j=1 p(y j |x), with m discrete latent factors, Y j .This function is optimized to give the tightest lower bound possible for Eq. 5.
The factorization of the latent leads to the terms I(X; Y ) = j I(Y j ; X) which can be directly calculated.The term Each I(X i ; Y j ) can then be tractably estimated [32,33].There are free parameters α i,j that must be updated while searching for latent factors and achieving objective functions.When t = 0, the α i,j initializes and then updates according to: The second term α * * i,j = exp (γ (I (X i : Y j ) − max j I (X i : Y j ))), λ and γ are constant parameters.This decomposition allows us to quantify the contribution to the total correlation bound from each latent factor, which we can aid interpretability.
November 15,2022 CorEx can be further extended into a hierarchy of latent factors [33], helping to reveal hierarchical structure that we expect to play an important role in the brain.The latent factors at layer k explain dependence of the variables in the layer below.
Here k gives the layer and Y 0 ≡ X denotes the observed variables.Ultimately, we have a bound on TC that gets tighter as we add more latent factors and layers and for which we can quantify the contribution for each factor to the bound.We exploit this decomposition for interpretability [41] as illustrated in Fig. 2. CorEx prefers to find modular or tree-like latent factor models which are beneficial for dealing with the curse of dimensionality [42].For neuroimaging, we expect this modular decomposition to be effective because functional specialization in the brain are often associated with spatially localized regions.We explore this hypothesis in the experiments.) are total correlation.As we mentioned above, the simulation data is totally Gaussian distributed.Therefore, their dependency should be zero.We find that CorEx and RBIG both perform very well and are very stable, and matrix-based Renyi entropy performance becomes more and more nice with increased dimensions, while Shannon discrete entropy becomes more and more accurate with an increase of samples.All these make sense, and it also explains the accuracy of total correlation estimation with CorEx.Here, compared to other estimators, the main functionality goal of CorEx is to cluster statistical dependency variables based on total correlation.However, other estimators mainly focus on directly getting the total correlation value and do not supply very nice visualization results.The CorEx gives us a nice connection with graph theory to visualize and show their functional relationship.

Experiment 2: Clustering by Total Correlation for dependent and independent mixtures
To evaluate the performance of CorEx in clustering tasks.The elements in group X include X1, X2, and X3, which satisfy Gaussian distributions and are completely independent from each other and from group Y , and variables in In Fig. 4, we found that CorEx based on total correlation has high accuracy in estimating their dependencies (Fig. 4e) compared to pairwise Pearson correlation (Fig. 4b), pairwise mutual information (Fig. 4c), and partial correlation (Fig. 4d).As we established in this experiment, the elements in group Y should be clustered together, and the elements in group X should be completely independent of each other and of group Y .The ground truth is presented in Fig. 4a.
Then we estimated the cluster result with pairwise Pearson correlation with threshold 0.1, pairwise mutual information with threshold 0.4, and partial correlation without threshold.Obviously, we found that pairwise approaches have high errors in accurately estimating their statistical dependencies, and pairwise mutual information is better than pairwise Pearson correlation, but still has high errors in correctly clustering tasks.When we considered the confounding effect of third variables, we still did not get a better clustering result compared to TC.Therefore, the clustering results with CorEx by total correlation get the best performance compared to pairwise approaches.Moreover, we use Purity as a criterion of clustering quality to qualify the performance of clustering because it's a straightforward and transparent evaluation metric [43].To calculate purity, each cluster is allocated to the class that occurs most frequently within it, and the accuracy of this assignment is determined by counting the number of correctly assigned elements and dividing by N (N = 6).Formally: where X = {X1, X2, X3} is the set of clusters and Y = {Y 1, Y 2, Y 3} is the set of classes.Fig. 4f presents the clustering performance of pairwise approaches and CorEx with purity as a criterion.Poor clusters have near-zero purity ratings (lower bound).A perfect cluster possesses a purity of one (maximum value).Based on Eq.9, we get purity values of 0.17 and 0.33 for pairwise approaches and partial correlation, and the purity value for CorEx is 0.83.All in all, we showed that CorEx based on total correlation has the best performance compared to pairwise approaches.

Experiment 3: Brain functional connectivity analysis using Total Correlation
A network is a collection of nodes and edges, where nodes represent fundamental elements (e.g., brain regions) within the system of interest (e.g., the brain), and edges represent the dependencies that exist between those fundamental elements with considered weights.Typically, the threshold is chosen based on the visual effect on functional connectivity, and here, we set the optimal threshold for community detection in brain connectivity networks.We use it to identify a threshold that maximizes information on the network modular structure, removes the weakest edges, and keeps the largest connected component.Fig. 5 illustrates the schematic representation of network construction using fMRI.Firstly, the time series are extracted from fMRI data based on a selected structural atlas, and then functional connectivity is estimated with CC, I, and CorEx, respectively.The results are presented with a graph that includes both brain nodes and their functional connectivity with weight edges.

First total correlation-based clustering example from fMRI data
The data was taken from a resting-state fMRI experiment in which a subject was watching and maintaining alert wakefulness but not performing any other behavioral task.Meanwhile, the BOLD signal was recorded.This data was downloaded from Nitime Meanwhile, we also use weighted graph theory to cluster dependence among ROIs and we threshold edges with a weight of less than 0.16 for legibility with the CorEx approach.As we mentioned above, mutual information only estimates a more robust relationship between ROIs compared to correlation.However, when we go beyond pairwise ROIs, CorEx captures richer information among all ROIs (see Fig. 7(bottom row)).Here, we selected m 1 = 10, m 2 = 3, m 3 = 1 as the latent dimensional for each layer in our estimate of TC with CorEx, and their corresponding convergent curves are plotted in Fig. 6, it shows total correlation lower bound stops increasing.Fig. 7(bottom row) shows the overall structure of the learned hierarchical model.Edge thickness is determined by α i,j I (X i : Y j ).The size of each node is proportional to the total correlation that a latent factor explains about its children.The discovered structure captures several significant relationships among ROIs that are consistent with correlation and mutual information results, e.g., LPCC and RPCC, LThal and RThal, LParaCing and RParaCing, LPut and RPut.Furthermore, TC discovered some beyond pairwise unknown relationships, for example, LCau, RCau, LFpol, and RFpol are clustered under node 0, which explains why they have dense dependency during this cognitive task compared to other ROIs in the brain.Bottom row: the figures show total correlation with a threshold of 0.16 that was estimated by CorEx.To more directly display the statistical dependencies of brain regions, we here convert the circle graph to a tree graph.The weights are shown by the thickness of the edges, which shows how strongly information is coupled between or among brain regions.

A selection of pre-defined atlas
We use the Automated Anatomical Labeling (AAL) atlas [44], a structural atlas with 116 ROIs identified from the anatomy of a reference subject(see Fig. 8.).
Figure 8: Automated Anatomical Labeling (AAL) atlas.The graph showed the volume of AAL (116 regions) mapped to the smoothed Colin27 brain surface template.The different brain areas are labeled on the brain surface with different colors, and detailed ROI/purple node information can be found in the Appendix section with Table 1.

Time series signals extraction
HCP and ACPI can access raw and preprocessed data as well as phenotypic information about data samples.The raw rs-fMRI data was preprocessed using the Configurable Pipeline for the Analysis of Connectomes, an open-source software pipeline that allows for automated rs-fMRI data preprocessing and analysis.We extract time series for each ROI in each subject after defining anatomical brain ROIs with the AAL atlas.We calculate the weighted average of the fMRI BOLD signals across all voxels in each region.Furthermore, the BOLD signal in each region is normalized and subsampled by repetition time.Finally, we average all of the subjects' time series signals in each ROI.

HCP900
The Human Connectome Project contains imaging and behavioral data from healthy people [30].To investigate rest-state functional connectivity, we used preprocessed rest-fMRI data from the HCP9007 release [31].Here, we selected m 1 = 10, m 2 = 5, m 3 = 1 as the latent dimensional for each layer in our estimate of TC with CorEx.We threshold edges with a weight of less than 0.16 for legibility.The Fig. 9 has shown that whole brain resting-state functional connectivity is estimated with CorEx compared to Pearson correlation and mutual information.It mostly captures relationships among brain regions and neighboring brain regions cluster together and communicate with other areas, e.g., node 0 has a bigger node size than other nodes.
From Fig. 9, we found that brain regions are functionally clustered together, which is also consistent with structure connectivity based on their physical connectivity distance.For example, under node 0, the cerebellum and vermis regions densely cluster together, while under node 1, the frontal lobes cluster together and are also densely functionally connected with the temporal lobe, and so on.The different colors indicate different brain regions, which are based on Table .1.In addition, we can see that functional integration and separation exist in our brain from Fig. 9.

Computational psychiatry applications with ACPI
The Addiction Connectome Preprocessed Initiative is a longitudinal study to investigate the effects of cannabis use among adults with a childhood diagnosis of ADHD.In particular, we use readily-preprocessed rest-fMRI data from the Multimodal Treatment Study of Attention Deficit Hyperactivity Disorder (MTA).We attempt to use functional connectivity as a bio-marker to discriminate whether individuals have consumed marijuana or not (62 marijuana group vs 64 control group).In a comparison of whole brain functional connectivity between control and patient groups, we found altered functional connectivity in the patient group compared to the healthy group (see Fig. 10.).We quantify the difference between patient groups compared to healthy groups, and the purity of patient groups compared to control groups is 0.85 ± 0.23.The significant altered functional connectivity happened between the frontoparietal and motor regions.Meanwhile, we found sparse functional connectivity in the patient group compared to the control group in general.Meanwhile, we also discovered that marijuana users had more interaction between neural time series in particular ROIs such as the cerebellum, fronto-parietal, and default model regions than controls, e.g., cerebellum regions mainly densely cluster around node0 compared to the control group.It also may explain differences in behavior in marijuana users because the fronto-parietal network controls cognitive behavior execution and decision-making, cerebellum-related action, and default model network dysfunction in addiction users.All the above results are consistent with previous related research [45][46][47].Moreover, we found some unknown disconnect between some visual regions and other brain areas.Based on related similarity research [48,49], we suggest that marijuana patients may alter visual perception too.

Discussion
This manuscript presents a higher-order information-theoretic measure to estimate functional connectivity.We estimated total correlation with CorEx under different situations.However, the approach has its own pros and cons, which we will discuss later.Furthermore, we found total correlation can be a metric to estimate functional connectivity in the human brain.It can identify some well-known functional connectivities and capture a few unknown nonlinear relationships among brain regions as well.To the best of our knowledge, this is the first time that total correlation has been used to estimate larger-scale functional connectivity for a whole-brain-AAL atlas with 116 structural ROIs.Total correlation can also be a tool to find biomarkers to help us diagnose brain-related diseases.
Here, we will discuss some advantages and limitations of this research now.Firstly, given the curse of dimensionality of fMRI, we need to find a low-dimensional representation that helps us characterize the connectivity.Traditional general linear models (GLM), such as expert-defined ROIs or the ALL atlas, are frequently used to find ROIs in resting-state experiments.However, we should be able to do better with a data-driven approach.Sample sizes and statistical thresholds are known to have a major impact on the statistical power and accuracy of GLM-based ROI selection.Previous research has revealed that GLM has limited statistical power when inferring from fMRI data [50,51].However, we used GLM-based ROI selection in the real fMRI datasets, which may affect the final result when we estimate functional connectivity.
Second, CorEx is model-independent, which means no anatomical or functional prior knowledge is required to estimate ROIs.The method is entirely data-driven; this way, it is possible to analyze networks that have not been investigated and could be a future extension of work.It is also possible to use total correlation as a pre-analysis for other techniques like dynamic causal modeling, which need constraints about the underlying network [52].What differentiates the CorEx algorithm is that it tries to break the variables into clusters with high TC.In other words, CoRex finds a tree of latent factors that explain the total correlation, so this tree of clusters based on TC is a more data-driven way to define regions and then connectivity than ROIs predefined by hand.This prioritization of "modular" solutions in CorEx was not realized or emphasized in the original research.The second reason why we used CorEx to estimate functional connectivity on larger-scale fMRI datasets is that it is a clustering approach via TC.Furthermore, CorEx estimates total correlation via hierarchical maximization correlation between previous layer and current layer variables with a tight information bound that push mode estimates a more accurate relationship among variables in real neural signals.
Third, TC is an indirect information quantitative tool that cannot determine the direction of information flow between brain regions.Meanwhile, we discovered some unknown functional connectivity in the real fMRI dataset before.
Fourth, given the irregularity of neural time series and the difficulties in quantifying graph signals when brain networks are represented by graphs, we should avoid quantifying too many graph signals.However, there is a metric called permutation entropy that gives us the possibility to quantify the graph signal in complex systems [36].It could be very interesting to apply this metric to brain networks to check how much information could be obtained from the complex graph signals, which could then help us more deeply understand brain networks in the future.Moreover, as we mentioned the complexity of neural time series, one of the important potential problems is the length of time series, except for the additional dimensional problem.It's a significant challenge when you are processing long lengths of time series, but it could be solved by transforming the time series into embedding space or segmenting the long time series into specific time windows [53].
Finally, we applied TC to estimate large scale functional connectivity with the real fMRI dataset across HCP and ACPI.The functional connectivity with HCP900 gives us the potential to estimate a full brain atlas with TC in the future, and our result shows that TC can capture the right functional connectivity, and beyond this, it could also give us some unknown functional connectivity.Therefore, it could be a future extension project.Furthermore, we used TC as a possible method to find biomarkers of brain disease with the ACPI dataset.We compared whole-brain functional connectivity between control and patient groups.We found altered functional connectivity in the patient group compared to the healthy group, and we quantified this difference with purity metrics because it's a simple and transparent evaluation measure.The purity in patient groups compared to control groups is not too large, and it shows that there is some altered functional connectivity in the patient group, for instance, we mentioned brain networks in the cerebellum, fronto-parietal, and default model regions.However, it's just examined with one dataset with small number of subjects and does not consider within-subjects variability, and could be extended with more large datasets in the future.

Conclusions
We have introduced total correlation to capture multivariate large scale interactions within brain regions.They have been experimentally verified as effective steps for reconstructing multivariate relationships in the brain.In this study, CorEx was adopted to estimate total correlation.The CorEx approach can capture functional connectivity characteristics when going beyond pairwise brain regions.On the other hand, we evaluated the method with resting-state fMRI datasets.We found that multivariable relationships cannot be detected if we use pairwise correlation and mutual information quantities only.More generally, multivariable relationships can be clustered only if we use total correlation.Therefore, total correlation measures are significant to find complicated functional connectivity among brain regions.Also, we have shown that total correlation can estimate functional connectivity in the real neural dataset and find biomarkers for diagnosing brain diseases.
In the future, we plan to use the functional connectivity relationships discovered by total correlation as an input to existing graph neural networks (GNNs) [54] for the purpose of interpretable brain disease diagnosis, such that practitioners or doctors can identify the most informative subgraphs (or modules) to the decision (e.g., autism patients or health-control groups).In this regard, quantitative measures to define differences between graphs [55], and extension of analytical results in [25] to larger number of nodes will be critical to assess and improve the qualitative results presented here.The recently proposed approaches (e.g., [56,57]) all rely on pairwise relationships estimated by linear correlation coefficient as the input, which ignores high-order dependence essentially.In this sense, we believe our approach has the potential to improve the explanation performances of existing GNN explainers on brains.

Figure 1 :
Figure 1: Conceptual scheme of information theoretic measures of neural information flow.The left circle areas represent amounts of information, and intersections represent shared information among the corresponding variables, X 0 , X 1 , X 2 .Examples of entropy, H(X 0 ), H(X 1 ), H(X 2 ), total correlation (red color), and T C[X 0 , X 1 , X 2 ] are given.The middle figures show some neural time series are extracted from brain regions, which correspond to the nodes in the right figure.The right figures illustrate large-scale time series in the brain and how the coupled information is transmitted among the brain regions.The blue and green lines show linear correlation (CC) and mutual information (I), respectively, between different parts of the brain.The modules represent the lobes of the human brain.Each module has specific brain regions, and each module works with the others.

Figure 2 :
Figure 2: CorEx learns a hierarchical latent factor as illustrated above.Edge thickness indicates strength of relationship between factors, and node thickness indicates how much total correlation is explained by each latent factor.

Figure 3 :
Figure 3: The estimated total correlation values for three independent variables.The various total correlation estimators are compared with ground truth value (red line), for example, matrix-based Renyi entropy (black line), Shannon discrete entropy (cyan line), RBIG (magenta line), and CorEx (green line).See the main text for more information.

Figure 4 :
Figure 4: Clustering performance for dependent and independent mixtures.The top row: a displays the ground truth of variable clustering in two groups.The f shows the purity value of each approach.The second row: b shows the clustering result based on Pearson correlation.The c shows the clustering result by pairwise mutual information.The d shows the clustering result by partial correlation.The e shows clustering results by CorEx based on total correlation.

Figure 5 :
Figure 5: A flowchart for the construction of functional brain network by fMRI. 1 Time series extraction from fMRI data within each anatomical unit (i.e., network node). 2 Estimation of a functional connectivity with CC, I, and TC (CorEx), respectively.3 Visualization of functional connectivity as a tree and circle graphs (i.e., network edges and network nodes).

Figure 6 :Figure 7 :
Figure 6: The total correlation converge curve of CorEx in layers 1, 2, and 3 is shown above.From left to right, their corresponding layer1, layer2, and layer3 parameters are selected in event-related experiments, and it shows that the total correlation lower bound stops increasing and tends to converge.

Figure 9 :
Figure 9: Large scale functional connectivity with HCP900.Top row: A weighted threshold graph with max of 86 edges showing the overall structure of the representation learned from AAL ROIs (A high-resolution figure is represented in the appendix with Fig.11.).Edge thickness is proportional to mutual information, and node size represents total correlation among children.In the node with red color, the frontal lobe is represented, while green color represents the insula and cingulate regions, blue color represents the temporal lobe, cyan color represents the central areas, gold color represents the occipital lobe, purple color represents the parietal lobe, and deep pink color represents the cerebellum and vermis.Bottom row: Two representative connectomes are presented in the form of a circular chord that shows the connections of all 116 nodes with (b) correlation and (c) mutual information of the HCP dataset.Each lobel was labeled with a different color.11

Figure 10 :
Figure 10: Functional connectivity between health groups and patient groups.A weighted threshold graph showing the overall structure of the representation learned from ALL ROIs.Edge thickness is proportional to mutual information, and node size represents total correlation among children.Here, we selected m 1 = 20, First, we estimated the pairwise functional connectivity metrics with Pearson correlation, mutual information, and the corresponding functional connectivity, a circle-weighted graph used to visualize the outcome of pairwise functional connectivity.In Fig.7top row (left) and (right), Pearson correlation and mutual information estimate the same pairwise dependencies, but later approaches capture stronger weights between ROIs, such as LPCC and RPCC, LThal and RThal, and LAmy and RAmy.