1. Introduction
Schizophrenia is a complex and severe mental illness that affects approximately 1% of the global population. It is a chronic and disabling condition that can significantly impact an individual’s quality of life. Patients with schizophrenia often experience a range of symptoms, such as delusions, hallucinations, disorganized thinking and speech, and a lack of motivation and emotional expression. One of the most challenging aspects of schizophrenia is that it can be difficult to diagnose, particularly in its early stages. Some of the early warning signs of schizophrenia can include social withdrawal, reduced speech or difficulty communicating, and a decline in academic or occupational performance. However, these symptoms can also be attributed to other mental health conditions or stressors, making diagnosis a complex and time-consuming process. Early intervention and treatment are crucial for individuals with schizophrenia, as they can help manage symptoms and improve overall functioning. Treatment options may include antipsychotic medications, psychotherapy, and family support. With appropriate care and support, many individuals with schizophrenia are able to lead fulfilling and productive lives.
Functional magnetic resonance imaging (fMRI) has revolutionized our understanding of psychotic disorders such as schizophrenia, providing us with a powerful tool to gain insights into the brain function abnormalities that cause them. Through fMRI, we can now detect and analyze these abnormalities with unprecedented accuracy. Brain-based results may also help provide insights into possible treatment targets [
1]. fMRI studies commonly employ techniques like independent component analysis (ICA) to estimate intrinsic connectivity networks (ICNs) of the brain by analyzing temporal relationships between hemodynamic signals [
2,
3]. Researchers have found that understanding the brain’s complex patterns of connectivity and function can provide valuable insights into the underlying mechanisms of cognitive processes. Identifying ICNs is an important part of this research, as ICNs are believed to represent functional sources that play significant roles in various psychological phenomena [
4].
Multivariate decomposition techniques, such as ICA, offer a valuable approach to identifying and extracting imaging features. These features include independent components (ICs), time courses (TCs) of independent components, and FNC, which have proven to be highly useful in the study of mental disorders [
5,
6]. The TCs offer a valuable representation of the temporal fluctuations in every IC, which is essential for identifying spatially distinct brain regions. Additionally, FNC serves as a crucial tool for characterizing temporal coherence across selected ICs, by correlating their TCs and representing intrinsic connectivity networks. FNC can be used to distinguish individuals with schizophrenia (SZ) from healthy controls (HC) in various predictive clinical contexts, as shown by multiple large-scale meta-analyses. There is a significant body of literature supporting the SZ dysconnection hypothesis. This hypothesis posits that alterations in functional connectivity represent a central endophenotype of the disorder, resulting from various factors including neuromodulatory and synaptic pathogenesis. Most studies of functional connectivity (FC) are designed to estimate networks that reflect linear statistical relationships between different areas of the brain [
7,
8,
9].
However, most neuroimaging analysis approaches fail to consider the complex nonlinear interactions that are inherent to neural systems. As a result, FC research has largely overlooked the importance of nonlinear interactions in understanding brain function. It is crucial to recognize the significance of nonlinear interactions in FC research to gain a comprehensive understanding of the workings of the brain [
10]. He et al. [
11] demonstrated that the analysis of nonlinear statistical relationships is crucial in providing a precise and comprehensive understanding of the organization and dynamics of neural ensembles at multiple scales. By delving into the functional significance of nonlinearity, Friston et al. [
5] aimed to unravel the intricacies of information processing in the brain and identify the root causes of mental health disorders. It was found that the interplay of neurons in the brain is a crucial factor in shaping our thought processes and actions. Moreover, nonlinear interactions, in particular, can foster a dynamic and versatile environment that facilitates a range of neural computations.
The recognition of nonlinear discriminative patterns and the automatic acquisition of optimal representations from neuroimaging data have made deep learning (DL) methods increasingly attractive for the diagnosis of mental disorders based on fMRI [
12,
13,
14]. Due to the ability of DL methods to learn meaningful and complex features from raw data, they have become a promising avenue for enhancing the accuracy and specificity of fMRI-based diagnostic approaches. As such, the application of DL in fMRI-based diagnosis of mental disorders is an area of active research and has the potential to significantly impact the field of psychiatry and neuroscience.
Functional connectivity or its network analog, FNC, are widely used input features in deep learning (DL) for analyzing brain function. FC is calculated via the cross correlation among timecourses from a pre-defined brain atlas, whereas FNC is coupling among overlapping whole brain networks, such as those extracted via ICA. FNC enables the identification of functional relationships between different regions of the brain and can be used to investigate various neurological conditions [
15]. Kim et al. [
16] conducted a study where they trained a deep neural network (DNN) using functional network connectivity (FNC) data. The DNN was trained with L1-norm to monitor weight sparsity, which refers to the use of a penalty term in the loss function to encourage the model to use fewer features. This approach led to substantial performance improvement in the model’s ability to accurately predict outcomes. Zeng et al. [
17] presented a sparse autoencoder to learn imaging-site-shared FCs, which was then used to guide SVM training on multi-site datasets for schizophrenia (SZ) diagnosis. To extract and exploit the temporal dynamic information in fMRI time series, researchers have proposed various methods. One such approach is based on recurrent neural networks (RNNs), which are a type of neural network that can process sequences of inputs and maintain an internal state that reflects the context of the previous inputs. By using RNN-based methods, it is possible to model the complex temporal dependencies in fMRI time series and capture the patterns of brain activity over time. These methods have shown promising results in tasks such as brain decoding, brain state classification, and prediction of future brain states. Yan and colleagues [
18] proposed a multi-scale recurrent neural network (RNN) approach to investigate temporal correlations (TCs) in data. On the other hand, Dakka et al. have adopted a recurrent convolutional neural network (R-CNN) on four-dimensional fMRI recordings at the whole-brain voxel level [
19]. Their aim was to distinguish patients with SZ and HCs. The proposed approaches have proven to be useful in identifying differences between SZ and HC groups in neuroimaging data. Furthermore, the use of dynamic FNC (dFNC) has become increasingly popular in recent years as a valuable tool in discriminating brain disorders [
20,
21,
22]. In some cases, it is used alone, without combining with static FNC, to improve prediction accuracy. This approach involves analyzing the changes in functional connectivity between different regions of the brain over time and can provide valuable insights into how the brain functions and how it is impacted by various disorders. By identifying patterns of connectivity that are unique to specific disorders, researchers hope to develop more effective diagnostic tools and treatment strategies.
The intricate nonlinear interactions exhibited by neural systems are a plausible component of cognitive operations, and it is well-documented that such interactions are altered in schizophrenia. It is imperative to develop highly effective methods for capturing integrated complex networks from measures that exhibit sensitivity towards nonlinear relationships. Equally as important is the need for rich visualization of brain changes linked to the resulting model. Most prior work has focused mostly on the prediction or reconstruction accuracy, but is limited to poorly localized salience or attention maps.
Our work presents a detailed and innovative approach that utilizes a highly sophisticated deep convolutional neural network (DCNN) [
23] to extract nonlinear discriminative heatmaps with remarkable efficiency from the whole training datasets of FNC matrices. The objective of our study was to develop a deep learning model for classifying SZ using ResNets [
11] a type of deep residual network, while also enhancing rich visualization of the relevant FNC patterns within the brain. Our approach was designed to overcome the limitations of traditional methods, which often fail to accurately capture the complex nonlinear relationships between different brain regions, while also providing a rich visual output of the relevant functional changes. By leveraging the power of DCNN, we were able to extract highly informative heatmaps that provide a deeper understanding of the functional connectivity patterns within the brain. Our technique has the potential to advance the field of neuroimaging by providing researchers with a powerful new tool to analyze and interpret complex brain data. This method addresses the challenging task of analyzing brain functional networks using resting fMRI scans, which can vary due to age, sex, lab protocols, and short data acquisition times.
Our method involves a two-stage training process. In the first stage, we employ a state-of-the-art deep convolutional neural network to classify FNC training samples into multi-label classes. This process is based on two different kinds of labels, thus allowing for a more comprehensive and practical classification approach. The first label plays a crucial role in accurately classifying and distinguishing between healthy controls and individuals with schizophrenia. The second label is essential in accurately categorizing and differentiating cognitive levels. By maximizing the distances between training FNCs from different classes, we can extract highly nonlinear discriminate features from the heatmaps of DCNN. In the next stage, to enhance visualization, we implement two highly effective feature extraction methods based on the heatmaps we have generated. These methods have been carefully chosen for their capability to accurately identify and extract key features from the data. The first method involves statistical analysis using t-tests, which are utilized to compare the means of two sets of data. The second method entails the application of ICA to the output of the neural network, with the goal of extracting maximally independent features present in the generated heatmaps. ICA is a computational technique that aims to identify and separate independent sources from a mixture of signals.
2. Methods
2.1. ResNet Networks
ResNets have revolutionized the field of deep learning by allowing us to create much deeper neural networks, which in turn have led to significant improvements in tasks such as image classification, object detection, and natural language processing. This is achieved by using residual blocks that allow the network to learn the difference between the input and output, thus enabling the network to learn more complex features. Our study utilized ResNets for training the deep learning model for SZ classification, and we observed improved performance in comparison to other models previously used for this task. The use of ResNets in our study thus highlights their potential for enhancing the accuracy and efficiency of deep learning models in various fields. ResNets are a type of deep neural network that addresses the vanishing gradient problem. This problem arises when training a deep neural network because the gradient signal that is transmitted back through the network during the backpropagation process can become too small to be useful. When this happens, the weights in the network are not updated properly, leading to slower convergence and lower accuracy. ResNets introduce an innovative approach to solving this problem by implementing a “skip” or “residual” connection that allows data to bypass network layers. This connection acts as a shortcut that allows the gradient signal to flow more easily through the network, which helps to prevent the vanishing gradient problem from occurring. Specifically, the residual connection allows the network to learn the difference between the input and the output of a set of layers, which can be added back to the output to improve the network’s overall accuracy.
ResNet includes a number of residual blocks (ResBlocks), convolutional layers (Conv), and fully connected layers (FC), as presented in
Figure 1. ResBlocks learn residual functions with reference to the layer inputs. This “skip” connection approach means that gradients flow more easily through the network, preventing the vanishing gradient problem that can plague deep neural networks.
In this study, we utilized ResNet as a tool for capturing multi-level features of SZs. ResNet was employed for its ability to extract intricate details that indicate the similarity of SZs to healthy controls (HCs). Through ResNet, we were able to extract high-level features from heatmaps, which contained the most critical connectivities related to SZ. The utilization of ResNet allowed for a comprehensive analysis of the complex features present in SZs. Our hope is this brain-based approach will ultimately lead to a more thorough understanding of the disorder and its potential similarities to HCs.
2.2. Feature Selection
T-test-based feature selection is a statistical technique that is used to extract meaningful insights from a dataset. It involves analyzing each feature’s relationship with the target variable and selecting the ones with the strongest group effect. In our analysis, we used a two-sided t-test for selecting features. This test compares the means of two groups and determines whether there is a significant difference between them. In the context of feature selection, it helps to identify which features have a significant impact on the target variable, and which ones can be safely ignored.
In our study, we aimed to determine whether the population mean of SZs was significantly lower than that of HCs in the FNC features extracted from the deep learning model. The t-value provides an estimate of the statistical significance of the difference between the two groups. Our findings showed a highly significant difference between the population mean of SZs and HCs in multiple FNCs of the brain, indicating that individuals with schizophrenia have altered functional connectivity in their brains. This information is crucial as it helps provide insights into the links between brain function and SZ.
2.3. Proposed Pipeline
In this section, we provide a comprehensive explanation of our approach to detecting schizophrenia using the proposed algorithm. The approach is outlined in the flowchart in
Figure 2. This flow chart provides a visual representation of the step-by-step approach that we take to ensure optimal results. Our approach is built on four main steps, each of which plays a critical role in the success of our algorithm.
First, we preprocess the fMRI data and analyze it via a fully automated independent component analysis pipeline called NeuroMark [
37], resulting in ICNs and their associated timecourses. Each ICN consists of 150 time points. From the timecourses, we compute functional network connectivity (FNC), which represents a matrix of the covariance among timecourses.
In the second step of our approach, we focus on presenting various augmentation methods to effectively increase the number of training FNCs based on our initial limited training dataset. These augmentation methods are designed to improve the accuracy and generalization of the FNC models by generating new samples that are similar to the original data but with slight variations. Utilizing an augmented training dataset can be an effective approach towards training a highly efficient residual network. By generating additional synthetic data points, we can increase the size of our training dataset, which in turn can help our model learn more robust and accurate features. These augmentation methods include the mix-up technique. We will discuss this technique in detail and provide examples of how it can be used to enhance the training dataset.
In the third step, the augmented dataset is then used to train a residual network, a type of deep learning architecture that learns to approximate the underlying mapping between the input data and the output labels. Residual networks are known for their ability to train very deep neural networks, which can capture more complex features and patterns in the data. After the training process, we can extract the most powerful and distinguishing deep heatmaps of the model. These heatmaps can be used for various tasks such as classification, detection, or segmentation of SZs. Overall, the use of an augmented training dataset and a residual network can greatly improve the performance and effectiveness of machine learning models in various applications.
In the final step of our approach, we use two highly effective feature extraction methods that are based on the heatmaps we have generated. These methods have been chosen based on their ability to accurately identify and extract key features from the data. The first method of statistical analysis is based on
t-tests to identify features showing group differences. We also conduct a statistical analysis of the intrinsic networks depicted in various types of heatmaps. The second method involves the use of a second ICA on the output of the neural network to extract maximally independent features present in the model generated heatmaps. This is an approach we have recently introduced called source-based salience [
38] ICA is a computational technique that aims to identify and separate independent sources from a mixture of signals. In the context of heatmaps, ICA is particularly useful as it can help identify the independent sources that contribute to the observed patterns in the data. By decomposing the heatmap into its constituent sources, ICA can help uncover hidden patterns and relationships that may not be apparent from a simple visual inspection of the data.
By following these steps, we will create two effective deep learning models tailored to specific objectives. Our first objective is to use a deep convolutional neural network to classify individuals who have been diagnosed with schizophrenia from those who are healthy controls. By carrying this out, we hope to extract nonlinear discriminative features that can identify and differentiate between the two groups. This DCNN is trained on a large dataset of fMRIs from both SZs and HCs and learns to identify patterns and features that are unique to each group. Our second objective is to develop a deep convolutional neural network that can accurately classify each FNC as either a schizophrenia (SZ) patient or a healthy control (HC). This is a significant challenge because FNCs are complex and highly variable. By developing a highly accurate classification model, we can extract discriminative heatmaps that show specific functional connectivity that is predictive of SZ. This may lay the groundwork for future studies focused on tracking changes over the course of the illness. In addition, differentiating between different cognitive levels of SZ and HC can help us understand the relationship between cognition and mental illness along a dimensional scale, using a well-defined construct. Overall, our objective is to leverage the power of machine learning to gain new insights into the functional brain patterns linked to SZ.
2.4. Preprocessing of Distinct Functional Sources
The human brain is an incredibly complex organ that can be divided into different functional networks or sources, often called ICNs. These ICNs interact dynamically with one another to facilitate brain function. ICNs are thought to be responsible for a range of cognitive processes, including memory, attention, decision making, and language processing. Each ICN is made up of multiple regions of the brain that are functionally and anatomically connected. The functional activity of an ICN over time is measured by its time course, which shows how its activity changes over time. The contribution of spatial locations, on the other hand, is indicated by its spatial pattern. To accurately determine the FNC between ICN time courses, we employed Pearson’s correlation coefficient to compute the connectivity between each ICN pair for each individual in our study. This resulted in a 2D symmetric ICN × ICN FNC matrix that represented the functional connectivity between two ICNs. The FNC matrix consisted of cells that represented the strength of the functional connectivity between two ICNs. The greater the value of a cell, the stronger the connectivity between the two ICNs. We then use the FNC matrix as input into a deep learning model.
2.5. Data Augmentation
The current application of deep models to study SZ remains a significant challenge due to the lack of sufficient training datasets. This is primarily due to the limitations posed by data accessibility concerns, particularly in light of strict privacy regulations that restrict the sharing of sensitive patient data. As a consequence, insufficient training data significantly affects the SZ classification performance. In cases where there is an inadequate amount of training data, the accuracy and reliability of the classification model will be significantly reduced, leading to incorrect diagnosis and treatment. Therefore, it is crucial to ensure that an adequate amount of high-quality training data is available to optimize the performance of SZ classification. When the original dataset is limited, it can be challenging to train a machine learning model. One effective technique to overcome this limitation is data augmentation. Data augmentation refers to the process of generating new training samples by applying various transformations to the original dataset. By generating new samples, data augmentation increases the size and diversity of the training dataset, which leads to better generalization and improved performance of the model. Enhancing the training dataset through data augmentation has been demonstrated to be a highly effective method to reduce over-fitting issues in model training. In this study, we applied mixup augmentation to each ICN. Mixup is a popular and simple data augmentation technique that is widely used in deep learning applications. It involves randomly combining two data points from the same label in the training dataset by linearly interpolating between them. This process results in synthetic data points that lie along the straight line connecting the original data points. By generating synthetic data points in the training set, the mixup method helps to increase the diversity of the data and reduce overfitting. One of the key benefits of mixup is that it is easy to implement and does not require any additional labeled data. It has been shown to be effective in a wide range of applications, including image classification, object detection, and natural language processing. The data augmentation steps of our proposed algorithms are presented in
Figure 3.
2.6. Training the ResNet Models
The ResNet34 network, which was introduced by He et al. in 2016 [
11], is used in this study. The network is trained on a dataset that has been augmented to increase the amount of available data. The objective of this study is to classify instances as related or unrelated to schizophrenia. The classification layer is modified to produce a 512-dimensional feature vector, while the feature extraction layer of the ResNet network is kept unchanged. This approach allows for the effective extraction of relevant features from the input data, which can then be used to classify instances accurately. In order to train each network, we make use of the PyTorch (version 2.6) distributed machine learning system. To optimize the performance of our model, we utilize a stochastic gradient on a GeForce GTX 1060 Ti. To achieve the highest possible accuracy with minimal error, we set our optimal learning rate to 0.001 and adjust it by decaying every two epochs using an exponential rate of 0.90. This technique allows us to gradually reduce the learning rate, which in turn helps us to converge to the optimal solution. The training process for the ResNet network is conducted over 50 epochs. During this process, the model is trained on the available training data, and its performance is evaluated using a validation set. We make use of various techniques such as data augmentation during the training process to prevent overfitting. By training the model in this manner, we are able to achieve high accuracy and minimize the risk of overfitting. As part of our analysis, we visualized three convolutional layers, layers 10, 26, and 32 of the model, and converted them into heatmaps. These specific layers are only examples, chosen to visualize the output at different levels of feature reduction, including an early, intermediate, and later layer. These heatmaps allow us to visualize the most critical connections and features that are related to SZ. By providing a detailed two-dimensional score grid, these heatmaps enable us to identify specific regions that play a crucial role in differentiating between SZ and HC classes. This is particularly useful because it allows us to understand how different regions of the brain contribute to the resulting classification. Moreover, by highlighting the regions that are most important for different classes, we can gain a better understanding of the brain regions that differentiate SZ from HC.
2.7. Feature Extraction and Visualization
In order to compare the means of intrinsic networks in heatmaps, we use two-sample
t-tests to determine whether the mean of SZ samples in each intrinsic network is significantly greater or significantly less than the mean of HC samples in every intrinsic network. In the process of analyzing the neural network’s performance, we selected specific convolutional layers for further investigation. In order to gain a deeper understanding of the underlying mechanisms at play, we test the heatmaps corresponding to these layers, as seen in
Figure 1 We also incorporate a novel postprocessing step to further enhance visualization of the model results but performing what we call “source-based salience”. To carry this out, we apply independent component analysis (ICA) to the output layers to reduce to a smaller set of maximally independent FNC patterns.
We next take the heatmaps from selected convolutional layers and combine them to generate a comprehensive heatmap that contains all the most prominent deep features.
Figure 4 shows the visualization of combined heatmaps. In this figure, the areas that appear brighter in the heatmaps are indicative of a stronger association with the corresponding class. In this figure, columns 1, 2, and 3 demonstrate the mean heatmaps of SZs, while columns 4, 5, and 6 present the mean heatmaps of HCs. The first row provides a detailed illustration of examples of heatmaps, while the second row depicts the corresponding inputs of the network.
Next, we apply ICA to this combined heatmap. This is particularly useful in the context of heatmaps because it helps us identify the independent sources that contribute to the observed patterns in the data. This is because ICA can reveal underlying factors that are not directly observable but still contribute to the observed patterns in the data. The combination of heatmaps and ICA allows us to gain a more comprehensive understanding of the deep features in the data and the underlying patterns that they reveal. The architecture of the ICA-based feature extraction is illustrated in detail in
Figure 5.
2.8. Dataset
In order to evaluate the efficacy of our method, we utilized resting-state fMRI data from a large sample of healthy controls and patients diagnosed with schizophrenia across three datasets FBIRN (Function Biomedical Informatics Research Network), MPRC (Maryland Psychiatric Research Center), and COBRE (Centers for Biomedical Research Excellence). The dataset included 708 healthy controls and 537 individuals with schizophrenia. All subjects signed informed consent to participate in the respective studies. Resting-state fMRI data were obtained from the participants while they were in a relaxed state with their eyes closed. Recruitment and scanning details for the COBRE, FBIRN, and MPRC studies can be located in references Aine et al. [
33], Damaraju et al. [
39], and Adhikari et al. [
40], respectively. Groups were matched on sex, age and motion parameters. Medication effects were evaluated using a regression of olanzapine equivalency to FNC and no significant differences were found. The data were then preprocessed and analyzed using advanced statistical and machine learning techniques to identify relevant patterns of brain activity that were predictive of schizophrenia.
Individuals in the COBRE dataset that had been diagnosed with schizophrenia (SZ) underwent a systematic diagnostic process. Two research psychiatrists worked together to determine a diagnosis of SZ using the Structured Clinical Interview for DSM-IV Axis I Disorders (SCID), which took into account the patient’s specific circumstances and symptoms. The patient version of the SCID-DSM-IV-TR was used to ensure a standardized approach to the diagnosis. Additionally, the SZ individuals were evaluated for comorbidities that may have been present and were assessed for both retrospective and prospective clinical stability, to ensure that the diagnosis was accurate and that the patients were receiving appropriate treatment. The participants who were part of the FBIRN study and had SZ (schizophrenia) were diagnosed using the SCID-DSM-IV-TR (Structured Clinical Interview for DSM-IV-TR). Before undergoing scanning, they were required to be clinically stable for at least two months. For MPRC SZ subjects, a diagnosis of schizophrenia was confirmed via the SCID-DSM-IV (
Table 1).
2.9. Cognitive Scores
Schizophrenia is a complex mental disorder that can impact an individual’s cognitive abilities. An additional aim of the study was to explore the cognitive abilities of individuals diagnosed with schizophrenia in comparison to healthy individuals with similar demographic characteristics. Cognitive scores were computed based on neurocognitive domain z-scores derived from computerized neuropsychological tests. We used scores collected via the Computerized Multiphasic Interactive Neurocognitive System (CMINDS) [
41] as well as the Measurement and Treatment Research to Improve Cognition in Schizophrenia Consensus Cognitive Battery (MCCB) [
42]. The study involved 175 patients diagnosed with schizophrenia as well as 169 healthy volunteers. The neurocognitive tests were administered to both groups, and the scores based on the results were analyzed. The findings indicated that the order of the schizophrenia domain profile, based on effect size, was as follows: speed of processing, attention/vigilance, working memory, verbal learning, visual learning, and reasoning/problem solving. Moreover, the study revealed that women showed higher scores in attention/vigilance, verbal learning, and visual learning. Men showed higher scores in reasoning/problem solving. No significant group differences were found in terms of sex interactions.
This portion of the study is focused on enhancing our understanding of the cognitive abilities of individuals with schizophrenia and the possibility of sex differences in cognitive profiles. We specifically focus on the use of functional neuroimaging data to predict cognitive scores. The findings can potentially contribute to the development of targeted interventions and therapies aimed at improving the cognitive abilities of individuals diagnosed with schizophrenia.
2.10. Experiments and Analysis
All studies collected 5 min resting fMRI data using standard echo planar imaging sequence. A standard preprocessing pipeline including slice timing correction, motion correction, spatial normalization, and spatial smoothing (6 mm) was implemented using the SPM software (version 25.01). Subjects with motion > 3 mm were removed from the analysis. We employed fMRI data to extract FNC features for each participant using the NeuroMark spatially constrained independent component analysis pipeline with the NeuroMark_fMRI_1.0 template [
37]. This pipeline is a fully automated and standardized approach which extracts 53 components and their associated timecourses, from which we compute the FNC features via cross-correlation after bandpass filtering at [0.01–0.15 Hz]. Nuisance effects including age, gender, head motion, and site effects were regressed from the data. By using the NeuroMark pipeline and removing nuisance effects from the FNC features, we aimed to ensure that the results we obtained were robust and minimized any potential biases.
To ensure the reliability of our model, we conducted training and validation using an iterative methodology. The model was trained on the MPRC and COBRE training datasets and subsequently tested on the FBIRN dataset. This also controls for potential site/study differences in the results. We expanded the training dataset by adding more augmented training samples and then conducted the test. Following the training phase, we evaluated the model’s performance on the test data.
In order to evaluate how well our two-class classification model performed, we compared it with other classifiers including the support vector machine (SVM), random forests (RF), kernel SVM (KernelSVM), and decision trees (DT), which are commonly used in the field. To obtain an accurate assessment of the model’s performance, we used widely accepted evaluation metrics including precision, recall, accuracy, and F1 score.
3. Results
Figure 6 presents our findings, which show that our method’s classification accuracy surpasses that of the baseline methods by a significant margin. Our method utilizes ResNet, which facilitates the transformation of the nonlinear features of FNCs into a learning space that is discriminative for classification purposes. We show that this approach enables us to achieve an accuracy rate of 92.8% when classifying patients and controls. The reason for the high performance is likely related to the highly nonlinear feature space that represents the entire FNC training dataset. As a result, simple linear classifiers like SVM may not be effective in classifying these nonlinear features of FNCs. To achieve more precise and reliable results, it is crucial to employ more advanced classification methods that are better suited to the complexity of these nonlinear features.
We also utilized a deep convolutional neural network to categorize the FNC training samples into multiple classes effectively. This method involved considering two distinct types of labels, which enabled a more thorough and practical approach to classification. The first label differentiated between healthy individuals and those with schizophrenia. The second label characterized three levels of cognitive performance. Our approach showed an accuracy rate of 86.5% in the context of a multi-class classification.
3.1. Examples of Extracted Features
The visualization of the mean heatmaps at different convolution layers is presented in
Figure 7. The figure is divided into three columns. The first column shows the mean heatmaps of HCs, which are individuals with healthy cognition. The second column presents the mean heatmaps of SZs, which are individuals with schizophrenia. The third column illustrates the difference between the mean heatmaps of HCs and SZs. This difference helps in identifying the regions of the brain where the activity is significantly different between the two groups. The figure has three rows, each depicting mean heatmaps at different layers of convolution. The first row shows examples of mean heatmaps in layer 10, which is a shallow layer. The second row depicts the mean heatmaps in layer 23, which is a mid-level layer. The third row depicts the mean heatmaps in layer 32, which is a deeper layer. The figure provides insight into how the activity patterns in the brain change across different convolution layers.
3.2. Group Differences in Enhanced Heatmaps
The mean heatmaps of healthy controls and individuals with schizophrenia are displayed in
Figure 8, with separate columns for each group (the first and second columns, respectively). In addition, the third column shows the difference between the two groups. Results showed that the SZ group mainly showed significantly weaker connections between three different domains when compared to the HC group, specifically VS, SM, and CC. This suggests that there may be disrupted neural communication between these regions in individuals with SZ, which could have implications for their cognitive and behavioral functioning. In contrast, the SZ group demonstrated a significant increase in connectivity between the SC network and other networks, such as the AUD, SM, and VS networks, when compared to the control group. Similarly,
Figure 8 indicates that there was a marked increase in the level of connectivity between the DM network and the CC network in the SZ group when compared to the control group. This suggests that individuals with schizophrenia may have altered functional connectivity patterns in these particular regions of the brain, which could potentially contribute to the clinical symptoms of the disorder.
The connectograms provide another perspective, showcasing the most prominent connections present within two distinct networks, VS and SM. The third connectogram, located in the third column, shows a significant difference in the strength of the connections between the SZ and HC groups. Specifically, the connections in the SZ group are weaker than those observed in the HC group, indicating a disruption in the network connectivity in individuals with schizophrenia.
3.3. Group Differences in Different Layers
Our research also aimed to investigate whether there are any significant differences between the means of two distinct population groups: the SZ and HC groups, at different levels of the nonlinear network. Specifically, we conducted two-sample t-tests on layers 10, 22, and 32 to compare the differences between the two groups. This analysis helps us to better understand any substantial changes in values and directions between the two groups and provides additional information regarding the underlying differences between SZ and HC populations.
The analysis presented in
Figure 9 comprises the results conducted for each connectivity within heatmaps at three distinct layers, specifically layers 10, 22, and 32. The first row of the figure displays the heatmaps of two-sample
t-tests, represented by two different colors. The blue color denotes that the connectivities in the SZ group are weaker than those in the HC group, whereas the red color indicates that the connectivities in the SZ group are stronger than those in the HC group. Additionally, the figure includes the corresponding connectograms that depict the connectivity between regions in a graphical format.
The data displayed in this figure indicates that the mean values of the group diagnosed with SZ are significantly lower compared to the group of HCs in both between-network connections and within-network connections in the regions of SM, VS, and CC, and different network layers. The SZ group, compared to the HC group, exhibits stronger connections within the SC network. Additionally, the SZ group also shows evidence of having more robust connections between the SC network and several other networks, such as AUD, SM, VS, and CC networks. In comparison to the HC group, the SZ group also displays significantly stronger connections between the DM network and other networks, including the CC, VS, SM, and AUD networks. These findings suggest that there are distinct differences in brain connectivity patterns between individuals with schizophrenia and healthy individuals. The connectograms highlight the most prominent changes in the connections within three distinct networks: the VS, CC, and SM networks. These changes are indicative of the fact that the mean values of the connections in the SZ group are significantly lower compared to the HC group.
Figure 9 also shows a comparison between various layers of ResNet architecture. As per the analysis, it is observed that deep layers, such as layers 22 and 32, demonstrate more discriminative and nonlinear features. In contrast, the input and shallow layer (layer 10) include more linear features along with noise. This behavior is due to the ability of ResNet to eliminate redundant features and noise from the input and retain the most relevant features associated with the classification of SZs and HCs. The deep layers exhibit a high level of abstraction, which helps in capturing complex relationships between different features.
3.4. ICA-Based Summary of Network Layers
To further visualize the heatmaps at different nonlinear layers, we applied ICA to the heatmaps from the deep network. Our primary objective was to summarize the results while also capturing complex nonlinear information via maximally independent FNC maps. This was achieved by identifying a linear transformation of the observed FNCs that maximizes the statistical independence of the resulting components. The process enabled us to apply one-tailed two-sample t-tests on the independent FNCs, which yielded two significant benefits. This allows us to identify which maximally independent FNC regions were most relevant for distinguishing between SZs and HCs. This provides additional insights into the possible underlying neural mechanisms that differentiate SZs from HCs. Second, we could eliminate the connectivities and noise that were irrelevant. This process also helps reduce noise that is less relevant to the SZ classification. This helped us to refine our analysis and improve its accuracy.
In particular, we utilized an ICA model to decompose the three selected ResNet layers into eight maximally independence basis functional network connectivity components. We then tested for significant group differences in the loading parameters for each of the eight basis FNCs. Results are shown in
Figure 10. The test for the basis FNC component 2 rejected the null hypothesis and showed that the mean of the SZ group was significantly lower than that of the HC group for this particular component.
In
Figure 10b, we show a heatmap whose loading parameters highlight the FNC values that have the most significant impact on the ability to classify healthy controls and individuals with schizophrenia. This two-tailed two-sample test will determine whether the mean of the SZ group is significantly greater than the mean of the HC group or if the mean of the SZ group is significantly less than the mean of the HC group The FNC cells that exhibit the strongest positive differences (HC-SZ) are the SM and VS networks, as well as the DM and CC networks, as seen in
Figure 10d. These networks have been found to be highly interconnected and play a critical role in regulating various cognitive and affective processes, such as attention, self-awareness, decision making, and emotional regulation. Additionally, this figure also shows the existence of strong negative connectivity between the SC and VS networks, which may reflect the inhibitory influence of the SC network on visual processing.
3.5. Differences Between Mean Heatmaps of HCs and SZs at Each Cognitive Level
In
Figure 11, we show the heatmaps of healthy individuals versus those with schizophrenia at different cognitive levels. The heatmaps are divided into three groups: high cognitive level (scores ranging from 2 to 5), medium cognitive level (scores ranging from −2 to +2), and low cognitive level (scores ranging from −5 to −2).
Results showed that the SZ group had significantly weaker connections between four different networks the VS, AUD, SM, and SC networks when compared to the HC group. Further analysis revealed a marked increase in the level of connectivity between the DM network and the CC and VS networks in the SZ group when compared to the control group.
3.6. Patient vs. Control Group Differences at Each Cognitive Level for Three Different Network Layers
Next, we show results for patient versus control two-sample t-maps at three different layers, namely 10, 22, and 32, for the high, medium, and low cognitive score levels. This approach allowed us to evaluate the differences between the two groups as a function of both the mean and the standard deviation. The heatmaps in
Figure 12 are represented by two different colors. The blue color denotes that the connectivity of the SZ group is weaker than that of the HC group, while the red color indicates that the connectivity in the SZ group are stronger than those in the HC group. The corresponding connectograms that depict the connectivity between regions are shown in
Figure 13.
The mean values of the SZ group are significantly stronger when compared to the HC group in terms of the connections between the SC network and other networks, such as AUD, SM, VS, CC, and DM. These strong connections are evident at different cognitive levels, indicating a consistent pattern across different levels of cognitive processing.
Further analysis revealed that at high and medium cognitive levels, the HC group also showed evidence of more robust connections between the DM, CC, and CB networks. However, at the low cognitive level, the data show that the mean values of the SZ group are significantly stronger compared to the HC group in these connections. This finding suggests that individuals with SZ may have a more heightened sensitivity to certain cognitive tasks, particularly those involving the DM, CC, and CB networks, which may contribute to the differences observed between the two groups. In
Figure 13, the connectograms highlight the most significant connections that exist in two separate networks, SC and CB.
Our study aimed to examine whether there exist any significant differences in means between two groups of individuals diagnosed with schizophrenia at different cognitive levels. To achieve this objective, we conducted two-sample
t-tests on three different layers, 10, 22, and 32, in high vs. low cognitive score groups within patients with schizophrenia, as illustrated in
Figure 14. The results displayed in this figure show that the mean values of the group diagnosed with SZ at the high cognitive level are significantly lower in the connections between SC, AUD, VM, and VS compared to the group of SZ at the low cognitive level. On the other hand, our analysis revealed that the SZ group at the low cognitive level has a higher mean than the SZ group at the high cognitive level in the VS and DM within networks.
4. Discussion
In recent years, there has been growing interest in utilizing feature selection techniques to better understand the altered brain functional connectivity that is linked to schizophrenia, using data obtained through fMRI. Despite the significant progress made in this field, the underlying neurobiological mechanisms that contribute to the development of this disorder remain poorly understood. In this work, we developed a deep-learning-based feature selection method that has demonstrated its ability to identify nonlinear and discriminative features in heatmaps with a high degree of accuracy. By combining approaches which leverage the power of machine learning while also focusing on visualization we hope to provide an approach that can help shed new light on the complex neurobiological processes that underlie schizophrenia.
The paper we present outlines a new and innovative approach for extracting nonlinear heatmaps from functional network connectivity images. To do this, we utilize a deep convolutional neural network that effectively extracts these heatmaps from the input FNC data. By analyzing these heatmaps, we can derive nonlinear intrinsic connectivity networks that provide a robust framework for understanding the complex interactions between different regions of the brain. One of the key benefits of this approach is that it helps remove abundant information and noise from the heatmaps, leaving only the most important features related to identifying SZs. This means that these networks represent a significant improvement over previous approaches and offer a more accurate way of identifying SZs.
The heatmap generation process is crucial in identifying the most informative nonlinear functional connectivities that help distinguish between different groups. To achieve this, we have employed statistical comparisons of the model output at several layers in addition to the use of ICA to summarize the output at the level of maximally independent FNC patterns. In doing so, we are able to extract the most relevant and significant nonlinear functional connectivity that contributes towards the classification of different groups. The overall heatmaps show in
Figure 8 are largely consistent with prior linear results [
39,
43,
44,
45], highlighting reductions in sensory regions and increased negative links in subcortical, default model, and cognitive control regions. Overall, this new approach has the potential to greatly enhance our understanding of the brain and provide better insights into how different regions of the brain interact with each other. However, this is only part of the story. Our results show a complex cascade of differences, largely continuing to involve subcortical and cerebellar regions, but with more complex patterns that are found with linear methods. Here, we mainly highlight several large-scale observations, since our goal in this paper was to highlight the application of our new approach. However, we plan to focus on these in much greater detail in future work that also include additional data for replication.
Our study also involved a rigorous evaluation of the classification performance of our proposed method as compared to existing state-of-the-art classification methods. Our deep learning model was able to improve the classification results significantly, which demonstrated the effectiveness of our approach in identifying a small but crucial number of discriminative features to accurately classify different groups. Thus, our method was able to identify more discriminative features, which effectively helped in separating different classes. In comparison with the competing methods, we were able to showcase the superiority of our method through our results. In particular, our method has shown remarkable results in accurately identifying SZs and HCs. To be specific, we achieved an average accuracy of 92.8% on the testing fMRI dataset, which is significantly higher than the other competing methods that had accuracies around 80%. While there is considerable variability in the classification of SZ vs. HC, due to many factors, our results compared with linear and even nonlinear SVM classifiers are significantly higher that what is typically found in the field [
17,
46].
Moreover, our method’s ability to identify SZs and HCs at certain cognitive levels is a unique feature that has proven much more challenging, with many studies not funding significant relationships with cognition or finding relationship are not robust (cf [
47]). This suggests the nonlinear relationship may be even more important in the relationship to cognitive scores and diagnosis, further highlighting the potential of our approach. Future studies should continue to advance our combined capture and visualize approach, which we hope can eventually have significant implications in the field of clinical psychology and neuroscience.
Through the process of identifying the most distinctive features from the heatmaps, we can delve into the dissimilarities in functional network connectivity between the patients and controls. Our results highlighted several notable differences in brain function. First, as mentioned above, individuals diagnosed with schizophrenia demonstrate stronger connectivity between the sub-cortical network and other networks such as auditory, visual, sensorimotor, and cognitive control networks than the group of healthy individuals. This suggests that there may be an increased interaction between the sub-cortical network and other brain regions in individuals with SZ, potentially indicating a distinct pattern of brain functioning in this population. Existing models of schizophrenia tend to implicate the cerebellum, subcortical, and cortical regions, consistent with our findings [
48,
49].
Second, individuals diagnosed with schizophrenia exhibit a higher degree of connectivity between the default-mode network and other networks, such as visual and cognitive control networks, compared to the group of healthy individuals. This increased connectivity indicates that the brain networks in individuals with schizophrenia are more tightly integrated, which could contribute to the symptoms associated with the disorder, such as hallucinations and delusions. This is consistent with but extends prior work focused on linear anticorrelations [
50,
51]. The default-mode network is a set of brain regions that are active when the mind is at rest and not involved in any specific task, while the visual and cognitive control networks are responsible for visual processing and cognitive control, respectively. Third, the average values of the group diagnosed with SZ reveal a significant decrease in comparison to the HC group in two types of brain connections, namely between-network connections and within-network connections, in the regions of sensorimotor, visual, and cognitive control. This means that individuals with SZ showcased lower connectivity levels in comparison to healthy individuals in the aforementioned regions of the brain, as previously show by linear approaches as well as a more recent gradient-based approach [
52,
53].
Finally, upon conducting further analysis, it was discovered that at high and medium levels of cognitive function, the HC (healthy control) group displayed more pronounced connections between the default-mode, cognitive control, and cerebellum networks. These connections were found to be stronger in the HC group compared to the SZ group. However, at a lower level of cognitive function, the data showed that the mean values of the SZ group were significantly stronger in these connections when compared to the HC group. This finding suggests that individuals with SZ may have a heightened sensitivity to certain cognitive tasks, particularly those involving the default-mode, cognitive control, and cerebellum networks; moreover, they extend linear approaches, which show very mixed relationships between SZ, cognitive performance, and functional connectivity [
54,
55]. These differences in neural connectivity may contribute to the observed variations between the two groups in terms of cognitive abilities and warrant further studies to confirm and study these in more detail.
Limitations and Future Work
The visualization of deep learning models is complex, while we have advocated for expanding this visualization in several ways, including an approach to compress these into a more compact representation using ICA, there is still much work to do in this space. Future work can explore the trade-off between enhanced visualization and condensed summary measures. Future models can also be trained to maximize visual output as well. We have applied our approach to a schizophrenia dataset; however, the clinical aspects can be studied much more deeply. Future work should also study relationships between continuous measures that are measured only in patients, such as symptom scores or duration of illness. We also plan to apply our approach to study individuals at an earlier disease stage, or in individuals that are at clinical high risk for schizophrenia and also to groups sharing clinical phenomenology such as individuals with bipolar disorder or autism spectrum disorder. Finally, while we have used a relatively large dataset, and results were cross-validated, these results should also be replicated in future independent studies to confirm their reproducibility.
5. Conclusions
In this work, we have proposed a novel approach that harnesses the power of deep convolutional neural networks to extract nonlinear heatmaps from functional network connectivity images. Our approach is specifically designed to capture complex nonlinear interactions such as those that we speculated underlie cognitive operations and the changes observed in psychiatric conditions like schizophrenia. Through this, we believe that this research can lead to a better understanding of the workings of the brain and offer crucial insights into the mechanisms that drive various psychological and psychiatric conditions. We show three main benefits: (1) improved classification performance for diagnosis, (2) enhanced visualization of complex relationships, and (3) high accuracy in predicting cognitive performance and its relationship to diagnosis.
To enhance visualization, our method involves analyzing the heatmaps generated by the DCNN to derive nonlinear intrinsic connectivity networks from the input FNC data. These networks represent a significant improvement over previous approaches and provide a robust framework for understanding the complex interactions between different regions of the brain. To ensure optimal efficiency and effectiveness, our method incorporates two stages in the training process. In the initial stage, we train a deep convolutional neural network to create heatmaps from various convolution layers of the network. In the second stage, we use the ICA and t-test-based feature selection methods to effectively analyze each heatmap from different convolution layers. This allows us to extract the most important nonlinear FNCs from the heatmaps that play a crucial role in distinguishing different groups. Overall, we believe that our approach has the potential to provide powerful insights into the workings of the brain and help researchers gain a deeper understanding of complex psychological and psychiatric conditions.
Our approach was focused on the use of the proposed method to analyze FNC matrices. We found substantial benefits even in this case, including an improvement in accuracy and also a rich set of visualizations that highlight different aspects of the patient versus control differences. There are likely additional performance improvements to be gained by modeling nonlinear interactions prior to the computation of the FNC, especially those related to more complex relationships such as those between cognitive performance, diagnosis (including expanding to other neuropsychiatric disorders including those at risk), and FNC, which we hope to explore in future works. In addition, we are planning to enhance the effectiveness of our deep learning network by incorporating self-attention transformers [
19,
56,
57,
58]. This specialized transformer architecture is capable of capturing nonlinear connectivity from dynamic functional network connectivity with greater accuracy and precision. By enabling the model to recognize and analyze relationships between distant elements in a sequence, self-attention transformers can effectively comprehend complex patterns and dependencies that would be otherwise difficult to detect. This allows for the extraction of the most significant features from windowed time courses of different brain networks (components), which are estimated via spatial ICA. We believe that this development will significantly improve the performance of our deep learning network and enable us to achieve more accurate results in our research.