Temporal and Spatial Analysis of Alzheimer’s Disease Based on an Improved Convolutional Neural Network and a Resting-State FMRI Brain Functional Network

Most current research on Alzheimer’s disease (AD) is based on transverse measurements. Given the nature of neurodegeneration in AD progression, observing longitudinal changes in the structural features of brain networks over time may improve the accuracy of the predicted transformation and provide a good measure of the progression of AD. Currently, there is no cure for patients with existing AD dementia, but patients with mild cognitive impairment (MCI) in the prodromal stage of AD dementia may be diagnosed. The study of the early diagnosis of MCI and the prediction of MCI to AD transformation is of great significance for the monitoring of the MCI to AD transformation process. Despite the high rate of MCI conversion to AD, the neuropathological cause of MCI is heterogeneous. However, many people with MCI remain stable. Treatment options are different for patients with stable MCI and those with underlying dementia. Therefore, it is of great significance for clinical practice to predict whether patients with MCI will develop AD dementia. This paper proposes an improved algorithm that is based on a convolution neural network (CNN) with residuals combined with multi-layer long short-term memory (LSTM) to diagnose AD and predict MCI. Firstly, multi-time resting-state fMRI images were obtained from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database for preprocessing, and then an AAL brain partition template was used to construct a 90 × 90 functional connectivity (FC) network matrix of a whole-brain region of interest (ROI). Secondly, the diversity of training samples was increased by generating an adversarial network (GAN). Finally, a CNN with residuals and a multi-layer LSTM model were constructed to automatically classify and predict the functional adjacency matrix. This method can not only distinguish Alzheimer’s disease from normal health conditions at multiple time points, but can also predict progressive MCI (pMCI) and stable MCI (sMCI) at multiple time points. The classification accuracies in AD vs. NC and sMCI vs.pMCI reached 93.5% and 75.5%, respectively.


Introduction
Alzheimer's disease (AD) is a common, non-reversible, and progressive neurological disease characterized by cognitive impairment, with patients' memory and thinking abilities gradually becoming impaired over time [1]. Mild cognitive impairment (MCI) is a transitional stage between normal aging and AD and is characterized by mild memory and intellectual impairment, with a degree of memory impairment not commensurate with age [2]. MCI is a preclinical risk factor for AD. The conversion of MCI to AD is ten times more common than in the general population. According to follow-up studies, the incidence of transition to AD in patients with MCI is 10 to 15% within 1 year, 40% within 2 years, and 20 to 53% within 3 years [3]. Therefore, MCI patients can be further divided into stable MCI (sMCI) and progressive MCI (pMCI) patients. Although there is

Data Selection
The sample data used in this study were from the Alzheimer's Disease Neuroimaging Initiative (ADNI) (http://adni.loni.usc.edu, accessed on 12 January 2022). There were 312 subjects in this study, including 100 for NC, 75 for sMCI, 72 for pMCI, and 65 for AD. fMRI images were collected at baseline (BL), 12 months, and 24 months in each group. Prior to the scan, the subjects passed cognitive and behavioral assessments [16]. Sample demographic information is shown in Table 1. As can be seen from Table 1, with the aggravation of the disease, MMSE scores showed a downward trend, while CDR showed an upward trend. A statistical analysis of basic information was obtained by SPSS software [17]. Sample scanners selected from the ADNI are from Philips Medical Systems. The resting state fMRI scan sequence (EPI) has a total of 140 time points with 48 layers, a magnetic field intensity of 3.0 tesla, a flip angle of 80.0, a TE of 30.0 ms, a TR of 3000.0 ms, a 64 × 65 matrix, and 6720.0 images with a thickness of 3.31 mm. The resting-state fMRI image display of NC, AD, sMCI, and pMCI subjects is shown in Figure 1.

Data Selection
The sample data used in this study were from the Alzheimer's Disease Neuroimaging Initiative (ADNI) (http://adni.loni.usc.edu, accessed on 12 January 2022). There were 312 subjects in this study, including 100 for NC, 75 for sMCI, 72 for pMCI, and 65 for AD. fMRI images were collected at baseline (BL), 12 months, and 24 months in each group. Prior to the scan, the subjects passed cognitive and behavioral assessments [16]. Sample demographic information is shown in Table 1. As can be seen from Table 1, with the aggravation of the disease, MMSE scores showed a downward trend, while CDR showed an upward trend. A statistical analysis of basic information was obtained by SPSS software [17]. Sample scanners selected from the ADNI are from Philips Medical Systems. The resting state fMRI scan sequence (EPI) has a total of 140 time points with 48 layers, a magnetic field intensity of 3.0 tesla, a flip angle of 80.0, a TE of 30.0 ms, a TR of 3000.0 ms, a 64 × 65 matrix, and 6720.0 images with a thickness of 3.31 mm. The resting-state fMRI image display of NC, AD, sMCI, and pMCI subjects is shown in Figure 1.

Image Preprocessing
The process of fMRI data preprocessing includes data format conversion, the removal of unstable time points, time layer correction, head movement correction, spatial standardization, the removal of linear drift, filtering, regression covariates, and the removal of excessive head movement time points [18]. In this paper, the preprocessing process is basically the same as that of general MRI, but the difference is that the preprocessing in this paper does not need to be performed smoothly because network analysis requires high spatial accuracy, and smoothness will affect the activation of adjacent regions of interest (ROIs). The pretreatment process of fMRI data includes format conversion (DICOM format to NIFTI), the removal of the first ten unstable time points, time layer correction, head correction, spatial standardization, linear drift removal, filtering, regression covariates, and the removal of excessive head movement time points (in order to reduce the influence of head movements and artifacts, subjects with FD > 0.5 over 2.5 min (50 frames) of data were excluded). The SPM8 toolbox and the DPARSFA (version 2.2) toolkit were used for standard preprocessing [19,20]. The pretreatment flow chart is shown in Figure 2.

Image Preprocessing
The process of fMRI data preprocessing includes data format conversion, the removal of unstable time points, time layer correction, head movement correction, spatial standardization, the removal of linear drift, filtering, regression covariates, and the removal of excessive head movement time points [18]. In this paper, the preprocessing process is basically the same as that of general MRI, but the difference is that the preprocessing in this paper does not need to be performed smoothly because network analysis requires high spatial accuracy, and smoothness will affect the activation of adjacent regions of interest (ROIs). The pretreatment process of fMRI data includes format conversion (DICOM format to NIFTI), the removal of the first ten unstable time points, time layer correction, head correction, spatial standardization, linear drift removal, filtering, regression covariates, and the removal of excessive head movement time points (in order to reduce the influence of head movements and artifacts, subjects with FD > 0.5 over 2.5 min (50 frames) of data were excluded). The SPM8 toolbox and the DPARSFA (version 2.2) toolkit were used for standard preprocessing [19,20]. The pretreatment flow chart is shown in Figure 2.

Whole-Brain Functional Link Matrix FC
The functional connectivity of the human brain is complex, and the connectivity of functional brain networks has been widely used in the study of AD. The construction of brain functional networks based on fMRI data was mainly divided into the following steps [21,22]: (1) Nodes (brain regions) were obtained, and the whole brain was divided into 90 ROI brain regions using the AAL (AAL90) template. Once the partitioning method was selected, the node was identified.
(2) The whole-brain functional connectivity matrix was obtained using fMRI data and nodes. We averaged the voxels in each ROI brain region, obtained fMRI time series signals in each ROI brain region, and constructed the brain functional network of each subject by calculating the Pearson correlation coefficient between two ROIs. The Pearson correlation coefficients of the two time series are shown in Equation (1): This is the product of the covariance of the two time series divided by the standard deviation of the two time series. The results of the Pearson correlation coefficient ρ X,Y are in the range of −1 ≤ ρ X,Y ≤ 1. When 0 ≤ ρ X,Y ≤ 1, the two time series are positively correlated or they are negatively correlated. When it equals zero, it means that the two time series are independent of each other and have no correlation. From this, it can be concluded that the functions of two groups of brain areas in a certain period of time are synergistic or antagonistic. By calculating the average time series of each brain region and calculating the correlation coefficient in pairs, the correlation matrix of the whole brain during this period of time can be obtained, namely the functional connection matrix [23][24][25]. The functional connectivity matrix is displayed using the AAL90 template with a total of 90 brain regions, so the connectivity matrix is 90 × 90. Brain network visualization and the functional connection matrix are shown in Figure 3.

Whole-Brain Functional Link Matrix FC
The functional connectivity of the human brain is complex, and the connectivity of functional brain networks has been widely used in the study of AD. The construction of brain functional networks based on fMRI data was mainly divided into the following steps [21,22]: (1) Nodes (brain regions) were obtained, and the whole brain was divided into 90 ROI brain regions using the AAL (AAL90) template. Once the partitioning method was selected, the node was identified.

The Improved Method Proposed in This Paper
At present, common brain image analysis methods to manually extract the specified features are mainly based on prior knowledge, which results in great limitations in the

The Improved Method Proposed in This Paper
At present, common brain image analysis methods to manually extract the specified features are mainly based on prior knowledge, which results in great limitations in the representation of image features. Most brain image analysis studies are focused on image data analysis at a single time point, which is prone to interference from different individuals. Longitudinal image data analysis at multiple time points in the time domain can obtain pathological changes in the pathogenesis process and achieve a more precise diagnosis of Alzheimer's disease [26]. Aiming at the above problems, this paper proposes an automatic analysis and diagnosis model of the multi-temporal brain function network based on the deep learning method. The functional connectivity (FC) (90 × 90) between brain ROI regions was used as the original input feature of CNNs. As deep neural networks generally require a large amount of training data to obtain ideal results, in the case of limited data in this paper, it is necessary to build a GAN to perform data augmentation for samples [27]. Then, a 1D−CNN model was built to extract spatial features. Then, a three-layer LSTM model was built to extract and analyze FC features at multiple time points [28,29]. Finally, the validity of the model was verified against the ADNI data set.
(1) GAN based data augmentation The GAN model contains two networks: one is a generative network, and the other is an adversarial network. The role of the generative network is to generate new samples in the case of given samples, so that the adversarial network cannot distinguish between these new samples and given samples. Therefore, GAN is generally a model that can generate synthetic samples that can reflect the target distribution behind real data and achieve the purpose of data augmentation [30]. The principal diagram of data augmentation by GAN is shown in Figure 4.

J. Environ. Res. Public Health 2022, 19, x FOR PEER REVIEW
of limited data in this paper, it is necessary to build a GAN to perform data au for samples [27]. Then, a 1D−CNN model was built to extract spatial featu three-layer LSTM model was built to extract and analyze FC features at m points [28,29]. Finally, the validity of the model was verified against the ADN (1) GAN based data augmentation The GAN model contains two networks: one is a generative network, an is an adversarial network. The role of the generative network is to generate n in the case of given samples, so that the adversarial network cannot distingu these new samples and given samples. Therefore, GAN is generally a model t erate synthetic samples that can reflect the target distribution behind real data the purpose of data augmentation [30]. The principal diagram of data augm GAN is shown in Figure 4. (2) Spatial feature extraction based on CNNs Convolutional neural networks (CNN) are very similar to common neur in that they are both made up of neurons with learnable weights and biases. Ev takes some input and generates some dot products, and the output is the frac classification [31]. The function of the convolution layer is feature extracti brain's functional network at each time point, we built a 1D-CNN model wi structure to extract spatial features at a single time point [32]. The model shown in Figure 5.
The model includes operations, such as convolution, max-pooling, and sh tion structure. In this paper, the traditional CNN model with a single directio cal structure is improved. In the improved model, two short connection m added to fuse the features of the front and rear layers and enhance the utiliz front layer [33]. (2) Spatial feature extraction based on CNNs Convolutional neural networks (CNN) are very similar to common neural networks in that they are both made up of neurons with learnable weights and biases. Every neuron takes some input and generates some dot products, and the output is the fraction of each classification [31]. The function of the convolution layer is feature extraction. For the brain's functional network at each time point, we built a 1D-CNN model with the same structure to extract spatial features at a single time point [32]. The model structure is shown in Figure 5.
The model includes operations, such as convolution, max-pooling, and short connection structure. In this paper, the traditional CNN model with a single direction and vertical structure is improved. In the improved model, two short connection modules are added to fuse the features of the front and rear layers and enhance the utilization of the front layer [33].   (2) and (3): It can be seen from Equations (2) and (3) that the difference between the cyclic layer and the fully connected layer is that the cyclic layer has a weight matrix W. If Equation (3) is repeatedly substituted into Equation (2), Equation (4) will be obtained:   (3) RNN Recurrent neural networks (RNNs) have achieved great success and are widely used in many natural language processing (NLP) applications. RNNs are mainly used to process sequence data. A simple RNN consists of an input layer, a hidden layer, and an output layer [34]. The RNN can be expanded using a timeline, as shown in Figure 6.
It can be seen from Equations (2) and (3) that the difference between the cyclic layer and the fully connected layer is that the cyclic layer has a weight matrix W. If Equation (3) is repeatedly substituted into Equation (2), Equation (4) will be obtained:   In Figure 6, there is a one-way flow of information from the input unit to the hidden unit, and another one-way flow of information from the hidden unit to the output unit. In some cases, the RNNs break the latter restriction, guiding information from the output unit back to the hiding element. These are called "back projections," and the input to the hiding layer also includes the status of the upper hiding layer, where nodes can be self-connected or interconnected [35].
In Figure 6, after the network receives the input x t at time t, the value of the hidden layer is s t and the output value is o t . The key point is that the value of s t not only depends on x t , it depends on s t−1 . The calculation method of recurrent neural networks can be expressed as shown in Equations (2) and (3): s t = f (Ux t + Ws t−1 )

of 16
It can be seen from Equations (2) and (3) that the difference between the cyclic layer and the fully connected layer is that the cyclic layer has a weight matrix W. If Equation (3) is repeatedly substituted into Equation (2), Equation (4) will be obtained: RNNs have problems with gradient disappearance and gradient explosion in the process of long sequence training, i.e., information loss caused by long-distance transmission.

(4) LSTM
The long memory network (LSTM) successfully solved the defects of the original recurrent neural network and became the most popular RNN at present. It has been successfully applied in many fields, such as speech recognition, image description, and natural language processing. The hidden layer of the original RNN has only one state, H, which is very sensitive to short-term input. Thus, let us add another state C, to preserve the long-term state [36]. This is shown in Figure 7: RNNs have problems with gradient disappearance and gradient explosion in the process of long sequence training, i.e., information loss caused by long-distance transmission.
(4) LSTM The long memory network (LSTM) successfully solved the defects of the original recurrent neural network and became the most popular RNN at present. It has been successfully applied in many fields, such as speech recognition, image description, and natural language processing. The hidden layer of the original RNN has only one state, H, which is very sensitive to short-term input. Thus, let us add another state C, to preserve the long-term state [36]. This is shown in Figure 7: The forgetting gate is shown in Equation (5): f W an be written as Equation (6): The input gate is shown in Equation (7): In the above formula, i W is the weight matrix of the input gate, and i b is the bias term of the input gate [37,38]. Next, the cell state k c  used to describe the current input is calculated based on the previous output, and the current input is shown in Equation (8) (9) and (10): The forgetting gate is shown in Equation (5): W f an be written as Equation (6): The input gate is shown in Equation (7): In the above formula, W i is the weight matrix of the input gate, and b i is the bias term of the input gate [37,38].
Next, the cell state c k used to describe the current input is calculated based on the previous output, and the current input is shown in Equation (8): This equation calculates the cell state c t at the current time. It is produced by multiplying the element of the last cell state c t−1 by the forgetting gate f t , and then multiplying the element of the current input cell state c t by the input gate c t , and then adding the two products shown in Equations (9) and (10): The final output of LSTM is determined by the output gate and cell state shown in Equation (11): In this paper, we set up a three-layer LSTM model to analyze these sequences and extract the temporal variation characteristics of the spatial features at different time points, so as to make comprehensive use of single-time point and multi-time point information to diagnose and predict AD [39,40]. The design of a CNN combined with the threelayer LSTM framework is shown in Figure 9. The overall framework for AD diagnosis and MCI prediction is shown in Figure 10. In this paper, we set up a three-layer LSTM model to analyze these sequences and extract the temporal variation characteristics of the spatial features at different time points, so as to make comprehensive use of single-time point and multi-time point information to diagnose and predict AD [39,40]. The design of a CNN combined with the three-layer LSTM framework is shown in Figure 9. In this paper, we set up a three-layer LSTM model to analyze these sequences a extract the temporal variation characteristics of the spatial features at different ti points, so as to make comprehensive use of single-time point and multi-time point inf mation to diagnose and predict AD [39,40]. The design of a CNN combined with the thr layer LSTM framework is shown in Figure 9. The overall framework for AD diagnosis and MCI prediction is shown in Figure 1  The overall framework for AD diagnosis and MCI prediction is shown in Figure 10. The overall program flow chart of this model is shown in Figure 11.  The overall program flow chart of this model is shown in Figure 11. The overall program flow chart of this model is shown in Figure 11.

Experimental Result
After the format conversion and image preprocessing of the original fMRI data obtained from ADNI in the experiment, the CNN model and LSTM model in this algorithm were built in the Python environment with the help of the deep learning library Keras and TensorFlow. The hardware configuration of this experiment is as follows: 8-core, 16-thread, AMD R7-4800U CPU, 16 G memory, 512 G hard disk, and a 4.2 GHz acceleration frequency. We performed an experimental test of the proposed multi-time resting-state fMRI brain functional network study on the ADNI database. We divided the whole data set into five parts, selected four pieces at a time as the training set, with the remaining one as the test set, and randomly selected part of the training set as the verification set. We used the accuracy, precision, and recall rate to evaluate the effect of this classification. The accuracy, precision, and recall rate are shown as Equation (12), Equation (13), and Equation (14), respectively.
where TP means the prediction is positive, and the reality is positive; TN means the prediction is negative, and the reality is negative; FP means the prediction is positive, and the reality is negative; and FN means the prediction is negative, and the reality is positive. The loss curve is shown in Figure 12, and the experimental results based on the convolutional neural network and resting state fMRI brain functional network are shown in Table 2. The blue line represents the loss curve of the training set, and the orange line represents the loss curve of the validation set.
were built in the Python environment with the help of the deep learning library Keras and TensorFlow. The hardware configuration of this experiment is as follows: 8-core, 16thread, AMD R7-4800U CPU, 16 G memory, 512 G hard disk, and a 4.2 GHz acceleration frequency. We performed an experimental test of the proposed multi-time resting-state fMRI brain functional network study on the ADNI database. We divided the whole data set into five parts, selected four pieces at a time as the training set, with the remaining one as the test set, and randomly selected part of the training set as the verification set. We used the accuracy, precision, and recall rate to evaluate the effect of this classification. The accuracy, precision, and recall rate are shown as Equation (12), Equation (13), and Equation (14), respectively.

TP TN Accuracy TP TN FN FP
where TP means the prediction is positive, and the reality is positive; TN means the prediction is negative, and the reality is negative; FP means the prediction is positive, and the reality is negative; and FN means the prediction is negative, and the reality is positive. The loss curve is shown in Figure 12, and the experimental results based on the convolutional neural network and resting state fMRI brain functional network are shown in Table 2. The blue line represents the loss curve of the training set, and the orange line represents the loss curve of the validation set.    In this study, the accuracy, precision, and recall of sMCI and pMCI as well as NC and AD groups at the baseline period (BL) and 12 months (12 m) and 24 months (24 m) after the baseline period were compared. As can be seen from the experimental results, there are significant differences between sMCI and pMCI as well as NC and AD samples over time.
A comparison of the ROC curves of different algorithms is shown in Figure 13. In this study, the accuracy, precision, and recall of sMCI and pMCI as well as NC and AD groups at the baseline period (BL) and 12 months (12 m) and 24 months (24 m) after the baseline period were compared. As can be seen from the experimental results, there are significant differences between sMCI and pMCI as well as NC and AD samples over time. A comparison of the ROC curves of different algorithms is shown in Figure 13.

Conclusions
In this paper, we proposed a multi-time model for the diagnosis and prediction of Alzheimer's disease based on a convolutional neural network and a resting-state fMRI of the brain functional network. The ADNI dataset was used to screen the original fMRI data, and the whole-brain resting-state fMRI of the brain functional network was built after format conversion and image preprocessing. GAN was used to amplify the data as the initial feature of CNN + LSTM, and the model was verified at multiple time points. Compared with other classical algorithms, the experimental results show that the algorithm is effective. AD vs. NC was superior to pMCI vs. sMCI at multiple time points. The diagnosis effect of Alzheimer's disease using only the SVM model was the worst, and the classification effect of the CNN experiment using only CNN was better than that of SVM at multiple time points. The model based on CNN combined with LSTM proposed by us was superior to the CNN and SVM methods alone in temporal and spatial analyses. This indicates that the spatial and temporal analysis algorithm proposed by us is suitable for the diagnosis and prediction of Alzheimer's disease.

Institutional Review Board Statement:
The study was conducted in accordance with the guidelines of the Declaration of Helsinki. The study procedures were approved by the institutional review. boards of research centers in the ADNI.
Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.
Data Availability Statement: Data used in the preparation of this article were obtained from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database (adni.loni.usc.edu (accessed on 12 January 2022)).