Wavelet-Based Fractal Analysis of rs-fMRI for Classification of Alzheimer’s Disease

The resting-state functional magnetic resonance imaging (rs-fMRI) modality has gained widespread acceptance as a promising method for analyzing a variety of neurological and psychiatric diseases. It is established that resting-state neuroimaging data exhibit fractal behavior, manifested in the form of slow-decaying auto-correlation and power-law scaling of the power spectrum across low-frequency components. With this property, the rs-fMRI signal can be broken down into fractal and nonfractal components. The fractal nature originates from several sources, such as cardiac fluctuations, respiration and system noise, and carries no information on the brain’s neuronal activities. As a result, the conventional correlation of rs-fMRI signals may not accurately reflect the functional dynamic of spontaneous neuronal activities. This problem can be solved by using a better representation of neuronal activities provided by the connectivity of nonfractal components. In this work, the nonfractal connectivity of rs-fMRI is used to distinguish Alzheimer’s patients from healthy controls. The automated anatomical labeling (AAL) atlas is used to extract the blood-oxygenation-level-dependent time series signals from 116 brain regions, yielding a 116 × 116 nonfractal connectivity matrix. From this matrix, significant connections evaluated using the p-value are selected as an input to a classifier for the classification of Alzheimer’s vs. normal controls. The nonfractal-based approach provides a good representation of the brain’s neuronal activity. It outperformed the fractal and Pearson-based connectivity approaches by 16.4% and 17.2%, respectively. The classification algorithm developed based on the nonfractal connectivity feature and support vector machine classifier has shown an excellent performance, with an accuracy of 90.3% and 83.3% for the XHSLF dataset and ADNI dataset, respectively. For further validation of our proposed work, we combined the two datasets (XHSLF+ADNI) and still received an accuracy of 90.2%. The proposed work outperformed the recently published work by a margin of 8.18% and 11.2%, respectively.


Introduction
Alzheimer's disease (AD) is a brain disorder that causes progressive deterioration in brain functions, commonly affecting memory function, the thinking process and behavior. Eventually, symptoms become severe enough to affect daily activities. Therefore, early diagnosis of Alzheimer's offers a variety of advantages for diagnosed individuals and helps in their treatment planning. One of the effective methods to analyze brain functions is to observe the connectivity patterns of the brain. Brain connectivity refers to the way different brain units communicate with each other. Recent findings have shown that AD is strongly associated with alterations of network connectivity among different brain regions [1][2][3]. In Alzheimer's disease patients, brain areas are poorly associated, and cognitive dysfunction is related to a reduced functional integration [3]. Hence, the brain connectivity patterns can be a useful biomarker to distinguish AD patients from normal controls (NC).
Different methods have been introduced over the years to characterize connectivity among the brain regions, and one of the early methods is the seed-based approach [4]. The seed-based approach chooses a brain region as the seed and finds the temporal correlation between the selected seed to the rest of the brain. Although known to be computationally simple and a more intuitive result analysis, the seed-based approach is dependent on the selection of seeds, making it vulnerable to bias. An application of seed-based connectivity for AD classification from normal controls was conducted in [5][6][7][8], considering the specific brain regions-posterior cingulate cortex, middle temporal gyrus, entorhinal cortex and hippocampus-that are highly affected by AD. The findings from these studies show that the brain connectivity in these regions carries the key features for the identification of AD from normal controls. However, seed-based analysis has the advantage of displaying the network of regions that are the most functionally related to the region of interest. This interpretation is simple and appealing to many experts. The fundamental constraint of this method is noise created by other structural spatial resting networks influenced by head movements or scanner-induced distortions.
In general, conventional methods for diagnosing AD and NC are developed using positron emission tomography (PET) and cerebrospinal fluid (CSF) [9]. The use of CSF as an AD biomarker is not economical and the interpretation of results is challenging and complex [10]. Recent years have seen a tremendous increase in AD-related research utilizing other brain imaging modality, including electroencephalography (EEG) [11,12], which suffers from a low spatial resolution due to the smaller number of electrodes used, functional magnetic resonance imaging (fMRI) and structural magnetic resonance imaging (sMRI) [13]. In comparison to EEG, with a better spatial resolution, fMRI signals quantify brain activity based on the changes in oxygenation, blood volume and flow. In contrast to the structural MRI, which mostly reflects the brain tissue information, the fMRI focuses on functional brain activities and provides a more direct measurement on the involvement of different brain regions in certain brain activities [14].

Alzheimer's Disease Classification-Related Work
Early diagnosis of Alzheimer's disease (AD) can be beneficial for the diagnosed individuals and their caregivers. Among the benefits is a better prognosis, which can facilitate earlier treatment and allow for the arrangement of specialized social care and counseling for the patients and their family members. For this reason, different methods were proposed over the years for the analysis and classification of AD patients from NC using either structural MRI, blood oxygen level-dependent (BOLD) time-series fMRI signals or a combination of brain imaging modality.
The Alzheimer's Disease Neuroimaging Initiative (ADNI) database is one of the most commonly used datasets for the classification of AD, mild cognitive impairment (MCI) and normal controls (NC) [15][16][17][18]. In [16], Heung et al. developed a deep learning-based classification algorithm for the diagnosis of AD and MCI. The approach blends sparse regression models with a deep neural network, with the convolutional neural network (CNN) using the forecasts from various regression models as feedback for making final clinical decisions. The ADNI provided an MRI dataset of 805 subjects, and the work reported a 91.02% classification accuracy. In [17], Esmaeilzadeh et al. proposed a 3D CNN to address the issue of a small number of available labeled subjects, reporting a high dimensionality of neuroimaging data for the diagnosis of MCI/AD, with an accuracy of 94.1%. In another work, Jack Albright [18] suggested a neural network-based approach for forecasting the progression of Alzheimer's disease with a 0.866 multi-class area under the curve (AUC), and the trained model was effective in assessing the progression of Alzheimer's disease in patients who were cognitively stable at the start and in patients who had moderate cognitive dysfunction.
A technique based on structural and metabolic connectivity was used by Zheng et al. [15] to distinguish between AD and MCI patients. Using multi-modal images, both structural and metabolic connectivity was obtained, and MRI and positron emission tomography (PET) were used, which represent the high-order morphological and metabolic interactions in the network. The proposed method achieved a 79.37% accuracy in predicting MCI-to-AD progression and demonstrated the good potential of multi-modal connectivity biomarkers for early AD diagnosis. In [19], Castellazzi et al. proposed a method based on multiple regional metrics from rs-fMRI and diffusion tensor imaging (DTI) as input features to a classifier for the automatic identification of AD from vascular dementia (VD). The method was tested on 33 AD and 27 VD using an adaptive neuro-fuzzy inference system (ANFIS) and yielded a correct prediction rate of 77.33%. In another work, a tensor-based framework for rs-fMRI classification achieved an accuracy of 86% [20]. The proposed algorithm utilized a novel connectivity pattern and has boosted the classification of early-stage AD.
The analysis of the brain network is an effective way of defining brain topological organization, which has been extensively used in the investigation of mental disorders [15,21]. It is indicated in [21] by Frank et al. that functional connectivity can be used to classify Alzheimer's disease and to identify its distinguishing features. Another significant measure to observe and visualize brain functions is an efficient imaging modality to evaluate how structurally separated and functionally specialized brain networks are linked, especially using rs-fMRI [22], which reflects spontaneous BOLD time-series signal fluctuations when a subject is not undertaking any explicit tasks. The use of BOLD time series rs-fMRI signals and deep learning techniques for the diagnosis of AD is reported in [22,23]. In [22], Ju et al. used the functional connectivity of rs-fMRI as an input to an autoencoder network for distinguishing normal ageing from MCI, which is an early stage of AD. Wang et al. used permutation entropy to investigate the complexity of rs-fMRI signals in MCI and AD patients [23]. From MCI to AD, the permutation entropy has been shown to decrease. This finding shows that rs-fMRI signal complexity analyses can be used to characterise cognitive impairments in MCI and AD.
Several studies have shown that global artifacts coming from motion and other physiological factors influence brain connectivity [24][25][26], and several approaches have been undertaken to decrease the effects of these artifacts. Global signal regression is utilized in [24] to reduce global artifacts caused by motion and respiration. The proposed regression method strengthens the associations between the functional connectivity of resting state signals and most behavioral measures of young healthy adults. A regression technique was proposed by Rasmus et al. to remove breathing variation-related fluctuations from neuronal activity using a simultaneous recording of heartbeat signals [25]. Monofractal and multifractal dynamics in fMRI have been studied by Jang et al. [26], and Wink et al. [27] introduced a feature extraction method for task-based fMRI recording in classifying fMRI volumes using a deep neural network.

Fractal Behavior of rs-fMRI Signals
One of the main objectives of resting-state neuroimaging is to detect the physiological mechanisms of resting-state brain imaging data accurately. However, this is not straightforward as non-neuronal physiological influences have a significant effect on the resting state signals and these influences need to be taken into consideration in the development of the classification algorithm for AD. It is established in [28][29][30][31][32][33][34][35] that resting-state fMRI signal follows fractal behavior, also known as long-range dependence, and they exhibit selfsimilarity and power-law scaling properties in the time and frequency domain, respectively. The self-similarity property manifests in the form of a slow decaying autocorrelation of the resting-state fMRI time-series signal. In the frequency domain, fractal properties exhibit in the form of a 1/f power spectrum. This concept is illustrated in Figure 1. In neuroimaging signals, the fractal nature may originate from several sources, such as cardiac fluctuations [36], respiration [25], vascular changes [37] and system noise. It is established that fractal behavior will affect the functional connectivity [35,38,39]; hence, extracting fractal-free signal connectivity matrices is paramount for the accurate classification of AD using rs-fMRI signals. This fractal behavior can also be observed in time-series prediction and optimization [40,41]. The connectivity matrix of fractal-free signals computed via the wavelet transform of the long memory process of rs-fMRI signals is known as nonfractal connectivity [35]. It provides a better depiction of the brain's neuronal activities as it cancels the effect of functional connectivity from fractal behavior. Nonfractal connectivity is a correlation of short-term memory signals and is independent of fractal behavior.
The concept of fractal and nonfractal functional connectivity of rs-fMRI was introduced in 2012 by You et al. [35] and was implemented in the analysis of an rs-fMRI recording of a rat. The paper provides a representation of the fractal behavior of neuroimaging signals based on the fractionally integrated process (FIP) model. Fundamentally, the neuroimaging signal is interpreted in the FIP model as the output of a long-memory (LM) filter whose input is a nonfractal signal. In other words, through long memory filtering, a nonfractal signal is translated into a neuroimaging signal with fractal behavior, as illustrated in Figure 2. To date, the nonfractal functional connectivity has not been used to distinguish AD from NC subjects. The proposed research makes the following contributions and is summarized as follows. This study uses nonfractal connectivity to create a classification algorithm for AD vs. NC individuals. The method is projected to outperform standard correlation in classification because nonfractal connectivity gives a more accurate depiction of the brain's neural activity [35]. The connectivity of 116 AAL brain areas is estimated, yielding a 116 times 116-connectivity matrix per participant. p-value analysis is used to find significant connections, which will be fed into machine learning classifiers. A comparison between Pearson correlation and fractal connectivity is made on multiple classifiers to demonstrate the excellent classification performance provided by nonfractal connectivity. It outperformed the fractal and Pearson-based connectivity techniques by 16.4% and 17.2%, respectively. The proposed method is unique because it uses the nonfractal functional connectivity of fractal-free rs-fMRI data to distinguish between AD and NC subjects. This is the first time fractal-free rs-fMRI data have been used to distinguish Alzheimer's patients from healthy controls. The value of the proposed approach is demonstrated using two independent datasets with distinct acquisition protocols, proving the high performance of the nonfractal connectivity measure. The nonfractal method has been used in the analysis of rs-fMRI data to date [35,38,39], but no investigation into neurological disorder classification has been performed.
The rest of the paper is structured as follows. Section 2 presents the long memory process for modeling the fractal nature of rs-fMRI data. Methods of the investigation, details on the dataset and the principles of the wavelet-based fractal analysis of brain connectivity are described in Section 3. In Section 4, results of the investigation on rs-fMRI in classifying the rs-fMRI signal for the detection of AD from HC are presented, including the statistical analysis, feature selection and performance of the classification algorithm tested on several machine learning classifiers.

Long Memory Model of rs-fMRI Signals
In the following section, we cover univariate and multivariate fractionally integrated processes (FIPs), which include fractionally integrated noise (FIN), fractional Gaussian noise (FGN) and the auto-regressive fractionally integrated moving average (ARFIMA).

Univariate Case
The output of the linear LM filter is r(t), and it is a real-valued discrete process of length N. Then, the input to the LM filter is the spontaneous neural activity that exhibits short-term memory, where m(t), which has spectral density M m ( f ), is where a ∈ R is the Hurst exponent calculated from wavelet-based fractal estimation and the back-shift operator is defined as P. The a parameter controls the fractal behavior, where, if 0 < a < 1/2, the r(t) is anti-persistent and exhibits short memory, whereas if a > 1/2 , the r(t) is persistent and exhibits long memory. On the other hand, the white noise process will have a = 0 [42][43][44].
In essence, the output of the linear LM filter, r(t), which is the convolution of the spontaneous neural activity, m(t), with the filter impulse response, h(t), given as where If − 1 2 < a < 1 2 , the spectral density of r(t) can be given as Here, the term |1 − e −j f | −2a in Equation (4) represents the fractal component of r(t), whereas the S m ( f ) represents the nonfractal component of r(t) [35].

Multivariate Case
The univariate model of long memory can be enhanced to a multivariate case. A real-valued v-vector process R(t) is given by where M(t) = (m 1 (t), . . . , m v (t)) represents a multivariate stationary process and its spectral density . For − 1 2 < a l < 1 2 , the spectral density of M is given as where Providing, 0 < a l < 1 2 for 1 ≤ l ≤ v, R(t) is considered to be a stationary long-memory process having memory parameter a = [a 1 , . . . , a v ]. Assuming M(t) to be a vector of the auto-regressive moving average (ARMA) process, R(t) eventually becomes a multivariate ARFIMA process. However, on the other hand, if M(t) is a vector i.i.d random variable, i.e., R(t) becomes a multivariate FIN. Now, the cross-spectral density of r x (t) and r y (t) is given as where η x,y is the (x, y)-th element of ∑ m .

Methodology
In Figure 3, the proposed AD vs. NC classification algorithm employing nonfractal connectivity of rs-fMRI data is shown. Initially, the raw data were pre-processed using a conventional method. The AAL atlas was then used to extract time series from 116 regions, followed by the generation of connectivity matrices. Before being fed into the machine learning (ML) classifier, the connectivity matrices must be statistically analyzed to minimize the feature vector. The evaluation of the nonfractal classification algorithm was compared to fractal connectivity and Pearson correlation as a baseline comparison. A 10-fold crossvalidation framework was used for parameter optimization and classification.

Description of Dataset
The rs-fMRI datasets obtained from three sites were used in this study, abbreviated as XH [45], SLF [46] and ADNI [47]. The demographic and physiological information of the subjects for each site is summarized in Table 1 and the data acquisition protocol of the MRI scanner is given in Table 2. The first one is the XH dataset from the neuropsychological research facility at Xuanwu Hospital, Beijing, China [45], having a total of 56 subjects, comprising 21 normal controls (NC) and 35 AD subjects. Among the 21 healthy volunteers, there were 7 males and 14 females, with age of 65.0 ± 8.1 years, whereas, among the 35 AD, there were 17 males and 18 females, with age of 65.8 ± 8.3 years. For a balanced class size of AD and NC, 31 AD and 21 NC were selected from the XH, with an additional 10 NC from the second dataset from Santa Lucia Foundation (SLF) [46], making the total number of NC 31. The combined rs-fMRI dataset of XH and SLF are denoted as XHSLF dataset in the Result and Discussion section.
The third dataset was obtained from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database [47] consisting of 60 subjects, with 30 from AD (average age 72.81 years) and 30 from NC (average age 75.25 years). Patients with AD have MMSE score of 20-26 and CDR of 0.5-1, whereas NC patients have MMSE score of 24-30 and a CDR of 0. The idea behind taking 30 subjects from each class for ADNI is based on the equal number of time points. Basically, ADNI contains fMRI data with different number of time points, i.e., 140, 197, 200, etc. We took those subjects that had 140 time points. However, other subjects with different number of time points can also be considered in the future studies.

Data Pre-Processing
For the analysis of fMRI data, pre-processing is necessary. These steps are crucial in making the analysis legitimate and greatly improving the ability of the subsequent analyses, as it removes unwanted artifacts and transforms the data into a standard format. Data were pre-processed using the connectivity toolbox (CONN) toolbox in the following order: realignment, slice time correction, co-registration, normalization and smoothing. After image pre-processing, time-series were extracted using Data Processing & Analysis for (Resting-State) Brain Imaging (DPABI) toolbox [48].

BOLD Time-Series Signals Extraction
After following the standard procedure of pre-processing, the connectivity matrix was computed using time-series signals from 116 regions described by the AAL atlas. With a signal length of 160 samples, the size of the extracted signal for each subject was 160-sample × 116-region.

Functional Connectivity of rs-fMRI Signals
Based on wavelet fractal analysis, 2 types of connectivity were extracted, namely fractal and nonfractal, as the feature vector for the classification problem. In addition, for base comparison, Pearson correlation was also considered for the classification of AD vs. NC. The detailed calculation of Pearson correlation and the principle of fractal and nonfractal connectivity are given in the subsequent sections.

Pearson Correlation Coefficient
A standardized measure of covariance between two variables is the Pearson correlation coefficient [49]. The Pearson correlation between two variables is calculated as follows: where P A,B shows the Pearson correlation between time series A and B, cov is the covariance and σ A and σ B are the standard deviations of A and B, respectively. The covariance can be calculated by where E[.] is the expectation operator and µ A and µ B are the mean of A and B, respectively. Equation (10) can be re-written as

Wavelet Analysis for Fractal Connectivity
The Pearson correlation provides a measure of similarity between raw BOLD timeseries signals of different brain regions. It is one of the early measures on brain signals' functional connectivity. Using the multivariate long memory model presented in Section 2, two new concepts of functional resting-state connectivity between brain regions-fractal connectivity and nonfractal connectivity-were introduced [35]. The former is defined as the asymptotic wavelet correlation, meaning as the wavelet scale approaches to infinity, whereas the latter is described as the short memory covariance of BOLD time-series signals from a pair of brain regions.
Consider R(t) to be the multivariate FIN process, representing the BOLD time-series signal of interest, with the memory parameter a, and M(t) to be a short memory function of R(t) given in Equation (5). The nonfractal connectivity of r x (t) and r y (t) is described as where η x,y represents the covariance of spontaneous neural activity signals, m x (t), and m y (t); that is, η x,y := E[m x (1)m y (1)].

Wavelet Analysis for Nonfractal Connectivity
By means of the discrete wavelet transform (DWT), the variance of a discrete time series can be broken down across several frequency bands. Consider the wavelet coefficient of the p-th process r p (t) at scale q and time point s to be V p (q, s). The wavelet covariance is given as υ x,y (q) := cor(V x (q, s), V y (q, s)) at scale q. In addition, the coefficient of the wavelet of an FIP is covariance stationary at scale q, and υ x,y (q) is independent of time t.
Finally, the wavelet covariance of r x (t) and r y (t), at scale q is associated with the following cross-spectral density [50] of the wavelet coefficients at scale q and the BOLD time-series signals: The wavelet correlation at scale q and time point s, δ x,y (q) :=cor(V x (q, s), V y (q, s)) is written as Theorem 1 (Asymptotic wavelet covariance). Assume that R(t) is a multivariate FIN process that is i.i.d random variable meeting the criteria, as in Equation (8). Then, the asymptotic wavelet covariance of r x (t) and r y (t) computed at scale q → ∞ is given by where β x,y := 2 cos From (16) and (17), the wavelet correlation of r x (t) and r y (t) as the scale q approaches ∞ is the fractal connectivity of the multivariate FIN, given as In essence, the fractal connectivity of r x (t) and r y (t) is the asymptotic wavelet correlation, δ ∞ x,y . Notably, both nonfractal connectivity in Equation (19) and fractal connectivity in Equation (13) require estimation of memory parameters a x and a y , as well as calculation of short memory covariance, as in Equation (13). Method of estimating the Hurst exponent and short memory covariance matrix of a multivariate FIN via the univariate maximum likelihood method was proposed by You et al. in [35]. The likelihood function for memory parameter a x is given by where the matrix Σ x denotes the covariance matrix of r x (t). The optimal memory parameter a x is obtained via a minimization problem derived from Equation (20) with respect toâ x [35].
The estimation of short-memory covariance for nonfractal connectivity is also obtained by using the linearity of wavelet covariance over scales, given aŝ Figure 4 shows a sample of 116 × 116 AD and NC connectivity matrices for nonfractal, fractal and Pearson.

Statistical Analysis, Feature Reduction and Flattening of Functional Connections
The 160 × 116 time-series signal of one subject will result in a total number of 116 × 116 = 13,456 connections. For accurate classification between the AD and NC, only highly meaningful features were selected using the one-way analysis of variance (ANOVA) of the connectivity values and expressed in terms of p-value. Apart from reducing the length of the input feature vector to ML classifiers, this will also reduce the computational cost of modeling and improve the performance of the model. Since the conventional ML classifiers only accept 1D input type, the reduced feature vector is flattened prior to being input to ML classifiers.

ML Classifier
The final stage in the classification process is to choose a classifier. The support vector machine (SVM) is a supervised machine learning technique that performs well even when feature vectors have many dimensions. The SVM separated the classes using a line in 2D cases, and a plane in higher dimensions using a constraint optimization problem [51]. When the classes were not linearly separable, the feature vector was transformed into a new feature space and used to derive the decision boundary in the original feature space using a kernel mapping method. A linear kernel function was used in this study.

The equation of a linear hyperplane is given as follows
where w is defined as a normal to the hyperplane, the feature vector is given by x and bias is given by b. Values for b and w are obtained by SVM from the training data. For a classifier having a decision boundary, it can be written as: Combining Equations (25) and (26) we obtain The constraint in Equation (27) is that all training data must lie on either side of the support hyperplane. Support vectors are the points that are closest to the separating hyper-plane.
Other classifiers, including KNN, decision trees and bagged trees, were also tested in addition to SVM, and the findings of the top four classifiers are reported in the result section. The best machine learning hyperparameters for SVM, KNN, decision trees and bagged trees were chosen using Bayesian optimization, with cross-validation loss as the objective function.

Performance Evaluation
Let AD be the positive and NC be the negative class. The following performance measures were calculated using the confusion matrix, as shown below.
True positive is TP, true negative is TN, false positive is FP and false negative is FN. In addition to the above metrics, we use the area under the curve (AUC)-receiver operating characteristics (ROC) curve to visualise the classifier's performance. The ROC curve is a probability plot of the true positive rate (TPR), commonly called sensitivity, versus the false positive rate (FPR). The higher the AUC, the better the model distinguishes between AD and NC, implying a high level of separability.

Result and Discussion
The performance of the three connectivity matrices-nonfractal, fractal and Pearsonin classifying AD over NC is evaluated in this section. Results from the best performance classifiers, generated using a 10-fold cross-validation framework, are presented for comparison.
The two datasets, XHSLF and ADNI, are independently evaluated, since the data were recorded using different protocols and have slightly different demographics. Data from 62 subjects from the XHSLF dataset and 30 subjects from the ADNI dataset are used to generate the 116 × 116 connectivity matrices for each subject, where only the significant connections with p ≤ 0.05 are selected using p-value analysis as the input feature to ML classifiers.The coding and training of the classifier models were run on MATLAB 2020a on a ninth generation Intel i7-9700 Processor CPU operating at 3 GHz frequency.

Statistical Analysis
To distinguish between AD and NC, the p-value analysis is used to examine the significance connections for the three connectivity matrices: the nonfractal, fractal and Pearson correlation. Figure 5 shows the p-values of 13456 connections for the nonfractal, fractal and Pearson correlation for the datasets XHSLF and ADNI. Using the confidence level of 95%, the total informative connection of XHSLF is 820, 3115 and 6054 whereas, for the ADNI dataset, the values are 630, 2168 and 7066, for the nonfractal, fractal and Pearson correlation, respectively. These functional connections will be used as the input vectors to ML classifiers. Relative to nonfractal, the fractal and Pearson correlation contains more significant connections, producing higher dimensional feature vectors.

Classification of AD vs. NC Based on Nonfractal Connectivity
The results presented in this section are arranged in the following manner. Firstly, the classifications based on nonfractal, fractal and Pearson connectivity are experimented on using several classifiers, and evaluated in terms of accuracy. Secondly, using the best classifier as determined from the first experiment, apart from the accuracy, the classification model is further evaluated, in terms of sensitivity, specificity, precision, FPR and AUC. Both experiments are conducted using a 10-fold cross-validation framework.

Selection of the Best ML Classifier
In the first experiment, after testing with several classifiers, the performances of the three best individual classifiers and the best ensemble classifier are listed in Table 3. The length of the feature vector for nonfractal, fractal and Pearson connectivity is based on the length of the reduced feature, as discussed in Section 4.1 and shown in Figure 5. The parameters for the classifiers are as follows: The KNN classifier is trained using 11 neighbors and the cosine distance metric. The decision tree employed Gini's diversity index split criteria. The bagged tree employed 30 learners, with the weighted average rule as the basis for the choice. Finally, employing the linear kernel function, the best performance of the SVM was achieved.
Many classifiers were tested in the search for the best one. Here, only classification accuracy values for the three best non-ensemble and the best ensemble methods are presented in Table 3. Clearly, for most of the classifiers and for both datasets, nonfractal connectivity gives a better classification accuracy than fractal and Pearson. The best performance by nonfractal connectivity is 90.3%, whereas fractal and Pearson connectivity are at 82.3% and 72.6%, respectively. Since the SVM gives the best average performance for nonfractal connectivity across the two datasets, the subsequent investigations will be based on the SVM classifier.

Evaluation of Significant Functional Connections
To further substantiate the good performance of the reduced nonfractal connectivity over fractal and Pearson connectivity, the SVM classifier is tested with the same length of the feature vector, equal to the number of nonfractal functional connections. This means that, for the XHSLF dataset, the length of the nonfractal, fractal and Pearson connectivity is set at 820, and, for the ADNI dataset, it is set at 630.
The classification accuracy using the SVM classifier at a fixed length of the feature vector is reported in Table 4. As expected, the nonfractal-based approach consistently gives the highest accuracy compared to the fractal and Pearson approaches. Notably, the highest accuracy of the nonfractal approach is recorded using 10-fold cross validation, at 90.3%, which is approximately 1.24 and 1.75 times higher than the fractal and Pearson approaches, respectively, for the XHSLF dataset, and 1.31 times higher for the ADNI dataset. The p-value of the top 820 connections of the XHSLF dataset for fractal and Pearson connectivities is at 0.007 and 0.009, respectively, whereas, for the ADNI dataset, the p-value of the top 630 connections for fractal and Pearson connectivites is at 0.0012 and 0.0006, respectively. This means that, at the same length as the nonfractal connection, the significant connections of fractal and Pearson have higher confidence levels. However, even with higher confidence levels, the fractal and Pearson connectivities were not able to result in a good accuracy like the nonfractal connectivity. This is possibly due to the better representation of the brain's neuronal activity by the nonfractal connectivity, relative to the fractal and Pearson connectivities.

Classification of AD vs. NC Using SVM
In this section, the results of a detailed evaluation using the best classifier, SVM, based on nonfractal, fractal and Pearson connectivity measures are presented. The length of the feature vector for each connectivity measure and dataset are determined, as in Section 4.1.
The detailed evaluation of the classification algorithm using nonfractal connectivity and SubEn includes assessments on accuracy, sensitivity, specificity, precision, FPR and AUC. Using a 10-fold framework, the results of the assessment are presented in Table 5.
Here, only the significant connections are used as the input vectors of the classifier. The SVM with a linear kernel is used as it performs well with high-dimensional functional connectivity data. Table 5. Sensitivity, specificity, accuracy, precision, FPR (in %) and AUC of classification of AD vs. NC based on functional connectivity of rs-fMRI and support vector machine for dataset XHSLF and ADNI. For the XHSLF dataset, the nonfractal connectivity achieves the highest accuracy of 90.3%, with an 87.87% sensitivity and 93.1% specificity, an AUC value of 0.98 and an FPR of around 6.89%, demonstrating its overall good performance relative to benchmark methods, and fractal and Pearson connectivity. For the ADNI dataset, the performance of nonfractal connectivity is slightly lower but maintained at 83.3% accuracy. Apart from the performance measures presented in Table 5, the plots of the confusion matrix for both the XHSLF and ADNI dataset are also shown in Figure 6. With nonfractal AUC values of 0.99 and 0.95 for the XHSLF and ADNI datasets, respectively, this indicates that the model yields a high true positive rate and low false positive rate. The closer the AUC value is to 1, the better the model isat predicting AD as AD and NC as NC. For further demonstration on the excellent performance of the classification using nonfractal connectivity, the receiver operator characteristics (ROC) curve is generated for the XHSLF and ADNI datasets, as shown in Figure 7. The ROC shows the relationship between sensitivity and specificity, as illustrated in the ROC curve analysis, resulting in an area under the curve (AUC) of 0.99 for the nonfractal, 0.86 for the fractal and 0.73 for the Pearson correlation for the XHSLF dataset. Similarly for the ADNI dataset, an AUC for the nonfractal is 0.95, 0.72 for the fractal and 0.77 for the Pearson correlation. The higher value of AUC for the nonfractal connectivity indicates the excellent classification performance of the proposed classification method.

Investigation on the Proposed AD Classification Using XHSLF+ADNI Dataset
The proposed approach is tested using a combined XHSLF and ADNI dataset with 122 subjects in this section. The combined dataset resulted in 61 subjects in each AD and NC class. Using p-value analysis, significance connections for the nonfractal, fractal and Pearson connectivities are determined at a 95% confidence level. The plot of p-values for 13,456 connections for the nonfractal, fractal and Pearson correlation for XHSLF+ADNI is shown in Figure 8. The number of significance connections for the nonfractal, fractal and Pearson correlation follows a similar trend as in Section 4.1 for XHSLF and ADNI. Specifically, relative to the nonfractal, the fractal and Pearson correlation contains more significant connections, producing higher dimensional feature vectors as input vectors to ML classifiers. The proposed nonfractal-connectivity-based method obtained the highest accuracy of 90.2% using KNN with a linear kernel. However, the two other methods, namely the fractal and Pearson connectivity, achieved the highest accuracy of 73.8% with KNN and 73% with linear SVM, respectively. An evaluation was conducted in terms of sensitivity, specificity, accuracy, precision, FPR and AUC using different ML classifiers, as presented in Table 6. For the detailed evaluation, the findings based on nonfractal, fractal and Pearson connectivities are presented in Table 7. It can be seen that, when the number of subjects was increased to 61 per class, it still maintained a good classification accuracy of >90%. In fact, this value outperformed the two benchmark approaches, surpassing them by a margin of 16.4% and 17.2%, respectively. Notably, the rs-fMRI signal is highly affected by several factors mentioned earlier. The oscillations and disruptions created by these factors affect the functional connectivity of the brain. They failed to give the pure neuronal brain activity, so the already available approaches do not reach a good classification accuracy, such as seed-based and ICA-based approaches. The signal that is not free from fractal behavior represented a poor accuracy, and the Pearson-based method also failed to perform effectively as the rs-fMRI signal contains the oscillations and noise. Moreover, it can be observed that the proposed method showed promising results on the individual and combined dataset, showing the excellent potential of the proposed method.

Comparison with Related Works
In Table 8, the findings of this study are compared to previous studies in terms of their accuracy, sensitivity and specificity. In Table 8, it is clear that, for the XHSLF dataset, the nonfractal approach outperformed the previous methods by Frank de Vos et al. [21] by 16.16% in terms of classification accuracy. For the AUC comparison, it surpassed the method by a margin of 0.14, respectively. However, the performance of the method for ADNI dataset is 4.3% higher than Frank de Vos et al. [21]. The lower accuracy for the ADNI dataset may arise from the lower number of subjects than that of the XHSLF dataset. Furthermore, the work by Frank de Vos et al. [21] used 31 different features comprising functional connectivity matrices of several brain networks and fast eigenvector centrality mapping of the amplitude of low frequency fluctuations (ALFF) as an input vector to logistic regression for the classification of AD from NC. Since the method combines the feature vector from each resting state measure, the length of the combined feature vector is very large and causes a low classification accuracy.
Another machine learning method for the early diagnosis of Alzheimer's disease was proposed by Kasani et al. [52], which was based on correlation of the demographic and neuropsychological information. Their reported accuracy of 82.75% for AD vs. NC classification was achieved using a bagging technique. The correlation of the demographic and neuropsychological information may not provide enough information for more promising results in AD classification.
In addition to this, the recently published work proposed by Zhu et al. [53] used seed correlation for the diagnosis of AD and showed that the functional connectivity between the hippocampus and other brain areas is altered in AD. Their proposed method achieved an accuracy of 82.02% for the classification of AD from NC subjects. The seed-based method suffers from the selection of the seed, which requires prior information and may lead to biases, hence not giving a good classification rate.

Conclusions
This study uses the connectivity of nonfractal components of rs-fMRI signals to develop a classification method for AD vs. NC individuals. This is primarily in order to use the fact that nonfractal connectivity better represents spontaneous neural activity. In essence, nonfractal connectivity excludes the fractal components originated from system noise and other human physiological systems, such as respiratory functions, giving more accurate features of the brain signal for classifiers to learn in the discriminating of AD from normal control subjects. As a result of the limited number of training samples relative to a large number of features (116 × 116 = 13,456), the development of the classification algorithm can be challenging. Using significant connectivity values selected based on p-value analysis, out of 13,456 connections, 820 for the XHSLF dataset and 630 for the ADNI dataset are selected as the input of a SVM classifier. Our experimental results indicate that the nonfractalbased method achieved a 90.3% and 83.3% accuracy for the XHSLF and ADNI dataset and outperformed fractal connectivity and Pearson correlation by 8% and 17.7% for the XHSLF dataset and 11.63% and 13.3% for the ADNI dataset, respectively. The proposed method also performs better in terms of accuracy, sensitivity, specificity and AUC values when compared to similar published research. The findings of this study indicate the great potential of using the nonfractal connectivity as the biomarker for the diagnosis of AD. However, further investigation needs to be conducted with a larger dataset before it can be used in clinical applications.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript: