Development of a Machine Learning Model to Discriminate Mild Cognitive Impairment Subjects from Normal Controls in Community Screening

Background: Mild cognitive impairment (MCI) is a transitional stage between normal aging and probable Alzheimer’s disease. It is of great value to screen for MCI in the community. A novel machine learning (ML) model is composed of electroencephalography (EEG), eye tracking (ET), and neuropsychological assessments. This study has been proposed to identify MCI subjects from normal controls (NC). Methods: Two cohorts were used in this study. Cohort 1 as the training and validation group, includes184 MCI patients and 152 NC subjects. Cohort 2 as an independent test group, includes 44 MCI and 48 NC individuals. EEG, ET, Neuropsychological Tests Battery (NTB), and clinical variables with age, gender, educational level, MoCA-B, and ACE-R were selected for all subjects. Receiver operating characteristic (ROC) curves were adopted to evaluate the capabilities of this tool to classify MCI from NC. The clinical model, the EEG and ET model, and the neuropsychological model were compared. Results: We found that the classification accuracy of the proposed model achieved 84.5 ± 4.43% and 88.8 ± 3.59% in Cohort 1 and Cohort 2, respectively. The area under curve (AUC) of the proposed tool achieved 0.941 (0.893–0.982) in Cohort 1 and 0.966 (0.921–0.988) in Cohort 2, respectively. Conclusions: The proposed model incorporation of EEG, ET, and neuropsychological assessments yielded excellent classification performances, suggesting its potential for future application in cognitive decline prediction.


Introduction
Alzheimer's disease (AD) is the most common neurodegenerative brain disease that affects 50-70% of patients with cognitive impairments over the age of 65 [1]. AD pathology leads to an irreversible deterioration in cognitive functions such as loss of memory, executive dysfunction, and attention disorders [2][3][4]. Mild cognitive impairment (MCI) refers to the intermediate period between the typical cognitive decline of normal aging and the more severe decline associated with dementia (e.g., AD) [5][6][7]. Because of the irreversibility of AD, it is of great value to screen MCI subjects at the community level [5,8,9].
Currently, biochemical tests (e.g., Cerebrospinal Fluid and Blood) and neuroimaging tests (e.g., Magnetic Resonance Imaging,) were considered efficient screening tools for MCI [10][11][12]. However, these techniques were usually invasive and expensive, restricting large-scale screening applications in the community [13,14]. Therefore, an effective and low-cost detectable approach to cognitive decline in MCI is urgently required.
Recently, MCI screening has attracted immersive interests. A Neuropsychological Tests Battery (NTB) is well recognized in the diagnostic pipelines of preclinical AD [15]. Multiple preclinical neuropsychological measures significantly predicted progression to AD from MCI and detected changes in patients in verbal and visual memory, visuospatial processing, error control, and subjective neuropsychological complaints [16]. Paul et. al. confirmed that neuropsychological tests quick-MCI to assess cognitive status in 3-5 min and can discriminate MCI accurately in primary care [17]. Neuropsychological tests were clearly appropriate for MCI community screening, as are emerging cognitive assessments such as electroencephalogram (EEG) and eye tracking (ET) to monitor cognitive function. Murty et al. found that stimulus-induced gamma rhythms from EEG were significantly lower in MCI/AD subjects compared to their age-and gender-matched controls, suggesting that gamma of EEG could be used as a potential screening tool for MCI or AD in humans [18]. Oyama et al. developed a brief cognitive assessment utilizing an eye-tracking technology that can enable quantitative scoring and the sensitive detection of cognitive impairment in patients with mild cognitive impairment and dementia [19]. Nie et al. found that eye movement parameters are stable indicators to distinguish patients with MCI and cognitively normal subjects and are not affected by different testing versions and numbers [20]. The incorporation of neuropsychological tests and physiological measurements warrants further study as a practical and cost-effective method for wide-scale screening for identifying older adults who may be at risk for pathological cognitive decline. Neuropsychological tests might be limited in their effectiveness in MCI screening while acknowledging that neuropsychological tests are inadequate for making a definitive diagnosis. To increase the precision and sensitivity of MCI screening, several researchers incorporated NTB into objective physiological measures, such as prefrontal EEG [21] and ET [22]. For instance, our previous work validated the feasibility of physiological measures using EEG and ET in distinguishing MCI from HC, with a classification accuracy of 81.5% [23].
In addition, with the development of artificial intelligence techniques, machine learning (ML) methods have been widely used for the differential diagnosis of MCI [15,[23][24][25]. For example, Lin et al. developed non-invasive clinical variables and ML classifiers, including Support Vector Machine (SVM), Logistic Regression (LR), and Random Forest (RF), to achieve over 75% classification accuracy to classify subjects who converted to MCI from normal within four years [25]. Yim et al. proposed a ML algorithm to identify cognitive dysfunction based on neuropsychological tests including the Montreal Cognitive Assessment (MoCA). The results showed a good classification performance between cognitive impairment and normal subjects [15]. However, there were few models using neuropsychological tests, physiological tests, and ML algorithms in the previous studies.
This study aims to propose and validate a novel and low-cost screening model consisting of neuropsychological tests, physiological tests, and ML algorithms. Importantly, to evaluate the robustness of the model, two independent cohorts were used in this study. Figure 1 shows the flowchart of the proposed model, which was composed of four steps: data collection, data preprocessing, feature extraction and selection, and classification based on ML classifiers. These steps were described in detail as the following:

Data Collection
EEG, ET, neuropsychological test (Table S1), and demographic data (age, gender, and education) were selected as the inputs of the model. Details of the data collection step were described in our previous study [23] and provided in the Figure S1 of Supplementary Material.

Data Preprocessing
This model included an automatic data preprocessing step for EEG, ET, and NTB.

EEG Preprocessing
Invalid EEG data was first removed according to whether the EEG electrode was offset. Next, the power frequency noise, electromyogram signal, electrocardiogram signal, and other external noises were removed using a band stop filter and a band pass filter. Simple second-order Butterworth filtering was applied with a passband of 0.5-30 Hz. Finally, we overlapped 60% of the EEG data by applying a 5 s moving window, providing 15 overlapping segments for each subject. The EEG signal was preprocessing using EE-GLAB toolbox implemented in MATLAB 2018a(Math Works Inc., Sherborn, MA,USA).

Data Collection
EEG, ET, neuropsychological test (Table S1), and demographic data (age, gender, and education) were selected as the inputs of the model. Details of the data collection step were described in our previous study [23] and provided in the Figure S1 of Supplementary Material.

Data Preprocessing
This model included an automatic data preprocessing step for EEG, ET, and NTB.

EEG Preprocessing
Invalid EEG data was first removed according to whether the EEG electrode was offset. Next, the power frequency noise, electromyogram signal, electrocardiogram signal, and other external noises were removed using a band stop filter and a band pass filter. Simple second-order Butterworth filtering was applied with a passband of 0.5-30 Hz. Finally, we overlapped 60% of the EEG data by applying a 5 s moving window, providing 15 overlapping segments for each subject. The EEG signal was preprocessing using EEGLAB toolbox implemented in MATLAB 2018a (Math Works Inc., Sherborn, MA, USA).

ET Preprocessing
First, excessive noise from ET data was eliminated. Next, the gaze position signal was normalized to the display coordinates to avoid the interpolation bias. Finally, a low pass Butterworth filter with a cut-off frequency of 5 Hz was implemented in MATLAB 2018a (Math Works Inc., Sherborn, MA, USA).

NTB Data Preprocessing
NTB data were cleaned, and all abnormal values were eliminated. Finally, neuropsychological test scores were normalized into 0-1.

EEG Data
Frequency-domain and spectral-domain features of the EEG signal were extracted. A Fourier transform of the autocorrelation function was employed to transform the EEG signal from time-domain to frequency-domain to get the power spectral density. Four EEG frequency bands (delta 0.5-4 Hz, theta 4-8 Hz, alpha 8-13 Hz, and beta 13-30 Hz) were filtered in this study. The power spectrum of each frequency band and specific spectral power ratios like the alpha/theta power ratio was computed. The extracted linear features of the EEG were consistent with our preliminary work [23]. Nonlinear features of the EEG, including approximate entropy (ApEn) [26], Multiscale entropy (MsEn) [27], and Lempel Ziv complexity (LZC) were calculated [28]. The calculation formulas of the EEG features were described in the section of Feature extraction and selection of Supplementary Material.

ET Data
ET data was divided into saccade data and gaze data. The association of gazes and saccades with specific regions on visual stimuli was examined. Then, visual scan parameters such as blink frequency, blink time, fixation time, and sustained attention duration were calculated. The nonlinear features of ET were extracted by LZC.

NTB Data
NTB data, which are numerical, included subtest scores, total test scores, and response time. Meaningful numerical features were subsequently converted to z-scores using Z transformation.

Feature Selection
The Minimum Redundancy-Maximum Relevance (MRMR) algorithm was used for feature selection [29]. In the MRMR algorithm, the correlation between different feature subsets is modeled as: where the feature subset Ω is from the feature set F and F = { f 1 , . . . , f D }. In this tool, m = {+1, −1} represents HC and MCI respectively and M is the mutual information between the feature subset and the target classes which is given by where p(X), p(Y), p(X, Y) are the marginal probability distributions and joint probability distributions of variable X, Y respectively. Clearly, the mutual information comes to zero when p(X, Y) = p(X)p(Y), which states that the feature is independent with the target classes. The redundancy between the feature f i and other features can be modeled as: Thus, the feature meeting the minimum redundancy-maximum correlation principle can be obtained via: In the above equation, the optimal features can be obtained by maximizing the correlation between the features and the target classification and minimizing the redundancy between the features. By performing similar operations on different feature subsets, multiple optimal features can be found to reduce the complexity and improve the algorithm decision performance.

Classification
A support vector machine (SVM) was used as the ML classifier with Anaconda Spyder 3.7 (Anaconda Inc., Austin, TX, USA). As a classic supervised learning method, SVM has been widely used in statistical classification and regression analysis due to its ability to map vectors linearly to a higher dimensional space that creates a maximum margin hyperplane to achieve high classification performance.
Support vectors maximize the margin of the classifier by changing the position and orientation of the hyperplane. Kernel functions of SVM or "kernel trick" by SVM were applied to remedy the issue that the points are not separable linearly due to the position of the data. Kernel trick involves the transformation of the existing algorithm from a lower-dimensional data set to a higher one. The amount of information remains the same, but in this higher dimensional space, it is possible to create a linear classifier. Several K kernels are assigned to each point which then helps determine the best fit hyperplane for the newly transformed feature space. With enough K functions, it is possible to get precise separation.
Linear SVM classifier with hard margin: Kernel trick equation minimizing W subject to:

Subjects
We recruited two cohorts for this study. Cohort 1 was composed of 336 subjects from four communities in Jiading district, Shanghai, China, including 152 MCI patients and 184 normal controls (NC) subjects. Cohort 2 was composed of 44 MCI patients and 48 NC subjects from one community in Baoshan district, Shanghai, China. All subjects also underwent a battery of cognitive evaluations, including Addenbrooke's Cognitive Examination-revised (ACE-R) and Montreal cognitive assessment-basic (MoCA-B). The permission of MoCA-B in the study was received via https://www.mocatest.org/permission (accessed on 28 June 2017).
All subjects signed an informed consent before the examinations. This study has been approved by the ethics committee of Long Hua Hospital in Shanghai University of Traditional Chinese Medicine (Ethical number: 2017LCSY345) and conducted in accordance with the principles of the Declaration of Helsinki. In this study, Cohort 1 was used as the training and validation group to train the SVM classifier. Cohort 2 was used as an independent test group to verify the robustness of the classification results.
MCI was defined by an actuarial neuropsychological strategy proposed by Jak and Bondi [30], subjects were considered to have MCI if they met any of the following three criteria and neglected to meet the criteria for dementia. The inclusion criteria for MCI were as follows [31,32]: (1) right-handed, and Mandarin-speaking subjects; (2) a subjective memory complaint; (3) memory impairment relative to age and education-matched healthy elderly individuals confirmed by performance on neuropsychological assessments (below 1.5 standard deviations); (4) intact general cognitive function confirmed by MoCA-B scores ≥ 26; (5) intact activities of daily living; and (6) without dementia confirmed by a physician.
Exclusion criteria of MCI were as follows: (1) other neurological diseases including cerebrovascular disease, brain trauma, Parkinson's syndrome, brain tumor, and epilepsy; (2) current major psychiatric disease such as severe depression and anxiety; (3) other neurological conditions that could cause cognitive decline (e.g., brain tumors, Parkinson's disease, encephalitis, or epilepsy) rather than AD spectrum disorders; (4) systemic diseases that may lead to cognitive decline (thyroid dysfunction, severe anemia, syphilis, or HIV, etc.); (5) other conditions such as a history of CO poisoning and general anesthesia; (6) severe visual or hearing impairment; (7) contraindication for MRI.
The inclusion criteria for NC included the following: (1) no subjective or informantreported memory decline; (2) non-clinical depression (Geriatric Depression Scale scores < 6); (3) normal age-adjusted, gender-adjusted, and education-adjusted performance on standardized cognitive tests.

Data Acquisition
All data were selected from 1 September 2017 to 31 August 2018 in the communities, Shanghai, China. The data selection protocol has been introduced in the Supplementary Material.

Validation Experiments for Optimal Parameters of the Classifier
We adjusted the hyper-parameters for the SVM classifier such as kernel function, penalty factor C, and coefficient of kernel function gamma with good classification performance by 5-fold cross-validation. Different kernels, including linear, polynomial, and RBF were compared in this study. Cohort 1 was used to train these parameters.

Discriminative Analysis
The classification results from four models were compared by using the SVM classifier, including (1)

Statistical Analysis
Differences in demographic and cognitive performance between the NC group and the MCI group were evaluated by two sample t-tests or chi-square (χ 2 ) tests of Statistical Package V24 for Social Sciences (SPSS Inc., Chicago, IL, USA). The significance level was set as p < 0.05. Receiver operating characteristic (ROC) curves were used to evaluate the capabilities of the tool in distinguishing MCI from NC. The areas under the curves (AUCs) with 95% confidence intervals (CIs) were calculated.

Demographic and Clinical Characteristics
The detailed demographic and clinical characteristics were reported in Table 1. The results showed that the scores of MoCA-B and ACE-R from MCI patients were significantly lower than NC's scores (p < 0.001, two-sample t-test). There were no significant differences in age (p = 0.875; two-sample t-test), gender (p = 0.541; chi-square test) or years of education (p = 0.071; Wilcoxon rank-sum test) of cohort 1. There were no significant differences in age (p = 0.783; two-sample t-test), gender (p = 0.492; chi-square test) or years of education (p = 0.068; Wilcoxon rank-sum test) of cohort 2 either.

Validation Experiments for Optimal Parameters of Classifier
The best classification performance was obtained under the specific parameters (C = 1.1, GAMMA = 0.001) while the kernel function was set to RBF. Table 2 shows the detailed performance of different kernel functions and corresponding parameters.  Tables 3 and 4 showed comparison results of four models in Cohort 1 and 2, respectively. Classification results showed that the performance of the proposed tool was better than other models (Accuracy: 84.5 ± 4.43%; Sensitivity: 81.9 ± 7.88%; Specificity: 86.8 ± 6.19%; AUC: 0.942 (0.893-0.982)) in Cohort 1. Classification results also showed that the performance of the proposed tool was better than other models (Accuracy: 88.8 ± 3.59%; Sensitivity: 86.2 ± 6.46%; Specificity: 91.0 ± 5.39%; AUC: 0.966 (0.921-0.988)) in Cohort 2. Figures 2 and 3 showed the ROC results of the four models in both cohorts.

Discussion
Cognitive decline remains highly underdiagnosed in the community despite extensive efforts to find novel approaches to detect MCI and find objective screening methods for cognitive decline could improve early MCI diagnosis. MCI screening in the community has become a hot topic nowadays. In light of their excellent performance in detecting a cognitive decline in MCI patients, multimodal detection approaches have been commonly used in computer-aided disease diagnostic fields of community screening. In this study, we proposed a ML model based on EEG, eye movement, and neuropsychological tests for MCI screening at the community level. In contrast to other traditional models,

Discussion
Cognitive decline remains highly underdiagnosed in the community despite extensive efforts to find novel approaches to detect MCI and find objective screening methods for cognitive decline could improve early MCI diagnosis. MCI screening in the community has become a hot topic nowadays. In light of their excellent performance in detecting a cognitive decline in MCI patients, multimodal detection approaches have been commonly used in computer-aided disease diagnostic fields of community screening. In this study, we proposed a ML model based on EEG, eye movement, and neuropsychological tests for MCI screening at the community level. In contrast to other traditional models, such as the EEG-based model, ET-based model, and NTB-based model, the classification

Discussion
Cognitive decline remains highly underdiagnosed in the community despite extensive efforts to find novel approaches to detect MCI and find objective screening methods for cognitive decline could improve early MCI diagnosis. MCI screening in the community has become a hot topic nowadays. In light of their excellent performance in detecting a cognitive decline in MCI patients, multimodal detection approaches have been commonly used in computer-aided disease diagnostic fields of community screening. In this study, we proposed a ML model based on EEG, eye movement, and neuropsychological tests for MCI screening at the community level. In contrast to other traditional models, such as the EEG-based model, ET-based model, and NTB-based model, the classification results of our model outperformed other traditional models.
So far, a lot of studies have focused on the classification of NC and MCI by using machine learning models for screening in primary care. For instance, Siuly [15]; and, Wang et al. developed a Random Forest (RF)-based model to optimize the content of cognitive evaluation and achieved an accuracy of 68% in the classification of MCI and NC [35].
Notably, our classification results were similar to previous studies, indicating the reliability of our results. As shown in Table 5, although previous studies based on EEG analysis performed powerful discrimination for MCI detection (ACC = 98.8% in Siuly's model), it is worth noting that these studies based on expensive and long-term physiological signal collection devices are seldom used in primary care. By contrast, the wearable EEG device used in our approach was more suitable for large-scale MCI screening. In contrast to earlier studies based on ET and NTB, our method achieved better accuracy. Additionally, the advantages of our method were also summarized as follows: (1) In terms of feature extraction, the linear and nonlinear feature analysis has been successfully used to identify the powerful biomarkers of neurophysiological diseases, such as Alzheimer's disease (AD). In this study, we applied both linear and nonlinear methods to extract EEG and eye movement features. For EEG, complexity analysis as a nonlinear dynamic method can represent the rate of new patterns appearing in a time series, and to a certain extent, details of the signal can be presented in the binarized sequence. (2) In terms of feature selection and classification, the SVM model was selected. As a ML model, the SVM is suitable for classifying the features obtained from neuropsychological assessments. (3) In terms of the clinical setting, we depicted a machine learning framework for automated cognitive assessment data analysis for the precise classification of healthy and mild cognitive impairment individuals. Our work opens the possibility for automated assessment of cognitive function in community screening.
Although our proposed method achieved a good classification of screening MCI and NC, several limitations still exist. First, the whole experiment is time-consuming and thus leads to a decrease in the degree of completion and cooperation of patients. Second, the de-noising algorithm may influence the results of feature extraction and classification. Third, the sample size of NC and MCI individuals was limited, and increasing the sample size in future studies should be taken into consideration. Longitudinal imaging studies are still absent. In the subsequent research, ongoing follow-up observational studies of individuals will facilitate the investigation and validation of our results. Finally, SVM was only used as the classifier in this study. If alternative classifiers such as using extreme learning machines or deep learning models were developed, better classification results will be obtained.

Conclusions
In this study, an automatic and non-invasive MCI detection model was proposed, which integrated EEG, Eye movement techniques, and a neuropsychological test battery. The results indicated the potential application for MCI detection and guided referral for a more comprehensive evaluation to ultimately facilitate early intervention in primary care.

Supplementary Materials:
The following supporting information can be downloaded at: https://www. mdpi.com/article/10.3390/brainsci12091149/s1. Figure S1. Data collection in this study. Table S1. listed neuropsychological Test used in this study.  Informed Consent Statement: Informed consent was obtained from all subjects involved in the study. Written informed consent has been obtained from the subjects to publish this paper.

Data Availability Statement:
The data that support the findings of this study are available from the corresponding author.

Conflicts of Interest:
The authors declare no conflict of interest.