Enhancing Mild Cognitive Impairment Auxiliary Identification Through Multimodal Cognitive Assessment with Eye Tracking and Convolutional Neural Network Analysis

Li, Na; Wang, Ziming; Ren, Wen; Zheng, Hong; Liu, Shuai; Zhou, Yi; Ju, Kang; Chen, Zhongting

doi:10.3390/biomedicines13030738

Open AccessArticle

Enhancing Mild Cognitive Impairment Auxiliary Identification Through Multimodal Cognitive Assessment with Eye Tracking and Convolutional Neural Network Analysis

by

Na Li

^1,2,†,

Ziming Wang

^2,†,

Wen Ren

²,

Hong Zheng

¹,

Shuai Liu

¹

,

Yi Zhou

²,

Kang Ju

^1,* and

Zhongting Chen

^1,2,*

¹

Shanghai Changning Mental Health Center, Affiliated Mental Health Center of East China Normal University, Shanghai 200335, China

²

Psychology and Cognitive Science, East China Normal University, Shanghai 200062, China

^*

Authors to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Biomedicines 2025, 13(3), 738; https://doi.org/10.3390/biomedicines13030738

Submission received: 22 January 2025 / Revised: 27 February 2025 / Accepted: 7 March 2025 / Published: 18 March 2025

(This article belongs to the Special Issue Biomedical and Biochemical Basis of Neurodegenerative Diseases)

Download

Browse Figures

Review Reports Versions Notes

Abstract

Background: Mild Cognitive Impairment (MCI) is a critical transitional phase between normal aging and dementia, and early detection is essential to mitigate cognitive decline. Traditional cognitive assessment tools, such as the Mini-Mental State Examination (MMSE) and the Montreal Cognitive Assessment (MoCA), exhibit limitations in feasibility, which potentially and partially affects results for early-stage MCI detection. This study developed and tested a supportive cognitive assessment system for MCI auxiliary identification, leveraging eye-tracking features and convolutional neural network (CNN) analysis. Methods: The system employed eye-tracking technology in conjunction with machine learning to build a multimodal auxiliary identification model. Four eye movement tasks and two cognitive tests were administered to 128 participants (40 MCI patients, 57 elderly controls, 31 young adults as reference). We extracted 31 eye movement and 8 behavioral features to assess their contributions to classification accuracy using CNN analysis. Eye movement features only, behavioral features only, and combined features models were developed and tested respectively, to find out the most effective approach for MCI auxiliary identification. Results: Overall, the combined features model achieved a higher discrimination accuracy than models with single feature sets alone. Specifically, the model’s ability to differentiate MCI from healthy individuals, including young adults, reached an average accuracy of 74.62%. For distinguishing MCI from elderly controls, the model’s accuracy averaged 66.50%. Conclusions: Results show that a multimodal model significantly outperforms single-feature models in identifying MCI, highlighting the potential of eye-tracking for early detection. These findings suggest that integrating multimodal data can enhance the effectiveness of MCI auxiliary identification, providing a novel potential pathway for community-based early detection efforts.

Keywords:

eye movements; MCI; CNN analysis; auxiliary identification

1. Introduction

Mild Cognitive Impairment (MCI) is a critical transitional phase between normal age-related cognitive decline and the onset of dementia [1]. Specifically, MCI is characterized by a noticeable deterioration in cognitive functions, including memory, executive functions, language, and visuospatial skills, while daily living capabilities remain largely intact [2]. According to Jia et al. (2020) [3], the prevalence of MCI in China has reached 15.5%, affecting approximately 38.77 million people. Given this increasing prevalence, early and accurate detection of MCI is essential for timely intervention to mitigate further cognitive decline.

Traditional cognitive assessments are the most used methods in MCI early detection, such as the Mini-Mental State Examination (MMSE) and the Montreal Cognitive Assessment (MoCA), although they have some inevitable limitations in practice [4,5,6]. For instance, the comprehensive nature of the MoCA necessitates clinical expertise for administration, which hinders broader detection efforts. Higher-educated individuals typically achieve better scores [7], which can lead to misinterpretation of cognitive impairment, especially among illiterate individuals who tend to score significantly lower compared to literate individuals [8]. Meanwhile, in the case of the MMSE, despite being more accessible for individual use, its sensitivity and stability are compromised due to educational bias, affecting its diagnostic accuracy [3,9]. It exhibits a ceiling effect, with an optimal cutoff point of 29/30 resulting in only 54% correct diagnoses of cognitive impairment, indicating its limited sensitivity for detecting mild cognitive impairment (MCI) [10]. Moreover, both the MMSE and MoCA rely on subjective scoring, particularly for open-ended questions and tasks that require interpretation (e.g., drawing or construction tasks). This can lead to variability in scores depending on the scorer’s judgment, which may affect the reliability of the results [11]. Therefore, there is an urgent need for automated measurement methods that can efficiently and accurately assist in the identification of MCI, ideally being non-invasive, easy to administer, and capable of overcoming the limitations of current tools.

In this context, eye-tracking technology has emerged as a promising non-invasive alternative for identifying and monitoring cognitive decline among various population groups [12]. Recent research has underscored the effectiveness of eye tracking in distinguishing individuals with Alzheimer’s disease (AD) from MCI patients and healthy controls, utilizing indices such as reaction time and error rates from an anti-saccade task [13]. Another study has demonstrated a novel 3 min eye-tracking test’s efficacy in assessing cognitive function among normal controls, MCI, and AD subjects, particularly focusing on memory and deductive reasoning tasks [14]. Prior studies have validated the effectiveness of eye movement tracking techniques in differentiating cognitive impairments, highlighting its utility in the early identification and monitoring of cognitive decline [15,16].

In addition, the introduction of machine learning, particularly into the realm of cognitive impairments, enhances diagnostic precision and reliability of using biomarkers for early identification and treatment strategies [17]. For example, Convolutional Neural Network (CNN) models have demonstrated high accuracy in classifying Alzheimer’s disease and predicting the transition from MCI to Alzheimer’s dementia [18,19]. Combining eye-tracking data with machine learning has shown significant potential in improving diagnostic accuracy, with algorithms like Support Vector Machines (SVM) and Bayesian classifiers achieving over 80% accuracy in assessing cognitive disorders [20]. Likewise, Lin et al. (2024) [21] employed the Light Gradient Boosting Machine (LightGBM) to combine gait and eye movement features for accurately identifying cognitive impairments. Overall, advanced machine learning classifiers, including those that were trained with eye movement data, are a potential approach to distinguishing MCI and early Alzheimer’s with high precision [11,22].

In light of these findings, this study aims to develop a cognitive status assessment system based on eye-tracking features and deep neural network analysis using CNN. We further analyze the contribution of eye-tracking and behavioral data to classification accuracy. Our MCI dataset comprises 128 valid participants, including 57 elderly controls, 40 MCI patients, and 31 young adults. Ultimately, this research, building upon previous studies, validates the feasibility of using eye-tracking features for auxiliary identification of cognitive impairments and contributes to the development of more accessible community-based auxiliary identification systems, holding significant implications for future researchers in constructing broader identification models [23].

2. Materials and Methods

2.1. Participants

One hundred and forty-seven participants in total volunteered for the present study, comprising forty-seven MCI participants (range: 60–87 years, mean: 72.55 years, SD: 6.72 years), sixty-nine healthy control participants (range: 56–87 years, mean age: 69.7 years, SD: 7.88 years), and thirty-one healthy young participants (range: 17–27 years, mean: 21.71 years, SD: 2.60 years) as a reference group. Visual acuity was measured in all participants to ensure normal or corrected-to-normal vision. Participants in the control group and the young group had no history of neurological diseases, psychiatric disorders, or musculoskeletal dysfunctions. All participants and their guardians signed informed consent forms. This study was approved by the Ethics Committees of the Shanghai Changning Mental Health Center and East China Normal University, and it conforms to ethical standards for human research.

Participants in the healthy young control group were recruited from East China Normal University and did not undergo MCI diagnostic evaluations. MCI patients and healthy elderly participants were recruited through a collaboration between the Shanghai Changning Mental Health Center and local community committees. The MCI patients met the diagnostic criteria proposed by Petersen et al. (1999) [24]. The diagnosing process was as follows: For those with potential MCI risk, a standardized evaluation was conducted, including clinical symptoms of cognitive impairment, assessment of daily living abilities, and the severity of cognitive impairment. Specifically, (i) the clinical symptoms of cognitive impairment were confirmed by the patient, an informant, or an experienced clinician; (ii) the Activities of Daily Living (ADL, Barthel Test) scale was used to assess daily living abilities; and (iii) due to logistical constraints, the assessment of cognitive impairment severity was divided into two batches. The first batch used the Mini-Mental State Examination (MMSE) scale to assess 62 individuals, with 3 identified as MCI patients. The second batch used the Montreal Cognitive Assessment (MoCA) scale to assess 54 individuals, with 44 identified as MCI patients. In total, 47 MCI patients were selected.

2.2. Apparatus

The participant was seated comfortably in a dim room facing an LED monitor (Lenovo, Beijing, China, D27-30, 1920 × 1080 pixels, 27-in., 60 Hz) positioned in the frontal plane 50 cm from the participant’s eyes. Head movements were restrained by a chin rest so that the eyes were directed toward the center of the screen. The movements of the left eye were recorded at a sampling rate of 60 Hz using a Gazepoint3 eye tracker (Gazepoint Vancouver, BC, Canada). The experiment was programmed in MATLAB (version: R2018b) with the Psychtoolbox package [25,26,27].

2.3. Task and Procedure

2.3.1. Eye Movement Tasks

Four visual tasks were designed and performed in separate sessions: a pro- and anti-saccades task [28,29], a smooth pursuit task [30], a memory-guided saccades task [31,32,33], and a predictive saccades task [34,35] (see Figure 1).

Pro- and anti-saccades task: A drift correction procedure was performed before each trial. Participants were asked to look at a fixation dot at the center of a screen on a gray background and press the space bar to start. A black fixation cross (+; 1° × 1° of visual angle) was initially presented at the screen center, and four black circles (diameter: 0.5° of visual angle) were displayed at ±9.7° horizontally and vertically away from the screen center at the same time. Firstly, participants were required to fixate the cross (which lasted between 1000 and 1500 ms). Then, in each trial, one of the black circles turned blue or red randomly. In the pro-saccade task, participants were required to look at the target (which is the blue circle). In the anti-saccade task, participants were required to look at the opposite position to the target (which is the red circle). For trials that timed out (participant failed to look to the color-changed circle 1000 ms after target onset) and for trials in which participants made an initial saccade to the wrong position, participants received visual (“wrong”) feedback which lasted 200 ms. Each participant first finished 16 practice trials including 2 times for each direction for pro-and anti-saccade conditions and then started the formal experiment, which was performed 8 times during 2 blocks for pro- and anti-saccade tasks each, with a total number of 64 trials (see Figure 1A).

Smooth pursuit task: Participants were asked to gaze at a red fixation circle (diameter: 0.5° of visual angle) at the center of the screen on a gray background. The trial started when the fixation circle turned blue after participants’ gaze exceeded 1200 ms. Participants were instructed to track along with the blue target, which moved in a horizontal or vertical direction. The total amplitude of the target movement was 20° of visual angle (10° of visual angle either side of the center). The frequency of the sinusoidal target oscillation was set at 0.25 Hz and 0.4 Hz, respectively. Each participant first finished 2 practice trials (one horizontal and one vertical), each of which lasted 10 s, and then started the formal experiment, which was performed 2 times for each of the two frequencies and directions, with a total number of 8 trials (see Figure 1B).

Memory-guided saccades task: A drift correction procedure was performed before each trial. Participants were asked to look at a fixation dot at the center of a screen on a gray background and press the space bar to start. A black fixation cross (+; 1° × 1° of visual angle) was initially presented at the screen center. Firstly, participants were required to fixate the central cross (which lasted between 1000 and 1500 ms). Then, on each trial, a peripheral target (diameter: 0.5° of visual angle) was presented for 100 ms at 5°, 10°, or 15° horizontally from the center of the screen, giving a total of 6 target locations. After a variable, 4500 to 5000 ms delay, the cross disappeared. Until this point, participants were instructed to gaze at the central cross. At this point, participants were asked to make a memory-guided saccade (looking at the remembered location of the peripheral target) while the screen remained blank for 1000 ms. Each participant first finished 12 practice trials including 2 times for each direction and then started the formal experiment, which was performed 10 times during 2 blocks for each direction, with a total number of 60 trials (see Figure 1C).

Predictive Saccade task: Participants were asked to gaze at a red fixation circle (diameter: 0.5° of visual angle; which lasted between 1000 and 1500 ms) at the center of the screen on a gray background. In each trial, the blue target circle (diameter: 0.5° of visual angle) alternated between the two fixed locations (10° of visual angle either side of the screen center) on the horizontal plane for a total of 12 target steps. A square-wave target alternated consistently at one of the five target rates (0.66, 0.8, 1, 1.33, and 2 Hz). These five target rates correspond to the following ISIs (interstimulus intervals): 1500, 1250, 1000, 750, and 500 ms, respectively. Each participant first finished 2 practice trials for two frequencies (0.66 and 2 Hz) and then started the formal experiment, which was performed 5 times during 5 blocks for each frequency, with a total number of 25 trials (see Figure 1D).

2.3.2. Tests of Cognitive Capabilities

Participants’ cognitive capabilities were measured in two domains, inhibitory control function, and working memory.

Stroop Task: Participants first saw a cross fixation point. Subsequently, participants were presented with a color–word combination and were required to quickly and accurately judge the color of the word and press the corresponding color key. The task included three words (red, green, blue) and three presentation colors (red, green, blue). The practice phase consisted of 9 trials, with each level presented once; the formal experiment consisted of 90 trials, with each stimulus level presented 10 times, and the order of stimulus presentation was random. During the practice, feedback on the correctness of the judgment was provided to help participants understand the operation of the experiment, while there was no feedback in the formal experiment. The task analyzed the reaction times of participants under conditions where the color and the word were consistent and inconsistent. The results of the Stroop task calculated accuracy, reaction time, accuracy under the word–color consistent condition, reaction time under the word–color consistent condition, accuracy under the word–color inconsistent condition, and reaction time under the word–color inconsistent condition.

Working Memory Tasks: Participants performed forward/backward visuospatial tasks and forward/backward digit span tasks.

In the forward/backward visuospatial task, participants saw a 16-gallery grid on the screen. During each trial, a fixed number of apples appeared in this grid subsequently and randomly. Participants needed to remember the order and location of each apple’s appearance. After the items were presented, a blank background appeared for 2000 ms. Then, a blank 16-gallery grid was presented, and participants needed to light the apples at the corresponding locations sequentially on the computer, either in order (forward) or in reverse order (backward). The task’s difficulty increased with the length of the items to be recalled. Each participant first finished 3 practice trials for 2 items and then started the formal experiment, which began with 3 items. If these were not recalled correctly, “error” feedback was given; if the answers were incorrect for 3 consecutive times, the test was automatically terminated. The task was scored by awarding one point for each correct trial.

In the forward/backward digit span tasks, participants heard a series of numbers and needed to memorize the order in which the numbers appeared. After the numbers were presented, a blank background appeared for 2000 ms. Then, participants needed to recall the numbers they heard and enter them in the correct sequence in a blank input box on the computer, either in order (forward) or in reverse order (backward). The task’s difficulty increased with the length of numbers to be recalled. Each participant first finished 3 practice trials for 2 numbers and then started the formal experiment, which began with 3 numbers. If they were not answered correctly, “error” feedback was given; if the answers were incorrect for 3 consecutive times, the test was automatically terminated. The task was scored by awarding one point for each correct trial.

In consideration of the potential impact of testing duration and fatigue on the results in elderly patients, we implemented several measures to ensure accurate and reliable data collection. During tasks such as the smooth pursuit task, rest periods were incorporated to prevent fatigue, and the task would only commence after the participant had fixated on the target for a certain duration. Rest periods were also arranged between different blocks of the task, with the duration adjusted according to the participant’s condition and well-being. Both behavioral and eye movement tests included longer rest periods to ensure participants were not fatigued at the start of the tests. The overall duration of the tests varied by individual, with eye movement tests taking approximately 30 min and behavioral tests around 20 min. Additionally, the testing time was individualized for each participant to ensure their comfort and to minimize the impact of fatigue on the results.

2.4. CNN Analysis

In this study, we use the PyTorch package (version: 2.2.0) in Python 3 to build Convolutional Neural Networks (CNNs) for classification. Eye-tracking and behavioral features were not provided to the CNN in video format but were extracted as scalar features from four eye-movement tasks (pro- and anti-saccades task, smooth pursuit task, memory-guided saccades task, and predictive saccade task) and two cognitive tasks (Stroop Task and Working Memory Tasks) using Matlab. In this study, the input data comprised two sets, namely, eye-movement-related data and behavioral data, comprising 31 eye-movement features and 8 behavioral features. We concatenated the eye-tracking features and behavioral features into a single feature vector, meaning that each participant had 39 features that were fed into the model. The features from specific tasks and measurements are listed in Table 1. For details on definitions of input features, please see Table S1.

The model consists of two one-dimensional convolutional layers and two one-dimensional pooling layers stacked alternately, followed by three fully connected layers for the output. The kernel size of each convolutional layer is 2, with a stride of 1. The number of output channels for the first convolutional layer is 16, while the number of output channels for the second convolutional layer is 32. This is carried out to enhance the model’s feature representation capability and to capture higher-level abstract features from the data. To ensure that the input and output features have the same dimensions during each convolution operation, padding was applied. We choose max pooling to reduce computational load. After the convolutional layers and pooling layers, a flattening layer was added to convert the multi-dimensional feature maps into a one-dimensional vector, making it suitable for subsequent fully connected layers.

The first fully connected layer outputs 4096 features, and the expanded dimensions help the network capture more complex data characteristics. The second layer outputs 256 features, which simplifies the rich feature information from the previous layer and focuses on more representative features. The third fully connected layer is the final output layer, with the number of output features corresponding to the number of categories (2 for binary classification, 3 for triple classification), representing the final classification result. We incorporate Dropout layers in the network, which are regularization techniques that prevent overfitting by randomly dropping neurons during training. Considering the relatively small sample size and the data volume of the current study, larger kernels and more convolutional layers did not effectively improve classification accuracy, so we adjusted the number of convolutional layers, kernel size, stride, number of hidden layers, and batch size to adapt the data size of the current study.

In convolutional neural networks, if there are no activation functions in the CNN, the relationship between neuron outputs and inputs remains linear. To enhance the network’s ability to express nonlinearities, activation functions should be added between different layers, thus imparting nonlinear characteristics to the data and enhancing model stability. In this paper, the ReLU function (f(x) = max(0,x)) was used as the activation function. Compared to other activation functions such as Sigmoid, the ReLU function can effectively avoid the problem of vanishing gradients and set the output of some neurons to zero during computation, alleviating overfitting issues in the small sample dataset used in this paper.

The essence of CNN lies in computing the loss function and continuously updating model weights by monitoring the decrease in the loss function to optimize the model. For the choice of loss function, we adopted the Cross Entropy Loss. Cross Entropy Loss can be used in binary and multiclass tasks. It calculates the cross-entropy loss between the model output and the true labels, serving as the objective function for model optimization. The smaller the value of cross-entropy, the better the model prediction performance.

We selected the Adam optimizer with an initial learning rate of 0.001. The first-order momentum parameter β1 and the second-order momentum parameter β2 were set to their default values (β1 = 0.9, β2 = 0.999). The Adam optimizer is an adaptive optimization algorithm that adjusts the learning rate based on historical gradient information, allowing the model to converge quickly in the early stages of training and to find the minimum of the loss function faster in the later stages. Additionally, the Adam optimizer can adjust momentum parameters to prevent falling into local minima. To further optimize the model in the later stages of training, we used the StepLR learning rate scheduler. This scheduler reduces the learning rate by 10% after every 10 epochs, helping to prevent oscillations in the later stages of training that can be caused by large parameter updates.

For the binary and ternary classification tasks, the ratio of the training set to the test set was set to 8:2. For each classification task, 10 independent classification results were independently computed, with outputs of classification accuracy, loss function, participant IDs with classification errors, and labels of misclassification. While maintaining the same ratio, different divisions of training and test sets were employed in each classification to confirm model generalization. Subsequently, the average and variance of the classification accuracy over 10 results were calculated as the final output results.

3. Results

Eye movement data and behavioral data were collected from a total of 147 participants. After excluding participants who did not complete all tasks and those with recording issues, 128 valid participants remained, comprising 40 MCI patients (range: 60–87 years, mean: 72.60 years, SD: 6.22 years, 30 women), 57 healthy elderly adults (range: 56–87 years, mean: 69.39 years, SD: 5.10 years, 34 women), and 31 young adults (range: 17–27 years, mean: 21.71 years, SD: 2.60 years, 20 women).

3.1. Differences in Eye Movement and Behavioral Features Between Groups

To further investigate the characteristics of eye movement performance across groups, we extracted key features from a total of 31 eye movement parameters. Initially, we excluded ten indicators that did not show significant differences between young and elderly adults. From the remaining 21 features, we selected 11 that effectively reflect the overall characteristics of the task. These 11 features demonstrated significant differences between the young and elderly groups. These features reflect the accuracy, reaction speed, and stability in eye movement tasks (see Figure 2 and Table S2).

Analysis results showed that the young group performed significantly better than the elderly group (including the control and MCI groups) across these 11 features, |ts| > 2.06, ps < 0.042, indicating that young participants outperformed the elderly in accuracy, reaction speed, and stability. Within the elderly groups, the control group demonstrated no significant difference from the MCI group (|ts (95)| < 1.49, ps > 0.14). Data distribution of eye movement variables under four tasks were derived to illustrate differences among the three groups (see Figure 2 and Figure S1). For the working memory tasks, the young group performed the best as well, followed by the elderly control group, with the MCI group performing the worst. The young group performed significantly better than the elderly group (including the control and MCI groups), |ts (75)| > 11.66, ps < 0.001. The young group also performed significantly better than the elderly group on the response time but not the accuracy of the Stroop test. Within the elderly group, the MCI group performed significantly worse than the control group on the backward digit span task (t (94.25) = 3.42, p = 0.001), with no significant differences observed in other cognitive tasks (|ts| < 1.97, ps > 0.052) (see Table S2 and Figure S2).

3.2. Relationship Between Eye Movement Variables and Behavioral Features

Pearson correlation analysis between the eye movement features and behavioral features was conducted to investigate the relationships between eye movement and behavioral features. The accuracy and variance features observed from the eye movement tasks demonstrated positive correlations with the four working memory tasks, whereas there were significantly negative correlations between those features and the response time of the Stroop test for inhibitory control function, with the exception of variance horizontal at 0.25 Hz and 0.4 Hz in the smooth pursuit task (ps > 0.05). For instance, the pro-saccades accuracy was positively correlated with the forward visuospatial task (r = 0.45, p < 0.01) and negatively with the consistent response time of the Stroop test (r = −0.39, p < 0.01). Conversely, the latency, gain, and delay features in the eye movement tasks were negatively correlated with the four working memory tasks, but positive with the response time of the Stroop test. The accuracy of the Stroop test had non-significant correlations with the eye movement features (ps > 0.05). The fixation dispersion in pro- and anti-saccades tasks and smooth pursuit task was significantly correlated with some working memory tasks. For details of the correlation analysis results, please see Table S3 in the Supplementary Materials.

3.3. Power of Discriminating MCI from Healthy Individuals Including Young Adults

In the current study, seven feature sets (i.e., eye movement features only, behavioral features only, combined features, combined features without pro- and anti-saccades task, combined features without smooth pursuit task, combined features without memory-guided task and combined features without prediction saccades task) were used to train a CNN for MCI detection (see Table 2).

The model trained with the combined features achieved an average discrimination accuracy of 74.62% using CNN for feature extraction as a classifier. The model based on behavioral features only (accuracy: 72.31%) outperformed the model based on eye movement features only (accuracy: 69.62%) which yielded the lowest accuracy. Notable, the accuracy of the model dropped to 68.85% when the pro- and anti-saccades features were excluded, and it increased to 78.48% when the prediction saccade features increased. The other two models that incorporated eye movement data achieved accuracies of 70% and above.

The effect of combining features in the model on MCI participants’ identification was the best, with a hit rate of 60.77% and it was also the best model for distinguishing between elderly control and young individuals, with a hit rate of 73.08% and 100%, respectively. The next best model, which used only behavioral features, had hit rates of 67.84% for elderly control individuals and 57.50% for MCI participants. The eye movement features-only model could distinguish the elderly control individuals (hit rate: 66.60%) better than the MCI participants, and the latter case was almost at the random chance level (hit rate: 51.63%). All models could effectively distinguish young participants, with the lowest achieved hit rate being 94.90%. Removing pro- and anti-saccades features reduced the model’s hit rate from 60.77% to 51.50%, indicating the critical role of these features in identifying individuals with MCI. However, when smooth pursuit, memory-guided saccade, and predictive saccade features were excluded, the hit rates for correctly distinguishing MCI individuals all remained above 58%, and among those models removing predictive saccade features even increased the model’s hit rate to 62.64%.

3.4. Power of Discriminating MCI from the Healthy Elderly Individuals

For the binary classification, we used CNN analysis to discriminate MCI participants from healthy elderly individuals. As with the triple classification, the model utilizing combined features demonstrated the highest accuracy (66.50% for MCI detection). However, unlike the findings from the triple classification task, the model based solely on eye movement features outperformed the behavioral features-only model, with accuracies of 61% and 58.50%, respectively (see Table 3). The model using only eye movement features also achieved a better result (AUC: 0.61) than the behavioral features-only model (AUC: 0.51). Moreover, the combined features model achieved the best AUC at 0.64 (see Figure 3). The transition from triple to binary classification did not significantly disrupt the stability of the models, though the accuracy of the behavioral features-only model decreased markedly in the absence of young adults from the dataset. Notably, all four of the models that incorporated eye movement data achieved accuracies of 61.00% and above.

Among those models, the combined features model was relatively effective in distinguishing MCI participants from the normal population with similar age and gender characteristics, achieving a hit rate of 59.76% for MCI patients and 70.02% for healthy elderly individuals. The model that used only behavioral features yielded a hit rate of 52.79% for correctly identifying MCI patients, while the eye movement features-only model had the lowest hit rate at 49.31%. Excluding the predictive saccade task from the combined model increased the hit rate to 59.88%, and almost had the same hit rate as the combined features model. Other sets of models incorporating eye movement data achieved hit rates between 52.77% to 56.97% in identifying MCI patients.

3.5. Comparison of CNN with Other Models

To further validate the advantages of CNN in distinguishing MCI, we compare the classification performance of CNN with that of Support Vector Machine (SVM) and Random Forest. First, regarding SVM, it showed a weaker ability to distinguish MCI patients from the control group of elderly individuals, with an average accuracy of 53.00%. The accuracy for distinguishing MCI was only 51.15%, which is not much different from random chance (50%). However, SVM was able to distinguish MCI patients from young individuals, with an accuracy of 99.44%. The classification performance of Random Forest was similar to that of SVM. In the binary classification task, Random Forest’s average accuracy was 53.00%, with the accuracy for distinguishing MCI being low (47.80%), below random probability. In the triple classification task, Random Forest’s average accuracy was 63.46%, with an accuracy of 100% for distinguishing young individuals, but only 43.6% for distinguishing MCI patients. Both SVM and Random Forest significantly underperformed compared to CNN, which achieved an average accuracy of 66.50% in binary classification and 74.62% in triple classification.

To explore the importance of the CNN convolutional operation in distinguishing MCI patients, we compared the classification performance of CNN and FCNN (Fully Connected Neural Network). In the binary classification task, the average accuracy of FCNN was 57.50%, with an MCI classification accuracy of 51.94%. In the triple classification task, the average accuracy of FCNN was 67.69%, with a classification accuracy of 100% for young individuals and 47.09% for MCI patients.

4. Discussion

Our study applies a multimodal analysis combining eye movement and behavioral features to identify MCI. By extracting 31 eye movement and 8 behavioral features, we constructed a CNN-based machine learning model to compare identification performance among single behavioral features, single eye movement features, and their fusion. Our findings indicate that the fusion model effectively distinguishes MCI patients, particularly from healthy individuals, underscoring its clinical value for early MCI diagnosis. This approach provides a valuable reference for AD prevention and complements prior research focused predominantly on diagnosing AD, contributing to a more comprehensive understanding of cognitive decline disorders.

Our results align with previous cognitive impairment research [36,37,38,39], confirming that MCI patients exhibit deficits in memory, particularly in spatial and digit recall, as well as poorer working memory performance compared to healthy elderly individuals, evidenced by lower backward digit span scores. Eye movement features, recognized as biomarkers for early AD stages and cognitive impairment assessment, have shown high accuracy in eye-tracking-based cognitive tasks [40,41]. However, our study reveals that while eye movement features can distinguish between young adults and older adults, they are less effective in differentiating MCI patients from elderly controls, highlighting diagnostic challenges specific to MCI.

Prior studies have focused mainly on MCI progression to severe cognitive impairments like AD, with visual attention models achieving high dementia prediction accuracy [42]. Eye-tracking data for cognitive impairments have demonstrated a strong link between eye movement and cognitive functions [43]. For example, dual-task eye-tracking and gait analysis achieved an AUC of 0.866–0.893 in distinguishing MCI from dementia [21]. Conversely, identifying MCI from the general population remains challenging. In our study, using only eye movement features yielded an accuracy of 61% and an AUC of 0.61; adding behavioral features improved the accuracy and AUC to 66.5% and 0.64, respectively. Studies combining gait and eye movement tasks reported similar AUCs (0.713–0.742) for distinguishing MCI from cognitively normal (CN) individuals [21]. Despite a trend toward improved accuracy and specificity with added features, overall model performance remained modest. In addition to the findings presented, our study suggests that the eye-tracking data captured in this research are primarily reflective of cognitive inhibition, a key component of executive function under high cognitive load. While saccadic movements have been hypothesized to relate to working memory through mechanisms that are still not fully understood, our results indicate that these movements are more closely associated with cognitive inhibition. This finding is consistent with studies showing the importance of cognitive inhibition in maintaining focus and filtering out irrelevant information, particularly in tasks requiring high cognitive load. Future research should explore the potential links between saccadic movements, cognitive inhibition, and direct cognitive load to further elucidate these mechanisms.

Our findings reveal that classification using only eye movement features performed poorly in distinguishing MCI patients from healthy elderly individuals, with an accuracy of 61.0%—though higher than their performance (Accuracy: 58.5%) using behavioral features. Conversely, using behavioral features only demonstrated a higher accuracy of 72.31% than eye movement features only, which had an accuracy of 69.62%, in distinguishing young groups from older groups (MCI and elderly controls). This suggested that traditional assessments are more effective in differentiating between age groups due to their comprehensive evaluation of age-related cognitive declines [44]. For example, motor limitations among older adults may impede their performance in behavioral tasks, and the diverse educational backgrounds of the elderly are frequently neglected in conventional assessments, which may introduce bias [45,46]. Meanwhile, this sensitivity of eye movement metrics to subtle cognitive changes, especially in MCI patients, is supported by evidence showing that tasks like the anti-saccades paradigm can effectively differentiate MCI patients from healthy controls [13]. Consequently, eye movement tasks are more effective in identifying cognitive impairments associated with MCI, making them a valuable tool in the assessment of cognitive health in the elderly [13,47]. We observed that the combination of eye movement metrics, including pro- and anti-saccades, smooth pursuit, memory-guided saccades, and prediction saccades, achieved the highest accuracy of 66.5% in distinguishing individuals with MCI from healthy controls, indicating that a single metric may not fully capture an individual’s cognitive state due to the diverse aspects of cognitive function they reflect. Compared to traditional self-reporting methods, eye-tracking technology reduces subjective bias, enhancing the objectivity of detection. Furthermore, the application of machine learning algorithms enables the automatic identification of patterns related to MCI from large datasets. This offers a more convenient, objective, and automated means for cognitive assessment, facilitating preliminary auxiliary identification and evaluation of cognitive function.

In this study, by comparing the performance of the CNN model with other models (such as FCNN) in classification tasks, we find that the CNN model has a significant advantage in distinguishing MCI patients. Specifically, the classification accuracy of the CNN model (with an average accuracy of 66.50% in binary classification and 74.62% in triple classification) is significantly higher than that of the FCNN model (with an average accuracy of 57.50% in binary classification and 67.69% in triple classification). This suggests that the convolutional operation plays an important role in distinguishing MCI from the elderly control group. Although CNN is typically used for processing high-dimensional data (such as images or videos), the results of this study show that even when the input data consist of scalar features, the discriminatory power of CNN remains significantly stronger than that of other models. This phenomenon may be attributed to the fact that the convolutional layers of CNN can extract high-dimensional features from the data, and these features, compared to individual eye-tracking features (such as fixation dispersion), are more indicative of the participants’ cognitive states. Specifically, the convolutional operation performs local conjunctions of features in the input data through sliding convolution kernels, thus generating higher-dimensional features [48]. For example, in the smooth pursuit task, fixation dispersion and saccade compensation, when considered separately, may not provide strong discriminatory power, but their combination may be crucial for assessing cognitive state. The convolutional operation can learn the local patterns among these features, thereby extracting feature combinations that are more effective in distinguishing MCI patients.

Previous research has highlighted the potential of microsaccades—fine involuntary eye movements—as a valuable metric for assessing cognitive function. Microsaccades are closely linked to attention and cognitive processing [49] and have been shown to be highly sensitive to cognitive load in tasks requiring rapid shifts of attention [50]. Additionally, they have been identified as potential indicators of neurodegenerative changes in conditions such as Alzheimer’s disease and MCI [51]. Incorporating microsaccade metrics into our current framework of eye-tracking features may improve the accuracy of early detection and provide deeper insights into the underlying mechanisms of cognitive decline. Future studies should consider the integration of microsaccade metrics to further explore their potential in enhancing diagnostic models for MCI.

Behavioral features were included to improve assessment accuracy, in recognition of the complexity of cognitive impairments. The lower accuracy observed in our study may be due to a small sample size, a common limitation in precision medicine. Nevertheless, our findings demonstrate the value of multi-dimensional parameters for MCI early detection. The heterogeneity of MCI may also explain the limited model accuracy, suggesting that a single approach may not be effective. For example, amnestic MCI (aMCI) patients, exhibit eye movement abnormalities in memory-related tasks that are not typically observed in non-amnestic MCI (naMCI) patients, reflecting the distinct memory impairments characteristic of each subtype. Considering MCI subtypes has been shown to enhance diagnostic accuracy. Our model, as an auxiliary identification tool, aiming to capture this complexity, incorporated a variety of eye movement tasks and behavioral measures to improve sensitivity. Future research on specific eye movement biomarkers and neuropsychological criteria that differentiate MCI subtypes could further aid in predicting dementia progression. A deeper understanding of MCI subtypes could facilitate targeted prevention strategies and support more effective testing of future therapeutic interventions [52,53,54,55].

5. Conclusions

Overall, this study validates the effectiveness of integrating eye movement and behavioral features for multimodal MCI auxiliary identification. Compared to previous studies, this research further demonstrates the advantages of multimodal features in capturing the complex cognitive impairment patterns in MCI, especially providing a more comprehensive perspective for MCI auxiliary identification. The support of eye-tracking technology also opens the possibility of examining the underlying neuronal connections, surpassing the limitations of traditional self-reported methodologies [56]. However, this study has its limitations. First, the sample size was relatively small, particularly for MCI patients, which may affect the model’s stability and generalizability. Second, some eye movement tasks may not fully capture the complexity of MCI, particularly in the predictive saccade task, where the effectiveness of MCI subtypes’ identification is limited. Future research could improve in the following areas: (1) expanding the sample size, particularly through multi-center data collection, to enhance the model’s generalizability and robustness; (2) exploring the integration of more physiological and behavioral signals, such as EEG and skin conductance response, to further improve the accuracy of early MCI identification; and (3) designing personalized cognitive tasks for different MCI subtypes to enhance the model’s sensitivity to subtype differences, to provide more practical auxiliary identification tools for clinical diagnosis and intervention.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/biomedicines13030738/s1, Table S1: Definitions of Input Features; Table S2: Comparison of elderly and young adults; Table S3: Correlation between eye movement variables and behavioral features; Figure S1: Differences in eye movement features between groups; Figure S2: Differences in behavioral features between groups.

Author Contributions

Conceptualization, N.L., K.J. and Z.C.; methodology, Z.W.; software, Z.W.; validation, K.J., Y.Z. and S.L.; formal analysis, N.L. and Z.W.; investigation, Y.Z.; resources, K.J. and H.Z.; data curation, N.L. and Z.C.; writing—original draft preparation, N.L. and W.R.; writing—review and editing, N.L., Z.C., W.R. and S.L.; visualization, N.L. and Z.W.; supervision, Z.C.; project administration, N.L. and Z.C.; funding acquisition, N.L., K.J., H.Z. and Z.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Shanghai Municipal Health Commission project (Grant No. 202040500), the High-quality Balance Project of Changning District Health Commission, the Postdoctoral Foundation of Shanghai Changning Mental Health Center (Grant No. SCMHC-PF001), the Shanghai Municipal Health Commission Health Industry Clinical Research Special Project (Grant No. 20234Y0011), the Medical Master’s and Doctoral Innovation Talent Base Project of Changning District (RCJD2022S07), and the Mental Health Project of Changning Mental Health Center Affiliated with East China Normal University in 2023.

Institutional Review Board Statement

The study was conducted in accordance with the Declaration of Helsinki, and approved by the Committee on Human Research Protection of the Shanghai Changning Mental Health Center (approval number: M202302, approval date: 3 April 2023) and East China Normal University (approval number: HR2-0044-2023, approval date: 4 April 2023).

Informed Consent Statement

The patients/participants provided their written informed consent to participate in this study.

Data Availability Statement

The data supporting the conclusions of this article are available in an online repository (https://osf.io/sfcqj/ (accessed on 19 January 2025)).

Acknowledgments

We thank the participants in this study, and Wei Zexuan for data collection.

Conflicts of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Tábuas-Pereira, M.; Baldeiras, I.; Duro, D.; Santiago, B.; Ribeiro, M.; Leitão, M.; Oliveira, C.; Santana, I. Prognosis of Early-Onset vs. Late-Onset Mild Cognitive Impairment: Comparison of Conversion Rates and Its Predictors. Geriatrics 2016, 1, 11. [Google Scholar] [CrossRef] [PubMed]
Adarsh, V.; Gangadharan, G.R.; Fiore, U.; Zanetti, P. Multimodal Classification of Alzheimer’s Disease and Mild Cognitive Impairment Using Custom MKSCDDL Kernel over CNN with Transparent Decision-Making for Explainable Diagnosis. Sci. Rep. 2024, 14, 1774. [Google Scholar] [CrossRef]
Jia, L.; Du, Y.; Chu, L.; Zhang, Z.; Li, F.; Lyu, D.; Li, Y.; Li, Y.; Zhu, M.; Jiao, H.; et al. Prevalence, Risk Factors, and Management of Dementia and Mild Cognitive Impairment in Adults Aged 60 Years or Older in China: A Cross-Sectional Study. Lancet Public Health 2020, 5, e661–e671. [Google Scholar] [CrossRef] [PubMed]
Chen, P.; Cai, H.; Bai, W.; Su, Z.; Tang, Y.-L.; Ungvari, G.S.; Ng, C.H.; Zhang, Q.; Xiang, Y.-T. Global Prevalence of Mild Cognitive Impairment among Older Adults Living in Nursing Homes: A Meta-Analysis and Systematic Review of Epidemiological Surveys. Transl. Psychiatry 2023, 13, 88. [Google Scholar] [CrossRef]
Huo, Z.; Lin, J.; Bat, B.K.K.; Chan, J.Y.C.; Tsoi, K.K.F.; Yip, B.H.K. Diagnostic Accuracy of Dementia Screening Tools in the Chinese Population: A Systematic Review and Meta-Analysis of 167 Diagnostic Studies. Age Ageing 2021, 50, 1093–1101. [Google Scholar] [CrossRef] [PubMed]
Yu, J.; Li, J.; Huang, X. The Beijing Version of the Montreal Cognitive Assessment as a Brief Screening Tool for Mild Cognitive Impairment: A Community-Based Study. BMC Psychiatry 2012, 12, 156. [Google Scholar] [CrossRef]
Rossetti, H.C.; Lacritz, L.H.; Cullum, C.M.; Weiner, M.F. Normative data for the Montreal Cognitive Assessment (MoCA) in a population-based sample. Neurology 2011, 77, 1272–1275. [Google Scholar] [CrossRef]
Maher, C.; Calia, C. The effect of illiteracy on performance in screening tools for dementia: A meta-analysis. J. Clin. Exp. Neuropsychol. 2021, 43, 945–966. [Google Scholar] [CrossRef]
Jia, X.; Wang, Z.; Huang, F.; Su, C.; Du, W.; Jiang, H.; Wang, H.; Wang, J.; Wang, F.; Su, W.; et al. A Comparison of the Mini-Mental State Examination (MMSE) with the Montreal Cognitive Assessment (MoCA) for Mild Cognitive Impairment Screening in Chinese Middle-Aged and Older Population: A Cross-Sectional Study. BMC Psychiatry 2021, 21, 485. [Google Scholar] [CrossRef]
Hoops, S.; Nazem, S.; Siderowf, A.D.; Duda, J.E.; Xie, S.X.; Stern, M.B.; Weintraub, D. Validity of the MoCA and MMSE in the detection of MCI and dementia in Parkinson disease. Neurology. 2018, 73, 1738–1745. [Google Scholar] [CrossRef]
Wolf, A.; Tripanpitak, K.; Umeda, S.; Otake-Matsuura, M. Eye-Tracking Paradigms for the Assessment of Mild Cognitive Impairment: A Systematic Review. Front. Psychol. 2023, 14, 1197567. [Google Scholar] [CrossRef] [PubMed]
Bueno, A.P.A.; Sato, J.R.; Hornberger, M. Eye Tracking–The Overlooked Method to Measure Cognition in Neurodegeneration? Neuropsychologia 2019, 133, 107191. [Google Scholar] [CrossRef] [PubMed]
Opwonya, J.; Doan, D.N.T.; Kim, S.G.; Kim, J.I.; Ku, B.; Kim, S.; Park, S.; Kim, J.U. Saccadic Eye Movement in Mild Cognitive Impairment and Alzheimer’s Disease: A Systematic Review and Meta-Analysis. Neuropsychol. Rev. 2022, 32, 193–227. [Google Scholar] [CrossRef] [PubMed]
Tadokoro, K.; Yamashita, T.; Fukui, Y.; Nomura, E.; Ohta, Y.; Ueno, S.; Nishina, S.; Tsunoda, K.; Wakutani, Y.; Takao, Y.; et al. Early Detection of Cognitive Decline in Mild Cognitive Impairment and Alzheimer’s Disease with a Novel Eye Tracking Test. J. Neurol. Sci. 2021, 427, 117529. [Google Scholar] [CrossRef]
Haque, R.U.; Pongos, A.L.; Manzanares, C.M.; Lah, J.J.; Levey, A.I.; Clifford, G.D. Deep Convolutional Neural Networks and Transfer Learning for Measuring Cognitive Impairment Using Eye-Tracking in a Distributed Tablet-Based Environment. IEEE Trans. Biomed. Eng. 2021, 68, 11–18. [Google Scholar] [CrossRef]
Liu, Z.; Yang, Z.; Gu, Y.; Liu, H.; Wang, P. The Effectiveness of Eye Tracking in the Diagnosis of Cognitive Disorders: A Systematic Review and Meta-Analysis. PLoS ONE 2021, 16, e0254059. [Google Scholar] [CrossRef] [PubMed]
Alvi, A.M.; Siuly, S.; Wang, H.; Wang, K.; Whittaker, F. A Deep Learning Based Framework for Diagnosis of Mild Cognitive Impairment. Knowl.-Based Syst. 2022, 248, 108815. [Google Scholar] [CrossRef]
Grueso, S.; Viejo-Sobera, R. Machine Learning Methods for Predicting Progression from Mild Cognitive Impairment to Alzheimer’s Disease Dementia: A Systematic Review. Alzheimer’s Res. Ther. 2021, 13, 162. [Google Scholar] [CrossRef]
Oh, K.; Chung, Y.-C.; Kim, K.W.; Kim, W.-S.; Oh, I.-S. Classification and Visualization of Alzheimer’s Disease Using Volumetric Convolutional Neural Network and Transfer Learning. Sci. Rep. 2019, 9, 18150. [Google Scholar] [CrossRef]
Zhang, D.; Liu, X.; Xu, L.; Li, Y.; Xu, Y.; Xia, M.; Qian, Z.; Tang, Y.; Liu, Z.; Chen, T.; et al. Effective Differentiation between Depressed Patients and Controls Using Discriminative Eye Movement Features. J. Affect. Disord. 2022, 307, 237–243. [Google Scholar] [CrossRef]
Lin, J.; Xu, T.; Yang, X.; Yang, Q.; Zhu, Y.; Wan, M.; Xiao, X.; Zhang, S.; Ouyang, Z.; Fan, X.; et al. A Detection Model of Cognitive Impairment via the Integrated Gait and Eye Movement Analysis from a Large Chinese Community Cohort. Alzheimer’s Dement. 2024, 20, 1089–1101. [Google Scholar] [CrossRef] [PubMed]
Tamaru, Y.; Matsushita, F.; Matsugi, A. Tests of Abnormal Gaze Behavior Increase the Accuracy of Mild Cognitive Impairment Assessments. Sci. Rep. 2024, 14, 19512. [Google Scholar] [CrossRef]
Jia, H.; Yu, S.; Yin, S.; Liu, L.; Yi, C.; Xue, K.; Li, F.; Yao, D.; Xu, P.; Zhang, T. A Model Combining Multi Branch Spectral-Temporal CNN, Efficient Channel Attention, and LightGBM for MI-BCI Classification. IEEE Trans. Neural Syst. Rehabil. Eng. 2023, 31, 1311–1320. [Google Scholar] [CrossRef]
Petersen, R.C.; Smith, G.E.; Waring, S.C.; Ivnik, R.J.; Tangalos, E.G.; Kokmen, E. Mild Cognitive Impairment: Clinical Characterization and Outcome. Arch. Neurol. 1999, 56, 303. [Google Scholar] [CrossRef] [PubMed]
Brainard, D.H. The Psychophysics Toolbox. Spat. Vis. 1997, 10, 433–436. [Google Scholar] [CrossRef]
Kleiner, M.; Brainard, D.; Pelli, D.; Ingling, A.; Murray, R.; Broussard, C. What's new in psychtoolbox-3. Perception. 2007, 36, 1–16. [Google Scholar]
Pelli, D.G. The VideoToolbox Software for Visual Psychophysics: Transforming Numbers into Movies. Spat. Vis. 1997, 10, 437–442. [Google Scholar] [CrossRef] [PubMed]
Alichniewicz, K.K.; Brunner, F.; Klünemann, H.H.; Greenlee, M.W. Neural Correlates of Saccadic Inhibition in Healthy Elderly and Patients with Amnestic Mild Cognitive Impairment. Front. Psychol. 2013, 4, 467. [Google Scholar] [CrossRef]
Karatekin, C. Improving Antisaccade Performance in Adolescents with Attention-Deficit/Hyperactivity Disorder (ADHD). Exp. Brain Res. 2006, 174, 324–341. [Google Scholar] [CrossRef] [PubMed]
Shakespeare, T.J.; Kaski, D.; Yong, K.X.X.; Paterson, R.W.; Slattery, C.F.; Ryan, N.S.; Schott, J.M.; Crutch, S.J. Abnormalities of Fixation, Saccade and Pursuit in Posterior Cortical Atrophy. Brain 2015, 138, 1976–1991. [Google Scholar] [CrossRef]
Bucci, M.P.; Goulème, N.; Dehouck, D.; Stordeur, C.; Acquaviva, E.; Septier, M.; Lefebvre, A.; Gerard, C.-L.; Peyre, H.; Delorme, R. Interactions between Eye Movements and Posture in Children with Neurodevelopmental Disorders. Int. J. Dev. Neurosci. Off. J. Int. Soc. Dev. Neurosci. 2018, 71, 61–67. [Google Scholar] [CrossRef] [PubMed]
Mostofsky, S.H.; Lasker, A.G.; Cutting, L.E.; Denckla, M.B.; Zee, D.S. Oculomotor Abnormalities in Attention Deficit Hyperactivity Disorder: A Preliminary Study. Neurology 2001, 57, 423–430. [Google Scholar] [CrossRef] [PubMed]
Mostofsky, S.H.; Lasker, A.G.; Singer, H.S.; Denckla, M.B.; Zee, D.S. Oculomotor Abnormalities in Boys With Tourette Syndrome With and Without ADHD. J. Am. Acad. Child Adolesc. Psychiatry 2001, 40, 1464–1472. [Google Scholar] [CrossRef]
Calancie, O.G.; Brien, D.C.; Huang, J.; Coe, B.C.; Booij, L.; Khalid-Khan, S.; Munoz, D.P. Maturation of Temporal Saccade Prediction from Childhood to Adulthood: Predictive Saccades, Reduced Pupil Size, and Blink Synchronization. J. Neurosci. 2022, 42, 69–80. [Google Scholar] [CrossRef]
Stark, L.; Vossius, G.; Young, L.R. Predictive Control of Eye Tracking Movements. IRE Trans. Hum. Factors Electron. 1962, HFE-3, 52–57. [Google Scholar] [CrossRef]
Borkowska, A.; Drożdż, W.; Jurkowski, P.; Rybakowski, J.K. The Wisconsin Card Sorting Test and the N-Back Test in Mild Cognitive Impairment and Elderly Depression. World J. Biol. Psychiatry 2009, 10, 870–876. [Google Scholar] [CrossRef]
Kessels, R.P.C.; Meulenbroek, O.; Fernández, G.; Olde Rikkert, M.G.M. Spatial Working Memory in Aging and Mild Cognitive Impairment: Effects of Task Load and Contextual Cueing. Aging Neuropsychol. Cogn. 2010, 17, 556–574. [Google Scholar] [CrossRef] [PubMed]
Saunders, N.L.J.; Summers, M.J. Attention and Working Memory Deficits in Mild Cognitive Impairment. J. Clin. Exp. Neuropsychol. 2010, 32, 350–357. [Google Scholar] [CrossRef]
Saunders, N.L.J.; Summers, M.J. Longitudinal Deficits to Attention, Executive, and Working Memory in Subtypes of Mild Cognitive Impairment. Neuropsychology 2011, 25, 237–248. [Google Scholar] [CrossRef]
Crutcher, M.D.; Calhoun-Haney, R.; Manzanares, C.M.; Lah, J.J.; Levey, A.I.; Zola, S.M. Eye Tracking During a Visual Paired Comparison Task as a Predictor of Early Dementia. Am. J. Alzheimer’s Dis. Dementias^® 2009, 24, 258–266. [Google Scholar] [CrossRef]
Zola, S.M.; Manzanares, C.M.; Clopton, P.; Lah, J.J.; Levey, A.I. A Behavioral Task Predicts Conversion to Mild Cognitive Impairment and Alzheimer’s Disease. Am. J. Alzheimer’s Dis. Dementias^® 2013, 28, 179–184. [Google Scholar] [CrossRef] [PubMed]
Chaabouni, S.; Benois-pineau, J.; Tison, F.; Ben Amar, C.; Zemmari, A. Prediction of Visual Attention with Deep CNN on Artificially Degraded Videos for Studies of Attention of Patients with Dementia. Multimed. Tools Appl. 2017, 76, 22527–22546. [Google Scholar] [CrossRef]
Beltrán, J.; García-Vázquez, M.S.; Benois-Pineau, J.; Gutierrez-Robledo, L.M.; Dartigues, J.-F. Computational Techniques for Eye Movements Analysis towards Supporting Early Diagnosis of Alzheimer’s Disease: A Review. Comput. Math. Methods Med. 2018, 2018, 2676409. [Google Scholar] [CrossRef]
Spreng, R.N.; Wojtowicz, M.; Grady, C.L. Reliable Differences in Brain Activity between Young and Old Adults: A Quantitative Meta-Analysis across Multiple Cognitive Domains. Neurosci. Biobehav. Rev. 2010, 34, 1178–1194. [Google Scholar] [CrossRef]
Mueller, Y.K.; Monod, S.; Locatelli, I.; Büla, C.; Cornuz, J.; Senn, N. Performance of a brief geriatric evaluation compared to a comprehensive geriatric assessment for detection of geriatric syndromes in family medicine: A prospective diagnostic study. BMC Geriatr. 2018, 18, 72. [Google Scholar] [CrossRef] [PubMed]
Parker, S.G.; McCue, P.; Phelps, K.; McCleod, A.; Arora, S.; Nockels, K.; Kennedy, S.; Roberts, H.; Conroy, S. What is Comprehensive Geriatric Assessment (CGA)? An umbrella review. Age Ageing 2018, 47, 149–155. [Google Scholar] [CrossRef]
Wilcockson, T.D.W.; Mardanbegi, D.; Xia, B.; Taylor, S.; Sawyer, P.; Gellersen, H.W.; Leroi, I.; Killick, R.; Crawford, T.J. Abnormalities of saccadic eye movements in dementia due to Alzheimer’s disease and mild cognitive impairment. Aging 2019, 11, 5389–5398. [Google Scholar] [CrossRef]
LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef]
Rolfs, M. Microsaccades: Small steps on a long way. Vis. Res. 2009, 49, 2415–2441. [Google Scholar] [CrossRef]
Dalmaso, M.; Castelli, L.; Galfano, G. Microsaccadic rate and pupil size dynamics in pro-/anti-saccade preparation: The impact of intermixed vs. blocked trial administration. Psychol. Res. 2020, 84, 1320–1332. [Google Scholar] [CrossRef]
Kapoula, Z.; Yang, Q.; Otero-Millan, J.; Xiao, S.; Macknik, S.L.; Lang, A.; Verny, M.; Martinez-Conde, S. Distinctive features of microsaccades in Alzheimer’s disease and in mild cognitive impairment. Age 2014, 36, 535–543. [Google Scholar] [CrossRef]
Busse, A.; Angermeyer, M.C.; Riedel-Heller, S.G. Progression of Mild Cognitive Impairment to Dementia: A Challenge to Current Thinking. Br. J. Psychiatry 2006, 189, 399–404. [Google Scholar] [CrossRef] [PubMed]
Clark, R.; Blundell, J.; Dunn, M.J.; Erichsen, J.T.; Giardini, M.E.; Gottlob, I.; Harris, C.; Lee, H.; Mcilreavy, L.; Olson, A.; et al. The Potential and Value of Objective Eye Tracking in the Ophthalmology Clinic. Eye 2019, 33, 1200–1202. [Google Scholar] [CrossRef] [PubMed]
Csukly, G.; Sirály, E.; Fodor, Z.; Horváth, A.; Salacz, P.; Hidasi, Z.; Csibri, É.; Rudas, G.; Szabó, Á. The Differentiation of Amnestic Type MCI from the Non-Amnestic Types by Structural MRI. Front. Aging Neurosci. 2016, 8, 52. [Google Scholar] [CrossRef] [PubMed]
Wright, T.; O’Connor, S. Reviewing Challenges and Gaps in European and Global Dementia Policy. J. Public Ment. Health 2018, 17, 157–167. [Google Scholar] [CrossRef]
Kaufer, D.I.; Williams, C.S.; Braaten, A.J.; Gill, K.; Zimmerman, S.; Sloane, P.D. Cognitive Screening for Dementia and Mild Cognitive Impairment in Assisted Living: Comparison of 3 Tests. J. Am. Med. Dir. Assoc. 2008, 9, 586–593. [Google Scholar] [CrossRef]

Figure 1. (A) pro- and anti-saccades task; (B) smooth pursuit task; (C) memory-guided task; (D) predictive saccade task. Red dot: target point of the anti-saccade (A), fixation point (B,D); blue dot: target point.

Figure 2. Difference among groups in eye movement tasks. (A) fixation dispersion; pro- and anti-saccades tasks’ accuracy (%) and latency (s); (B) smooth pursuit task: mean variance, mean delay(s), and saccade compensation; (C) memory-guided task: mean gain; (D) predictive saccade task: mean variance and latency(s).

Figure 3. MCI patient detection power of models. Models based on eye movement features only (blue curve), behavioral features only (green curve), and combined features (included both eye movement and behavioral features) (orange curve) (AUC: 0.61, 0.51, and 0.64, respectively). The binary model based on combined features performs the best. AUC: area under receiver operating characteristic (ROC) curve.

Table 1. Data Input to CNN.

The Pro- and Anti-Saccades Tasks (10 Features)	Smooth Pursuit Task (12 Features)	Memory-Guided Task (5 Features)	Predictive Saccade Task (4 Features)	Behavioral Tasks (8 Features)
fixation dispersion	fixation dispersion	fixation dispersion	fixation dispersion	forward visuospatial task
mean accuracy	saccade compensation	mean gain	mean variance	backward visuospatial task
pro-saccades accuracy	mean variance	gain at 5°	mean latency	forward digit span task
anti-saccades accuracy	variance vertical at 0.25 Hz	gain at 10°	mean gain	backward digit span task
mean latency	variance vertical at 0.4 Hz	gain at 15°		Stroop consistent accuracy
pro-saccades latency	variance horizontal at 0.25 Hz			Stroop inconsistent accuracy
anti-saccades latency	variance horizontal at 0.4 Hz			Stroop consistent response time
mean gain	mean delay			Stroop inconsistent response time
pro-saccades gain	delay vertical at 0.25 Hz
anti-saccades gain	delay vertical at 0.4 Hz
	delay horizontal at 0.25 Hz
	delay horizontal at 0.4 Hz

Table 2. MCI Triple Classification Results.

	Accuracy (%)	Hit Rate (%)
	Accuracy (%)	Control	MCI	Young
Combined features	74.62 (3.71)	73.08 (12.63)	60.77 (12.51)	100.00 (0.00)
Eye movement features only	69.62 (5.27)	66.60 (7.41)	51.63 (15.29)	98.75 (3.95)
Behavioral features only	72.31 (7.43)	67.84 (12.34)	57.50 (11.62)	96.39 (5.83)
Combined features without pro- and anti-saccades task	68.85 (7.57)	70.07 (11.82)	51.50 (20.56)	94.90 (8.32)
Combined features without smooth pursuit task	70.39 (5.46)	61.73 (11.18)	58.60 (8.57)	100.00 (0.00)
Combined features without memory-guided saccade task	73.08 (8.70)	69.41 (15.64)	58.81 (19.83)	100.00 (0.00)
Combined features without prediction saccade task	78.48 (6.86)	73.70 (15.09)	62.64 (16.94)	100.00 (0.00)

Note: Data are expressed as Mean (Standard Deviation).

Table 3. MCI Binary Classification Results.

	Accuracy (%)	Hit Rate (%)
	Accuracy (%)	Control	MCI
Combined features	66.50 (8.51)	70.02 (11.15)	59.76 (13.90)
Eye movement features only	61.00 (6.15)	70.63 (10.56)	49.31 (13.15)
Behavioral features only	58.50 (4.12)	62.22 (13.77)	52.79 (15.88)
Combined features without pro- and anti-saccades task	61.00 (10.22)	62.08 (18.79)	56.97 (15.77)
Combined features without smooth pursuit task	62.50 (7.55)	68.21 (13.71)	54.58 (14.59)
Combined features without memory-guided saccade task	63.00 (6.75)	69.56 (11.03)	52.77 (21.40)
Combined features without prediction saccade task	66.00 (6.15)	70.78 (10.54)	59.88 (17.76)

Note: Data are expressed as Mean (Standard Deviation).

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Li, N.; Wang, Z.; Ren, W.; Zheng, H.; Liu, S.; Zhou, Y.; Ju, K.; Chen, Z. Enhancing Mild Cognitive Impairment Auxiliary Identification Through Multimodal Cognitive Assessment with Eye Tracking and Convolutional Neural Network Analysis. Biomedicines 2025, 13, 738. https://doi.org/10.3390/biomedicines13030738

AMA Style

Li N, Wang Z, Ren W, Zheng H, Liu S, Zhou Y, Ju K, Chen Z. Enhancing Mild Cognitive Impairment Auxiliary Identification Through Multimodal Cognitive Assessment with Eye Tracking and Convolutional Neural Network Analysis. Biomedicines. 2025; 13(3):738. https://doi.org/10.3390/biomedicines13030738

Chicago/Turabian Style

Li, Na, Ziming Wang, Wen Ren, Hong Zheng, Shuai Liu, Yi Zhou, Kang Ju, and Zhongting Chen. 2025. "Enhancing Mild Cognitive Impairment Auxiliary Identification Through Multimodal Cognitive Assessment with Eye Tracking and Convolutional Neural Network Analysis" Biomedicines 13, no. 3: 738. https://doi.org/10.3390/biomedicines13030738

APA Style

Li, N., Wang, Z., Ren, W., Zheng, H., Liu, S., Zhou, Y., Ju, K., & Chen, Z. (2025). Enhancing Mild Cognitive Impairment Auxiliary Identification Through Multimodal Cognitive Assessment with Eye Tracking and Convolutional Neural Network Analysis. Biomedicines, 13(3), 738. https://doi.org/10.3390/biomedicines13030738

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Enhancing Mild Cognitive Impairment Auxiliary Identification Through Multimodal Cognitive Assessment with Eye Tracking and Convolutional Neural Network Analysis

Abstract

1. Introduction

2. Materials and Methods

2.1. Participants

2.2. Apparatus

2.3. Task and Procedure

2.3.1. Eye Movement Tasks

2.3.2. Tests of Cognitive Capabilities

2.4. CNN Analysis

3. Results

3.1. Differences in Eye Movement and Behavioral Features Between Groups

3.2. Relationship Between Eye Movement Variables and Behavioral Features

3.3. Power of Discriminating MCI from Healthy Individuals Including Young Adults

3.4. Power of Discriminating MCI from the Healthy Elderly Individuals

3.5. Comparison of CNN with Other Models

4. Discussion

5. Conclusions

Supplementary Materials

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI