Next Article in Journal
Infectious Complications and Prognostic Factors of Mortality in Patients with Lupus Nephritis Admitted to Intensive Care Units
Previous Article in Journal
Quantifying the Impact of Chronic Obstructive Sialadenitis on Quality of Life
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Education, Sex, and Age Shape Rey Complex Figure Performance in Cognitively Normal Adults: An Interpretable Machine Learning Study

1
Laboratory for Pathology Dynamics, Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA 30332, USA
2
Alzheimer’s Disease Research Center, Department of Neurology, Emory University School of Medicine, Atlanta, GA 30329, USA
3
Department of Brain Health, University of Nevada Las Vegas, Las Vegas, NV 89154, USA
4
Center for Machine Learning at Georgia Tech, Georgia Institute of Technology, Atlanta, GA 30332, USA
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
J. Clin. Med. 2025, 14(21), 7562; https://doi.org/10.3390/jcm14217562 (registering DOI)
Submission received: 9 September 2025 / Revised: 16 October 2025 / Accepted: 22 October 2025 / Published: 25 October 2025
(This article belongs to the Section Clinical Neurology)

Abstract

Background: Demographic factors such as education, sex, and age can significantly influence cognitive test performance, yet their impact on the Montreal Cognitive Assessment (MoCA) and Rey Complex Figure (CF) test has not been fully characterized in large, cognitively normal samples. Understanding these effects is critical for refining normative standards and improving the clinical interpretation of neuropsychological assessments. Methods: Data from 926 cognitively healthy adults (MoCA ≥ 24) were analyzed using supervised machine learning classifiers and complementary statistical models to identify the most predictive MoCA and CF features associated with education, sex, and age, while including race as a covariate. Feature importance analyses were conducted to quantify the relative contributions of accuracy-based and time-based measures after adjusting for demographic confounding. Results: Distinct patterns emerged across demographic groups. Higher educational attainment was associated with longer encoding times and improved recall performance, suggesting more deliberate encoding strategies. Sex differences were most apparent in the recall of visuospatial details and language-related subtests, with women showing relative advantages in fine detail reproduction and verbal fluency. Age-related differences were primarily reflected in slower task completion and reduced spatial memory accuracy. Conclusions: Leveraging one of the largest reported samples of cognitively healthy adults, this study demonstrates that education, sex, and age systematically influence MoCA and CF performance. These findings highlight the importance of incorporating demographic factors into normative frameworks to enhance diagnostic precision and the interpretability of cognitive assessments.

1. Introduction

Developed by André Rey [1], the Rey Complex Figure (CF), also sometimes referred to as the Rey–Osterrieth Complex Figure (ROCF), is a widely used neuropsychological test to assess visuospatial construction, visual memory, and executive functioning. The test involves copying and then recalling a complex geometric design following the copy and after a short delay. The CF is used across both clinical and research settings to assess cognitive functions across a variety of neurological conditions including dementia, traumatic brain injury, and epilepsy. The CF is scored based upon the accuracy and location of 18 individual items. Each unit can have a maximum of two points for a total of 36 points; two points for both an accurate figure and correct location; one point for an accurate figure with incorrect location or inaccurate figure with correct location; 0.5 points for recognizable but inaccurate figure and incorrect location; and 0 points for a missing figure [2]. Figure 1 shows the 18 CF items.
Demographic factors such as age, education, sex, and cultural background are known to significantly influence neuropsychological test performance, and failure to account for these variables can lead to misinterpretation of results and inaccurate classification of cognitive functioning [3,4]. However, few studies have systematically quantified how these demographic variables shape individual CF subscores in cognitively normal adults, limiting the ability to refine normative standards or diagnostic thresholds. Traditional regression approaches often assume linear relationships between demographic factors and test performance, whereas interpretable machine learning (ML) methods can flexibly model nonlinear associations while maintaining transparency in feature-level contributions. Such approaches offer clinical value by identifying which specific cognitive features drive demographic variability—information that can improve interpretation without compromising explainability.
The objective of the present study was to apply interpretable machine learning methods to examine differences in CF performance among cognitively healthy control participants. Previous machine learning CF research has primarily used non-clinically interpretable convolutional neural networks to examine differences between pathological and non-pathological states [5,6]. In this report, we apply interpretable, clinical feature-based machine learning algorithms to assess the impact of demographic features on neuropsychological assessment performance, including sex, education level, and race, in a large sample (n = 926) of cognitively normal participants. By characterizing how demographic factors influence cognitive test components, this study aims to support refinement of diagnostic cutoffs and normative adjustments currently under debate in the cognitive screening literature.

2. Materials and Methods

To investigate the influence of demographic variables on cognitive performance, we analyzed data from a large cohort of cognitively healthy adults. Standardized procedures were applied to extract performance metrics from the Rey Complex Figure (CF) and Montreal Cognitive Assessment (MoCA), and both statistical modeling and supervised machine learning approaches were used to identify key predictors. This dual approach ensured that findings were clinically interpretable while leveraging the sensitivity of data-driven methods.

2.1. Participant Selection

This was a secondary analysis of previously existing, fully de-identified CF data collected from an existing cohort of participants seen in the clinic at Emory University, Atlanta, GA, USA, as part of the Emory Healthy Brain Study. All participants provided consent according to the Declaration of Helsinki. The original EHBS study protocol was approved by the Internal Review Board of Emory University. The present secondary analysis used only de-identified data and was deemed exempt from additional ethical review.
All CF components were drawn by the participants with pen and paper; no electronic or tablet drawings were utilized. Included participants had no known neurological pathology at the time of CF testing and had Montreal Cognitive Assessment (MoCA) score greater than 24/30. Included demographic participant attributes included sex, age, race, and education. Study exclusion criteria included: (1) incomplete drawings for all three subtests of copy, immediate recall, and delay recall; (2) a MoCA cut-off of <24 utilized based on evidence indicating that the previously recommended threshold of <26 may be overly conservative for classifying normal cognitive functioning [7,8]. The final sample size for this study was 926 cognitively normal participants (n = 926).

2.2. CF Scoring

CF reproductions for copy, immediate recall, and delayed recall subtests were independently scored by three experienced research assistants. Immediate recall was obtained approximately 30 s after copy completion, with delayed CF recall obtained approximately 30 min later [7]. One senior clinician who specialized in the grading of the CF was used to consolidate minor differences in scoring and assure score veracity using the traditional Rey scoring system [7]. Although formal inter-rater reliability statistics were not computed, discrepancies were minimal and resolved through consensus to maintain scoring accuracy and standardization across all raters.
There are 18 CF scoring elements that are the same across copy, immediate memory, and delayed memory conditions (Table 1). The features used for grading included the presence or absence of specific features in the CF drawing in the copy subtest (Copy_), the immediate recall subtest (Imm_), or the delayed recall subtest (Del_). Additionally, the drawing times were recorded (in seconds) for each of the 3 subtests. Total CF scores for each CF condition were also used as predictive features.

2.3. Feature Inclusion and Data Processing

The primary features included CF copy, CF immediate, and CF delay including completion times for each condition. Additionally, the Montreal Cognitive Assessment and MoCA domain scores were used as predictive features [9]. Only the raw total MoCA score was used as part of the cognitively normal participant selection criterion. Demographic features modeled included sex, age, education years, and race. Sex (male, female) was modeled as a binomial. Age (in years) was binned into 3 groups using age percentile within the population to calculate three relatively equal bin sizes: Age group 1 (45–60 years), Age group 2 (60–67 years), and Age group 3 (67–80 years). Education years were classified based upon highest grade completed (No college, college graduates, post-graduate degree). Race was grouped into Asian, black, or white/Caucasian. Race was not used as an isolated classification target due to substantial class imbalance across groups, which created a high risk of overfitting and misleading performance estimates. However, race was retained as a covariate in regression preprocessing and feature evaluation to account for its potential influence on cognitive performance (see Section 4.5).

2.4. Machine Learning to Classify Participant Demographic Features

A combined statistical and machine learning pipeline (Figure 2) was employed to examine associations between Complex Figure (CF) performance and participant demographics (education, sex, and age). Initial significance testing was conducted using t-tests and ANOVA, followed by regression to isolate the unique contribution of each demographic attribute and to remove potential confounding. Post-regression p-values were checked to confirm that preprocessing effectively controlled for demographic overlap prior to machine learning analysis. To address potential collinearity, we computed pairwise correlations and variance inflation factors (VIFs) across predictors; all VIFs indicated acceptable independence.
Traditional supervised feature-based models—support vector machine (SVM), logistic regression (LR), and random forests—were implemented using scikit-learn with standard default parameters, as these algorithms are well established and require minimal hyperparameter tuning. To ensure generalizability, the data were randomly partitioned into an 80/20 training–testing split, with the test set held out as an independent validation sample. Model fitting, preprocessing, and feature scaling were conducted exclusively on the training data to prevent data leakage. Model performance was assessed using accuracy, precision, recall, and F1-scores to ensure robustness across metrics. Exploratory analyses using k-nearest neighbors (kNN) and principal component analysis (PCA) were conducted to visualize decision boundaries and confirm model separability; as these did not alter the main findings, detailed results are not presented.
To identify the most informative predictors, feature importance values were extracted from Random Forest models using Gini impurity reduction. This approach quantifies how much each feature contributes to decreasing classification error during model training, thereby ranking predictors by their relative contribution. Although alternative approaches such as SHAP or permutation methods were considered, Gini importance was selected for its direct interpretability in clinical contexts and its established use in cognitive feature analysis. Rankings of the top ten features for each demographic classification are reported. All analyses were performed in Python 3.14.0 using the Scikit-learn library.

3. Results

The goal of this study was to assess how well demographic features can predict CF performance in normal controls with no overt functional pathology. The end-to-end workflow (Figure 2) first extracted CF subscores and task times, then used classical statistics (t-tests/ANOVA and linear models) to quantify raw demographic associations and to regress out non-target demographics before prediction. In parallel, a supervised machine-learning branch trained classifiers on the adjusted feature sets to predict the target demographic and to estimate feature importance.

3.1. Cohort Characteristics and Demographic Performance

The cohort’s characteristics are shown in Table 2. Of the eligible controls with intact global cognition (MoCA ≥ 24), the analytic cohort comprised n = 926 participants. Women constituted ~71% of the sample. Most participants self-identified as White, with smaller Black and Asian subgroups. Age was evenly distributed across the three bins (45–60, 60–67, 67–80), and education skewed toward the two higher categories (≥16 years). These distributions demonstrate sufficient representation across the primary demographic variables used for modeling—age, sex, and education. Accordingly, education, sex, and age were retained as the primary independent variables in subsequent analyses. Race was not examined as an isolated variable due to class imbalance but was included as a covariate during regression preprocessing and feature evaluation to account for its potential influence (see Section 4.5).
The relationship between demographic targets was evaluated using Variance Inflation Factor (VIF) [10] and Spearman rank correlations. Table 2 shows the calculated VIF and tolerance values, while Figure 3 displays the Spearman rank correlations in a correlation matrix. Given that the VIF values are all less than 5 to 10, and the tolerance values are greater than 0.1 to 0.2 [10], multicollinearity does not exist amongst the demographic variables.

3.2. CF Performance by Demographic Group

Demographic performances are shown in Table 3. Across groups, CF Copy scores were narrowly distributed, while Immediate and Delayed recall showed greater spread. Women and men had comparable Copy performance; men showed slightly higher recall sums, whereas women exhibited marginally higher MoCA scores. Age-related trends were evident: earlier age bins showed higher recall sums, with modest variation in task completion times. Increasing education was associated with incrementally higher CF recall and MoCA scores (Table 4).

3.3. Statistical Associations of CF Features with Demographics

Statistical analysis (the upper branch of Figure 2) of data was analyzed to examine correlations with demographic attributes. Table 5 summarizes the results of statistical testing, examining the influence of primary demographic variables for which there was a sufficient sample size—education, sex, and age—on CF subscores (Copy_Sum, Imm_Sum, Del_Sum) and MoCA scores, both before and after demographic regression.
Statistical analysis was performed to evaluate associations in the raw data. Several significant associations emerged: education was strongly related to all CF measures and MoCA (all p < 0.01), age showed significant effects on immediate and delayed recall and MoCA (p < 0.001), and sex was associated with immediate recall, delayed recall, and MoCA (p < 0.05).

3.4. ML Classification of Demographic Targets

Machine learning (ML) models were developed to identify patterns associated with individual demographic targets—education, sex, or age—each chosen based on sufficient sample size and class balance. While the models generated demographic predictions, the primary objective was to examine feature importance profiles linked to each target. Race was retained as a covariate to account for potential confounding effects. Classifier performance varied by both target (education, sex, or age) and algorithm (SVM, Logistic Regression, Random Forest) as shown in Figure 4. Logistic Regression achieved the highest accuracy for Education and Age, while Random Forest yielded the best Sex predictions, with the strongest gains in precision. Support Vector Machines performed competitively but did not lead on any target. Together, these results indicate that (i) education- and age-related signal in CF features is captured most effectively by a linear decision boundary after deconfounding, and (ii) sex-related signal benefits from the non-linear partitions afforded by tree-based ensembles.

3.5. CF Feature Importance in Predicting Demographic Attributes

Feature-ranking analyses converged on temporal measures as the most consistently informative predictors across tasks. Table 6 shows the top 10 ranked CF features for each classification task: education, sex, and age. The full ranked CF feature importance lists for age, sex, and age are shown in Appendix A Table A1, Table A2 and Table A3, respectively. Copy time and Immediate recall time were the top two features overall, with MoCA close behind. Recall and copy sums contributed secondary information, and MoCA subdomains (e.g., Executive Function) provided additional, target-specific value. The prominence of timing features suggests that processing speed and task efficiency during figure copy and recall carry robust demographic signatures even after statistical deconfounding.
In a cognitively intact control cohort, CF performance exhibits expected demographic gradients at the descriptive level; however, targeted regression effectively removes these associations, enabling fair prediction tasks. Under these conditions, linear models best recover education and age, whereas non-linear ensembles best recover sex. Task-completion times emerge as the most generalizable features, with global cognition (MoCA) and quantitative recall/copy subscores providing complementary signal.

3.6. Overlap of Top Predictive Features Across Education, Sex, and Age

Figure 5 presents a Venn diagram illustrating the overlap of the top 15 predictive features across the three demographic classification tasks: education, sex, and age. Several CF and MoCA measures emerged as universally predictive, including Copy_Sum, Imm_Sum, Del_Sum, Del_17, MoCA total, MoCA subdomains (EF, Mem), and task completion times (Copy_time, Imm_time, Del_time). These features clustered in the central intersection, indicating broad predictive utility across all targets. In contrast, a subset of features demonstrated target-specific importance, such as MoCA_VS, Imm_6, and Imm_11 for education; Del_9 and MoCA_Lang for sex; and Del_12, Del_18, and Imm_8 for age. Features like Imm_18 and Imm_17 showed pairwise overlap, linking education with sex and sex with age, respectively. Together, the results highlight both a core set of generalizable predictors and specialized features that drive classification accuracy for demographic groups.

4. Discussion

This study investigated how demographic factors influence performance on the Complex Figure (CF) task and related MoCA subtests within a cognitively normal population. By integrating accuracy-based and time-based features, we identified patterns reflecting not only general cognitive ability but also meaningful variations associated with education, sex, and age. Regression analyses further demonstrated that these effects were not statistical artifacts, but rather distinct cognitive profiles across demographic groups. Understanding these differences is clinically relevant, as they provide insight into how demographic factors potentially shape visuospatial processing, encoding strategies, and memory function—ultimately informing both test interpretation and the refinement of normative standards. The following sections examine these demographic influences in detail, beginning with education, followed by sex and age.

4.1. Differences Based on Education

When classifying participants by education level, the most predictive features were all time-based or summative metrics: Imm_time, Copy_time, Imm_Sum, MoCA, and Del_time. These variables represent cognitive processing speed, executive planning, and memory consolidation. The time-based metrics, particularly Imm_time and Copy_time, suggest that longer encoding times during figure reproduction are not simply related to age-related slowing, but instead may reflect more deliberate encoding strategies in individuals with higher education, ultimately leading to stronger memory performance. This aligns with evidence showing that greater education enhances executive function and attention to structure in tasks such as CF [11]. On the other hand, summative metrics like Imm_Sum and Del_time emphasize the role of higher education in enhancing both short- and long-term visual memory performance. Previous studies have shown that higher levels of education supports organizational encoding, strategy use, and visual detail tracking, which are all factors that can contribute to stronger CF performance [11,12].
In addition to these predictors (which were found to have high importance when predicting sex and age), three features were uniquely important for predicting education: MoCA_VS, Imm_small_rectangle, and Imm_circle_with_three_dots. MoCA_VS, a measure of visuospatial skill, reflects how educational experience may fine-tune visual construction and mental manipulation skills, consistent with prior studies linking education with visuospatial and design fluency abilities [4]. Notably, previous research has found that individuals with higher levels of education scored higher on MoCA subtests specifically measuring visuospatial abilities and language [13]. This may be because general executive function capacity and working memory tend to be more efficient in individuals with higher levels of education [4,12]. It is important to note that visuospatial subtests are also highly sensitive to early cognitive changes and decline. Kaya et al. [13] found that visuospatial tasks such as clock drawing and trail making were strongly discriminative for mild cognitive impairment (MCI) compared to healthy controls and participants with Alzheimer’s disease (AD). The high importance of Imm_small_rectangle and Imm_circle_with_three_dots is supported by the idea that individuals with higher educational exposure perform better on memory tasks requiring structured encoding of geometric information [11]. These two immediate recall components also measure the early-stage encoding and recall of minor visual components, suggesting that formal education may sharpen the precision and stability of memory traces related to the spatial positioning of smaller shapes within a larger whole.

4.2. Differences Based on Sex

Machine learning classification identified CF drawing times as the most prominent difference between male and female participants. This observation is generally consistent with prior work suggesting that copy time may influence recall performance, particularly in delayed recall tests [11]. In contrast to Tremblay et al. [11], however, the current results did not show a clear gender-specific pattern, raising the possibility that time-on-task reflects general cognitive strategies rather than a sex-dependent encoding effect.
Feature importance analyses further indicated that detail-oriented shapes—specifically the vertical cross and the small triangle above the large rectangle—and language ability emerged as distinctive predictors in the sex classification task. These elements require a high degree of precision and visual discrimination, and their predictive importance aligns with prior findings that females often exhibit superior fine motor control and accuracy in visuospatial tasks, particularly those involving visual feedback and structured reproduction [14].
Notably, Del_small_triangle_above_large_rectangle, a delayed recall subscore from the CF, reflects the participant’s ability to retrieve a complex visual element from memory. In this study, Del_small_triangle_above_large_rectangle was more predictive of sex than simpler geometric components such as lines or circles. This pattern supports the hypothesis that high-complexity, centrally located CF features may be especially sensitive to strategic and memory-based sex differences, with females potentially demonstrating advantages linked to attention to fine detail and encoding strategies [11,15].
Similarly, Imm_vertical_cross, which measures immediate recall of one of the cross structures, suggests a sex-related difference in early-stage encoding of detailed visual-spatial relationships. Prior studies indicate that females often approach figure construction more holistically, emphasizing organization and fine detail, which can enhance recall performance for subcomponents such as the vertical cross [16]. This may also reflect differences in encoding strategies, where males tend to perform better on spatial orientation tasks, while females excel in fine motor and precision-based tasks [17].
Lastly, the MoCA_Lang score, reflecting language abilities such as verbal fluency and sentence repetition, was a sex-specific predictor in the classification model. Women frequently outperform men on language-based measures, including verbal memory and fluency [17,18] differences often attributed to both developmental and biological factors such as hemispheric lateralization and neuroendocrine modulation of language networks.
Consistent with prior work, males generally outperform females in gross visuospatial tasks such as mental rotation or global spatial organization, whereas females tend to show relative strengths in segmenting or sequential reproduction tasks [15,16]. These distinctions may partly arise from early social conditioning, with boys more often encouraged to engage in spatially demanding activities such as building or navigation, and girls in tasks emphasizing detailed planning and precision [19,20].

4.3. Differences Based on Age

Like the age and education classification tasks, all three drawing times were strongly associated with age. Imm_time, Copy_time, and Del_time reflect processing speed and encoding duration, which are known to decline gradually with age due to slower neural transmission and reduced executive control [11,21]. Older adults may compensate for these changes by allocating more time to encoding; however, this additional time does not always translate into preserved recall performance.
Demographic variables such as education and age have previously been shown to influence MoCA performance, which may help explain why MoCA, MoCA_EF, and MoCA_Mem were among the most predictive features across classification tasks. Standard MoCA cutoff scores can under- or over-estimate impairment depending on demographic characteristics [22]. Even within a cognitively normal sample, these MoCA subscores were closely associated with age classification, consistent with evidence that working memory capacity and episodic recall decline with advancing age [21].
Further feature importance analyses indicated that subscores from both the immediate and delayed recall phases of the CF were strongly age-related. Age has been shown to affect both immediate and delayed recall performance on the CF [11,21], with delayed recall typically showing progressive decline beginning in middle adulthood and becoming more pronounced in the late 50s and 60s [21]. The prominence of Del_five_parallel_lines and Del_square_attached_to_large_rectangle among age-sensitive features supports this pattern: as the brain ages, integrating global structure and relational components—such as aligning parallel lines or reproducing connected shapes—becomes increasingly effortful and error-prone. Meanwhile, Imm_four_parallel_lines as a key predictor highlights that early encoding strategies may begin to deteriorate with age, influencing memory consolidation even over short intervals. This observation aligns with prior research suggesting that older adults often exhibit reduced encoding efficiency, which can affect both working memory and subsequent recall [11]. Age has also been linked to a shift away from structured or hierarchical reproduction strategies, which may contribute to reduced accuracy on spatially complex figure elements, including clusters of lines [15].

4.4. Clinical Implications

Although the models in this study were developed for interpretability rather than clinical prediction, the identified demographic effects hold important implications for clinical research and cognitive assessment. Understanding how age, sex, and education influence performance on the CF and related MoCA subtests—and accounting for potential covariates such as race—can improve the interpretation of visuospatial and memory measures commonly used in diagnostic or preclinical screening contexts. Accounting for these demographic influences may also help reduce confounding when using CF performance as a behavioral endpoint in studies of Alzheimer’s disease (AD) or other neurodegenerative conditions.
For instance, demographic-adjusted CF metrics could improve the selection or stratification of participants in clinical trials targeting AD pathology, ensuring that performance differences are not misattributed to disease effects alone. Visuospatial and recall changes measured by the CF are frequently observed within the AD spectrum, but our results suggest that demographic variability must also be considered when interpreting these patterns [23].
Furthermore, integrating CF-derived metrics with other non-invasive biomarkers could enhance interpretability across diverse populations. Blood-based tau phosphorylated at threonine 217 (pTau217), for example, has been shown to distinguish amyloid-positive from amyloid-negative individuals [24], while transcranial magnetic stimulation (TMS) paradigms have detected cholinergic dysfunction associated with AD-related mild cognitive impairment [25]. Coupling such biomarkers with demographically informed CF analyses may strengthen the ability to disentangle disease-related cognitive changes from normative demographic effects, ultimately improving both diagnostic accuracy and trial design.

4.5. Limitations and Future Directions

A key limitation of this study is the exclusion of race as a standalone classification target. The sample was predominantly White (91.79%), with limited representation of Black (7.24%) and Asian (<1%) participants. Although class imbalance techniques were applied during preliminary analyses, the small sample sizes for underrepresented groups introduced a high risk of overfitting, inflated performance for the majority class, and misleading conclusions regarding race-based cognitive differences. For this reason, education, sex, and age were selected as the only isolated targets for classification. Race was retained as a covariate during regression preprocessing and feature evaluation to account for its potential influence; however, meaningful classification will require a more demographically balanced dataset. Future studies should prioritize recruitment strategies that enhance racial and cultural representation to enable more inclusive and generalizable analyses.
This study also emphasized clinically interpretable machine learning approaches, using feature-based rather than deep learning models to enhance transparency and clinical relevance [26,27]. While this strategy supports interpretability, it necessarily limits model complexity and potential predictive power. Furthermore, although feature analysis identified CF subscores and MoCA_VS as highly predictive of education, poor performance on these subtests should not be interpreted as impairment in the specific domains they measure, as results may also reflect task familiarity or test-taking strategies [28]. These findings should therefore be viewed as demographic-related performance differences, not diagnostic indicators. Additionally, MoCA subscores do not contribute equally to classification accuracy [29], reinforcing the need for cautious interpretation in clinical and research contexts.
Given the cross-sectional design of this study, it was not possible to determine whether the observed demographic influences on CF and MoCA performance remain stable over time. Future research should adopt longitudinal designs to assess this stability and to evaluate whether demographic characteristics might serve as early indicators of cognitive change or progression toward MCI.
Finally, CF performance was evaluated using traditional pen-and-paper drawings, which likely reduced potential confounding effects related to older participants’ lack of familiarity with digital media. However, as future digital tablet-based assessments become more common in clinical and research settings, future studies should investigate automated scoring methods, deep learning approaches, and pixel-level preprocessing (e.g., pen pressure, line trajectory) to improve measurement precision, sensitivity, and scalability. Future digital CF implementations—with real-time stroke capture and sequencing—would also allow finer-grained evaluation of strategic versus compensatory behavior.

5. Conclusions

This study leveraged one of the largest reported samples of cognitively healthy adults (n = 926) and applied clinically interpretable, supervised machine learning methods to examine how education, sex, and age influence performance on the MoCA and CF assessments. Feature importance analyses indicated that higher educational attainment was associated with more deliberate encoding strategies and enhanced memory consolidation, while sex differences were most apparent in the recall of complex visuospatial features and language-related tasks. Age-related effects were characterized primarily by slower processing speed and reduced spatial memory recall accuracy. Collectively, these findings highlight the importance of considering demographic influences when interpreting cognitive test performance, supporting ongoing efforts to refine normative standards and enhance the clinical interpretability of MoCA and CF outcomes across diverse populations.

Author Contributions

Conceptualization, A.J.B.L., B.Z., D.W.L., J.J.L., S.E.J. and C.S.M.; methodology, A.J.B.L., B.Z., D.W.L. and C.S.M.; software, A.J.B.L. and C.S.M.; validation, A.J.B.L., B.Z., J.J.L., S.E.J., D.W.L. and C.S.M.; formal analysis, A.J.B.L., B.Z. and C.S.M.; investigation, D.W.L. and C.S.M.; resources, C.S.M.; data curation, A.J.B.L.; writing—original draft preparation, A.J.B.L., B.Z. and C.S.M.; writing—review and editing, A.J.B.L., B.Z., J.J.L., S.E.J., D.W.L. and C.S.M.; visualization, A.J.B.L., B.Z. and C.S.M.; supervision, C.S.M.; project administration, D.W.L., J.J.L. and C.S.M.; funding acquisition, C.S.M. All authors have read and agreed to the published version of the manuscript.

Funding

The work was supported by National Institutes of Health grants R01AG070937 (J.J.L.), R35GM152245 (C.S.M.), and subawards from U19AG056169 to J.J.L. and C.S.M. Additional funding was provided by the National Science Foundation 1944247 (C.S.M.) and the Chan Zuckerberg Initiative 253558 (C.S.M.). The funding sources had no role in study design; in the collection, analysis, and interpretation of the data; in the writing of the report; or in the decision to submit the article for publication.

Institutional Review Board Statement

The dataset analyzed in this study was collected in accordance with the Declaration of Helsinki and as part of an Internal Review Board-approved protocol at Emory University for the Emory Healthy Brain Study. All participants provided informed consent at the time of data collection. The present secondary analysis used only de-identified data and was deemed exempt from additional ethical review.

Informed Consent Statement

This was a secondary analysis of previously existing, fully de-identified CF data collected from a cohort of participants seen in the clinic at Emory University, Atlanta, Georgia, USA, as part of the Emory Healthy Brain Study. All participants enrolled in the EHBS provided consent according to the Declaration of Helsinki, and the study protocol was approved by the Internal Review Board of Emory University.

Data Availability Statement

Original data requests may be made available upon reasonable request to D.L. and J.J.L. with appropriate approvals.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Table A1. Ranked feature importance for predicting education classification. Effects of age and sex were regressed to enable isolated evaluation of education, but race is kept as a covariate. Features are ranked from most important (Rank 1) to least important (Rank 68) based on their contribution to model performance using a Random Forest classifier.
Table A1. Ranked feature importance for predicting education classification. Effects of age and sex were regressed to enable isolated evaluation of education, but race is kept as a covariate. Features are ranked from most important (Rank 1) to least important (Rank 68) based on their contribution to model performance using a Random Forest classifier.
FeatureImportance Rank
Imm_time_adj1
Copy_time_adj2
Imm_Sum_adj3
MoCA_adj4
Del_time_adj5
Copy_Sum_adj6
Del_Sum_adj7
MoCA_EF_adj8
MoCA_Mem_adj9
Imm_6_adj10
Imm_18_adj11
Imm_11_adj12
MoCA_VS_adj13
Del_8_adj14
Del_17_adj15
Del_12_adj16
Imm_17_adj17
Imm_1_adj18
MoCA_Lang_adj19
Imm_14_adj20
Imm_4_adj21
Imm_8_adj22
Del_6_adj23
Del_18_adj24
Imm_12_adj25
Del_9_adj26
Del_14_adj27
Imm_9_adj28
Del_11_adj29
Imm_16_adj30
Imm_3_adj31
Del_1_adj32
Del_4_adj33
Imm_5_adj34
copy_7_adj35
copy_1_adj36
copy_6_adj37
copy_9_adj38
Del_13_adj39
Imm_7_adj40
Imm_13_adj41
Del_16_adj42
Del_3_adj43
Del_5_adj44
copy_3_adj45
Del_7_adj46
Imm_15_adj47
Del_10_adj48
Del_15_adj49
Imm_2_adj50
copy_14_adj51
copy_18_adj52
copy_11_adj53
Hand_adj54
copy_13_adj55
copy_2_adj56
copy_5_adj57
Del_2_adj58
Race_adj59
Imm_10_adj60
copy_16_adj61
copy_8_adj62
copy_12_adj63
copy_17_adj64
copy_4_adj65
copy_15_adj66
copy_10_adj67
MoCA_Orient_adj68
Table A2. Ranked feature importance for predicting sex classification. Effects of education and age were regressed to enable isolated evaluation of sex, but race is kept as a covariate. Features are ranked from most important (Rank 1) to least important (Rank 68) based on their contribution to model performance using a Random Forest classifier.
Table A2. Ranked feature importance for predicting sex classification. Effects of education and age were regressed to enable isolated evaluation of sex, but race is kept as a covariate. Features are ranked from most important (Rank 1) to least important (Rank 68) based on their contribution to model performance using a Random Forest classifier.
FeatureImportance Rank
MoCA_adj1
Copy_time_adj2
MoCA_EF_adj3
Del_Sum_adj4
Imm_time_adj5
Del_time_adj6
Copy_Sum_adj7
Imm_Sum_adj8
MoCA_Lang_adj9
MoCA_Mem_adj10
Imm_1_adj11
Imm_17_adj12
Del_17_adj13
Imm_18_adj14
Del_9_adj15
Imm_9_adj16
Del_8_adj17
Imm_14_adj18
Imm_8_adj19
Del_3_adj20
Del_18_adj21
Del_1_adj22
Del_14_adj23
Imm_6_adj24
Del_11_adj25
Del_6_adj26
Del_12_adj27
copy_9_adj28
Imm_13_adj29
Imm_5_adj30
Del_2_adj31
Imm_11_adj32
Del_16_adj33
Del_7_adj34
copy_7_adj35
Del_13_adj36
MoCA_VS_adj37
Imm_15_adj38
Imm_12_adj39
Imm_2_adj40
Imm_3_adj41
Imm_16_adj42
copy_1_adj43
Imm_7_adj44
copy_6_adj45
copy_2_adj46
Del_5_adj47
Del_4_adj48
Imm_4_adj49
copy_3_adj50
Hand_adj51
copy_15_adj52
Imm_10_adj53
Race_adj54
copy_18_adj55
Del_15_adj56
copy_13_adj57
copy_5_adj58
MoCA_Orient_adj59
Del_10_adj60
copy_11_adj61
copy_8_adj62
copy_14_adj63
copy_17_adj64
copy_12_adj65
copy_16_adj66
copy_10_adj67
copy_4_adj68
Table A3. Ranked feature importance for predicting age classification. Effects of education and sex were regressed to enable isolated evaluation of age, but race is kept as a covariate. Features are ranked from most important (Rank 1) to least important (Rank 68) based on their contribution to model performance using a Random Forest classifier.
Table A3. Ranked feature importance for predicting age classification. Effects of education and sex were regressed to enable isolated evaluation of age, but race is kept as a covariate. Features are ranked from most important (Rank 1) to least important (Rank 68) based on their contribution to model performance using a Random Forest classifier.
FeatureImportance Rank
Imm_time_adj1
Copy_time_adj2
Del_time_adj3
MoCA_adj4
Imm_Sum_adj5
Del_Sum_adj6
Copy_Sum_adj7
MoCA_EF_adj8
MoCA_Mem_adj9
Del_17_adj10
Imm_17_adj11
Del_18_adj12
Del_8_adj13
Del_12_adj14
Imm_8_adj15
Imm_6_adj16
Imm_12_adj17
Imm_11_adj18
Del_6_adj19
Del_9_adj20
Imm_18_adj21
MoCA_Lang_adj22
Imm_14_adj23
Del_3_adj24
Del_14_adj25
Imm_3_adj26
Imm_1_adj27
Imm_13_adj28
Del_7_adj29
Imm_9_adj30
Del_11_adj31
Del_16_adj32
MoCA_VS_adj33
Del_1_adj34
Del_13_adj35
copy_7_adj36
copy_9_adj37
copy_1_adj38
copy_6_adj39
Imm_7_adj40
Del_2_adj41
Del_5_adj42
Imm_2_adj43
Imm_16_adj44
copy_18_adj45
copy_3_adj46
Hand_adj47
Del_4_adj48
copy_2_adj49
Imm_4_adj50
Imm_5_adj51
Del_10_adj52
copy_15_adj53
Imm_15_adj54
copy_17_adj55
copy_11_adj56
Del_15_adj57
copy_8_adj58
Imm_10_adj59
copy_14_adj60
Race_adj61
MoCA_Orient_adj62
copy_13_adj63
copy_5_adj64
copy_16_adj65
copy_12_adj66
copy_4_adj67
copy_10_adj68

References

  1. Rey, A. L’examen psychologique dans les cas d’encéphalopathie traumatique. (Les problems.). [The psychological examination in cases of traumatic encepholopathy. Problems.]. Arch. Psychol. 1941, 28, 215–285. [Google Scholar]
  2. Meyers, J.; Meyers, K.R. Rey Complex Figure Test and Recognition Trial; Psychological Assessment Resources, Incorporated: Lutz, FL, USA, 1995. [Google Scholar]
  3. Heaton, R.K. Revised Comprehensive Norms for an Expanded Halstead-Reitan Battery: Demographically Adjusted Neuropsychological Norms for African American and Caucasian Adults, Professional Manual; Psychological Assessment Resources: Lutz, FL, USA, 2004. [Google Scholar]
  4. Rosselli, M.; Ardila, A. The impact of culture and education on non-verbal neuropsychological measurements: A critical review. Brain Cogn. 2003, 52, 326–333. [Google Scholar] [CrossRef] [PubMed]
  5. Dong, N.; Fu, C.; Li, R.; Zhang, W.; Liu, M.; Xiao, W.; Taylor, H.M.; Nicholas, P.J.; Tanglay, O.; Young, I.M.; et al. Machine Learning Decomposition of the Anatomy of Neuropsychological Deficit in Alzheimer’s Disease and Mild Cognitive Impairment. Front. Aging Neurosci. 2022, 14, 854733. [Google Scholar] [CrossRef]
  6. Cheah, W.T.; Hwang, J.J.; Hong, S.Y.; Fu, L.C.; Chang, Y.L.; Chen, T.F.; Chen, I.A.; Chou, C.C. A Digital Screening System for Alzheimer Disease Based on a Neuropsychological Test and a Convolutional Neural Network: System Development and Validation. JMIR Med. Inf. 2022, 10, e31106. [Google Scholar] [CrossRef] [PubMed]
  7. Loring, D.W.; Martin, R.C.; Meador, K.J.; Lee, G.P. Psychometric construction of the Rey-Osterrieth Complex Figure: Methodological considerations and interrater reliability. Arch. Clin. Neuropsychol. 1990, 5, 1–14. [Google Scholar] [CrossRef]
  8. Islam, N.; Hashem, R.; Gad, M.; Brown, A.; Levis, B.; Renoux, C.; Thombs, B.D.; McInnes, M.D. Accuracy of the Montreal Cognitive Assessment tool for detecting mild cognitive impairment: A systematic review and meta-analysis. Alzheimers Dement. 2023, 19, 3235–3243. [Google Scholar] [CrossRef] [PubMed]
  9. Goldstein, F.C.; Milloy, A.; Loring, D.W. Incremental Validity of Montreal Cognitive Assessment Index Scores in Mild Cognitive Impairment and Alzheimer Disease. Dement. Geriatr. Cogn. Disord. 2018, 45, 49–55. [Google Scholar] [CrossRef]
  10. Kim, J.H. Multicollinearity and misleading statistical results. Korean J. Anesth. 2019, 72, 558–569. [Google Scholar] [CrossRef]
  11. Tremblay, M.P.; Potvin, O.; Callahan, B.L.; Belleville, S.; Gagnon, J.F.; Caza, N.; Ferland, G.; Hudon, C.; Macoir, J. Normative data for the Rey-Osterrieth and the Taylor complex figure tests in Quebec-French people. Arch. Clin. Neuropsychol. 2015, 30, 78–87. [Google Scholar] [CrossRef]
  12. Lingo VanGilder, J.; Lohse, K.R.; Duff, K.; Wang, P.; Schaefer, S.Y. Evidence for associations between Rey-Osterrieth Complex Figure test and motor skill learning in older adults. Acta Psychol. 2021, 214, 103261. [Google Scholar] [CrossRef]
  13. Kaya, Y.; Aki, O.E.; Can, U.A.; Derle, E.; Kibaroglu, S.; Barak, A. Validation of Montreal Cognitive Assessment and Discriminant Power of Montreal Cognitive Assessment Subtests in Patients With Mild Cognitive Impairment and Alzheimer Dementia in Turkish Population. J. Geriatr. Psychiatry Neurol. 2014, 27, 103–109. [Google Scholar] [CrossRef]
  14. Liutsko, L.; Muinos, R.; Tous Ral, J.M.; Contreras, M.J. Fine Motor Precision Tasks: Sex Differences in Performance with and without Visual Guidance across Different Age Groups. Behav. Sci. 2020, 10, 36. [Google Scholar] [CrossRef]
  15. Rosselli, M.; Ardila, A. Effects of age, education, and gender on the Rey-Osterrieth Complex Figure. Clin. Neuropsychol. 1991, 5, 370–376. [Google Scholar] [CrossRef]
  16. Vlachos, F.; Andreou, G.; Andreou, E. Biological and environmental influences in visuospatial abilities. Learn. Individ. Differ. 2003, 13, 339–347. [Google Scholar] [CrossRef]
  17. Halpern, D.F. Sex Differences in Cognitive Abilities, 4th ed.; Psychology Press: New York, NY, USA, 2012. [Google Scholar]
  18. Herlitz, A.; Yonker, J.E. Sex Differences in Episodic Memory: The Influence of Intelligence. J. Clin. Exp. Neuropsychol. 2002, 24, 107–114. [Google Scholar] [CrossRef]
  19. Eccles, J.S. Gender roles and women’s achievement-related decisions. Psychol. Women Q. 1987, 11, 135–172. [Google Scholar] [CrossRef]
  20. Halpern, D.F. Sex Differences in Cognitive Abilities, 2nd ed.; Lawrence Erlbaum Associates, Inc: Mahwah, NJ, USA, 1992; p. 308. [Google Scholar]
  21. Gallagher, C.; Burke, T. Age, gender and IQ effects on the Rey-Osterrieth Complex Figure Test. Br. J. Clin. Psychol. 2007, 46, 35–45. [Google Scholar] [CrossRef] [PubMed]
  22. Carson, N.; Leach, L.; Murphy, K.J. A re-examination of Montreal Cognitive Assessment (MoCA) cutoff scores. Int. J. Geriatr. Psychiatry 2018, 33, 379–388. [Google Scholar] [CrossRef]
  23. Bradshaw, A.C.; Georges, J. Anti-Amyloid Therapies for Alzheimer’s Disease: An Alzheimer Europe Position Paper and Call to Action. J. Prev. Alzheimer’s Dis. 2024, 11, 265–273. [Google Scholar] [CrossRef]
  24. Antonioni, A.; Raho, E.M.; Di Lorenzo, F.; Manzoli, L.; Flacco, M.E.; Koch, G. Blood phosphorylated Tau217 distinguishes amyloid-positive from amyloid-negative subjects in the Alzheimer’s disease continuum. A systematic review and meta-analysis. J. Neurol. 2025, 272, 252. [Google Scholar] [CrossRef]
  25. Padovani, A.; Benussi, A.; Cantoni, V.; Dell’Era, V.; Cotelli, M.S.; Caratozzolo, S.; Turrone, R.; Rozzini, L.; Alberici, A.; Altomare, D.; et al. Diagnosis of Mild Cognitive Impairment Due to Alzheimer’s Disease with Transcranial Magnetic Stimulation. J. Alzheimer’s Dis. 2018, 65, 221–230. [Google Scholar] [CrossRef]
  26. Al-Hussaini, I.; White, B.; Varmeziar, A.; Mehra, N.; Sanchez, M.; Lee, J.; DeGroote, N.P.; Miller, T.P.; Mitchell, C.S. An Interpretable Machine Learning Framework for Rare Disease: A Case Study to Stratify Infection Risk in Pediatric Leukemia. J. Clin. Med. 2024, 13, 1788. [Google Scholar] [CrossRef]
  27. Al-Hussaini, I.; Mitchell, C.S. SeizFt: Interpretable Machine Learning for Seizure Detection Using Wearables. Bioengineering 2023, 10, 918. [Google Scholar] [CrossRef] [PubMed]
  28. Moafmashhadi, P.; Koski, L. Limitations for interpreting failure on individual subtests of the Montreal Cognitive Assessment. J. Geriatr. Psychiatry Neurol. 2013, 26, 19–28. [Google Scholar] [CrossRef] [PubMed]
  29. Aiello, E.N.; Gramegna, C.; Esposito, A.; Gazzaniga, V.; Zago, S.; Difonzo, T.; Maddaluno, O.; Appollonio, I.; Bolognini, N. The Montreal Cognitive Assessment (MoCA): Updated norms and psychometric insights into adaptive testing from healthy individuals in Northern Italy. Aging Clin. Exp. Res. 2022, 34, 375–382. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Rey Complex Figure (CF) scoring items. The left panel shows the Rey Complex Figure that participants reproduced during the copy, immediate recall, and delayed recall phases of the test. Each numbered element corresponds to a distinct structural component of the figure, summarized in the reference table on the right. The numbered labels are included to help readers identify component features and were not present on the version shown to participants.
Figure 1. Rey Complex Figure (CF) scoring items. The left panel shows the Rey Complex Figure that participants reproduced during the copy, immediate recall, and delayed recall phases of the test. Each numbered element corresponds to a distinct structural component of the figure, summarized in the reference table on the right. The numbered labels are included to help readers identify component features and were not present on the version shown to participants.
Jcm 14 07562 g001
Figure 2. Pipeline for feature extraction and predictive analysis of CF scores. The top branch represents the statistical analysis pipeline, which included significance testing via t-tests and ANOVA tests using the raw data and subsequent linear regression to isolate the primary demographic variables of interest (education or sex or age). The bottom branch illustrates the machine learning pipeline, which identifies the most important CF or MoCA features to assess the isolated effect of either education, sex, or age.
Figure 2. Pipeline for feature extraction and predictive analysis of CF scores. The top branch represents the statistical analysis pipeline, which included significance testing via t-tests and ANOVA tests using the raw data and subsequent linear regression to isolate the primary demographic variables of interest (education or sex or age). The bottom branch illustrates the machine learning pipeline, which identifies the most important CF or MoCA features to assess the isolated effect of either education, sex, or age.
Jcm 14 07562 g002
Figure 3. Correlation Matrix of Demographic Target Relationships. The correlation matrix shows near-zero associations: ρ (age, education) = 0.03, ρ (age, sex) = −0.05, and ρ (sex, education) = 0.01.
Figure 3. Correlation Matrix of Demographic Target Relationships. The correlation matrix shows near-zero associations: ρ (age, education) = 0.03, ρ (age, sex) = −0.05, and ρ (sex, education) = 0.01.
Jcm 14 07562 g003
Figure 4. Classifier Performance by Target and Metric. Panels (ac) show the performance of three classifiers—SVM (a), Logistic Regression (b), and Random Forest (c)—in predicting demographic targets. Each panel displays classifier performance across Accuracy, Precision, and Recall for three target variables: Education (Ed), Sex, and Age. Logistic Regression (b) achieved the highest accuracy for predicting both Education and Age, while Random Forest (c) outperformed other classifiers in predicting Sex, showing both the highest accuracy and a marked increase in precision.
Figure 4. Classifier Performance by Target and Metric. Panels (ac) show the performance of three classifiers—SVM (a), Logistic Regression (b), and Random Forest (c)—in predicting demographic targets. Each panel displays classifier performance across Accuracy, Precision, and Recall for three target variables: Education (Ed), Sex, and Age. Logistic Regression (b) achieved the highest accuracy for predicting both Education and Age, while Random Forest (c) outperformed other classifiers in predicting Sex, showing both the highest accuracy and a marked increase in precision.
Jcm 14 07562 g004
Figure 5. Overlap of Top 15 Predictive Features Across Classification Targets. The Venn diagram illustrates the overlap of the top 15 features contributing to the classification of Education, Sex, and Age. Each circle represents the set of features most important for a given demographic classification task, with overlapping regions indicating features shared across tasks. Features placed in the center are predictive across all three targets, while features located in the pairwise overlaps are shared between two demographic targets. The external legend provides color-coded annotations for individual targets (Education, Sex, Age) as well as their intersections (Education and Sex, Education and Age, Sex and Age, and all three combined). Key for abbreviations in this figure: Imm_1 = Imm_Vertical_Cross, Imm_6 = Imm_Small_Rectangle, Imm_8 = Imm_Four_Parallel_Lines, Imm_11 = Imm_Circle_with_Three_Dots, Imm_17 = Imm_Horizontal_Cross, Imm_18 = Imm_Square_attached_to_Large_Rectangle, Del_9 = Del_Small_Triangle_above_Large_Rectangle, Del_12 = Del_Five_Parallel_Lines, Del_18 = Del_Square_attached_to_Large_Rectangle.
Figure 5. Overlap of Top 15 Predictive Features Across Classification Targets. The Venn diagram illustrates the overlap of the top 15 features contributing to the classification of Education, Sex, and Age. Each circle represents the set of features most important for a given demographic classification task, with overlapping regions indicating features shared across tasks. Features placed in the center are predictive across all three targets, while features located in the pairwise overlaps are shared between two demographic targets. The external legend provides color-coded annotations for individual targets (Education, Sex, Age) as well as their intersections (Education and Sex, Education and Age, Sex and Age, and all three combined). Key for abbreviations in this figure: Imm_1 = Imm_Vertical_Cross, Imm_6 = Imm_Small_Rectangle, Imm_8 = Imm_Four_Parallel_Lines, Imm_11 = Imm_Circle_with_Three_Dots, Imm_17 = Imm_Horizontal_Cross, Imm_18 = Imm_Square_attached_to_Large_Rectangle, Del_9 = Del_Small_Triangle_above_Large_Rectangle, Del_12 = Del_Five_Parallel_Lines, Del_18 = Del_Square_attached_to_Large_Rectangle.
Jcm 14 07562 g005
Table 1. Copy, immediate recall, and delayed recall subscores with coded CF component identifiers. The table lists the conversion of Rey Complex Figure (CF) subscores across the three test phases: copy, immediate recall, and delayed recall. Numbers in parentheses correspond to the coded component identifiers shown in Figure 1. In the dataset and results, each CF subtest is denoted as X_, where X is replaced by Copy_, Imm_, or Del_, referring to the respective CF subtests.
Table 1. Copy, immediate recall, and delayed recall subscores with coded CF component identifiers. The table lists the conversion of Rey Complex Figure (CF) subscores across the three test phases: copy, immediate recall, and delayed recall. Numbers in parentheses correspond to the coded component identifiers shown in Figure 1. In the dataset and results, each CF subtest is denoted as X_, where X is replaced by Copy_, Imm_, or Del_, referring to the respective CF subtests.
Coded Variable X (Copy_, Imm_, Del_)Actual CF Subscore
X_1Vertical Cross
X_2Large Rectangle
X_3Diagonal Cross
X_4Horizontal Midline of Large Rectangle (2)
X_5Vertical Midline of Large Rectangle (2)
X_6Small Rectangle
X_7Small Horizontal Line above Small Rectangle (6)
X_8Four Parallel Lines
X_9Small Triangle above Large Rectangle (2)
X_10Small Vertical Line within Large Rectangle (2)
X_11Circle with Three Dots
X_12Five Parallel Lines
X_13Sides of Large Triangle attached to Large Rectangle
X_14Diamond
X_15Vertical Line within Sides of Large Triangle (13)
X_16Horizontal Line within Sides of Large Triangle (13)
X_17Horizontal Cross
X_18Square attached to Large Rectangle (2)
Table 2. VIF and tolerance values for demographic variables.
Table 2. VIF and tolerance values for demographic variables.
VariableVIFTolerance
Age1.1420.876
Sex1.1350.881
Education1.1100.901
Table 3. Demographic features and sample sizes.
Table 3. Demographic features and sample sizes.
Demographic FeatureSample Size (N)% of Study
Male27129.27
Female65570.73
White85091.79
Black677.24
Asian90.97
Age 1 (45–60 years)32735.31
Age 2 (60–67 years)32635.21
Age 3 (67–80 years)27329.48
Education 1 (10–16 years)15216.41
Education 2 (16–18 years)36739.63
Education 3 (18–22 years)40743.95
Total926
Table 4. Descriptive statistics of CF scores, subtest scores, and subtest drawing times (copy, immediate recall, and delayed recall) across demographic groups. The table presents the mean standard deviation for Copy Sum, Immediate Sum, Delayed Sum, and MoCA scores, along with average completion times (in seconds) for the copy, immediate recall, and delayed recall tasks. Data are stratified by sex, race, age group, and education level.
Table 4. Descriptive statistics of CF scores, subtest scores, and subtest drawing times (copy, immediate recall, and delayed recall) across demographic groups. The table presents the mean standard deviation for Copy Sum, Immediate Sum, Delayed Sum, and MoCA scores, along with average completion times (in seconds) for the copy, immediate recall, and delayed recall tasks. Data are stratified by sex, race, age group, and education level.
Demographic GroupCopy SumImmediate SumDelayed SumMoCACopy Time (s)Immediate Time (s)Delay Time (s)
Male32.45 ± 3.0818.70 ± 6.5817.67 ± 6.4826.79 ± 1.69164.48129.79105.58
Female32.21 ± 3.2217.91 ± 6.0516.78 ± 6.3027.31 ± 1.78160.00129.69103.14
Age 132.51 ± 3.1219.34 ± 6.1918.10 ± 6.3427.48 ± 1.77166.28135.64106.66
Age 232.23 ± 3.2218.00 ± 6.0616.99 ± 6.1927.01 ± 1.77156.53125.61102.65
Age 332.08 ± 3.2116.89 ± 6.2015.84 ± 6.3926.95 ± 1.73161.06127.53101.93
Education 131.43 ± 3.4216.26 ± 6.5315.47 ± 6.9526.55 ± 1.73161.53128.22102.30
Education 232.42 ± 3.0818.49 ± 6.1517.40 ± 6.3027.16 ± 1.77159.32129.29105.14
Education 332.48 ± 3.1418.54 ± 6.0417.30 ± 6.1127.39 ± 1.74163.02130.67103.27
Table 5. p-values from statistical tests assessing the relationship between demographic variables (education, sex, age) and CF subscores (Copy_Sum, Imm_Sum, Del_Sum, and MoCA) in the raw data. The raw data p-values show several statistically significant relationships (p < 0.05), particularly between education and all CF scores, as well as between age and immediate/delayed recall and MoCA.
Table 5. p-values from statistical tests assessing the relationship between demographic variables (education, sex, age) and CF subscores (Copy_Sum, Imm_Sum, Del_Sum, and MoCA) in the raw data. The raw data p-values show several statistically significant relationships (p < 0.05), particularly between education and all CF scores, as well as between age and immediate/delayed recall and MoCA.
Independent VariableDependent VariableRaw Data
p-Value
EducationCopy_Sum0.0027
Imm_Sum0.0006
Del_Sum0.0097
MoCA9.9 × 10−7
SexCopy_Sum0.2671
Imm_Sum0.0471
Del_Sum0.0331
MoCA0.0001
AgeCopy_Sum0.0747
Imm_Sum4.85 × 10−7
Del_Sum6.49 × 10−6
MoCA0.0001
Table 6. The table lists the ten highest-ranked features from random forest classifiers trained to predict education, sex, and age. Feature ranks were derived from model-specific importance scores, where lower rank values indicate greater importance. The “Sum of Ranks” column aggregates feature ranks across all three tasks to highlight features consistently important for prediction. Notably, Copy_time, Imm_time, and MoCA emerged as the three most universally influential features across demographic target classification.
Table 6. The table lists the ten highest-ranked features from random forest classifiers trained to predict education, sex, and age. Feature ranks were derived from model-specific importance scores, where lower rank values indicate greater importance. The “Sum of Ranks” column aggregates feature ranks across all three tasks to highlight features consistently important for prediction. Notably, Copy_time, Imm_time, and MoCA emerged as the three most universally influential features across demographic target classification.
FeatureRank in EdRank in SexRank in AgeOverall Rank
Copy_time2221
Imm_time1512
MoCA4143
Del_time5634
Imm_Sum3855
Del_Sum7466
MoCA_EF8387
Copy_Sum6778
MoCA_Mem91099
Del_1715131010
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Lee, A.J.B.; Zhao, B.; Lah, J.J.; John, S.E.; Loring, D.W.; Mitchell, C.S. Education, Sex, and Age Shape Rey Complex Figure Performance in Cognitively Normal Adults: An Interpretable Machine Learning Study. J. Clin. Med. 2025, 14, 7562. https://doi.org/10.3390/jcm14217562

AMA Style

Lee AJB, Zhao B, Lah JJ, John SE, Loring DW, Mitchell CS. Education, Sex, and Age Shape Rey Complex Figure Performance in Cognitively Normal Adults: An Interpretable Machine Learning Study. Journal of Clinical Medicine. 2025; 14(21):7562. https://doi.org/10.3390/jcm14217562

Chicago/Turabian Style

Lee, Albert J. B., Benjamin Zhao, James J. Lah, Samantha E. John, David W. Loring, and Cassie S. Mitchell. 2025. "Education, Sex, and Age Shape Rey Complex Figure Performance in Cognitively Normal Adults: An Interpretable Machine Learning Study" Journal of Clinical Medicine 14, no. 21: 7562. https://doi.org/10.3390/jcm14217562

APA Style

Lee, A. J. B., Zhao, B., Lah, J. J., John, S. E., Loring, D. W., & Mitchell, C. S. (2025). Education, Sex, and Age Shape Rey Complex Figure Performance in Cognitively Normal Adults: An Interpretable Machine Learning Study. Journal of Clinical Medicine, 14(21), 7562. https://doi.org/10.3390/jcm14217562

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop