3. Results
Demographic distribution is shown in
Table 1. In line with the initial norms, the independent samples
t test showed no gender classifications [CWIT1
t (98) = −0.210,
p = 0.834; CWIT2
t (98) = −2.051,
p = 0.043; CWIT3
t (98) = −0.088,
p = 0.930; CWIT4
t (98) = 0.644,
p = 0.521; TMT1
t (98) = 1.626,
p = 0.107; TMT2
t (98) = 0.673,
p = 0.503; TMT3
t (98) = 1.505,
p = 0.136; TMT4
t (98) = −1.155,
p = 0.251], in contrast to age, which was associated with both tests, and therefore their norms were stratified by age. Education was not correlated to the CWIT
D-KEFS, but instead the TMT
D-KEFS, which was found to be significantly related to schooling years, specifically condition 4, which is the most complex and indicative of executive dysfunction, hence its norms were matched with age and education (
Table 3 and
Table 4). According to the results, CWIT errors, specifically total error score, corrected errors, and uncorrected errors did not differ by means of age and education years; however, in agreement with the initial study of Delis et al. [
41], we calculated different norms across the three age groups of our study in order to provide a more concrete description of errors distribution in health experts.
Norms were established using percentiles scores (
Table 5,
Table 6,
Table 7,
Table 8,
Table 9,
Table 10,
Table 11,
Table 12 and
Table 13). Specifically, we calculated the raw mean scores, as well as their standard deviation. Afterwards, we converted the raw scores into percentile scores. Inferential cut off scores were then calculated to extract those under the lowest 10%, which is the 10th percentile and is traditionally applied as low performance [
67]. Scores above the 95% of the population were regarded as superior performance.
The CWIT norms for the four conditions were matched to age, color naming (p > 0.05), word reading (p < 0.05), inhibition (p < 0.01), and inhibition/switching (p < 0.01), but not stratified by gender and education because no differences were observed between men and women or between the three educational classes. On the contrary, the TMTD-KEFS performance was strongly dependent on age and education, according to the Pearson test, and therefore, different age- and education-related norms have been extracted in each D-KEFS test’s condition: visual scanning (p > 0.05), number sequencing (p < 0.05), letter sequencing (p < 0.05), letter–number switching (p < 0.01) and motor speed (p < 0.01).
4. Discussion
To our knowledge, there are no norms outside the original American age-adjusted norms presented in the D-KEFS manual by Delis et al. (2001) [
41] available for clinicians and researchers in Greece. Despite some previous versions of Stroop and the TMT validated in the adult Greek population, this is the first study which measures the effect of demographic variables on CWIT
D-KEFS as well as TMT
D-KEFS performance and provides normative data for the Greek sample. Moreover, there is a gap in the literature regarding neuropsychological tests’ normative data studies in the age range between early to middle adulthood, where no particular focus was given to the socio-demographic effect in examinees’ performances. This research gap is supposed to be covered by the current study.
Regarding the CWIT
D-KEFS, it was created to improve the Stroop version by including an inhibition/switching trial, which was designed to be more difficult than the inhibition trial by means of completion time, as well as the number of errors. Overall, the results of the current study suggest that the CWIT
D-KEFS can be regarded as a measure of performance in processing speed/executive functioning, however, until now there have been no studies comparing the CWIT with the old Stroop test. The results of the current study showed that age was a predictive factor of CWIT
D-KEFS performance across its conditions, also in agreement with previous studies from Lippa and Davis (2010) [
60], Zhao et al. (2020) [
42], and Espenes et al. (2024) [
68]. In fact, the three age classes of the sample differed by means of their performance in CWIT
D-KEFS mainly regarding the last two conditions. Based on the Pearson test, age classes differed statistically significantly in condition 3 (
r = 0.489,
n = 100,
p < 0.001), 4 (
r = 0.514,
n = 100,
p < 0.001), and 2 (
r = 0.219,
n = 100,
p < 0.005). Additionally, regarding condition 2, it is worth mentioning that the average and upper percentile scores among the three age classes were identical with subtle differences, whereas those between 30 and 39 years old had a slightly better performance followed by those between 20 and 29 and those between 40 and 49 years old. However, this slightly lower performance of younger counterparts compared with adults who belong to the 30–39 age group does not necessarily mean lower executive functioning, because various factors, such as fatigue, low motivation, or reduced attention could affect their performance [
69].
Kurniadi et al. (2021) [
20] found moderate correlation between education with CWIT
D-KEFS performance in condition 3 (
r = 0.212,
n = 101,
p < 0.005) and condition 4 (
r = 0.319,
n = 100,
p < 0.001), whereas Karr et al. (2018; 2019 [
70,
71]) found that performance in the D-KEFS was proportional to participants’ level of education, which is in contrast to the results of the current study. Finally, regarding gender, no statistically significant correlation with performance was detected, which is confirmed by the study of Cutler et al. (2023) [
24], in which the effect of gender did not play a significant role in the performance of the sample across all the tests’ conditions. Furthermore, in the longitudinal study by Adólfsdóttir et al. (2017) [
18], a marginal correlation in performance was detected; as for the suspension condition, they predicted that the completion time would increase per year by 0.61% for men and by 0.62% for women, while for the suspension condition/shift, the rates would be 0.72% for men and 0.70% for women.
The primary variable of the CWIT
D-KEFS was completion time, errors, and the three contrast measures mentioned previously. Therefore, norms at this stage of our study were calculated only for them, because they were assumed as primary scores. Norms for the optional scores, that include inhibition/switching versus color naming, as well as inhibition/switching versus word reading using their scaled scores, were not included in the current study. It is worth mentioning that in the first two conditions total errors were almost zero for all age groups without exception. Then, in condition 3, a limited number of errors were observed, most of which were self-corrected. This finding is probably related to the increase in completion time, relative to the previous baseline conditions, as correcting existing errors requires the participant to delay. Finally, in condition 4, the number of errors was twice as high as in condition 3, while the mean of uncorrected errors and self-corrections appeared almost equal, regardless of age. In fact, this finding agrees with the study of Lippa and Davis (2010) [
60], where it was found that basically in people aged from 14 to 69 years, the average of errors was found greater in condition 4 than in 3. However, it contrasts with the study of Barnett et al. (2022) [
61], because they argue that condition 3 is assumed to function as a practice test for condition 4 that requires inhibitory control and switching. In general, as the last two conditions are considered more complex, it was observed that the participants largely seemed to sacrifice more time in completing them, trying to avoid mistakes.
The inhibition/switching trial was designed to be the most difficult as regards completion time and the number of errors. Of particular interest is the study by Lippa and Davis (2010) [
60], which investigates the complexity of the fourth condition compared to the third in adult population with an average level of 14.8 years of schooling and a diagnosis of either neurological or psychiatric pathology. Among people between 14 and 69 years old, the mean of errors in inhibition/switching condition was greater compared with the inhibition condition, whereas in those between 8 and 13 years old and 70 and 89 years old, the errors’ mean score was lower or equal with the ones in inhibition condition [
60]. Moreover, longitudinal study of Adólfsdóttir et al. (2017) [
18] demonstrated age-related changes in performance on inhibition, and combined inhibition and switching in middle-aged and older adults, where these populations had lower performance which appeared to persist, even after controlling for baseline measures of processing speed, gender, and years of schooling. Finally, norms for the three contrast variables—inhibition versus color naming, inhibition/switching versus combined naming plus reading, and inhibition/switching versus inhibition—were shown to be around 0 across all age groups, which is in line with the American norms.
A previous normative data study for the Stroop version called Trenerry’s Stroop Neuropsychological Screening Test (SNST) was conducted by Zalonis et al. (2009) [
44] in a Greek adult population between 18 and 84 years and education range of 6–18 years of schooling. Contrary to our results, their findings suggested that both age and education significantly contributed to SNST performance. However, no direct comparisons can be made between our findings with the study of Zalonis et al. (2009) [
44] because they included a broader sample in terms of age and education, which profoundly influenced examinees’ performances. Additionally, between those with 20–29 years old in the sample of this study, no one had less than 13 years of education, something that seems to be representative of the educational level of young adults in Greece according to the official laws of the government. Nevertheless, according to their findings, age appeared to be the most predictive factor of SNST performance, compared with education. Their study calculated four variables, including the color task, or the time needed to read the 112 items in the color–word task; the number of errors; the number of self-corrections; and the interference score, which were calculated by subtracting the number of errors from the total number of items completed in 120 s. On the contrary, in our study we followed a totally different scoring method via measuring the completion time across the four conditions.
As regards the effect of demographics on the TMT
D-KEFS, it is worth mentioning that most studies used the traditional version of the test, so although useful comparisons could be made, this controversy could be a limitation [
51,
72]. More specifically, previous studies which used the traditional version of the test in different populations showed that TMT performance appeared to be related to age and years of schooling [
48,
49,
51,
53,
73], which is also confirmed by the findings of the current study. Except from the study of Fine et al. (2011) [
47], the remaining studies that referred to the traditional version of the test showed that age and level of education were significantly related with the TMT execution time [
48,
49,
53,
73]. A possible explanation could be that typically, motor speed gradually declines with age [
52], which is also in agreement with the study of Cavaco et al. (2013) [
51], who found that the performance becomes better as the level of education increases. Finally, Fine et al. (2011) [
47] showed that the effect of higher educational level is stronger mainly in the conditions of letter sequence and number–letter switching, which is also confirmed by the results of the current study. Although many studies found that overall age has a greater impact on the overall test performance than education [
52], in our study we found that across the three age classes, those with 16+ years of schooling did better than their peers with a lower level of education, indicating the improvement in motor speed in those with a higher level of education, which is also confirmed by some previous studies [
53].
Regarding gender, no differences were found by means of the TMT
D-KEFS performance in the current study. According to the literature review, previous studies observed differences between men and women. In specific, Cavaco et al. (2013) [
51] found differences in the number sequencing condition, while a study by Cangoz et al. (2009) [
72] reported a relationship between gender and TMT conditions. However, the study of Cangoz et al. (2009) [
72] was conducted with people aged over 50, so no clear comparisons can be made. Heterogeneity of results by means of gender may be attributed to uncontrolled or unmeasured factors, such as sample’s characteristics, differences in men and women by means of educational level, and/or cultural differences [
74].
Finally, according to the findings mentioned above, it seems that age and educational level may be predictive factors of TMT
D-KEFS, because process speed declines with age [
51,
53] and increases with years of schooling [
75]. These results are supported by theories about cognitive reserve which stress the protective role of education, among other factors, in cognitive decline, even in fluid intelligence aspects [
76,
77].
To compare the results of the current study with the first normative data study in a Greek adult population by Zalonis et al. (2009) [
44], it was found that in their study they followed a totally different age clustering, because they divided their sample in 12 different groups (16–19, 20–29, 20–40, 25–45, 30–50, 35–55, 40–60, 45–65, 50–70, 55–75, 60–80, 65–85), and three were different from our educational levels according to the number of years of schooling (<9, 10–12, 13<). At this point, it must be noted that due to the absence of participants in the age group of 20–29 with educational level less than 10–12 years of schooling, the following category was not included in the normative tables. However, it is worth noting that the normative data scores in number sequence and switching conditions, which are common between the two TMT versions, was almost the same in the age class who had 10–12 years of schooling. Finally, given that the age classes between the two studies were very different, comparing them is vague and insufficient.
When comparing Greek norms of the CWIT with the American ones, stratified also by age, it was found that our mean scores belonged to the same range in terms of scaled scores which is equivalent to the mean American score of 10 or 1 SD above, which is acceptable according to what was previously mentioned. Hence, although Greek adults had an approximately slightly lower time-to-completion in some conditions, compared with Americans, this evidence can be attributed to higher educational levels across all age classes. Moving to the TMT
D-KEFS, Greek norms were equivalent to Americans; however, in some conditions Greek adults scored 1 SD higher compared with the American sample, which can also be explained by the increased years of schooling. Finally, it can be assumed that the CWIT and TMT norms for the Greek adult population are equivalent to the original norms calculated by Delis et al. (2001) [
41], hence they can be used by health professionals and researchers.