Higher-Level Executive Functions in Healthy Elderly and Mild Cognitive Impairment: A Systematic Review

Mild Cognitive Impairment (MCI) is a clinical syndrome characterized by a moderate decline in one or more cognitive functions with a preserved autonomy in daily life activities. MCI exhibits cognitive, behavioral, psychological symptoms. The executive functions (EFs) are key functions for everyday life and physical and mental health and allow for the behavior to adapt to external changes. Higher-level executive functions develop from basic EFs (inhibition, working memory, attentional control, and cognitive flexibility). They are planning, reasoning, problem solving, and fluid intelligence (Gf). This systematic review investigates the relationship between higher-level executive functions and healthy and pathological aging, assuming the role of executive functions deficits as a predictor of cognitive decline. The systematic review was conducted according to the PRISMA Statement. A total of 73 studies were identified. The results indicate that 65.8% of the studies confirm significant EFs alterations in MCI (56.8% planning, 50% reasoning, 100% problem solving, 71.4% fluid intelligence). These results seem to highlight a strong prevalence of higher-level executive functions deficits in MCI elderly than in healthy elderly.


Introduction
Mild Cognitive Impairment (MCI) is a syndrome characterized by a clinical profile intermediate between healthy aging and pathological aging. Individuals with MCI do not meet the diagnostic criteria of dementia, but they have worse cognitive functioning than physiological and normal aging [1]. The most common onset symptom is memory impairment, as in Alzheimer's disease (AD), followed by other impairments [1]. However, cognitive deficits can be detected in cognitive functions other than memory. Petersen et al. [2] divided MCI into four groups based on the number and the type of impaired functions.
The most studied type is amnesic MCI, in which the subject has a memory disorder that can be at a single domain (aMCI) or multiple domains (aMCI-md) [3]. In the latter case, there are other impairments in addition to the memory deficits. On the other hand, if the subject does not have a memory deficit, we speak of non-amnestic MCI, which can be at a single (naMCI) or multiple (naMCI-md) domain based on the functions involved [2]. In 8-12% of cases, MCI evolves into Alzheimer's disease. Hence, studying this syndrome is fundamental to predicting AD progression [2].
People with aMCI exhibit a reduced thickness of the entorhinal cortex, fusiform gyrus, and hippocampus compared to naMCI and healthy elderly, and reduced thickness of cingulate gyrus and amygdala compared to healthy elderly. A decreased thickness of precuneus is present in both MCI types [4]. These alterations are similar to the Alzheimer's disease modifications, thus confirming how the MCI is a transitional phase between healthy and pathological aging from an anatomical point of view [5]. The patients with Mild Cognitive Impairment show behavioral and psychological symptoms, in addition to cognitive impairments involving memory and executive functions deficits [6]. evaluate how higher-level EFs could affect the everyday life of healthy and pathological elderly, and how they affect the independence in functional activity (IADL). IADL is one of the criteria of Mild Cognitive Impairment diagnosis [1], which is useful to distinguish it from dementia and could lead to an early diagnosis. This systematic review aims to investigate the relationship between higher-level executive functions and healthy and pathological aging, assuming the role of executive functions deficits as a predictor of cognitive decline. An additional goal is to establish what tests best discriminate healthy elderly from Mild Cognitive Impairment. Moreover, the present review points at evaluating how higher-level executive functions are compromised in both MCI and healthy elderly but with a worse outcome in pathological aging.

Materials and Methods
The review process was conducted according to the PRISMA Statement [24,25].

Research Strategies
A systematic search of the international literature was conducted in the following electronic databases by selecting articles published in peer-review journals: PsycINFO, Scopus, MEDLINE, and Web of Sciences. The last search was conducted on 13 July 2021.
A list of keywords and MeSH terms was generated to identify studies ("Mild Cognitive Impairment" AND "executive function*"); ("Mild Cognitive Impairment" AND "planning"); ("Mild Cognitive Impairment" AND "reasoning"); ("Mild Cognitive Impairment" AND "problem solving"); ("Mild Cognitive Impairment" AND "fluid intelligence"). Restrictions were made, limiting the research to academic publications with English and Italian full text, without restrictions regarding gender and ethnicity. Additionally, the bibliographical references of retrieved papers, reviews, and meta-analyses were screened manually to assess whether they included relevant studies in the review. The number of selected articles is shown in Table 1.

Eligibility Criteria
A total of 11,269 articles were obtained from the search procedure. The first step allowed 5337 duplicates to be eliminated using the Mendeley software. Then, the list of potential articles produced by systematic research was revised. The reading of the title and abstract allowed the first exclusion of 5198 non-inherent studies. A further selection was made by reading the full text (See Figure 1).
The inclusion criteria were: adult population (age equal to or higher than 50 years), diagnosis of Mild Cognitive Impairment; healthy subjects; use of higher-level executive functions measurements.
The exclusion criteria were: participants with medical conditions that could potentially influence the investigated relationship (for example, metabolic disorders; cardiovascular disorders; chronic disorders; cancer); participants diagnosed with dementia (Alzheimer Disease; Parkinson's Disease; Vascular Dementia; Frontotemporal Dementia; Dementia with Lewy Bodies; Huntington's Disease), psychiatric disorders, neurological disorders, strokes; use of drugs that affect the nervous system and traumatic brain injury; methodological flaws; lack of essential data; assessment made by caregivers; MCI participants included in healthy elderly or AD groups; reviews, dissertations, editorials, comments, replies; trials; age < 50 years and animal models. disorders, strokes; use of drugs that affect the nervous system and traumatic brain injury; methodological flaws; lack of essential data; assessment made by caregivers; MCI participants included in healthy elderly or AD groups; reviews, dissertations, editorials, comments, replies; trials; age < 50 years and animal models.

Data Collection
According to the PICOS approach [24], the following information was extracted from each study: authors and year of publication; characteristics of participants (including age, gender, Mini Mental State Examination-MMSE score); diagnostic criteria; experimental paradigm; results.
The extracted data are included in Table 2.

Data Collection
According to the PICOS approach [24], the following information was extracted from each study: authors and year of publication; characteristics of participants (including age, gender, Mini Mental State Examination-MMSE score); diagnostic criteria; experimental paradigm; results.
The extracted data are included in Table 2.

Quality Assessment
A quality assessment was carried out to analyze the eligibility of each article to reduce the risk bias. The analysis used five criteria to screen each study selected for systematic review: sampling bias, executive function measurements, diagnostic criteria, selective reporting bias, and methodological bias. Each criterion score ranges from 1 (low risk) to 3 (high risk). The overall quality shall be calculated by adding all the scores obtaining a global score ranging from 5 to 15. The study was considered at low risk of bias if the score was 5, while a score in the 6-10 interval was considered an indicator of a moderate risk of bias. The quality assessment was subdivided into planning, reasoning, fluid intelligence, and problem-solving measurements. The risk of bias is reported in Figure 2. systematic review: sampling bias, executive function measurements, diagnostic criteria, selective reporting bias, and methodological bias. Each criterion score ranges from 1 (low risk) to 3 (high risk). The overall quality shall be calculated by adding all the scores obtaining a global score ranging from 5 to 15. The study was considered at low risk of bias if the score was 5, while a score in the 6-10 interval was considered an indicator of a moderate risk of bias. The quality assessment was subdivided into planning, reasoning, fluid intelligence, and problem-solving measurements. The risk of bias is reported in Figure 2. showing moderate scores, no study reports a moderate risk of bias in more than two items. A large percentage of the studies adopted valid and reliable tools to measure planning and included an appropriate sample size. Moreover, most studies were adequately controlled for confounding variables. The higher risk bias was in the "EFs measurements" and the lower in "methodological bias". In the overall bias, the score ranged from 5 to 7 for every article included.  showing moderate scores, no study reports a moderate risk of bias in more than two items. A large percentage of the studies adopted valid and reliable tools to measure planning and included an appropriate sample size. Moreover, most studies were adequately controlled for confounding variables. The higher risk bias was in the "EFs measurements" and the lower in "methodological bias". In the overall bias, the score ranged from 5 to 7 for every article included. systematic review: sampling bias, executive function measurements, diagnostic criteria, selective reporting bias, and methodological bias. Each criterion score ranges from 1 (low risk) to 3 (high risk). The overall quality shall be calculated by adding all the scores obtaining a global score ranging from 5 to 15. The study was considered at low risk of bias if the score was 5, while a score in the 6-10 interval was considered an indicator of a moderate risk of bias. The quality assessment was subdivided into planning, reasoning, fluid intelligence, and problem-solving measurements. The risk of bias is reported in Figure 2. showing moderate scores, no study reports a moderate risk of bias in more than two items. A large percentage of the studies adopted valid and reliable tools to measure planning and included an appropriate sample size. Moreover, most studies were adequately controlled for confounding variables. The higher risk bias was in the "EFs measurements" and the lower in "methodological bias". In the overall bias, the score ranged from 5 to 7 for every article included.   Figure 4 shows the percentage of articles adopting reasoning tests fulfilling each quality criterion of risk of bias assessment. On average, the quality of the studies was good since 28 out of 32 studies (87.5%) exhibited low scores on the risk of bias. The high percentage of studies with low or no risk of bias increases the validity of this systematic review. Despite four studies (12.5%) showing moderate scores, no study reports a moderate risk of bias in more than two items. A large percentage of the studies used valid and reliable tools to measure reasoning and included an appropriate sample size. Moreover, most studies were adequately controlled for confounding variables. The higher risk bias was in the "methodological bias" and the lower in "sampling bias" "EFs measurements" and "diagnostic criteria". In the overall bias, the score ranged from 5 to 7 for every article included.  Figure 4 shows the percentage of articles adopting reasoning tests fulfilling each quality criterion of risk of bias assessment. On average, the quality of the studies was good since 28 out of 32 studies (87.5%) exhibited low scores on the risk of bias. The high percentage of studies with low or no risk of bias increases the validity of this systematic review. Despite four studies (12.5%) showing moderate scores, no study reports a moderate risk of bias in more than two items. A large percentage of the studies used valid and reliable tools to measure reasoning and included an appropriate sample size. Moreover, most studies were adequately controlled for confounding variables. The higher risk bias was in the "methodological bias" and the lower in "sampling bias" "EFs measurements" and "diagnostic criteria". In the overall bias, the score ranged from 5 to 7 for every article included.  Figure 5 shows the percentage of articles adopting a problem-solving task fulfilling each quality criterion of risk of bias assessment. On average, the quality of the studies was good since six out of six studies (100%) exhibited low scores on the risk of bias. The high percentage of studies with low or no risk of bias increases the validity of this systematic review. No study reports a moderate risk of bias in more than one item. A large percentage of the studies used valid and reliable tools to measure problem solving and included an appropriate sample size. Moreover, most studies were adequately controlled for confounding variables. The higher risk bias was in the "EFs measurements" and the lower in "sampling bias", "methodological bias", and "diagnostic criteria". In the overall bias, the score ranged from 5 to 6 for every article included.  Figure 5 shows the percentage of articles adopting a problem-solving task fulfilling each quality criterion of risk of bias assessment. On average, the quality of the studies was good since six out of six studies (100%) exhibited low scores on the risk of bias. The high percentage of studies with low or no risk of bias increases the validity of this systematic review. No study reports a moderate risk of bias in more than one item. A large percentage of the studies used valid and reliable tools to measure problem solving and included an appropriate sample size. Moreover, most studies were adequately controlled for confounding variables. The higher risk bias was in the "EFs measurements" and the lower in "sampling bias", "methodological bias", and "diagnostic criteria". In the overall bias, the score ranged from 5 to 6 for every article included. Figure 6 shows the percentage of articles adopting fluid intelligence measurements fulfilling each quality criterion of risk of bias assessment. On average, the quality of the studies was good since six out of seven studies (85.7%) exhibited low scores on the risk of bias. The high percentage of studies with low or no risk of bias increases the validity of this systematic review. Despite one study (14.3%) showing moderate scores, no study reports a moderate risk of bias in more than two items. A large percentage of the studies used valid and reliable tools to measure fluid intelligence and included an appropriate sample size. Moreover, most studies were adequately controlled for confounding variables. The higher risk bias was in the "methodological bias" and the lower in "EFs measurements" and "diagnostic criteria". In the overall bias, the score ranged from 5 to 7 for every article included.  Figure 6 shows the percentage of articles adopting fluid intelligence measurements fulfilling each quality criterion of risk of bias assessment. On average, the quality of the studies was good since six out of seven studies (85.7%) exhibited low scores on the risk of bias. The high percentage of studies with low or no risk of bias increases the validity of this systematic review. Despite one study (14.3%) showing moderate scores, no study reports a moderate risk of bias in more than two items. A large percentage of the studies used valid and reliable tools to measure fluid intelligence and included an appropriate sample size. Moreover, most studies were adequately controlled for confounding variables. The higher risk bias was in the "methodological bias" and the lower in "EFs measurements" and "diagnostic criteria". In the overall bias, the score ranged from 5 to 7 for every article included.    Figure 6 shows the percentage of articles adopting fluid intelligence measurements fulfilling each quality criterion of risk of bias assessment. On average, the quality of the studies was good since six out of seven studies (85.7%) exhibited low scores on the risk of bias. The high percentage of studies with low or no risk of bias increases the validity of this systematic review. Despite one study (14.3%) showing moderate scores, no study reports a moderate risk of bias in more than two items. A large percentage of the studies used valid and reliable tools to measure fluid intelligence and included an appropriate sample size. Moreover, most studies were adequately controlled for confounding variables. The higher risk bias was in the "methodological bias" and the lower in "EFs measurements" and "diagnostic criteria". In the overall bias, the score ranged from 5 to 7 for every article included.

Studies Selection
The flow chart shows the number of studies identified from the databases and the number of studies examined, assessed for eligibility, and included in the review with the reasons for possible exclusions (see Figure 1). A total of 73 studies were identified.
Of the 73 selected studies, 30 analyzed planning, 31 reasoning, six problem solving, and seven fluid intelligence. Nine studies used different executive function measures.
Results will be presented in two subsections, according to the higher-level executive functions and the MCI subtype.

Planning (N = 37)
Thirty-seven studies have measured planning in healthy elderly and MCI participants with an overall sample of 3491 participants (1919 HC and 1572 MCI) with a mean age that ranges from 60.7 years [82] to 79 years [90].
Nine studies [28,36,46,54,71,73,74,92,97] of the thirteen that analyzed the planning abilities with tower tests ("Tower of London", "Tower of Hanoi", and "Tower Test (D-KEFS)") highlighted a poorer performance in MCI than healthy groups. Metzler-Baddeley et al. [74] and Berlot et al. [33] observed more rule violations during the performance of the task in MCI, while Bharath et al. [36] reported a longer time to complete the test. De Paula et al. [46] used two versions of "Tower of London" (designed by Portella et al. [47] and Krikorian et al. [48]) and observed a lower planning ability in MCI subjects. Rainville et al. [92] pointed out a higher rule breaking and abandonment rate in MCI. Sánchez-Benavides et al. [96] saw in Mild Cognitive Impairment subjects higher total moves, total initiation time, total exclusion time, total solving time, and lower total correct rates than healthy subjects. Also, Garcia-Alvarez et al. [54], Lindbergh et al. [71], and Lussier et al. [73] found poor planning in Mild Cognitive Impairment subjects.
Four studies [89,94,100,113] that used "CLOX-1" found a lower planning capacity in Mild Cognitive Impairment samples.
Three studies used the "Zoo Map Test" [64,98,99]. Sanders et al. [98] highlighted higher total errors in MCI than healthy controls. Junquera et al. [64] analyzed the differences between healthy subjects, aMCI, naMCI, and aMCI multiple domains: aMCI multiple domains showed lower planning than healthy elderly and aMCI single domain, while naMCI subjects showed a poor planning ability than healthy elderly. Sanders et al. [98] observed decreased planning ability in MCI compared to healthy elderly.
Espinosa et al., [53] analyzed the differences between healthy subjects and MCI using the "Zoo Map Test" and "Action Program Test", in both tests, MCI had lower planning than healthy controls. Nordlund et al. [81,82] found a poor planning ability, assessed with the "Wisconsin Card Sorting Test-Computer Version (WCST-CV)", in MCI subjects than healthy elderly. Papp et al. [88] used the "Groton Maze Learning Test" to evaluate planning and underlined that participants with MCI exhibit higher exploratory errors, more rule-breaks errors, and reduced differences between trial 1 and trial 2 than healthy subjects. Zhang et al. [114] evaluated the differences between healthy subjects and MCI with "Trail Making Test (B-A)", "Porteus Maze Test", and "Verbal Fluency (fruits and animals)", and in each test, Mild Cognitive Impairment showed lower planning than healthy elderly.
Three studies [42,72,106] performed two tasks each and observed lower reasoning ability in MCI participants in only one of them.
Seven studies used "Similarities" [62,64,66,68,70,106,109] to assess reasoning in healthy elderly and Mild Cognitive Impairment samples and reported lower performance in reasoning in MCI subjects. Junquera et al. [64] analyzed the differences between healthy subjects, aMCI, naMCI, and aMCI multiple domains: both aMCI multiple domains and naMCI showed lower reasoning than healthy elderly. Four studies used "Matrix Reasoning" [41,42,44,62] to evaluate MCI and healthy subjects, and in each study, a decreased reasoning in Mild Cognitive Impairment subjects was highlighted. In particular, Chang [41] observed a higher performance in healthy subjects than MCI with normal awareness for memory (MCI-na), which in turn were better than MCI with poor awareness for memory (MCI-pa). Two studies used "Raven's Coloured Progressive Matrices-RCPM" [37,78] and observed a reduced reasoning ability in Mild Cognitive Impairment participants, as well as Benavides-Varela et al., [32] that used "Raven's Progressive Matrices-RPM". Sherod et al. [105] analyzed the differences between healthy subjects and MCI with the "DRS-2 Conceptualization" and "Cognitive Competency Test", and in both tests MCI had lower abstraction than healthy controls. Lui et al., [72] used "ACED Money Management" and reported a reduced reasoning ability in Mild Cognitive Impairment. Moreira et al. [75] used "Proverbs" to evaluate the reasoning in healthy elderly and MCI participants and observed higher abstraction ability in healthy subjects than MCI. García et al. [55] used "Abstraction (MoCA)" and pointed out a reduced ability in abstraction in MCI subjects.

Problem Solving (N = 6)
Six studies have assessed problem solving in MCI and a control group, with an overall sample composed of 344 participants (236 MCI and 108 HC) and a mean age ranging from 62.6 years [63] to 82 years [39]. Each study [35,51,63,88,104] used a different task to evaluate problem solving, and they all showed differences between the samples.
Beversdorf et al., [35] used the "Matchstick Problem" to evaluate visuospatial problem solving and highlighted lower capacity in the MCI sample to solve problems. Burton et al., [39] used "Block Design" to evaluate problem solving ability in healthy elderly, aMCI single and multiple domains, naMCI single and multiple domain participants. They observed better performance in healthy subjects than amnesic Mild Cognitive Impairment multiple domains, non-amnesic Mild Cognitive Impairment single, and multiple domains, but not to aMCI single domain subjects. Dwolatzky et al., [51] used "Pictorial Puzzles (2x2)" and reported a reduced accuracy in Mild Cognitive Impairment. Jin et al., [63] evaluated problem solving with "Sudoku (Nikoli Publishing)" and highlighted a decreased accuracy in complex tasks in aMCI subjects compared to healthy subjects. Papp et al., [88] used the "Groton Maze Learning Test" to evaluate problem solving and underlined that participants with MCI have higher exploratory errors, more rule-breaks errors, and a lower difference between trial 1 and trial 2 than healthy subjects. Sheldon et al., [104] used the "Means-Ends Problem Solving Test" and observed a reduced problem-solving ability in MCI subjects.

Fluid Intelligence (N = 7)
Seven studies have measured fluid intelligence in healthy elderly and Mild Cognitive Impairment subjects, with an overall sample of 682 participants (341 HC and 341 MCI) and a mean age ranging from 67.37 years [103] to 75.3 years [69]. Five studies used "Block Design" [45,58,66,69,112], three studies used "Matrix Reasoning" [52,103,112], and one study used "Raven Coloured Matrices" to evaluate fluid intelligence.
Two studies [58,66] did not report any significant difference between samples, one study performed multiple tests and showed conflicting results [110], while the others [45,52,69,103] reported lower performance in fluid intelligence in MCI subjects.

Non-Amnesic Mild Cognitive Impairment (N = 4)
Four studies analyzed the differences between naMCI and healthy elderly in higherlevel executive functions, with an overall sample of 495 participants (360 HC and 135 naMCI) and a mean age ranging from 63.4 years [102] to 79.57 years [39]. Three of these (75%) reported a significant difference between groups. One study [64] evaluated planning in naMCI and healthy elderly, highlighting a poorer performance in MCI sample. The most analyzed higher-level executive function in non-amnesic MCI is reasoning, which is evaluated in three studies [64,70,102], but only one (33.3%) [64] of them highlighted a lower reasoning ability in naMCI. Finally, one study [39] analyzed problem solving and observed a worse performance in naMCI compared to healthy elderly. Only one study [39] set apart non-amnesic Mild Cognitive Impairment multiple domain from naMCI single domain, showing no difference between the two groups. No one evaluated fluid intelligence in non-amnesic Mild Cognitive Impairment, and no one analyzed them without the amnesic Mild Cognitive Impairment sample.

Discussion
The purpose of this systematic review was to investigate the relationship between higher-level executive functions and healthy and pathological aging, assuming the role of executive functions deficits as a predictor of the general cognitive decline. Results showed that not all the studies found a prevalence of higher-level executive functions deficits in individuals with Mild Cognitive Impairment diagnosis compared to healthy elderly; however, 64.4% of the studies confirm a significant presence of alterations in MCI (56.8% planning, 50% reasoning, 100% problem solving, 71.4% fluid intelligence).
Despite the scarce number of observations that do not allow reliable conclusions, the evaluation of problem solving showed significant results. These data must be interpreted with caution because the studies [35,39,51,63,88,104] used different tasks to evaluate this ability. One interesting finding was observed by Burton et al. [39] that compared healthy subjects, amnesic Mild Cognitive Impairment single and multiple domains, and nonamnesic Mild Cognitive Impairment single and multiple domains to analyze problem solving. According to the literature, the authors reported lower problem-solving capacity in participants with aMCI multiple domains and naMCI single and multiple domains compared to healthy elderly, while the aMCI single domain subjects did not report any significant difference with the others. These results could be attributed to the Mild Cognitive Impairment [1], in which the only impaired cognitive domain is memory. On the other hand, Jin et al. [63] found a significant difference between aMCI and healthy control group; the author reported a positive linear correlation between blood oxygen levels in the posterior cingulate cortex (PCC) and precuneus in aMCI subjects during simple (r = 0.95) and complex (r = 0.90) problem solving tasks. In addition, healthy elderly showed a deactivation of these areas while the aMCI showed an activation. These regions are included in the Default Mode Network (DMN) and, taking into account the close relationship with the hippocampus, these activations in aMCI may be explained as a compensatory memory mechanism.
The results of fluid intelligence must be interpreted with caution due to the small number of studies that measured this variable [45,52,58,66,69,103,112]. A possible source of error about fluid intelligence ability is linked to the type of assessment carried out: this review includes studies that evaluated the intelligence quotient employing tests commonly used to assess fluid intelligence.
Despite this, Li et al., [69], through the means regression and the cluster analysis, observed that the "Block Design" test could predict conversion from a healthy state to amnesic Mild Cognitive Impairment. Another important finding is observed by Wu et al., [111] that studied, in amnesic Mild Cognitive Impairment and healthy elderly, the Resting State-Executive Control Network (RS-ECN), a network that is adjacent to DMN and the other major attention networks and with which it shares some anatomical areas. The aMCI showed a decreased functional connectivity of anterior cingulate cortex (ACC), inferior parietal lobule (IPC), lateral parietal and anterior insula, precuneus, middle frontal gyrus, left and right dorsal lateral prefrontal cortex (DLPFC); these regions are strictly involved in the Ventral Attention Network and more generally in executive functions. Moreover, the author [112] also observed increased functional connectivity of different areas of the Default Mode Network, the Ventral Attention Network (VAN), and the Dorsal Attention Network: the right anterior prefrontal cortex (aPFC), left and right ventral lateral prefrontal cortex (VLPFC), superior parietal cortex, posterior parietal lobule, occipital and temporal. Even if these regions are not involved in fluid intelligence, they are still implicated in planning, reasoning, problem solving, abstract thinking, and other executive functions, with particular reference to the DLPFC. The overall results of fluid intelligence, although not uniform, pursued a trend towards higher prevalence of this ability deficit in Mild Cognitive Impairment.
The results about planning are projected to highlight a negative trend in MCI that reported lower ability than healthy elderly. However, these data must be interpreted with caution because not all tests provided statistically significant results, and some studies used inappropriate tests. In particular, some studies used the "Clock Drawing Test" and "CLOX-1", which are not specific for planning evaluation but are instead typically used in neuropsychological batteries to investigate other cognitive functions. The "Clock Drawing Test" is commonly used to assess praxis and visuospatial skills, while the "CLOX-1" is the version that evaluates the executive functions (e.g., goal selection, planning, selective attention, and motor sequencing) [120]. Despite this distinction, four studies [57,[116][117][118] used the "Clock Drawing Test" to assess executive functions, and neither of these reported any significant difference between MCI and healthy elderly. In addition, the studies that used "CLOX-1" [83,84,89,104,106] did not report significant differences between MCI and healthy subjects; only a few of studies [89,94,100,113] highlighted lower planning ability in pathological aging. These results could be explained by the low sensitivity of this test in discriminating between MCI and healthy elderly. However, the studies that used the "Clock Drawing Test" and the "CLOX-1" were included too, since both original validations [120,121] considered the test adequate to assess planning ability.
On the other hand, the "Zoo Map Test" and the tower tests ("Tower of London", "Tower of Hanoi", and "Tower Test (D-KEFS)") seem well to discriminate the differences between healthy and MCI participants. Junquera et al., [64] analyzed the differences between healthy subjects and single and multiple domain aMCI and naMCI participants. Subjects with aMCI multiple domains and single and multiple domains naMCI subjects exhibited lower planning ability than healthy elderly. Two studies [33,74] observed more rule violations during tasks in MCI; in addition, Rainville et al., [92] pointed out a higher rule breaking and abandonment rate in MCI than healthy participants. Metzler-Baddeley et al., [74] have also observed a correlation between the number of rule violations in the TOL and the variation of mean diffusivity in the bilateral anterior cingulum and the fornix.
Although not all results showed a statistically significant difference, many reasoning deficits can be observed in MCI. Most studies used the "Similarities" test to evaluate reasoning, which identifies the relationship between a couple of words. Seven of these studies [62,64,66,68,70,106,109] reported significant differences between healthy and Mild Cognitive Impairment elderly. Chang [41] compared healthy and MCI participants with and without awareness for memory problems and observed a higher performance in healthy subjects than in MCI with normal awareness for memory (MCI-na), which in turn were better than MCI with poor awareness for memory (MCI-pa). In addition, MCI-pa showed reduced white matter integrity of left dorsal frontal-striatal tract, right dorsal frontal-striatal tract, left anterior thalamocortical radiations-ventral part, corpus callosuminferior parietal lobule, and corpus callosum-ventral prefrontal regions. Nishi et al., [78] found a correlation between reasoning task execution and reduced glucose reuptake in the right middle frontal gyrus and higher activation in the same area.
Not all the studies that analyzed higher-level executive functions highlighted significant differences. Generally, it may be concluded that elderly with Mild Cognitive Impairment exhibit poorer performance than healthy elderly. Due to small observations, problem solving and fluid intelligence results do not allow reliable conclusions. Despite this, the results appear promising, showing higher executive function deficits in MCI. Though numerous and highlighting a worse performance in MCI, planning and reasoning results do not always show significant differences between groups. This could be related to low sensitivity measures to discriminate MCI from normal aging.

Limits
Despite the encouraging results, this review holds some limitations. The major limitation is the lack of quantitative analysis (meta-analysis), which is difficult to carry out because of the large number of different tests and diagnostic criteria adopted by the studies. The absence of a standardized protocol to evaluate the higher-level executive functions represents another limitation, leading to the administration of rarely used tests and, consequently, to hardly generalizable results. An additional limiting factor of this review is task impurity and, therefore, the difficulty of separately evaluating each higher-level EF. For example, the "Matrix Reasoning" is used to evaluate: reasoning [42,44,85], visuospatial reasoning [58], non-verbal abstract reasoning [62], intelligence quotient [103,111], and fluid intelligence [52]. A further limit can be related to publication bias. Lastly, this review is based on Diamond's model [7], and therefore it focuses on some executive functions excluding all others, such as decision-making.

Conclusions
The results of this systematic review seem to highlight a higher prevalence of higherlevel executive functions disease in elderly with Mild Cognitive Impairment than in healthy elderly, confirming results already observed with other executive functions, such as cognitive and motor inhibition, conflict control, and cognitive flexibility [122], although some of these EFs are also compromised in healthy elderly [123]. MCI shows modifications over every aspect investigated in this research, highlighting significant differences that could worsen the quality of life. As far as we know, this study is the first to evaluate these aspects in healthy and MCI elderly. Certainly, a future goal will be to establish and create a standardized protocol to discriminate MCI from healthy elderly. Such a protocol should accurately measure reasoning, planning, problem solving, and fluid intelligence since these functions are treated as a single construct included in executive functions. An important goal for the next studies will be to figure out if higher-level executive functions diseases are early symptoms of Mild Cognitive Impairment or, on the other hand, MCI leads to poorer higher-level executive functions abilities as a consequence of the more significant alterations of the nervous system occurring in pathologically older age than in healthy elderly.