Cognitive and Academic Outcomes of Fundamental Motor Skill and Physical Activity Interventions Designed for Children with Special Educational Needs: A Systematic Review

This systematic review aimed to investigate the methodological quality and the effects of fundamental motor skills and physical activity interventions on cognitive and academic skills in 3- to 7-year-old children with special educational needs. The review was reported in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA 2020) statement. A literature search was carried out in April 2020 (updated in January 2022) using seven electronic databases, including ERIC, Scopus, Web of Science, PsycINFO, CINAHL, PubMed, and SPORTDiscus. The methodological quality of the studies was assessed with Effective Public Health Practice Project (EPHPP) Quality Assessment Tool. Cohen’s d effect sizes and post-hoc power analyses were conducted for the included studies. Altogether 22 studies (1883 children) met the inclusion criteria, representing children at-risk for learning difficulties, due to family background (nstudies = 8), children with learning difficulties (nstudies = 7), learning disabilities (nstudies = 5), and physical disabilities (nstudies = 2). Two of the included 22 studies displayed strong, one moderate, and 19 studies weak methodological quality. The intervention effects appeared to be somewhat dependent on the severity of the learning difficulty; in cognitive and language skills, the effects were largest in children at-risk due to family background, whereas in executive functions the effects were largest in children with learning disabilities. However, due to the vast heterogeneity of the included studies, and a rather low methodological quality, it is challenging to summarize the findings in a generalizable manner. Thus, additional high-quality research is required to determine the effectiveness of the interventions.


Introduction
Children's cognitive (e.g., executive functions) and academic skills (e.g., early numeracy and literacy skills) start to develop during the early years [1,2] which provide important grounds for later development [3]. During these years, in particular, the development of cognitive and academic skills is highly interrelated [4]. Thus, early childhood education has an important role in children's development, especially for children with special educational needs (SEN) [3] whose later academic success is at risk [5]. Children with SEN are not a homogeneous group, but rather include a wide range of children with various types and extents of learning difficulties or disabilities [4,6]; stemming, for instance, from biological, neurobiological, intellectual, genetic, or environmental factors [7]. While the challenges of children with SEN differ widely, in general, a requirement for customized special education is observed [4]. Early childhood education provides a valuable environment for the implementation of effective interventions to support the learning of children

Study Selection
Three authors performed the article selection according to the predetermined eligibility criteria. The articles were initially screened based on the abstract. In terms of inclusion, the abstracts were coded as "yes", "maybe" or "no". The inter-rater agreement was determined during the abstract rating processes by calculating Cohen's weighted kappa. In the first literature search, both authors rated the first 40% (n = 2266) of the abstracts independently, after which the inter-rater agreement was 0.718 and the remaining abstracts were divided between the authors. In the updated literature search, both authors rated all of the abstracts (n = 2198) independently with an inter-rater agreement of 0.771. In both cases, the inter-rater agreement could be considered as good [22]. Following the abstract screening, all of the eligible articles underwent full-text screening, where the authors independently decided whether to "exclude", "include" or "maybe" include each article.

Methodological Quality
The methodological quality of the eligible studies was assessed with the Effective Public Health Practice Project Quality Assessment Tool for Quantitative Studies (EPHPP) [23,24]. The tool is suitable for evaluating the quality of a variety of study designs (e.g., randomized controlled trials (RCT) and pre-post designs (PPD)) [25] and has been used previously in systematic reviews in this particular field [16,26]. The inter-rater agreement has been shown to be more consistent with the EPHPP tool compared to the Cochrane Collaboration Risk of Bias Tool [25], for instance. Three authors rated each study procedure as "strong", "moderate" or "weak". Final ratings were formed based on six sections (selection bias, study design, confounders, blinding, data collection methods, and withdrawals and drop-outs) with the following criteria: studies with no weak ratings and at least four strong ratings were considered as "strong"; studies with less than four strong ratings and one weak rating were considered as "moderate"; studies with two or more weak ratings were considered as "weak". For more detailed criteria see the Quality Assessment Tool for Quantitative Studies Dictionary [23]. Any disagreements were solved in consensus meetings with all authors.

Data Extraction
Data were extracted from the eligible studies independently by three authors. Extracted data included the geographical location, study design, sample size, children's age, gender, and reason for SEN, cognitive and academic outcomes, intervention exposure, intervention details (only FMS and/or PA interventions and combined interventions), control conditions, and data for effect size calculations. If missing data was encountered, the corresponding author was contacted in order to receive the required information.

Effect Size Calculations
Cohen's d effect sizes [27] were calculated to allow for the quantification and comparison of the effects across the studies. Effect sizes were calculated for the studies that demonstrated significant effects and provided sufficient information (i.e., pre-and postscores, as well as the associated standard deviations or standard errors). If a study demonstrated significant effects for multiple outcomes, all of them were included. Cohen's d effect sizes of <0.2, 0.2, 0.5, and 0.8, correspond to trivial, small, medium, and large effects, respectively [27].
Between-group effects were calculated in accordance with the following; And within-group effects were calculated as: Cohen's d effect size M post = mean post-score M pre = mean pre-score E = experimental group C = control group SD pooled = pooled standard deviation n = sample size

Power Analyses
Power calculations were carried out with G*power 3.1.9.6 [28]. If a study had multiple groups, the power calculations were conducted on a sub-group basis in order to determine the power of specific group comparisons. Type 1 error probability (α) was computed as 0.05, corresponding to a significance level of 5%. A medium effect size (0.5) was used as the reference point to establish observed power for each outcome and a type 2 error probability (β) of 0.2, corresponding to a power of 0.8 (1 − β), or 80%, was selected as the cut-off point for adequate power [29].

Search Results
The stages of the systematic selection of the studies are presented in detail in Figure 1. In the updated literature search, a total of 3211 articles were found, which became 2198 articles after removal of duplicates. Of these, 2128 articles were excluded due to not meeting the eligibility criteria and the remaining 70 articles underwent full-text screening. Finally, 2 and 20 articles from the updated and the previous literature search (i.e., studies Brain Sci. 2022, 12, 1001 5 of 16 which were identified in the previous systematic review [16] but excluded since the review focused on typically developing children), respectively, were included.

Search Results
The stages of the systematic selection of the studies are presented in detail i 1. In the updated literature search, a total of 3211 articles were found, which beca articles after removal of duplicates. Of these, 2128 articles were excluded du meeting the eligibility criteria and the remaining 70 articles underwent full-text sc Finally, 2 and 20 articles from the updated and the previous literature search (i.e. which were identified in the previous systematic review [16] but excluded s review focused on typically developing children), respectively, were included. Figure 1. PRISMA flow diagram of the stages associated with the systematic selection of Studies were identified in the previous systematic review [16], but excluded since th focused on typically developing children. * Studies were identified in the previous systematic review [16], but excluded since the review focused on typically developing children.

Study Characteristics and Population
Study characteristics are presented in detail in Supplementary Table S1. The included 22 studies represented 1883 children with various types and extents of SEN. In order to compare the intervention effects, children were divided into four groups based on the assumed severity of the learning difficulty. Thereafter, the following groups were formed: children at-risk for learning difficulties due to family background (n studies = 8; e.g., low SES) [30], children with learning difficulties (n studies = 7; e.g., high risk of ADHD) [31], learning disabilities (n studies = 5; e.g., autism spectrum disorder) [32], physical disabilities (n studies = 2; e.g., cerebral palsy) [33]. The mean ages of the participants ranged from 3.8 [34] to 7.4 years [35], and all of the studies included both boys and girls, apart from two studies that only included the former [36,37]. In terms of geographical location, the included studies were conducted in ten countries representing North America, Europe, Asia, and Africa, and were published between 1972 [38] and 2021 [35].

Methodological Quality
The methodological quality was determined based on the following factors: study design, selection bias, confounders, blinding, data collection methods, and withdrawals and drop-outs [23]. Only two of the included 22 studies (9%) demonstrated strong methodological quality, while one study (5%) had moderate quality, and 19 studies (86%) were considered methodologically weak. The rating for each section, as well as the overall quality, of the studies is presented in Table 1.
Of the included studies, ten (45%) were found to be underpowered to detect a medium effect size [31,34,36,[38][39][40]43,44,46,48], while seven (32%) were confirmed to be adequately powered [3,21,30,32,35,41,45]; for the remaining five (23%) post hoc power could not be determined (i.e., within-group designs without required information) [33,37,42,47,49]. Only three (14%) of the included studies reported the conducting of a priori power analysis [32,36,37], and six studies stated small sample size as a limitation of the study [31,34,35,42,43,48]. It should be noted, that while underpowered to detect a medium effect size, in one study [36], the authors conducted a priori power calculations with a large (0.74) estimated effect size, based on a pilot study, for which the study was adequately powered. Note. Some modifications were made to the EPHPP tool to solve misunderstandings between the raters. Study design: Studies that used quasi-experimental design were coded as CCT. Confounders: The confounders of interest included age, gender, health status, and pre-intervention score. Blinding: In question 2 "Were the study participants aware of the research question?" we chose to code "no" if there was no mention that participants were aware of the research question. This decision was made based on the young age of the participants. Data collection methods: The outcome of interest (cognitive or academic measurement) was evaluated. Methods were coded to be "valid" if the validity was mentioned in the article or if there was a citation to a test manual or another article where the validity was reported. Some well-known methods were seen as valid methods without a separate mention (e.g., Wechsler Intelligence Scale for Children or Bayley Scales of Infant and Toddler Development). Methods were coded as "reliable" only if the reliability was measured and reported in that specific data set. Withdrawals and drop-outs: In question 1, "Were withdrawals and drop-outs reported in terms of numbers and/or reasons per group?", if both numbers and reasons were reported it was coded as "yes", otherwise "no" was selected. Withdrawals and drop-outs were considered as children that did not finish the intervention, i.e., not missing data.

Effect Sizes
Individual effect sizes for each outcome and sub-group within the included studies are reported in Table 2. The effect sizes were presented in four groups based on the assumed severity of the participants learning difficulty: Children at-risk for learning difficulties due to family background. In total, eight of the included studies (one with two separate interventions) [38] investigated the effects of FMS and PA interventions in children with low SES [21,30,34,35,38,44,48], while one was carried out with immigrant children [45]. Two studies assessed cognitive skills as an outcome; one with an FMS only intervention [38] and one with a combined FMS intervention [21]. Both studies demonstrated a beneficial effect of the intervention. The effect was large (d = 3.0) for the latter, while an effect size could not be calculated for the former due to the lack of required data. Language skills were assessed in two studies [34,48], both of which demonstrated large beneficial effects of a combined PA intervention (d = 0.78-1.57 x 1.18). Three of the identified studies included executive functions as an outcome; one demonstrated a small beneficial effect of an FMS only intervention (d = 0.48) [30]; one found a significant benefit of a PA only intervention, but an effect size could not be calculated due to the lack of required data [35]; while one did not observe significant effects of an FMS/PA only intervention [45]. Finally, two studies investigated the effects of FMS only interventions on academic skills [38,44]; of which one demonstrated beneficial effects [38]; however, an effect size could not be calculated due to the lack of required data. The null-finding was underpowered to detect a medium effect [44].
Children with learning difficulties. In total, three studies assessed the effects of FMS and PA interventions in children at-risk for learning difficulties with low SES backgrounds [36,37,43], two studies on children with learning and perceptual-motor difficulties [39,46], one study on children with delays in language development [49], and one study on children at high risk for ADHD [31]. Three studies assessed cognitive skills as an outcome; two with combined FMS interventions [36,37]; and one with two separate FMS only interventions [39]. Both combined FMS interventions found large beneficial effects on cognitive skills (d = 0.78 − 1.87 x 1.33); while in two FMS only interventions the effects were assessed on both cognitive and academic skills and no significant effects were found [39]. The null-findings [39] were underpowered to detect a medium effect. Three studies assessed language skills as an outcome [43,46,49]. One study with combined FMS and PA intervention [49] and one study with FMS only intervention [43] reported a beneficial effect on language skills; however, effect sizes could not be calculated due to the lack of required data. The combined FMS intervention observed no significant benefits on language skills [46]. The null-finding [46] was underpowered to detect a medium effect. Finally, one study [31] investigated the effects of an FMS only intervention on executive functions and demonstrated a large beneficial effect (d = 1.48).
Children with learning disabilities. Three of the included studies investigated the effects of FMS and PA interventions in children with autism spectrum disorder [32,42,47]. One study with two separate interventions involved children with significant delays in cognition, social, motor, speech, or language development [40] and one study was on children with global developmental delay, autism spectrum disorder, or speech development delay [3]. Two studies assessed the effects on cognitive skills with combined FMS interventions [3,47], and both demonstrated beneficial effects; one study with large effects (d = 1.15) [47], while the other demonstrated a medium beneficial effect (d = 0.52) [3]. Three studies (one with two interventions) [40] assessed the effects on language skills [3,40,42]. A trivial effect was found with the FMS and PA only intervention (d = 0.07) [40], whereas the effect was small with the combined FMS and PA intervention (d = 0.34) [40] demonstrating significantly greater benefits than the FMS and PA only intervention (d = 0.27) [40]. Medium beneficial effects were found with combined FMS intervention (d = 0.57) [3]. For an FMS only intervention, while reporting beneficial effects, the effect size could not be calculated [42]. One study assessed the effects of an FMS and PA only intervention on executive functions and demonstrated large beneficial effects (d = 1.40) [32]. For academic skills, a trivial beneficial effect was found with the combined FMS and PA intervention (d = 0.10) [40] and a small effect with the FMS and PA only intervention (d = 0.30) [40]; with no significant differences between the groups.
Children with physical disabilities. Two studies assessed the effects of FMS and PA interventions on children with physical disabilities; with one of the studies including children with cerebral palsy and growth hormone deficiency [33]; and one including children who had below average physical development at birth [41]. Both studies assessed the effects on cognitive skills and no significant effects were found, either with combined FMS intervention [33] or with an FMS only intervention [41]. Of these, the former study [33] was underpowered to detect a medium effect.

Methodological Quality and Effect Sizes
Methodological quality and effect sizes are presented in Table 3. Large effects were found in children's cognitive skills [21,36,47], executive functions [31,32], and language skills [48]. Of these six studies, only one (17%) [36] had a strong methodological quality, while five (83%) [21,31,32,47,48] displayed a weak methodological quality. In addition, only one study that found large effects [32] used outcome measures that were shown to be valid, while three studies [21,36,48] used outcomes that were shown to be reliable. Two of these studies [31,47] used outcome measures that were neither shown to be valid nor reliable. The studies that received a strong rating in terms of data collection methods demonstrated small effects in two studies [30,40], and trivial effects in one study [40]. Five studies reported that the intervention effects were significant [35,38,42,43], or children's skills improved during the intervention [49]; however, effect sizes could not be calculated due to limited data availability.   [33] Cognitive skills: The Battelle Developmental Inventory Screening Test, cognitive subset Within group analysis (pre-treatment period) ns. n/a * sign = significant effects were reported but effect sizes could not be calculated due to limited data availability. improved = beneficial effects were reported with no description of statistical analyses. n/a = not applicable; power analyses could not be conducted for within-group analyses. ns = nonsignifican differences.

Discussion
The present systematic review aimed to investigate the methodological quality and the effects of FMS and PA interventions on cognitive and academic skills in preschool-aged children with SEN. The results demonstrated that only 9% of the included 22 studies had strong methodological quality, while 86% of the studies were rated as methodologically weak. The most often used outcome measures were cognitive and language skills and the largest effect sizes were found for cognitive skills, executive functions, and language skills. The intervention effects appeared to be somewhat dependent on the severity of the difficulty; in cognitive and language skills, the intervention effects were largest in children with minor learning difficulties (i.e., children at-risk due to family background), whereas in executive functions the intervention effects were largest in children with more severe difficulties (i.e., children with learning disabilities). However, due to the vast heterogeneity of the included studies, and rather low methodological quality, it is challenging to summarize the findings in a generalizable manner.
The finding that most of the included studies were methodologically weak is in line with the findings from a previous systematic review that investigated the effects of FMS and PA interventions on cognitive and academic skills in typically developing preschoolers [16]. The low ratings were mostly a result of inadequate reporting practices, especially in participant selection processes, confounders, blinding, data collection methods, and withdrawals [23]. Inadequate reporting practices are a common limitation in other educational interventions as well [50], and, thus, the use of reporting guidelines is highly recommended in the future.
It is recognized that difficulties exist in recruiting adequate sample sizes in children with SEN, and, thus, a large portion of the included studies were underpowered. Nonetheless, the limitations and potential risks of conducting underpowered studies cannot be dismissed (i.e., studies may result in substantially inflated effects or lead to false negative findings) [51]. Thus, results from underpowered studies may lead to erroneous conclusions as per the efficacy of studied interventions, which can subsequently lead to misguided decision making.
When considering the efficacy of PA and/or FMS interventions in children with SEN, the benefits appear to be somewhat dependent on the severity of the difficulty. Indeed, while large improvements in language skills were found for children at-risk due to family background, the effects were trivial-to-medium in children with learning disabilities. Importantly, the intervention effects observed in children at-risk due to family background were comparable to the effects of children without at-risk conditions in the previous review [16]. In cognitive skills, while medium-to-large improvements were demonstrated among all children with SEN, except for children with physical disabilities, a similar trend was observed. Indeed, the effects were progressively smaller in magnitude with increasing severity of the difficulty. It should be noted, however, that the improvements in cognitive skills-regardless of the severity of the difficulty (apart from children with physical disabilities)-were comparable to the ones experienced by typically developing children [16]. These findings indicate that it is easier to support children with more minor difficulties with FMS and/or PA interventions. Indeed, children that are at-risk due to family background, usually lack the opportunities to develop their cognitive [8] and language skills [52] in their home environment, and, thus, they lag behind their averageperforming peers. With the right kind of early education support, these children have the possibility to develop their skills, which can have a huge effect on their later success during formal schooling [5,8].
The improvements in executive functions also appeared to be contingent on the severity of the children's difficulties. In contrast, however, here the effects were large in children with learning difficulties and disabilities, whereas the improvements were small in children at-risk due to family background. In accordance, children with learning difficulties and learning disabilities improved executive functions to a greater extent than typically developing children [16]; which might be reflective of a greater potential to develop executive functions in children with a lower level of executive functions. In line with our findings, studies have demonstrated larger beneficial effects in older children with ADHD in comparison to their typically developing counterparts [53].
Notably, only one of the studies assessed numeracy as an outcome, which was further limited to only a few dimensions of numeracy (i.e., counting and number recognition). Thus, in addition to cognitive and language skills, more studies investigating the effects of FMS and PA interventions on numeracy in children with SEN are required.
In terms of the intervention type, evidence was found for the efficacy of combined interventions for cognitive skills and FMS and/or PA only interventions for executive functions. Due to an insufficient number of studies, comparison between intervention types was possible only for language skills, and in line with our previous findings in typically developing children [16], combined interventions appeared more effective than FMS and PA only interventions. It should be noted, however, that in the combined interventions, the outcome was typically related to the intervention content; thus, these differences might simply stem from the direct practice of the assessed outcome. Finally, the comparison between FMS and PA only interventions could not be done due to the small number of studies and the vast heterogeneity of the participants in the included studies.

Study Limitations and Strengths
One of the strengths of the present systematic review was that both FMS and PA, as well as combined FMS and PA, interventions were included. In addition, while it is increasingly common for systematic reviews to only include RCTs [54], we included all study designs apart from case studies. This is important, as the use of RCT designs in children with SEN is largely impossible and, thus, remains scarce [55]. Furthermore, the present effect size calculations allowed the quantification and comparison of the intervention effects between the studies. However, some limitations of the present study should be addressed. Namely, only studies that were published in English were included, and some populations were vastly underrepresented, as only two studies assessed children with physical disabilities; making the generalization of the findings unreasonable.

Conclusions
These results indicate that FMS and PA interventions may be beneficial in the support of cognitive and academic skills in children with SEN. The intervention effects appear to be somewhat dependent on the severity of the difficulty; in cognitive and language skills the intervention effects were largest in children with minor difficulties (i.e., children at-risk due to family background), whereas in executive functions the intervention effects appeared to be largest in children with more severe difficulties (i.e., children with learning disabilities). Moreover, in line with the findings from typically developing children, combined interventions appeared to be more effective compared to FMS and PA only interventions. However, the results should be treated with caution as most of the studies had low methodological quality and displayed vast heterogeneity. More studies including combined interventions as well as FMS and/or PA only interventions in children with SEN are required to confirm the present findings. Finally, adherence to reporting guidelines and the inclusion of a priori power analyses are strongly encouraged for future studies.