Effects of In-Classroom Physical Activity Breaks on Children’s Academic Performance, Cognition, Health Behaviours and Health Outcomes: A Systematic Review and Meta-Analysis of Randomised Controlled Trials

In-Classroom physical activity breaks (IcPAB) are a promising way to promote children’s health behaviors, while contributing to the development of their academic and cognitive ability and health outcomes. Yet the effect of the activity breaks, which are exclusive to classroom settings, are still mixed and unclear. Hence, this review was conducted to identify the characteristics and the effects of IcPAB among primary school children. The review protocol was registered on PROSPERO (CRD42021234192). Following the Cochrane guidelines, PubMed, PsycINFO (ProQuest), MEDLINE (EBSCOhost), Embase/Ovid, SportDISCUS (EBSCOhost), Web of Science, Scopus and Academic Search Premier (EBSCOhost) databases were searched to collect data on randomised control trials without a time restriction. The final database search was conducted on the 8 November 2021. Random effects models were used to calculate the effect sizes. The systematic review identified ten eligible studies, nine of which were also included in the meta-analysis. Few studies used the theoretical frameworks and process evaluations. IcPAB showed mixed effectiveness on academic outcomes: i.e., IcPAB had effects on spelling performance (p < 0.05) and foreign language learning (p < 0.01) but not on mathematics and reading performance. Health behaviors such as moderate-to-vigorous physical activity levels were improved (p < 0.01), but IcPAB did not have an impact on cognition outcomes and health outcomes. Given these mixed results, further research is needed underpinned by strong methodological quality, theoretical underpinnings and reliable process evaluation methods.


Introduction
A large body of evidence shows that academic achievement throughout the early school years is closely associated with health-related behaviors such as children's physical activities [1]. Integrating physical activity within school curriculum also contributes to reducing the sedentary behaviour [1,2]. Furthermore, evidence suggest that the cognitive function of children in elementary school is associated with physical activity [3], which suggests that children may benefit from classroom-based physical activity [4,5].
Moreover, the cognitive simulation hypothesis suggests that cognitively demanding physical activities would induce significant improvements in cognitive functioning such as problem solving, memorizing, and executive function [2,6,7], which also help in enhancing academic outcomes, such as in mathematics and reading, among primary school Publications that were not written in English or included special needs, differently abled and other disadvantaged children, were excluded. In addition, interventions that were carried out both inside and outside the classroom, study protocols and interventions with no data on control groups, studies without original data, and studies that had an age range below six years and above thirteen years were also excluded. Only randomized controlled trials were included in the systematic review; all other study designs were excluded.

Search Strategies
PubMed, PsycINFO (ProQuest), MEDLINE (EBSCOhost), Embase/Ovid, SportDIS-CUS (EBSCOhost), Web of Science, Scopus and Academic Search Premier (EBSCOhost) databases were searched without a time restriction using the following search keywords and search terms (strings adapted to different databases): "acti* break OR brain break* OR exercise break* OR class* break* OR movement break* OR lesson break* OR bizzi break* OR energi*) AND (primary school OR elementary school) AND (children OR child OR kids OR kid OR adolescents OR adolescent) AND (physical acti* OR exercise OR movement)", by one author (DP). The final database search was performed on 8 November 2021. Additional hand searches were performed to identify additional papers following a snowball technique by referring to the reference lists of primarily selected papers.

Study Selection
Citations from each database were downloaded into JabRef software, and one author (DP) removed the duplicates. Two authors reviewed the results, DP and YM, first by title and then by abstract. Cochrane's COVIDENCE online software (Free Version) was used to review the articles. Where ambiguity arose over the title or abstract, DP and YM assessed the study's eligibility by reading the article's full text. In the case of doubts, either to include or exclude a study, WL or DY acted as a third assessor to solve such discrepancies.
The initial database search, including five hand-searched articles (Figure 1), provided 2618 publications. After removing 674 duplicates, the team reviewed 1944 papers by title and abstract. Of these,106 articles were retrieved for full-text screening. The reviewers excluded 96 articles that did not meet the inclusion criteria. Ten articles [1,2,5,[9][10][11]22,[60][61][62] were included in the systematic review, and nine [1,2,5,[9][10][11]22,60,61] in the meta-analysis. Int baseline differences) given for the intervention and control group of specific studies [19,41]. The meta-analyses were performed using the Review Manager 5.4.1 software (Cochrane, London, UK). When studies reported intervention effects on multiple measures for an outcome, the reviewers included one outcome measure compared with other studies' outcome measures to prevent duplication of studies under a single outcome [19,41,67]. When there was more than one intervention group in a single study, each intervention group's result was treated as a separated study [19]. Standardized mean difference (SMD) was used to calculate the effect size of each study by computing the difference between treatment and control means [19,41]. Graphic forest plots with effect estimates with 95% confidence interval were considered for meta-analysis and pooled effect size

Data Extraction
One author D.L.I.H.K.P. extracted data of selected studies for qualitative synthesis. The variables that were extracted were: author, published year, geographical origin, participant characteristics (sample size, age), RCT design (number of study arms, duration, and dosage), theoretical framework and process evaluation methods used, academic outcomes, cognitive outcomes, health behaviour, and health outcomes. The review team categorized the primary studies' outcomes based on the methods of a previous review study [63]. Extracted data were recorded in an MS Excel Sheet referring to PICO criteria. The data extraction table of the selected full papers was then independently reviewed by two authors Y.D., W.L. Any discrepancies that occurred were cleared through face-to-face discussions by four authors (D.L.I.H.K.P., W.L., Y.D. and M.Y.).

Bias Assessment
According to PRISMA-P guidelines, two authors (D.L.I.H.K.P., M.Y.) independently and blindly assessed the risk of bias (RoB) of the studies using the revised Cochrane riskof-bias tool for randomized trials (RoB 2; [64,65]). RoB 2 version analyzed five domains for individually randomized trials: (1) bias arising from the randomization process, (2) bias due to deviations from intended interventions, (3) bias due to missing outcome data, (4) bias in the measurement of the outcome, and (5) bias in the selection of the reported result including the overall bias score for each study [64]. The result for each study which provided a value of high, low or some concerns, was obtained by responding to the options (Yes, Probably yes, Probably no, No or No information) provided under signaling questions for each risk-of-bias domain. Any disagreement between bias evaluation scores' risk was resolved through face-to-face discussions, and two authors (W.L. and Y.D.) intervened as the tiebreakers where necessary.

Meta-Analysis
When at least two studies were investigating the same broad outcome, with primary data on mean and standard deviation statistics, separate meta-analyses were conducted [66] for the outcome variables (academic outcomes, cognition outcomes, health behaviors, and health outcomes) by comparing pre and post-intervention values or mean differences of each intervention (IcPAB group) and control group. Where there was no baseline data or data on mean differences, the reviewers used post-intervention values (adjusted for baseline differences) given for the intervention and control group of specific studies [19,41].
The meta-analyses were performed using the Review Manager 5.4.1 software (Cochrane, London, UK). When studies reported intervention effects on multiple measures for an outcome, the reviewers included one outcome measure compared with other studies' outcome measures to prevent duplication of studies under a single outcome [19,41,67]. When there was more than one intervention group in a single study, each intervention group's result was treated as a separated study [19]. Standardized mean difference (SMD) was used to calculate the effect size of each study by computing the difference between treatment and control means [19,41]. Graphic forest plots with effect estimates with 95% confidence interval were considered for meta-analysis and pooled effect size results. The reviewers used random-effects model as per the guidelines of Cochrane Handbook for Systematic Reviews of Interventions because of the following reasons: (1) the number of investigated studies under each variable ranged from two to seven [66], (2) studies were substantially heterogeneous, and (3) there was a wide variation in health and academic outcomes employed in the different studies. To interpret the pooled effect sizes, Hedges's g with reference to Cohen's threshold levels: trivial < 0.2, small ≥0.2 to <0.5, moderate ≥0.5 to <0.8, and large ≥ 0.8 [28,68] were used.
To explore the impact of different decisions on meta-analytic results, a sensitivity analysis was performed by excluding or including studies in the meta-analysis based on the methodological quality of the papers, where there were more than two studies in each meta-analysis. If results remained consistent across the different analyses, these were considered robust as they remain identical/similar even after different decisions. It was considered as an indication that the result may need to be interpreted with caution should the results differ after performing the sensitivity analyses. As the meta-analytic review was conducted for continuous outcomes, Egger's test was performed through JAMOVI 2.0 software to identify the publication bias. Publication bias was detected where p < 0.10 [69].
To solve clinical heterogeneity-related problems, the reviewers made sure to make decisions by cross-checking with the PICO criteria and to ensure that all 'intention-totreat studies' were RCTs. In testing the robustness of the matching of the studies for meta-analysis, the statistical heterogeneity was analyzed using graphic forest plots and by calculating the I 2 statistic (representing the percentage of variance in effect estimates caused by heterogeneity rather than by sampling bias). Threshold level for substantial heterogeneity was set where I 2 statistic was ≥50% while I 2 = 0-40% not important, 30-60% moderate heterogeneity, 50-90% substantial heterogeneity, 75-100% considerable heterogeneity [19,67,70]. All the high levels of I 2 were reported with caution [19,41,67] as the number of meta-analyzed studies were less than ten under each outcome.

Study Characteristics
The characteristics of each study are summarized in Table 2. Overall, the studies were published from 2013 to 2021 and conducted in Australia [62,71]; Ireland [1,10,61]; Netherlands [9]; Switzerland [2,11]; and United States [5,22], which are all western and high income countries. The sample size varied from 40 [5] to 467 children [9]. Across included studies, participant ages ranged from seven [2] to twelve [1,9,60] years old. All the studies consisted of both male and female participants. None of the studies analyzed the effects of IcPAB on the ethnicity of the students, although Layne and colleagues mentioned that their sample consisted of African American students [5]. However, only one study had stratified its outcomes by gender [62]. The effectiveness of the IcPAB was measured after either implementing a two-arm [1,5,9,10,22,[60][61][62] or three-arm [2,11] RCT. Four out of ten RCTs were cluster randomized controlled trials (C-RCTs) [1,5,62,72]. The duration of the C-RCTs ranged from four [5] to nine [9] weeks, while RCT interventions' span ranged from a day [60] to eight months [22]. Two IcPABs were implemented for less than a week [60,61], four were implemented between two and six weeks [1,5,62,73] and three interventions were implemented for more than 12 weeks [2,10,22]. With the exception of two studies (five minutes for an activity break) [10,62], 80% of the interventions allocated ≥ 10 min for an IcPAB session. The total provision for an IcPAB per day ranged from 15 to 20 min [22]. In terms of the total intensity of IcPAB intervention, three studies were between 10 and 50 min [1,60,61], four studies were between 140 and 630 min [5,9,11,62], and three studies were between 1260 and 4800 min [2,10,22].

Theoretical Foundations and Process Evaluation Methods
Two studies (20% of total selected studies) included theoretical support such as the ecological model [62], social cognitive theory [62], behaviour change theory [1] and the B-COM model encompassed in behaviour wheel change framework [1,62]. Other studies did not report a theoretical underpinning for the intervention or a rationale behind the IcPAB activities. Five studies reported the fidelity and process evaluation mechanisms for the studies and used self-evaluated questionnaires by facilitators [1], self-completed daily logs for intensity and accuracy of IcPAB by facilitators [2,11,22,62], post-intervention discussions [62], awarding intensives such as memberships for successfully following intervention guidelines [22], and utilizing drop-in observation visits by student researchers [22]. However, in addition to reporting fidelity mechanisms, only two studies [2,9] discussed the teachers' compliance with the implementation of IcPAB. No studies addressed the students' compliance in attending the IcPAB.

Bias Assessments
Risk of bias assessment indicated to some concerns over the methodological quality among seven of the studies. Two studies were assessed as having a high risk of bias [5,11] and one study was assessed as having a low risk of bias [2] (Table 3). Schmidt et al 2019 [11] Watson et al 2019 [62] Martin et al 2017 [1] Mavilidi et al 2020 [60] Drummy et al 2016 [10]
Effectiveness of the IcPAB interventions on spelling skills (Panel 3 in Figure 2) was evaluated through a single study [2] using two different intervention arms (n = 188 students). The result with a statistically significant, large, pooled effect estimate (SMD = 2.13, 95% CI [0.21 to 4.05], p = 0.03; I 2 = 11%, p = 0.45) confirmed that the classroom-based physical activity breaks might have an impact on primary school kids' spelling performance. Two intervention samples (n = 137 students) from another single study [11] confirmed the effectiveness of IcPAB in improving the foreign language learning ability of students. The analysis reported a statistically significant, moderate to large effect size (SMD = 0.80, 95% CI [0.21 to 1.39], p = 0.008) for foreign language learning (Panel 4 in Figure 2) with a moderate to substantial amount of heterogeneity (I 2 = 64%, p = 0.09).

Intervention Effects on Cognition
Four intervention samples from three studies [2,5,9] were meta-analyzed for the effects of inhibition among primary school children. With a significant level of considerable heterogeneity (I 2 = 97%, p < 0.00001), it was found that the IcPABs do not have significant effects on the inhibitory performance (SMD = −0.64, 95% CI [−1.85 to 0.56], p = 0.30) among the participants (Panel 1A in Figure 3). After performing the sensitivity analysis, it was confirmed that the inhibition performance was not improved by the intervention with a significant moderate to large, pooled effect size (SMD = −1.48, 95% CI [−2.33 to −0.64], p = 0.0006), while there was a considerable heterogeneity level (I 2 = 83%, p = 0.01) without publication bias (Egger's regression = 0.417, p = 0.717; (Panel 1B in Figure 3).
Two intervention samples from the same study [2] showed no significant trivial to small pooled effect sizes favoring the control group (SMD = −0.07, 95% CI [−0.38 to 0.24], p = 0.65) with a less important heterogeneity level (I 2 = 15%, p = 0.28) for updating (Panel 2 in Figure 3). However, the same study [2] reported that the classroom physical activity breaks may have impacts on the shifting performance (SMD = 0.15, 95% CI [−0.14 to 0.44], p = 0.31; I 2 = 0%, p = 0.42) of the executive function with moderate to large effects (Panel 3 in Figure 3). Three intervention samples from two studies showed positive impacts of IcPAB on children's attention performance [9,11]   However, the same study [2] reported that the classroom physical activity breaks may have impacts on the shifting performance (SMD = 0.15, 95% CI [−0.14 to 0.44], p = 0.31; I 2 = 0%, p = 0.42) of the executive function with moderate to large effects (Panel 3 in Figure 3).

Intervention Effects on Health Outcomes
Two intervention samples (n = 68 students) from a single study [60], which was meta analyzed for test-anxiety as a mental health outcome provided a moderate to large pooled effect size (SMD = 0. 16

Sample, Intervention Characteristics, Outcomes, Theory, and Process Evalaution
Data from 1538 primary school students (from seven to twelve years old) in 10 studies were analyzed to assess the characteristics of IcPAB-related interventions and evaluate effectiveness in improving academic performance, cognition, health behaviors, and health outcomes. Interestingly, all the studies were conducted in high income countries, and only one study reported whether the gender of the students would influence the results of the IcPAB studies. Effects of IcPAB by ethnicity could not be found. Such weaknesses not only

Sample, Intervention Characteristics, Outcomes, Theory, and Process Evalaution
Data from 1538 primary school students (from seven to twelve years old) in 10 studies were analyzed to assess the characteristics of IcPAB-related interventions and evaluate effectiveness in improving academic performance, cognition, health behaviors, and health outcomes. Interestingly, all the studies were conducted in high income countries, and only one study reported whether the gender of the students would influence the results of the IcPAB studies. Effects of IcPAB by ethnicity could not be found. Such weaknesses not only illustrate the need for more studies in this area, but also indicates a need for studies in lowand middle-income countries. Only four of the studies used C-RCT designs [1,5,62,72], even though C-RCTs are recommended to evaluate the interventions effects of clusters such as classrooms [74]. Hence, C-RCT designs are encouraged for future IcPAB interventions. The intervention duration was less than 12 weeks in the majority of studies [1,5,9,[60][61][62]73], and allocated ≥10 min per an activity break, which is consistent with previous research findings [41,75]. The total intensity of intervention varied significantly, ranging from 10 min to 4800 min. The suitable IcPAB intervention intensity (dosage) needs to be identified in the future.
Most studies (n = 8) demonstrated average methodological quality, with concerns around the randomization procedure, handling of missing data and the outcome evaluation. When the risk of bias for methodological quality is relatively high in RCTs, the results should be interpreted with caution [19,41]. Our results suggest that the methodological quality of RCTs examining in-classroom physical activity breaks should be improved in future studies [19,41].
Most of the studies focused on understanding the effects of IcPAB on academic achievements, cognition health behaviour and health outcomes. This may be because there are theoretical assumptions and evidence for the relationships between such outcomes [2,28,35,76,77] even though those are under researched among the primary school level children [5,41]. However, none of the studies focused on diet and intake of vitamin supplements, which are important aspects of child growth and education [78]. This emphasises the need for future studies that examine the contribution of diet on IcPAB intervention among elementary school children.
This study also suggests the need for theoretical frameworks [13,79] in designing IcPAB interventions, with well explained process evaluation and fidelity methods [40,80,81]. The use of self-completed daily logs [2,11,22,62] for intensity and accuracy of IcPAB intervention delivery by facilitators seemed to be a popular method for fidelity and process evaluation.

Effectiveness
Previous review studies [13,19,27,35,41,75,82,83], focused on physical activity breaks that were conducted both inside and outside the classrooms without limiting the focus to RCT designs. These studies found that the physical activity breaks can have mixed effects on academic performance. In line with these study findings, current analysis also identified that the IcPAB have mixed effects on academic performance. The reasons for having mixed results for academic achievement could be due to quality, evaluation content, and the standardization of the test for each academic outcome [19,41]. The embodied cognitive load theory suggests that the intensity, load and the extent of physical activity integration into the curriculum affects the academic performance of a student [25]. The type of IcPAB (curriculum-based or general physical activity breaks) and its duration might moderate the effects of IcPAB on academic performance [41,75]. Therefore, more studies should be conducted to identify the accurate effects on academic performance by comparing different types of IcPAB among primary school students [27,41]. Based on the results for executive functions such as inhibition, updating and shifting, as well as the results for mathematics and reading, it can also be assumed that there is a positive association between executive functions and the academic performance of children [8,26,[84][85][86].
Current findings suggest that the effects of IcPAB on cognitive function of primary school children are inconsistent and mixed. Previous reviews [19,41,75,87], which focused on school-based studies that incorporated physical activity breaks both in and outside the classroom and included child populations without an age restriction, reported similar results. Therefore, it can be suggested that the venue (in-classroom or outside the classroom) does not play a crucial role within the school setting in improving cognition through physical activities. According to the cognitive simulation hypothesis, the cognitive demand levels of the physical activity influence the improvements of the cognition [2,6,7]. Hence, it is possible that the IcPAB was not cognitively demanding enough, given that most of the subdomains of the cognition did not have intervention effects. In addition, as explained in Watson's and Masini's studies [19,41], the intensity of activity breaks, the validity and the reliability of the measurements used to evaluate the cognitive performance, the smaller sample sizes, and the inconsistency of the most appropriate amount of physical activity breaks for a cognitive arousal have likely contributed to the conflicting results for cognitive performance among children.
Referring to the effects of classroom-based physical activity breaks on health behaviour, it was found that the MVPA levels of the elementary level students improved. Hence, in line with Masini's meta-analysis [19], but contradictory to another meta-analysis [41], this review confirmed the positive effects of IcPAB for improving physical activity levels. Yet, it should be noted that both those meta-analyses [19,41] were generated by referring to all types of study designs in contrast to the current review which was restricted to RCTs only. However, step count and sedentary behaviour did not indicate pooled effect estimates favoring the IcPAB interventions, contradictory to Masini's findings [19]. Therefore, similar to a previous recommendation [41], this study also suggests that the results on health-related behaviors be interpreted with caution due to the small number of studies (n = less than three studies) included in the meta-analysis.
Finally, less than two studies were identified that studied the IcPAB's effects on health outcomes such as aerobic fitness and BMI. Even though a systematic review [13] reported positive effects on BMI contrary to the current finding from the qualitative synthesis, further studies are warranted to measure pooled effects before providing a conclusive result. Only the effects on test-anxiety as a mental health-related outcome could be analyzed in this review. Even though, the test-anxiety did not provide statistically significant results, it should not be generalized, as the effects sizes were generated from a small sample size based on a single three-arm RCT.

Limitations and Recommendations
There are several limitations of this review. Identifying eligible studies was limited to English-language publications. The study did not analyze the effects of IcPAB on children with special education needs. As all the studies were published with data from high income countries, the outcomes cannot be generalized to include the entire world. In terms of the outcomes, there were seven or less studies under each outcome. Hence, the smaller number of studies limited the possibilities of conducting sub-group analysis in the quantitative synthesis. In addition, 90% of the studies did not analyze the long-term effects of the intervention as they abstained from the follow-up stage [1,2,5,[9][10][11]22,61]. Notably, none of the findings indicated publication bias except for in attention performance. However, there were considerable levels of heterogeneity for some outcomes, as well as concerns related to the risk of bias. This limits the interpretation of the current study, as less rigorous studies might be biased toward overestimating or underestimating the intervention effects.
Yet, despite these limitations this review clearly emphasized the existing gaps in classroom-based physical activity break interventions. This demonstrates that further rigorous and well-designed IcPAB programs are needed to enhance the intervention effects on elementary students' academic performance, cognition, health behaviors and health outcomes. In particular, theoretical underpinnings such as the COM-B behaviour model [1] can be integrated to these intervention designs to obtain positive results [13,35,40]. The COM-B model proposes that people need capability (C), opportunity (O) and motivation (M) to perform a particular behaviour (B) [1]. There are two studies [38,88] using the COM-B model to identify the capabilities, opportunities and motivations of the IcPAB facilitators (e.g., teachers). Based on these findings, the authors then applied certain behaviour change techniques and intervention functions such as education, training, and enablement to improve intervention effects by ensuring the fidelity of their trials. In addition, many studies claimed that there are difficulties for classroom teachers in implementing physical activity breaks [5,23,89] due to high curriculum demands. Pure educational time might also be shortened due to the implementation of IcPAB. Therefore, it would be promising to use curriculum related IcPAB in response to teacher concerns over tight curriculum and insufficient education time. Furthermore, policy level recommendations for teachers from the education authorities to implement compulsory daily IcPAB during lessons are also needed to promote activity breaks for improving the academic performance, cognition, and health outcomes of elementary level students. In addition, it was found that the majority of the studies in the review did not analyze the compliance rate on IcPAB both for teachers and students. Therefore, compliance issues should be taken into consideration in the future. Furthermore, total intensity of intervention may be correlated with the intervention effects on academic performance and health outcomes [75,90]. Therefore, moderation analyses of the IcPAB intervention intensity should be warranted in future meta-analyses. Furthermore, the effects of IcPAB on differently abled and children with special educational needs are suggested to be addressed in the future.

Conclusions
Our study demonstrated mixed effectiveness of IcPAB on academic outcomes (Ic-PAB had positive effects on spelling performance and foreign language learning but not on mathematics and reading performance) and health behaviors (moderate-to-vigorous physical activity levels were improved), but IcPAB did not have an impact on cognition outcomes and health outcomes. Moreover, few studies used theoretical frameworks and process evaluations. Importantly, our study generally included few studies examining the same outcomes, indicating that the effects of IcPAB interventions are under-researched, especially in relation to gender, low-and middle-income countries and the Asian region. A practical-knowledge gap was also found, as the time allocation for IcPAB sessions seemed to differ from what the classroom teachers desired. This study emphasizes the need for improved methodological quality of the RCT designs, specifically in relation to randomization and blinding process, missing data handling and the outcome evaluation. Finally, this study demonstrates that more classroom-based physical activity break intervention studies with RCT designs are required for primary school children to generalize the current findings on academic achievement, cognition, health behaviors and health outcomes.