The Effectiveness of Educational Robots in Improving Learning Outcomes: A Meta-Analysis

: Numerous studies have been conducted to investigate the potential effect of educational robots, but what appears to be missing is an up-to-date and thorough review of the learning effectiveness of educational robots and the various inﬂuencing factors. In this study, a meta-analysis was conducted to systematically synthesize studies’ ﬁndings on the effects of educational robots on students’ learning outcomes. After searching for randomized studies describing educational robots interventions to improve learning outcomes, 34 effect sizes described in 17 articles met the selection criteria. The results of our work evidence a moderate but signiﬁcantly positive effect of educational robots on learning outcomes ( g = 0.57, 95% CI [0.49, 0.65], p < 0.00001). Moreover, moderator analyses were conducted to investigate important factors relating to the variation of the impact, including educational level and assessment type. Based on the ﬁndings of this study, we provide researchers and practitioners with insights into what characteristics of educational robot interventions appear to beneﬁt students’ learning outcomes and how pedagogical approaches can be applied in various educational settings to guide the design of future educational robot interventions.


Introduction
Technological advancements have fundamentally transformed how people, society, and environments inter-relate. Mobilizing digital technology, such as robotics, could significantly facilitate the achievement of the Sustainable Development Goals (SDGs) [1]. As one of the great creative inventions in the 20th century, robots are playing an increasingly important role in industrial intelligent manufacturing, mass production, and public services. Robotics are likely to alter how the SDG are achieved, through replacing and supporting human activities and fostering innovation [2]. In the 1970s, the first educational robot was created in an artificial intelligence laboratory at the Massachusetts Institute of Technology [3]. Early research on educational robots focused on the educational functions of robot kits, including simple kits designed for the purpose of teaching a single function (such as response to sound) as well as complex ones such as Lego Mindstroms that allowed users to build and program [4]. In general, robotic kits are computer-programmed automated machines that are able to perform a series of actions [5]. As more and more social assistance robots (SARs)/social assistance humanoid robots (SAHRs) become available, users will be able to interact with them using actions such as gestures, voice recognition, and emotional expression. SARs are treated as pet animals, toys, or human beings in the form of robots. Given the "uncanny valley" problem [6]. SAHRs are robots with non-threatening appearance, such as Pepper and its precursor, NAO. Because of the ability to talk and show facial expressions, these robots are able to participate in social interactions. For example, they can be used to teach language courses and can even interact with students [7]. Studies concerning the appearance of education robots have examined the user's perception and the physical attributes of the robot (e.g., facial features) [8,9].
With the continuous improvement of robotics technology, educational robots have received great attention from the educational community globally. For example, robotic tutors with empathy have been used to assist elementary school students with learning tasks [10,11]. In South Australia, two schools introduced NAO robots developed and produced by Aldebaran Robotics in France to assist teachers and students [12]. The motivation behind these efforts is that robots can be used to address a variety of challenges faced with education including teacher shortage [13] and teacher workload [14].
The application of educational robots is consistent with several contemporary learning theories such as principles of active learning [15], social constructivism [16], and Papert's constructionism theory [17]. Some evidence is available that the use of robotics in education has a positive impact on student behavior and development, especially in problem-solving skills [18], collaboration [19], learning motivation [20], participation [21], and enjoyment and engagement in the classroom [22,23].These studies drew mixed conclusions about the effectiveness of robotics in education.
Researchers have been actively exploring the use of educational robots in a wide range of courses [24,25]. For example, Hong and colleagues [26] reported that the use of educational robots was beneficial for English learning. Similarly, Toh et al. [27] found that the use of robots helped improve the knowledge of mathematical concepts. McDonald and Howell [28] showed that educational robots could enhance students' interest in engineering and help them gain a better understanding of scientific processes. More broadly, Mathers et al. [29] reported a study that the use of robots enhances knowledge of physics-related topics.
Furthermore, review of literature shows that educational robots are a constantly evolving field with the potential to be implemented in education at all levels from kindergarten to university. Chin et al. [30] indicated that educational robots can provide primary school teachers with tools to increase student achievement. Chang et al. [31] regarded the educational robot as a tool to assist elementary school language teachers. Specifically, educational robots (e.g., NAO) can assist staff in kindergarten by going through a nine-phase procedure [32]. Moreover, NAO and Robovie have also been used to teach children language [33]. Benitti's [34] study reported that the Lego robotics kit is recommended for children age 7 and up. Similarly, Nugent et al. [35] stated that educational robots can teach middle-school students robot activities related to science and engineering processes by giving relatively specific guidance.
Previous systematic review studies have reported the potential contribution of educational robots in schools (e.g., Benitti [34]; Papadopoulos et al., [5]; Spolaôr and Woo et al. [36]; Woo et al. [37]). However, there is a growing criticism from the robotics community in recent years over the lack of empirical research on how robotics can be employed to improve student academic performance [5]. In an earlier study, Benitti's [34] result suggests that few of the empirical studies reviewed support the significance of using educational robots in classroom. Likewise, Woo et al. [37] systematically reviewed studies exploring the possibility of using social robots in naturalistic school settings and identified multiple technical and procedural problems that might affect the successful implementation of such tools. Yet, when examining the overall effectiveness and parameters of successful intervention aimed at the use of educational robots in schools, the above studies are not without limits. For this reason, a meta-analysis was conducted to explore the effectiveness of educational robots in formal learning environments in order to inform, motivate, and guide the use of such tools in future projects. In particular, the study aims to answer the following questions: Q1. Does the use of educational robots in the classroom improve student learning outcomes? Q2. Does the effect vary by The educational level (pre-school, primary school, secondary school, higher education)?

Literature Search and Inclusion Criteria
The procedure for selecting studies was based on the Preferred Reporting Items for Systematic reviews and Meta-Analysis (PRISMA) statement [38]. Our search process included two parts. We first consulted the databases of Web of Science and Scopus by using the following search parameters in titles and abstracts of the documents from 2005 to 2023: (robot* OR educational robot*) AND (learning outcome* OR learning achievement* OR academic performance*) AND (student* OR children* OR learner*). Then, we manually screened Google Scholar and additional records identified through citation checking. A study that qualifies for inclusion must examine the use of educational robots and meet the following additional criteria: (1) investigate the effect of an educational robot on student learning; (2) adopt a randomized experimental or quasi-experimental design; (3) include a control group; (4) provide sufficient statistical information for calculating the effect size; (5) be published in a peer-reviewed English language journal; (6) use courses enrolled in a kindergarten, primary school, secondary school, and university or college; and (7) be conducted in a natural school setting.

Coding Procedure
All records were uploaded to Mendeley. Two researchers independently coded the studies and the following information was extracted for both the experimental group and the control group: the descriptive statistics (i.e., mean, standard deviation, and sample size) of each outcome, the courses involved, sample size, educational levels, robot types, type of assessment, and length of intervention. The inter-coder agreement was evaluated based on both Cohen's Kappa [39] and Gwet's benchmark [40]. All differences in codes were discussed until a consensus was reached. Data were standardized before it was analyzed. First, if studies provide only the p-values, sample size and means, Borenstein et al.'s [41] methods were adopted to estimate standard deviations (SD). Second, in studies where pretest and post-test means were reported, data were combined to generate aggregated means.
When more than one effect size was reported in a study, dependent effect sizes in the meta-analysis were processed according to the characteristic of its dependency to avoid misestimation of standard errors. Various solutions have been proposed in the meta-analysis literature to deal with effect size dependency (e.g., Hedges et al. [42]). For example, one possible solution might be randomly choosing an effect size or taking a mean effect size. However, one problem with this method is that it could result in the loss of data and statistical power [43]. Another problem is that these outcome measures are conceptually different and they may be statistically unrelated [44]. Therefore, if multiple post-test results are reported according to different dimensions in a study, each effect size is considered individually. If several independent sample groups are used, the effect size of each sample group is separately included. In a similar way, in studies involving comparing multiple controls against a single experimental group, the estimated effect size for each pair was separately included. Finally, studies were subgrouped to investigate the possible influence of moderating variables (e.g., the length of intervention). Following previous research, random effects models were applied to investigate the variability of the results of different studies [41].

Quality Assessment
After preliminary screening of the obtained documents and elimination of duplicate documents and documents that do not match the course matter, the quality of these documents was also examined. The six general sources of bias proposed by Higgins  [45] were adopted in the quality assessment including adequacy of allocation sequence generation, allocation concealment, blinding procedures, incomplete result data, selective result reports, and other sources of bias. These items were obtained from published reports, and if more information was needed, we contacted the author. The methodological quality of these items were also checked to determine the level of risk of bias. Low risk of bias is assigned if specious bias might change the outcome; unclear risk of bias is assigned if some suspicion arises; high-risk bias is assigned if paradoxical bias severely affected confidence in the result.

Statistical Analysis
First, we obtained effect size (ES) data from studies included in the meta-analysis. For studies with means and SDs, ESs were estimated using Review Manager 5.3 [46]. Following previous research (e.g., Tutal and Yazar [47]), we chose to incorporate all ESs into the meta-analysis separately. The standardized mean difference index proposed by Hedges [48] was adopted to calculate ESs: Hedges = M 1 − M 2 /SD pooled , where M 1 refers to the mean score of the treatment group, M 2 refers to the mean score of the control group, and SD pooled refers to the weighted average of the SD value of both groups. ESs were interpreted based on Cohen's d, where 0.8 represents a large effect, 0.5 a medium effect, and 0.2 a small effect. A positive ES suggests that the experimental group outperformed the control group. Since continuous data from different scales were extracted, the standardized mean difference (SMD) of the effect size was calculated based on the sample size and 95% confidence interval of each study, and the summary study used analysis of variance. A significance level of 0.05 was set for all analyses (two tailed).
Following the suggestion of Borenstein et al. [41], in case of no heterogeneity, a fixedeffect model was chosen to calculate mean effect size; otherwise a random-effect model was used.To determine what part, if any, of the observed variation was real [41], the I 2 index was used to measure potential heterogeneity [49]. The I 2 value of 25% indicates low level of heterogeneity, 50% indicates moderate level, and 75% indicates high level [50].

Sensitivity Analysis and Moderator Analyses
The robustness of the results was examined using the leave-one-out method. That is, if the removal of an individual study results in substantial changes, this is an indication of poor homogeneity and therefore the results are unreliable [51]. In meta-analysis, heterogeneity often exists between studies. When multiple moderators are present, they may amplify or attenuate each other's influence on the treatment effectiveness. Hence, moderator analyses were conducted to assess heterogeneity by comparing study subsets [52]. As discussed in the literature review section, several variables could potentially affect the effectiveness of educational robots on student learning (e.g., educational level, discipline, and robotic type).

Search Results
As shown in Figure 1, the initial searches yielded 826 relevant articles. The number was reduced to 269 after duplicates were removed. After examining the title and abstract, another 223 residual references were removed. Full texts were retrieved for 46 articles. In total, 17 articles were retained for further analysis.  Table 1 overviews the studies included in the meta-analysis. In one article [53], two studies were identified for inclusion and treated as separate studies. One article [54] had two independent control groups and one experimental group with multiple outcomes, and thus four effect sizes were computed. Four articles reported learning outcomes according to different language skills including listening, speaking, reading, and writing, and thus the effect size for each dimension was treated separately. Final coding led to the inclusion of 34 (k = 34) independent effect sizes from 17 articles.   Table 1 overviews the studies included in the meta-analysis. In one article [53], two studies were identified for inclusion and treated as separate studies. One article [54] had two independent control groups and one experimental group with multiple outcomes, and thus four effect sizes were computed. Four articles reported learning outcomes according to different language skills including listening, speaking, reading, and writing, and thus the effect size for each dimension was treated separately. Final coding led to the inclusion of 34 (k = 34) independent effect sizes from 17 articles. Studies examined the learning effectiveness of educational robots in different courses of two broad categories. Nine articles investigated the use of educational robots in science, technology, engineering, and math (STEM) courses (e.g., C programming) and eight articles analyzed social science courses (e.g., English). Eight articles involved primary children, and three studies had secondary school participants. Two studies had pre-school children and four studies involved students in higher education respectively. The length of intervention differed greatly, ranging from less than 1 h to 25 weeks. The selection included 11 effect sizes for theoretical examination scores, 16 effect sizes for skill examination scores, and 7 effect sizes for attitude towards the course. Seven articles measured the learning effectiveness of robotic kits, five articles examined the learning effectiveness of humanoid robots, and another five articles focused on the learning effectiveness of social robots.

Study Quality
The results of the risk bias assessment in Figure 2 show that most of the included studies obtain satisfactory scores on the six areas, indicating a low risk of bias. Most studies mentioned that a cluster randomized sampling was used or simply stated that "randomization" was used. Selective reporting bias was also examined. Except for one study with missing data, all the other studies fully reported study results.

Main Effect
The primary goal of this study is to understand the nature of the effect of educational robots on student learning. Figure 3 displays a forest plot of the included studies. The forest plot shows the 95% CIs of the ESs of individual studies. Given a high heterogeneity (I 2 = 82%, p < 0.00001), a random-effects model was applied. A significant overall effect size (g = 0. 57 educational robots are conducive to learning outcomes. In general, teaching methods using educational robots can improve learning outcomes by 0.57 SD, a moderate but significantly positive effect according to Cohen and Lee [69]. In addition, the sensitivity analysis showed the results were robust, and the effect size did not vary considerably when the leave-one-out method was used. Regarding sensitivity analysis, the effect size did not vary considerably neither when results were computed by using the leave-one-out method.
dren, and three studies had secondary school participants. Two studies had pre-school children and four studies involved students in higher education respectively. The length of intervention differed greatly, ranging from less than 1 h to 25 weeks. The selection included 11 effect sizes for theoretical examination scores, 16 effect sizes for skill examination scores, and 7 effect sizes for attitude towards the course. Seven articles measured the learning effectiveness of robotic kits, five articles examined the learning effectiveness of humanoid robots, and another five articles focused on the learning effectiveness of social robots.

Study Quality
The results of the risk bias assessment in Figure 2 show that most of the included studies obtain satisfactory scores on the six areas, indicating a low risk of bias. Most studies mentioned that a cluster randomized sampling was used or simply stated that "randomization" was used. Selective reporting bias was also examined. Except for one study with missing data, all the other studies fully reported study results.

Main Effect
The primary goal of this study is to understand the nature of the effect of educational robots on student learning. Figure 3 displays a forest plot of the included studies. The forest plot shows the 95% CIs of the ESs of individual studies. Given a high heterogeneity (I 2 = 82%, p < 0.00001), a random-effects model was applied. A significant overall effect size (g = 0.57, 95% CI [0.49, 0.65], p < 0.00001) indicates that teaching methods incorporating educational robots are conducive to learning outcomes. In general, teaching methods using educational robots can improve learning outcomes by 0.57 SD, a moderate but significantly positive effect according to Cohen and Lee [69]. In addition, the sensitivity analysis showed the results were robust, and the effect size did not vary considerably when the leave-one-out method was used. Regarding sensitivity analysis, the effect size did not vary considerably neither when results were computed by using the leave-oneout method. The funnel plot is a common method for qualitatively measuring publication bias, and it is based on the hypothetical design that the accuracy of the estimation of the effect of intervention measures increases with the increase in the sample size. Figure 4 shows the result of the funnel plot for the 17 articles (k = 34) and it suggests no significant publication bias. The funnel plot is a common method for qualitatively measuring publication bias, and it is based on the hypothetical design that the accuracy of the estimation of the effect of intervention measures increases with the increase in the sample size. Figure 4 shows the result of the funnel plot for the 17 articles (k = 34) and it suggests no significant publication bias.

Moderator Analyses
We conducted moderator analyses to determine the possible explanations for the high heterogeneity among studies. The random-effect mode was chosen to explore the effect of potential moderator variables identified in this study, including educational level, subject area, treatment duration, assessment type, and robotic type. Table 2 shows that the SMD of each subset was positive and exclusive of zero, indicating that students who used educational robots achieved higher learning effectiveness than those who did not. In addition, educational level and assessment type were significantly related to the variability in the learning effectiveness. However, three moderators were not significant: the subject area, treatment duration, and the robotic type. The following sections present the results for each moderator.

Moderator Analyses
We conducted moderator analyses to determine the possible explanations for the high heterogeneity among studies. The random-effect mode was chosen to explore the effect of potential moderator variables identified in this study, including educational level, subject area, treatment duration, assessment type, and robotic type. Table 2 shows that the SMD of each subset was positive and exclusive of zero, indicating that students who used educational robots achieved higher learning effectiveness than those who did not. In addition, educational level and assessment type were significantly related to the variability in the learning effectiveness. However, three moderators were not significant: the subject area, treatment duration, and the robotic type. The following sections present the results for each moderator.
To examine if the influence of educational robot-based classroom instruction varies across educational levels, studies were divided into four subsets based on research setting: pre-school setting, primary school setting, secondary school setting, and higher education setting. As indicated in Table 2, educational robots were found to have positive effects on student learning at all educational levels. In terms of the strength of the effect, educational robots had quite a strong effect on student learning at secondary school level (g = 1.69) and higher education level (g = 1.42), and had a less strong effect at primary school level (g = 0.78) and pre-school level (g = 0.55). The included studies involved a diverse range of courses. To examine whether the use of educational robots in classroom instruction is more beneficial for some courses than for others, we separated the courses into science courses (e.g., mathematics and computer programming) and non-science courses (e.g., English and theater). Eligible studies were equally distributed between the two subsets. We also compared the treatment duration using moderator analysis. The results suggested that subject area (I 2 = 0%, p > 0.05) and treatment duration (I 2 = 0%, p > 0.05) had no significant moderating influence on the relationship between the use of educational robots and learning outcomes. In other words, the effects of educational robots used in different disciplines were not different, and the effects of implementing robotics-based educational tools from less than an hour to above eight weeks were not significant.
The learning outcomes of educational robot-assisted learning were often assessed using an examination score or some skill-based measure. Further, self-report questionnaires were also used to elicit student attitudes toward the course. It was found that the effects of educational robots in the classroom were totally different according to the type of assessment (I 2 = 83.7%, p < 0.05). The results show that the implementation of educational robots in student attitudes (g = 1.23) were significantly better than in exam mark (g = 0.97) or tests of skill (g = 0.49). Further, no significant heterogeneity value was found within each robotic type (I 2 = 0%, p > 0.05). The ESs produced by studies implementing robotic kits, social robots, and humanoid robots were +0.88, +0.71, and +0.91, respectively.

The Learning Effectiveness of Educational Robots
This study meta-analyzed 17 articles that examined the learning effectiveness of educational robots. Educational robots were found to have a moderate but significantly positive effect on student learning outcomes. This finding suggests that educational robotbased classroom instruction tends to produce better learning outcomes than traditional lecture-style teaching. This finding corroborates the positive results reported in previous review studies [5,34,37]. Several possible explanations might contribute to why students who received robot-based classroom instruction had better learning outcomes than those who were taught by traditional teaching methods.
First, some advantageous features of educational robots including the possibility of performing repetitive tasks precisely [31] can be leveraged for teaching purposes if they match the instructional goals. Woo et al. [37] found that social assistance humanoid robots (e.g., Pepper and NAO) have been used to support classroom teaching by reducing time-consuming and tedious repetitive activities. As a teaching tool, the repeatability of educational robots would provide more in-class time for collaborative learning activities. Thus, teachers are allowed more time to monitor student learning progress and provide timely assistance when problems occur.
Another explanation for this finding may be that educational robots can facilitate learning [70]. The use of educational robots can help to inspire curiosity and an enjoyable learning environment via interesting activities and practical experiences [23]. In addition, Alemi et al. [56] argued that the pleasure of the robot brought interesting interactions and lowered students' anxiety during the learning activities. The robot's friendly attitude towards children added an element of non-judgment and comfort to the interaction while learning, which in turn lowered the fear of making mistakes.
Third, previous reviews (e.g., Cheung and Slavin [71]) suggested that a sample size of 250 or fewer can be treated as small. Small sample size is a common concern for nearly all studied in the meta-analysis. Compared with those with large sample sizes, studies with small sample sizes tend to be more strictly controlled and thus there is a higher chance of obtaining positive results [72]. In addition, the file-drawer effect is more likely to occur in studies with small sample sizes where null effects are found.

Moderators for Educational Robots on Learning Effectiveness
Five factors were included in the moderator analysis. With regard to the factor of the educational level, results showed that the difference between the summated ESs of educational levels was found to be statistically significant; however, given the small number of studies included, interpretation of the ESs must be carried out with caution [73]. There were quite strong effects for secondary school students (note that k = 3) and higher education students (note k = 8), moderately strong effect for pre-school children, and strong effect for primary school students. These divergent findings might not be easily interpreted by a discerning pattern. However, one might tentatively speculate that robots are still limited in their capacities to perceive the human world [74]. Moreover, robots are typically used in situations where the lessons are short and well-structured, and delivered with little adaptation to the needs of an individual learner or a curriculum [75]. Therefore, the small effect of educational robots on student learning might be attributed to factors such as the learning content, activity format, or course requirements at the pre-school and primary levels.
In addition, Fernández-Llamas et al. [76] noted that younger students were more familiar with robots and were also more likely to believe that robots could think [77]. However, findings of some studies suggest otherwise. For example, Serholt [78] found that children expected that a robot could understand their intentions like a human teacher does. When the robot failed to do so, children would perform the task on their own or stop interacting with the robot. The study also pointed out that it is paradoxical to see that robots with a humanoid appearance actively participated in a learning activity but they were deficient in social interaction and cooperation skills were non-existent. Together, these findings suggest that the adoption of educational robots at the pre-school and primary levels requires a more careful instructional design. Moreover, looking ahead, we hope that robots someday reach an adequate level of humanlike perception and communication abilities to play such a role in interpreting their intentions, much like human teachers do.
Second, with regard to the type of assessment, the difference between the summated ESs of the groups was found to be statistically significant. The meta-analysis revealed that educational robots had a stronger impact on student attitude and were slightly weaker in improving theoretical and skill examination scores. Chin et al.'s [30] study found that students' attitude toward the educational robot was positive. In another study, Benitti [34] reported that the young people with different interests could potentially benefit from educational robot-based instruction. It is possible that interaction with robots can increase student motivation, engagement, and attitude towards education. Given these findings, future research on the application of educational robots should consider exploring how these tools in school environments can be better employed to promote the learning of knowledge and skills.
Third, statistically significant differences were not found among the moderators of subject area and robotics type. This seems to be an unsurprising result. The effectiveness of educational robots depends on various factors, and they must be adaptable in real time [7]. In general, there appears to be no discrepancy in learning outcomes if the robot chosen is suitable for the specific discipline. Prior studies showed that educational robots can help improve student learning outcomes in mathematics and science courses [27,79]. Chang et al. [31] reported that socially assistive robots can promote English language skills. However, to meet the learning purposes of incorporating robots in classroom settings, they must be deployed and studied in the context of their primary use. For example, it is unreasonable to use complex robots with pre-school children in a classroom setting. As such, future research has a responsibility to explore the use of different of robotics in terms of different learning activities.
Finally, the three categories of treatment duration are proved to be insignificant. This finding is in line with results of recent reviews (e.g., Cheung and Slavin [71]) which have consistently found that technology has no effect in improving student learning in the long run. Moreover, another factor for the nonsignificant result might be learning intensity because not every study reviewed in this meta-analysis indicated if students spent an equal amount of time using educational robots.

Conclusions
Undoubtedly, educational robots will be expected to take on a more vital role in schools in the future. Therefore, how educational robots can be best integrated into classroom instruction is a question that deserves greater attention. Despite the proposed advantages of incorporating educational robots in student learning [5,23,34,80], currently researchers have not given particular attention to the overall effectiveness and parameters of successful intervention aiming at the use of educational robots in school settings. To address this gap, this study, therefore, conducted a meta-analysis of the effects of educational robots in the classroom.
This study found supporting evidence for the positive effects of educational robotbased interventions on student learning (mean ES = +0.57), suggesting that educational robots can be leveraged to facilitate student learning. Moreover, the homogeneity test indicated that there was a high level of heterogeneity among the effect sizes of the studies reviewed in this meta-analysis. To further investigate this heterogeneity, five factors including course type, education level, treatment duration, assessment type, and robot type were quantitatively assessed using the moderator analysis. Our findings partially support and enrich the existing research in various ways. Some of the findings provide guidance and direction for the process of educational robot operation. In terms of treatment duration, the usefulness of robotics-based instruction remains stable as the duration of implementation is extended. What is clear from investigating the moderator effect of subject area and treatment duration is that the choice of robot usually depends on practical considerations.
Overall, course designers and teachers can use these results in course design and facilitation of learning to improve student's learning in educational robot-based courses as well as differentiating their practices according to course level (e.g., children and undergraduate students), discipline, and robotic type. Finally, we hope that the results of this study can further advance our understanding of implementing educational robots in formal learning settings, and enhance the quality of education for students engaged in robot-based learning.

Limitations and Future Research
A first limitation was that this study only focused on several factors that might affect the effects of educational robot interventions. It did not consider other factors such as gender differences or socio-economic status which might also influence student learning outcomes. Future research should consider how these and other factors might be related to students' academic performance involving the use of robotic technology in education settings.
Another limitation was that most randomized-controlled studies included in the metaanalysis were conducted with children aged 8 to 12 years old. Further research should be conducted to explore how these findings can be applied to different age groups.
Finally, although it is central to analyze specific characteristics of the intervention's influence on the effect of educational robots in formal learning environments, in the future it will be valuable to explore interaction effects among moderators, which can provide valuable information to answer questions such as "do these intervention components amplify or attenuate each other's effectiveness?"

Data Availability Statement:
The datasets generated and/or analyzed during the current study are available from the corresponding author.

Conflicts of Interest:
The authors declare no conflict of interest.