1. Introduction
There is a growing interest in integrating computational thinking (CT) into early childhood education, given its significant contribution to the development of 21st-century skills, such as creativity, problem-solving, and digital competence [
1]. One of the most used tools to promote CT is screen-free programmable robots targeted at children from the age of 3 [
2,
3]. Robotics offers an appealing approach to developing CT, as it provides children with a tangible way to observe the immediate effects of their interactions with the robot through its behavior [
2]. However, screen-free programmable robots aimed at preschool-aged children, such as BeeBot [
4] or KIBO [
5], offer limited activity options, which can quickly become repetitive for young learners. These screen-free robots were informed by the theory of Constructionism [
6], which argues that children can construct their knowledge and thinking through creating their own artifacts using all sorts of materials, especially a computer. In this study, we seek to develop a robot that is rooted in Sociocultural Theory [
7], which argues that children’s cognitive functions are the products of their social interactions. To create a digitally-mediated context that can stimulate children’s positive social interactions with the robot as well as other people (e.g., peers and teachers), there is a highlighted need to develop new robotics that are different from the existing platforms, thus exploring innovative, playful, and appealing ways to foster student motivation and maintain their interest in CT activities over the long term [
1].
Furthermore, most existing coding kits use tangible or graphical interfaces, while alternatives like voice user interfaces are not yet widely used and evaluated [
8]. Evidence suggests that voice user interfaces can foster peer interactions among three- to four-year-old children [
9]. Incorporating these interfaces into activities designed to enhance CT could merge social engagement with cognitive development. The design of the interface plays a crucial role in determining the users’ capabilities [
10]. Thus, exploring new programming forms seems like a vital endeavor to equip preschoolers with suitable tools for their initial journey into CT education.
To overcome the limitations of currently available coding kits for young children, we suggest introducing screen-free programming of high-capacity robots in early childhood as a valuable tool to promote CT and to improve or extend the traditional approach to educational robotics. The rapid progress in artificial intelligence technology, featuring interfaces that enable interaction through gestures, touch, and speech [
11], has significantly expanded the ways in which children can engage with technology and interact with robots, thereby broadening the scope of concepts and skills that can be addressed. This study aims to investigate the effectiveness of a program based on interaction with a collaborative voice-controlled robot in improving children’s CT skills.
1.1. Computational Thinking Skills in Early Childhood
CT is a 21st-century skillset required for solving complex problems, with the principles rooted in the field of computer science [
12]. Wing et al. [
12] define CT as the thought processes involved in understanding a problem and expressing its solutions in such a way that a computer can potentially perform them. It is considered as important as reading, writing, and arithmetics, and can be applied in both computing and non-computing disciplines, including human beings’ everyday lives [
12]. Shute et al. [
13] identified the key components of CT, namely, “decomposition, abstraction, algorithm design, debugging, iteration, and generalization” (p. 142). These key CT practices align with the powerful ideas proposed by Bers [
14], who argued that CT for young children should include at least algorithms, modularity, control structures, representation, hardware/software, debugging, and the design process. To further establish a comprehensive framework, Zeng et al. [
15] have conducted a systematic literature analyzing 42 different studies. Their analysis revealed representation, debugging, decomposition, and algorithmic thinking as the most prominent CT components. These components have been highlighted as key aspects in the understanding and practices of CT (e.g., [
16,
17]).
Representation refers to how computational ideas are abstracted and described. A total of 9 out of the 42 studies in the systematic review incorporated representation, teaching children to represent algorithms and computational concepts graphically through drawings, objects, or programming interfaces [
15,
18]. Being able to represent algorithms aids in communicating and explaining solutions [
14].
Debugging involves identifying and fixing errors. It was a focus in 23 out of the 42 studies [
15]. Young children were found capable of debugging simple programs or algorithms through trial and error [
15,
18]. Debugging improves as children progress from fixing concrete errors to restructuring logic and considering edge cases [
18,
19].
Decomposition supports breaking down problems into smaller, more manageable parts. A total of 16 out of the 42 studies featured decomposition through activities like modularizing programs, dividing a task among agents, or disassembling an object’s functions [
15,
18]. Decomposition is developmentally appropriate even for preschoolers and improves problem-solving abilities [
14,
18].
Algorithmic thinking refers to designing step-by-step procedures to solve problems. A total of 13 out of the 42 studies examined algorithmic thinking by having children design solutions and solve tasks sequentially [
15]. Algorithmic design is an increasingly emphasized 21st-century skill and can be nurtured through play-based, scaffolded learning experiences [
12,
19].
Given the lack of consensus on which CT components to teach beginner coders in early childhood education (ECE), the framework developed by Zeng et al. [
15] identifies key CT components suitable for early childhood education based on a three-dimensional framework that includes concepts, practices, and perspectives. The concepts encompass control flow/structures, representation, and hardware/software. The practices include algorithmic design, pattern recognition, abstraction, debugging, decomposition, iteration, and generalizing. The perspectives focus on expression, creation, connecting, perseverance, and choices of conduct. This framework offers a complete structure that supports the incorporation of CT into the educational curriculum, skill assessment, and educational research, promoting holistic and adaptive development during the early years of education.
This study adopts a broader perspective by utilizing Zeng’s framework to explore potential components that can be addressed during tasks involving a voice-controlled collaborative robot, such as representation, hardware/software, algorithmic thinking, debugging, generalizing, connecting, perseverance, and choices of conduct.
1.2. Effects of Interaction with Robotics on Computational Thinking among Young Children
The introduction of robotics into early childhood education has gained significant attention in recent years. A survey of the literature reveals a growing interest in the potential benefits of integrating robotics into the classroom setting to foster CT among young children.
Brennan and Resnick [
19] argue that CT is not just about coding but also includes systematic problem solving, logical reasoning, and creativity. Tedre et al. [
20] further argue that CT skills should be updated to include new concepts and problem-solving processes related to machine learning, which has a strong relevance to the current era of artificial intelligence (AI). The literature suggests that the use of robotics can have a significant impact on the development of CT skills among young children (e.g., [
1,
21,
22]). Bers et al. [
21] found that children who engaged in developmentally appropriate robotics programming activities showed enhanced CT skills and mastery of coding concepts. Similarly, Yang et al. [
22] noted improvements in CT skills among children involved in robotics coding activities, especially when they were engaged in a culturally responsive and embodied learning approach [
22].
Several studies have examined the ways children interact with robotics. Sullivan [
23] observed that children often engage in iterative processes of designing, testing, and refining their robotic creations, which mirrors the CT process. Yang [
22] found that children’s interaction with coding robotics can foster not only CT skills but also social skills, as children often work collaboratively on robotic projects.
Emerging research on robotics reveals promising effects on young children’s CT, with systematic reviews suggesting positive outcomes [
8,
24]. Bers [
21] reported that children as young as four years old can successfully program and control robots, demonstrating early computational thinking skills. Similarly, Yang [
1] found that robotics activities can support young children’s development of CT skills. The reviewed literature suggests that interaction with robotics can positively influence the development of CT skills among young children. Moreover, Pugnali [
25] found that the user interface (graphical coding app ScratchJr vs. tangible programmable robotics kit KIBO) could impact children’s acquisition of CT. However, further research is needed to explore the effects of innovative user interfaces (e.g., voice user interfaces) for integrating robotics into early childhood education.
This study investigates how a voice-controlled collaborative robot influences the development of computational thinking (CT) skills in preschool children. To address this, we designed a program based on interaction with the robot and aimed to answer the following research questions:
To what extent does a program centered on interaction with a collaborative voice-controlled robot foster computational thinking in preschool children?
What CT components do children engage with while developing collaborative tasks through interaction with a collaborative voice-controlled robot?
To answer these questions, we measured changes in children’s computational thinking skills before and after participating in the robot-based program through pre- and post-tests. Additionally, we identified and classified the CT components demonstrated by children during collaborative task development with the robot.
2. Materials and Methods
In this study, we investigated the effects of a child–robot collaboration program based on interacting with a voice-controlled robot on the CT of preschool children. We used a quasi-experimental design with pre- and post-tests to evaluate the effects of intervention in 34 preschoolers’ CT. This quasi-experimental design was chosen for two reasons. First, given that the study was conducted as part of a summer robotics workshop, random assignment to control and experimental groups was not feasible. Second, the quasi-experimental design allows for meaningful comparisons despite these limitations, providing a clear picture of the intervention’s impact [
26]
A mixed methodology design was used to leverage the strengths of both quantitative and qualitative data collection methods, as it provides more informative, comprehensive, balanced, and useful research results [
27].
The study was developed in an extracurricular context in a small city in the South of Chile. An open call was made, aimed at children aged 4 to 6 years, to participate in a summer workshop on collaborative robotics. The workshop was free and open to the general community. The parents of the children interested in participating in the workshop were informed about the objective of the study and its implications. Informed consent was obtained from all subjects involved in the study following the Declaration of Helsinki and was approved by the Ethics and Bioethics Committee of Universidad Austral de Chile on 22 April 2022.
In the study, 34 preschoolers (19 girls and 15 boys, ages 4–6, mean age 5.41 months, SD = 0.70) participated, with 88.2% having no prior coding experience. Of these, 33 children attended all the sessions.
2.1. Voice-Controlled Collaborative Robot
The robotic arm setup, shown in
Figure 1, is adapted to be voice-controlled, enabling children to interact with the robot using voice commands. This innovative approach enhances learning experiences by allowing children to engage directly with the robot. Our system includes the Ufactory Lite 6 robot arm, equipped with a vacuum gripper for manipulating objects and a microphone–speaker combo for receiving vocal instructions and providing audible responses. The Ufactory Lite 6 is a collaborative robot designed to work safely alongside humans due to its inherent safety features.
Despite the collaborative robot’s inherent safety features for human interaction, we meticulously delineated its operational zone with four QR codes to prevent children from entering the workspace, while also allowing it to access four additional positions aligned with color-coded buckets for task-specific object placement. Moreover, a supervising adult was strategically positioned between the child and the robot, and the robot’s speed was reduced, enhancing the experimental phase’s safety measures.
Interacting with the robot follows an intuitive sequence of steps, beginning with the system’s activation through a voice-initiated wake-up word (see
Figure 2). The wake-up word used was “Manito”, a Spanish word meaning “Little Hand”.
Upon recognition of the wake-up word, the system communicates its readiness to receive instructions through an auditory signal, establishing a six-second window for the child to issue a command. This fixed duration is a deliberate design choice, diverging from dynamic speech detection systems to better accommodate to the varied pacing at which children articulate their thoughts. This feature ensures the system’s robustness, particularly for younger users, by preventing premature termination of command capture due to hesitations or pauses in speech.
After a command is given, the system employs a deep learning speech-to-text (STT) model to transcribe the spoken instructions. An automatic semantic analysis is then conducted on the transcribed text, using a Large Language Model (LLM) to interpret the intended action. This analysis considers the action to be performed, the object involved, and any specified attributes such as color or size, though not all attributes are required for every command.
The final step in the process utilizes computer vision to identify and determine the positions of objects within the robot’s working environment. These spatial data, combined with the insights from the semantic analysis, inform the robotic arm’s actions. The system requests clarification through an audio message if a command is unclear or unfeasible. Successful command execution is followed by a confirmation message indicating the task’s completion and the system’s readiness for further instructions.
2.2. The CT Program
We developed an intervention program to promote CT via the aforementioned voice-controlled robot by engaging an early childhood practitioner and an early childhood learning scientist. This program involved an early childhood teacher and one of the authors, who is an early childhood learning scientist. The role of the researcher was to conduct the learning experiences with the children, mediate their interaction with the robot during the execution of tasks, explore the students’ thinking, and apply the data collection instruments. The role of the early childhood teacher was to support the work with the children when they were not interacting with the robot, assist the children if they needed to leave the room, and maintain order within the group.They offered suggestions to develop and revise the activities for the program, with a focus on enhancing children’s CT learning. The three-dimensional CT framework of Zeng [
15] guided the design of these learning activities. The program consisted of five one-hour workshop sessions in total. To accommodate parents’ schedules, each day’s workshop was offered in three consecutive time slots: 4:00 PM, 5:00 PM, and 6:00 PM, with a fifteen-minute break between sessions. Children could be enrolled in any of the three groups based on their parents’ preference.
In the first session, the children’s CT was assessed at the beginning of the intervention, representing their prior knowledge. This session focused on addressing the components of representation, hardware/software, and choices of conduct through activities aimed at familiarizing the children with the robot. The session opened with the researchers introducing themselves and leading a game to unite the group and help the children get comfortable with the laboratory. A pre-test was administered to each child individually by the researcher who participated in the intervention. Following this, a brainstorming activity allowed the children to share their existing knowledge about robots and discuss what they wished to learn. Next, “Little Hand”, the robotic arm, was presented; its functionality was explained, and interaction guidelines were established, such as the robot being able to listen to only one person at a time and follow only one instruction at a time. Additionally, to illustrate the robot’s object-grasping capability, the children were taught to lift a piece of paper with a straw, as a tangible analogy to the robot’s vacuum gripper mechanism. The session culminated with each child engaging with “Little Hand” for the first time, manipulating the robot to pick up and release objects on a table, establishing their initial interaction with the technology. Open-ended questions about the robot’s mechanisms followed, and the children concluded with coloring activities featuring robot illustrations.
In the second session, we focused on algorithmic thinking, debugging, perseverance, and connecting. We began with a recap of the initial session by asking the children to recall the robot’s capabilities and operational mechanics. For instance, we posed questions like ’What command should we use to instruct “Little Hand” to pick up the blue cube and deposit it in the blue box?’. During this session, the children’s task involved using the robot to move a white cube across the workspace to a white box, starting from different positions. To facilitate this exercise, we paired the children into teams of two, fostering teamwork. This team-based method was designed to encourage mutual assistance in accomplishing the task. Through their joint efforts, the children were able to successfully place the cube into the specified box (see
Figure 3). In other cases, this was achieved with scaffolding or adult guidance.
In the third session, the objective was for the children to sort objects of a specific color into their corresponding boxes. Just like in the previous session, this session focused on the components of algorithmic thinking, debugging, perseverance, and connecting. Before engaging with the robot, the children participated in a game featuring “Little hand”. An adult imitated the robot arm behavior, offering a hands-on reminder for the children to give instructions to the robot. Afterward, the children, in pairs, undertook various tasks with the robot. For example, children were tasked with sorting all white objects into a white box, among other similar activities.
The fourth session focused on the components of generalizing, debugging, perseverance, and connecting. At the beginning, to recall the voice commands that the robot operated by, we repeated the activity where we pretended to be “Little hand”, involving boards and pieces in small group interactions. Next, we outlined the session’s goal: to construct a toy robot from two distinct parts—a large cube featuring an illustration of the robot’s torso and a smaller cube with an illustration of the robot’s head. These cubes were initially placed separately, with the smaller cube intended to be positioned on top of the larger cube to complete the robot’s form. The children were then instructed to use voice commands to direct “Little hand” to lift the small cube and accurately place it on the large cube, thereby assembling the toy robot. To engage the children, they were allowed to select and color the head’s illustration before instructing “Little Hand” to perform the assembly with their chosen head.
In the fifth and final sessions, the children revisited concepts from previous sessions, which led to a discussion about the commands “Little hand” required for doing specific tasks. The components of representation, hardware/software, algorithmic thinking, debugging, and generalizing were intentionally addressed here. After this review, the post-test was applied. Following the test, the children were introduced to a set of the robot’s capabilities that had not been previously shown, demonstrating that the robot could do much more. Specifically, the robot was attached with a cardboard hand, allowing it to wave and clap with the children. The children waved back joyfully and concluded the session with a final high five with “Little hand”.
2.3. Data Collection Tool and Analysis
The TechCheck-K assessment developed by Relkin [
28] was used to evaluate the children’s CT. This assessment measures CT offline through 15 multiple-choice challenges derived from six CT domains. These domains correspond to six of Bers’ “seven powerful ideas of computer science”, covering algorithms, modularity, control structures, representation, and hardware and software concepts [
14]. The test includes questions such as: “Which one functions more like a computer?” and “This seesaw does not go up or down. How can it be modified to work?”
In the assessment, each of the fifteen multiple-choice questions earns one point if answered correctly, with no penalty for wrong answers. The test begins with two practice questions that do not count toward the final score. Each student’s total score is the sum of all question points. We then checked if the sample was normally distributed to select the appropriate statistical tests for comparing pre-test and post-test results. Lastly, we analyzed and synthesized the data to identify key themes.
We also included a qualitative content analysis of the video-recorded episodes of the fourth session challenge faced by the children. This challenge was undertaken by 33 of the 34 children who participated in the intervention. Since the children were divided into three groups with consecutive sessions, 15 pairs collaborated on the challenge, while three children worked individually. For this, the three-dimensional CT framework proposed by Zeng [
15] was applied to identify the CT components employed by the children during the task development (see
Table 1). Following Braun [
29], we transcribed, coded, and analyzed 18 task videos to identify patterns and define emergent themes.
3. Findings
We compared the differences between pre-test and post-test CT scores to preliminarily investigate the effectiveness of a program based on interaction with a voice-controlled robot in improving children’s CT skills. We analyzed whether the sample followed a normal distribution to utilize the paired-samples t-test. Given our sample size of less than 50, the Shapiro–Wilk test was employed to assess normality. The results indicated non-normal distributions for both pre-test (W(33) = 0.175, p < 0.05) and post-test (W(33) = 0.506, p < 0.05) scores, thereby violating the normality assumption. Parametric test assumptions were not met for studying gains between pre- and post-CT for each computational concept, and we used the Wilcoxon signed-rank test to compare our intervention’s effects.
On average, students demonstrated improved results in the post-test (M = 10.12, SD = 2.08) compared to the pre-test (M = 8.76, SD = 2.59). Results indicated significant differences post-intervention W (33) = 0.006,
p < 0.05, within such a short period. This implies that the intervention positively affected students’ performance in CT assessment.
Figure 4 visually illustrates the distribution of pre-test and post-test scores. In comparison, in [
28] is reported a similar average score (M = 10.65, SD = 2.58) following a 7-week intervention with children aged 5 to 9. Given the positive correlation between age and performance observed by [
28], our results with a younger cohort (4–6 years old) are auspicious.
We also qualitatively analyzed the video recordings of the episodes where the children faced the final challenge. We explored which components of CT the children applied during the task’s development. Our findings suggest that the children employed many of the key CT components described in the framework by [
15] that were intentionally addressed in the sessions. These components include CT concepts, practices, and perspectives (see
Table 1). The number of children using each CT component can be seen in
Figure 5.
Our results suggest that the CT concepts of “representation” and “hardware and software” were engaged by all the children who carried out the task. All the children participating in the study demonstrated an understanding that certain words expressed in verbal language can represent actions, showing the ability to use them to direct the robot to perform a specific action. This understanding of the concept of “representation” was evident, for example, when the children would say “Little Hand go down” for the robot to lower, “go up” for it to rise, and “grab it” for it to take an object. Similarly, the understanding that “software provides instructions to the hardware” and the hardware receives and executes these instructions was present throughout the development of the task. All the children could understand that the voice recognition software provided instructions to the robot’s hardware, allowing it to understand and execute the commands they spoke. This understanding was evidenced, for example, when the children gave a specific instruction to “Little Hand” and expected them to execute and respond before proceeding with the following instruction.
The children also engaged in many CT practices during the task development. The ability to transfer a specific problem-solving strategy to a different context, “Generalizing”, was utilized by 28 of the 33 children who resolved the challenge. At the beginning of the challenge, the children were told that the mission was to help “Little Hand” assemble a toy, and they were asked if they wanted to do it without being given any hints on how to solve the problem. With this action, we aimed to explore whether the children would be capable of generalizing (see
Figure 6). Our results suggest that the children were able to transfer the strategies they used to tackle various problems from previous sessions (for example, placing the blue cube in the blue box) to help “Little Hand” assemble the toy without significant difficulty.
A total of 30 out of the 33 children who faced the problem demonstrated decomposition. Most of the children could identify and sequence various actions needed to complete the task, such as moving to the blue cube and picking it up, then moving to the red cube and dropping it.
Algorithmic thinking was practiced by 26 of the 33 children who solved the task, being able to sequence the commands in an orderly manner to help “Little Hand” assemble the two-piece toy. The children instructed “Little Hand” to go to the blue cube, lower down, take the blue cube, rise, go to the red cube, lower down again, and finally, release the blue cube on top of the red cube, successfully solving the challenge of assembling the robot.
During the task, 15 of the 33 children debugged when encountering errors. These children recognized and corrected their mistakes, eventually completing the task successfully. A standard error occurred when they instructed “Little Hand” to release the blue cube directly over the red cube without lowering it first. This resulted in the cube falling instead of being placed on top of the red cube, thus not fulfilling the task’s objective. The following transcription exemplifies this error and illustrates the children’s process of realizing and rectifying their mistakes.
Girl 7: Little Hand.
Little Hand: Give me the next instruction.
Girl 7: Little Hand, release it.
Little Hand: Ok, let me think.
Little Hand: Ready (robot releases the blue cube over the red cube).
Girl 8: We forgot to lower it.
Researcher: Did we succeed?
Girl 7: Yes.
Girl 8: No.
Researcher: No? Why do you say no? (to Girl 8)
Girl 8: Because it did not assemble like that one.
Researcher: Furthermore, what do you think happened?
Girl 7: It fell.
Ten children flawlessly navigated the challenge on their initial attempt, eliminating the need for correction. Meanwhile, the other eight children either needed to be made aware of their errors or required assistance to make the necessary adjustments for task completion.
Three CT perspectives were observed during the task development: “connecting”, “perseverance”, and “choices of conduct”. Connecting unfolded naturally in the interactions among the children who were solving the task and between the children and the robot. Most of the children communicated and cooperated with each other and with “Little Hand” to solve the task (30 out of 33). Cooperation was explicitly evident in how they organized, taking turns interacting with the robot, in their commitment and interest in helping each other when difficulties arose, and in assisting “Little Hand” in completing the task. Communication occurred both verbally and non-verbally. Verbal communication happened when the children interacted with the robot using voice commands and interpreted its responses. It was also present when they signaled to each other whose turn it was or expressed that something was wrong or needed to be approached differently (for example, “You have to tell it to lower first”). Non-verbal communication occurred when children nodded to show agreement or disagreement or stepped aside to give their partner the chance to issue the next instruction.
“Perseverance” was demonstrated by 32 of the 33 children during the task development, both in overcoming the difficulties they experienced in getting “Little Hand” to understand what they were saying, and in successfully completing the task. The children had to overcome various challenges associated with voice control of a robot, such as the time the robot waits to hear an instruction before deciding it is unclear and must try again. This was a significant challenge in many cases because the younger children (ages 4 and 5) needed more time to think about the instruction and then verbalize it, which meant they had to try more than once for the robot to execute it. Another challenge some children faced was language development. Some had difficulty pronouncing certain words, especially those containing the letter “r”, which made it hard for “Little Hand” to understand them, necessitating multiple attempts.
The children also demonstrated that they could treat failures as natural processes to achieve a goal. In those cases where the children made some mistakes in the order of voice commands to complete the task, they showed a positive attitude, commitment, and enthusiasm to try again and achieve the objective.
Finally, the ”choices of conduct” element was played by almost all the children who solved the challenge. Following the safety regulations for interaction with the robot was a fundamental aspect of developing the experience. This involved avoiding accidents, not touching the robotic arm, and respecting its workspace. Making conscious decisions about one’s behavior was evident when the children respected the interaction rules and used the available materials responsibly.
4. Discussion and Implications
This study investigated the effectiveness of a program involving interaction with a voice-controlled collaborative robot on the development of CT in children aged 4 to 6 years. Our findings suggest that integrating collaborative robots with voice interfaces into preschool education may be a promising strategy for enhancing CT and promoting cognitive, social, and emotional development in children at this early stage of their lives. Our analysis found statistically significant improvements in the children’s CT skills, as evidenced by the comparison between the initial and final assessments. Furthermore, we observed the successful application of various CT components while completing the tasks. This observational evidence indicates a considerable potential to bolster CT development and extend learning to concepts and skills beyond the established curriculum. Such an approach could significantly enrich the educational experience of preschool-aged children by broadening their cognitive horizons and encouraging the exploration of new ideas and problem-solving techniques.
The significant implications of our findings underscore the value of integrating high-capacity collaborative robots with voice interfaces into preschool education as a novel approach to enhancing CT from an early age. This addresses a significant knowledge gap in the extant literature, although previous reviews have indicated voice user interfaces as a promising strategy for engaging young children in CT learning with robots [
8]. This period is critical for development, where voice interfaces play a pivotal role by enabling children to interact with technology intuitively, boosting their engagement and motivation for CT activities without the prerequisite of reading or writing skills, as demonstrated in the present study. This method offers a direct and engaging learning experience. It allows children to apply programming concepts through voice commands, giving them a tangible sense of how their instructions influence the robot’s actions. Such interactions bolster their comprehension of computational concepts and navigate past the barriers posed by conventional robotics kits and programming languages, which are often not designed with this age group in mind [
30,
31,
32].
In addition, the results of this study expand the application field of human–robot collaboration, bringing robot arms closer to the educational realm as a powerful tool for integrating knowledge and extending the conventional approach to educational robotics in early childhood. As an interdisciplinary discipline, robotics offers a valuable contribution to enriching educational curricula [
2,
24,
33]. Unlike programmable robots without screens aimed at preschool-aged children, voice-controlled robots provide children with a concrete and tangible experience to apply scientific concepts practically and expand the concepts and skills to be addressed, thus extending the boundaries of knowledge and skills that can be enhanced from early childhood. This implies that through robotics, children can acquire knowledge about data science, artificial intelligence, and engineering practices from an early age. Additionally, they offer a valuable opportunity to develop necessary skills, such as human–robot collaboration, from an early age. In this context, it has been noted that for effective collaboration between humans and robots, humans must develop higher-order thinking skills such as CT that allow them to communicate effectively with them and interpret their behaviors, thus developing a higher level of trust to interact with them [
34,
35]. Future research should investigate the feasibility and challenges of implementing this robotic intervention within formal educational settings. This would involve addressing logistical considerations, such as requiring multiple robots to accommodate simultaneous work with several groups. Specifically, research should examine how to integrate the intervention into existing curricula and foster the development of computational thinking skills in collaborative group settings, where multiple students interact with voice-controlled robots. Furthermore, exploring the robot’s potential to enhance learning in other curricular areas, such as mathematics and language, would be valuable. Larger-scale studies with more diverse participant groups are essential to validate the initial study’s findings and assess generalizability.