Augmenting Mobile App with NAO Robot for Autism Education

: This paper aims to investigate the possibility of combining humanoid robots, particularly the NAO robot, with a mobile application to enhance the educational experiences of children with autism spectrum disorder (ASD). The NAO robot, interfaced with a mobile app, serves as a socially assistive robotic (SAR) tool in the classroom. The study involved two groups of children aged three to six years old, exhibiting mild to moderate ASD symptoms. While the experimental group interacted with the NAO robot, the control group followed the standard curriculum. Initial ﬁndings showed that students in the experimental group exhibited higher levels of engagement and eye contact. However, certain limitations were identiﬁed, including the NAO robot’s limited capacity for concurrent interactions, language difﬁculties, battery life, and internet access. Despite these limitations, the study highlights the potential of robots and AI in addressing the particular educational requirements of children with ASD. Future research should focus on overcoming these obstacles to maximize the advantages of this technology in ASD education.


Introduction
Autism spectrum disorder (ASD) is a multifaceted neurodevelopmental condition characterized by varying degrees of difficulty with social interaction and communication and is often accompanied by repetitive behaviors [1]. Currently affecting 1 in 68 children [2], it remains a prevailing concern that, to date, lacks any definitive treatment [3,4]. While the exact etiology of this disorder remains elusive, various therapies and interventions have been developed to support children with ASD in achieving their full potential and leading fulfilling lives [5][6][7][8][9][10][11].
In an era of widespread digital presence, children with ASD are often observed to exhibit a strong affinity for technological devices such as tablets and smartphones. While these devices can provide an engaging medium for learning and entertainment, an excessive dependence on them may inadvertently contribute to the intensification of their social isolation [12]. However, a groundbreaking revelation has emerged, demonstrating that when these devices adopt human-like characteristics, they can act as bridges to fill the social gap and bolster social skills. This has prompted the exploration of humanoid robots, such as Kasper [13], NAO [14,15], FACE [16], Bandit [17], ZECA [18], Zeno R25 [19], Puffy [20], Ifbot [21], Ichiro [22], and Pepper [23], as interactive tools for children with ASD.
The literature abounds with studies examining the role of robots in enhancing the social and communicative abilities of children with ASD [33][34][35][36][37][38][39][40]. These studies consistently reveal positive outcomes, such as improved social skills and increased eye contact. Among the multitude of robots employed for this purpose, the NAO robot, produced by Softbank Robotics [41], has emerged as a particularly popular choice, accounting for over 30% of the studies conducted in the last decade.
The extensive utilization of robots in assisting children with ASD represents an exciting intersection between robotics and educational therapy, providing a platform for the exploration of new pedagogical approaches. This paper aims to contribute to this growing body of literature by presenting our findings from a recent study.
The following sections of this paper are structured as follows: Section 2 discusses materials and methods, followed by Section 3 on results. The discussion is then presented in Section 4, and conclusions are provided in Section 5.

Materials and Methods
This paper proposes an SAR augmented with a mobile application for autism education (AE) in preschool and kindergarten settings. The research question is as follows: Will the SAR augmented with the mobile application influence the frequency of eye contact among autistic students during AE? For an instance of eye contact to be recorded, it entails the child's gaze meeting the teacher's gaze during a working session. The number is collected and recorded by the teacher during a session.
To achieve our goal, an effective SAR suitable for AE in preschool or kindergarten must meet the following requirements: 1.
Children must have the ability to communicate with the robot.

2.
The robot must be capable of endlessly repeating commands.

3.
The robot must be easily portable.

4.
Children with autism should comprehend the robot's dialogues and queries. 5.
The robot must accurately recognize each child's face. 6.
The robot should accurately receive answers to questions from the children. 7.
The program must organize all lessons for autistic children into fields. 8.
The application must ensure that children with autism complete and master each field. 9.
The program must generate an individual profile for each child. 10. The program must save progress charts in a database to track the child's learning and social skills advancement. 11. The program must be simple to comprehend and learn for new users.

Key Performance Indicators and Engineering Requirements
The SAR must meet the five key performance indicators: reliability, performance, accuracy, availability, and usability.
Reliability refers to the robot's ability to consistently perform its tasks without breakdowns, malfunctions, or requiring excessive maintenance. It is often measured by the mean time between failures (MTBF) or the probability of failure rate on demand (PFOD). A more reliable robot experiences fewer interruptions due to repairs or modifications. We aim for a PFOD of less than 0.001.
The key performance indicators (KPIs) measure the robot's speed, effectiveness, and throughput. These metrics can be quantified by the number of jobs completed per unit of time, energy utilization, or the ratio of work accomplished to the robot's capability. They also consider the robot's ability to handle complex tasks or varying workloads. For our SAR, an appropriate response time would be no more than two seconds to reply to any order.
Accuracy measures how well the robot's actions or outputs align with the intended goals. For example, in a factory setting, this could mean that a robot always puts parts in the right place, within minimal tolerances. It is essential to be accurate when performing precise work, like robotic surgery or micro-assembly. In our SAR, accuracy entails correct recognition of localized Arabic words. Thus, we set our accuracy measure to correctly acknowledge at least 200 Arabic words in the kindergarten setting.
Availability is another KPI; it indicates the robot's readiness and capability to perform tasks over time. Most of the time, it excludes maintenance, repairs, or program updates. High availability means that the robot operates most of the time, with minimal interruptions due to technical problems or software changes. Our SAR robot needs to be available during kindergarten hours, at least two hours a day, five days a week.
Usability refers to how easily individuals can interact with, understand, and operate the robot. It considers factors such as the control interface, programming flexibility, communication of status or issues, and safety for human interaction. The SAR should be user-friendly for children aged four to twelve to use, requiring no training. Additionally, teachers should be able to learn how the mobile app works and how to use it within 15 min.
The five key performance indicators of SAR robots for ADT that are suitable for preschool or kindergarten are summarized in Figure 1. After a thorough evaluation, we have determined that the NAO robot produced by Softbank Robotics [41] is better suited for our application since it supports Arabic language localization. With its 25 degrees of freedom (DOFs), the NAO robot offers exceptional flexibility and mobility capabilities. Standing at 57 cm in height, it is designed to encourage interaction with children, either sitting or at table level. The robot is equipped with two cameras-one on its head for facial recognition and another on its chin for navigation and environmental recognition. Furthermore, NAO includes voice recognition capabilities, a speech synthesizer, and various sensors such as an accelerometer, gyroscope, and forcesensitive resistors. Its eyes, adorned with changeable LED lights, add a further dimension to its interactive capability, enhancing its appeal to users. Additionally, it may link to an external server or mobile phone to expand its capabilities. Table 1 maps the engineering requirements with the selected NAO specifications to justify our selection from an engineering perspective. The robot looks like a human, so when autistic children interact with it, it helps them interact with humans easily.

2
The robot shall be available five days a week, two hours daily.
The teaching sessions are very tiring for humans, so using a robot is very helpful. 3 The robot weighs 5.4 kg. The robot should be easy to handle and transport. 1,4,6 The robot should be able to speak at least 200 words in Arabic. Autistic children only understand their native language.

5,9,10
The robot should have face recognition ability, and the system should differentiate each child's face.
Each child has their own field, score, and progress. This will help teachers to track each child's progress separately. 6 The robot's probability of failure on demand should be less than 1/1000. The system should calculate the accurate score for the child.

7,8,11
The application shall not proceed to the following field unless the child completes several tests of the previous field with a score higher than 75/100.
This percentage will confirm that the information provided in the lesson has been fully understood by the child.

10
The application should have children's scores and a progress analysis database.
This can also be used to promote good behavior, reward children for their progress and behavior, and show their progress to the teacher.

11
The application should have a simple interface and be learned in 15 min.
People of different age groups should be able to use it easily.

The Robot Program Structure
The front-end programming of the NAO robot involved the use of Python, C++, and Choregraphe modular programming, as shown in Figure 2. The backend smartphone application was developed using Java. The smartphone application serves as an interface between the user and the NAO robot, facilitating class management, monitoring student progress, and assigning new users to classes. Firebase is utilized to construct system databases that collect, organize, and analyze data. The system was meticulously crafted to retain the progress of each individual child and to foster a stronger connection by recognizing their strengths and weaknesses in each lesson.
To achieve this, we augmented the NAO robot with an external smartphone application to better benefit from the ability to create and save an internal database on the mobile app, enabling the development of personalized profiles for each user and personalized education. Furthermore, it enhances the interaction and connection with the end user. To implement more individualized training, we must monitor the student's completed lessons and have more control over the child's development. This would assist in providing control elements and only let a student advance to the next session if they achieved satisfactory scores on the exam. The robot should save this information and be able to access it once it recognizes the child's face, at which point it should engage with them appropriately.
Based on the recommendations of the social consultant, each student is allocated a class with individualized instructions and assessments. The robot will manage voice recognition, speech synthesis, and picture identification, making crucial decisions. On the other hand, a smartphone will manage several accounts and keep track of exam results and completed courses. The relationship between NAO and smartphone applications is seen in Figure 3. Further details on the structure of the smartphone application are given in the next section.

Mobile Application Structure
The mobile application acts as a backend, connecting the instructor to the child through the NAO robot, as shown in Figure 3. The mobile application on the first login will allow the user to either sign up for a new account, log in, or connect with their existing Google account. Once logged in, the user can access their profile to update personal information such as their name, password, and contact information. They can also start using the application by navigating to the Class Tab. The complete functional structure of the smartphone application is depicted in Figure 4. Within the Class Tab, the instructor can create a new class, add students to a class, search for a student using their civil ID, check the progress chart of any student in their classes, or initiate a new lesson at a specific level.

The System Classes' Structure
The proposed approach relies on object-oriented classes and methods [42][43][44][45]. A system class diagram provides a visual representation of the system's components, their interactions, and relationships. Figure 5 displays various classes, their methods, and the interconnections between them. Each block is divided into an upper part and a lower part. The upper part represents the class's variables, while the lower part represents the class's member functions. The system has three major components: Teacher, Classroom, and Child. A teacher may be allocated to more than one classroom, which is shown by a line connecting the Teacher and the Classroom with an asterisk on the Classroom side to indicate that one instructor can be assigned to more than one classroom. Conversely, since each child can be assigned to only one classroom, the line connecting Child and Classroom has a number 1 on the classroom side and an asterisk on the child side. The relationship between the Child component and the Lesson component is manyto-many, which implies that each student may have several lessons, and a lesson can be assigned to multiple children. Therefore, asterisks are displayed on both sides of the edge relation that connects Child to Lesson. The same applies to the Child class and the Test class. The dotted line relation represents a class function that takes an instance of both connected classes as arguments to record a global variable in the database. In this case, we have the Takes class, which records which lesson is taken by which child, and Test class, which records, which test is taken by which child.
The Teacher class will hold credentials for teachers. Conversely, the Child class omits credentials. Nonetheless, this is performed automatically by searching for the child's profile based on the image recognition supplied by the robot through the identify Child method and comparing it to the stored child picture. The Child class will also store additional child-specific information. The Classroom class stores the class's title, number, photo, start and finish dates, and duration. Classroom class has one method for creating a class. The Taking Test class stores the score and the test date and has two methods to set and retrieve the scores. The Test class keeps track of which child completed what test number. The Lesson class stores the lesson title and has one method to initiate the lesson by sending a unique code for the robot to start the corresponding routine. The Question class stores questions and answers and has two methods: one for asking the question and another for checking the answer. That Takes class acts as a control flag to track which student took which lesson. Figure 6 visually represents the mobile application's user interface. Figure 6a displays the landing page, which offers teachers three options: to log in, create a new account, or connect using an existing Google account. Figure 6b provides a snapshot of the variety of accessible fields within the application, with a vertical scrolling feature for a comprehensive overview of all fields. Figure 6c showcases the available lessons, and a vertical scrolling feature is used to allow users to view all lessons. Lastly, the child's progress chart is available from the Class Tab. This feature provides teachers with a quick overview of the child's previous lesson, highlighting their strengths and weaknesses. This valuable information aids in tailoring the current lesson for a personalized educational approach.

Robot Identification Protocol
The auto-identification feature will identify the child once they appear in front of the robot, matching their image with the photo stored in the database. Upon successful identification, the robot initiates a personalized greeting, addressing the child by name to establish a social bond and set the stage for the upcoming lesson. Figure 7 provides a detailed illustration of this identification sequence protocol.

Lesson Protocol
The authors successfully implemented a range of lessons, encompassing topics such as color, food, body parts, personal hygiene tools, transportation, movement, and animals. Each lesson adheres to a standard protocol, as demonstrated in Figure 8, which outlines the sequence for a "repeat the object name" lesson. Upon welcoming the child, the robot prompts them to name a presented object chosen by the teacher. The robot then pauses to listen for a response. If the child answers incorrectly within a specified timeframe, the robot offers three additional attempts. If these attempts are unsuccessful, the lesson is recorded as incomplete, and the robot moves on to the next topic. The child's progress chart records these incomplete lessons for future review and revision. The interaction between each lesson and the child varies based on the child's level and profile. Figure 9a shows that some lessons will require the child to emulate the robot, while other lessons will ask the child to select the correct card from a stack of cards, as shown in Figure 10 (an example for the animal lesson). The child will be instructed to hold the correct animal card in front of the robot to check the answer, as depicted in Figure 9b. In this case, the child is holding a bird card (card number 11 in Figure 10) in front of the robot, and the robot will perform image processing and check if the image matches the question asked. Depending on the difficulty level, the robot could imitate the animal's sound, mention the animal's name, or describe its features. See Supplementary Materials for more information at https://openmylink.in/NAOMedia.

Technical Description
The front-end solution was deployed on the NAO robot using Choregraphe 2.1.4. The backend software was developed using object-oriented programming (OOP) with Java programming language, implemented using Android studio 3.3 and Arduino IDE 1.8.8. These were deployed on an Android smartphone running version 7, equipped with 2 MB RAM and 32 MB storage capacity. The database was stored using a Firebase server [46] as a backend-as-a-service (BAAS) hosted with Google Cloud and integrated with Bayt Alatfal cloud structure [47]. Firebase operates as a NoSQL database [48,49], employing JSON-like documents [50].
Lessons and quizzes were designed in consultation with the teachers and the specialists from Bayt Alatfal special needs preschool for autism. The lessons are compatible with the Bexly curriculum.

Experiment Description
Each lesson required approximately eight to ten minutes for completion. The learning sessions were conducted under the supervision of one specialist and one IT technical support worker. Each session lasted between 20 and 30 min and covers three to four children. The robot was charged between sessions, and the IT support worker set up the program. On average, a working day consisted of six to eight sessions. The sessions covered various fields, including communication, cognition, social skills, and mobility.

Population Description
The study involved two distinct groups, each comprising 12 children aged between three to six years old, exhibiting mild to moderate ASD symptoms, as shown in Tables 2 and 3. Table 4 provides an illustration of the normality test, executed using both the Kolmogorov-Smirnov and Shapiro-Wilk methods. The control group adhered to the standard class curriculum, establishing a baseline for comparison. Meanwhile, the experimental group had the unique opportunity to interact with the NAO robot as an integrated part of their classroom activities, introducing an innovative approach to their learning experience.

Results from Eye Contact Data Analysis
To investigate the research question, we have composed the following null hypotheses: a. Hypothesis 0a (H 0a ): The distribution of the number of eye contacts is the same across categories of autism levels. b.
Hypothesis 0b (H 0b ): The distribution of the number of eye contacts is the same across different age groups. c.
Hypothesis 0c (H 0c ): The distribution of the number of eye contacts is the same across the control group and experimental group. Figures 11 and 12 plot the average number of eye contacts per student per working day for the control group and the average number of eye contacts per student per working day for the experimental group, respectively. During a working session, when a child's gaze aligns with that of the teacher, it is considered eye contact. The teacher recorded this during each session.  The Shapiro-Wilk test and the Kolmogorov-Smirnov test are two of the most commonly used methods to test for normality. As shown in Table 4 above, the p-values (Sig.) for both age and autism level, as indicated by both the Kolmogorov-Smirnov and Shapiro-Wilk tests, are less than 0.05 (in fact, they are less than 0.001), which means neither age nor autism level is normally distributed. Consequently, non-parametric test methodologies will be applied to the data. Table 5 presents a summary of the results from the independent-sample Mann-Whitney U Test, analyzing the number of eye contacts across different autism levels. Similarly, Figure 13 illustrates the independent-sample Mann-Whitney U Test for the number of eye contacts across various autism levels.  Figure 13. Independent-sample Mann-Whitney U Test graph for number of eye contacts across autism levels.
Given that the age variable encompasses more than two age groups and exhibits a non-normal distribution across all samples, we opted for the Kruskal-Wallis test as an alternative to the one-way ANOVA. In this regard, Table 6 succinctly encapsulates the outcomes of the independent-sample Kruskal-Wallis test, investigating the variation in the number of eye contacts across distinct age groups. In tandem, Figure 14 visually represents the findings of the independent-sample Kruskal-Wallis Test through a graphical representation.   Table 7 provides a summary of the independent-sample Mann-Whitney U test results, examining the number of eye contacts between the control (Control = 1) and experimental (Control = 0) samples. In a similar vein, Figure 15 depicts the independent-sample Mann-Whitney U test, focusing on the number of eye contacts within these same samples.

Discussion
The primary objective of our study was to investigate the impact of age, autism level, and the use of NAO robot intervention (denoted as "Control = 0" for the experimental group in the data) on the number of eye contacts demonstrated by the subjects. We sought to understand these relationships by using SPSS version 28.
In our study, the dependent variable is the "Number of Eye Contact", while the independent variables comprise "Age", "Autism Level", and "Control" (the latter designates whether the subject belongs to the control (Control = 1) or experimental (Control = 0) group.
As indicated in Table 5, the results reveal a statistically significant difference in the number of eye contacts between different autism levels. This conclusion is drawn from a p-value of less than 0.001, making the result significant at the 0.05 level, leading to the rejection of the null hypothesis H 0a . In other words, this implies a meaningful difference in the number of eye contacts observed among various categories of autism levels.
Additionally, the "Standardized Test Statistic" value of −3.865, representing the zscore, offers insight into the result's relation to the mean. It tells us how many standard deviations the observed result is from the mean. The negative value specifically suggests that the observed rank sum is less than what would typically be expected.
Turning to Table 6, the Kruskal-Wallis test yields a relatively high p-value (Asymptotic Sig. (2-sided test) = 0.635). Since this p-value exceeds the conventional threshold of 0.05, we do not have grounds to reject the null hypothesis H0b. In essence, these test outcomes suggest that there is not a statistically significant variation in the distribution of the dependent variable across different age groups. In simpler terms, age does not appear to exert a substantial impact on the frequency of eye contacts, according to these test findings.
The independent-sample Mann-Whitney U test was utilized to investigate the distribution of the number of eye contacts across the categories of the control variable within the dataset (N = 864). The null hypothesis for this non-parametric test posits that the distribution of eye contacts is identical across the control and experimental categories.
The test results revealed a statistically significant difference in the distribution of the number of eye contacts across the control categories (Mann-Whitney U = 50,070.500, Wilcoxon W = 143,598.500, Standardized Test Statistic = −11.966, and Asymptotic Significance (2-sided) = 0.000). With a significance level of 0.050, the null hypothesis H 0c is consequently rejected.
This finding indicates that the number of eye contacts is not uniformly distributed across the control and experimental subjects, indicating a discernible distinction between the groups. Therefore, the incorporation of SAR augmented with a mobile application does impact the frequency of eye contacts within the examined population, which answers our research question. Further exploration and contextualization of the specific categories and features of the subjects, along with the methodology used and its influence on eye contact, could offer deeper insights into the underlying mechanisms and implications of this finding within the context of autism research.
We also noticed that maintaining a child's attention throughout the sessions proved to be complicated. A substantial portion of each session was devoted to captivating and sustaining the child's attention. We discovered that utilizing the NAO robot significantly grabbed the child's attention. The robot's capacity to personalize substances, such as recognizing the child's presence by proclaiming their name when they entered the robot's viewing area, proved to be highly beneficial. This individualized contact was further enhanced when the robot retained prior information about the child, fostering a sense of connection.
However, we encountered difficulties when the information did not correspond to the child's interests. To address this, we worked closely with the preschool's behavior modification specialist and psychotherapist (BMSP) to customize themes based on the child's interests. For example, we incorporated the child's favorite foods into food lessons and placed a mirror behind the robot during motion lessons to allow the child to observe and replicate the robot's motions. These alterations, implemented in both the control and experimental groups, significantly improved the children's attention.
In lessons focused on color, the NAO robot's ability to change its eye color added an engaging and interactive element. The child was tasked with identifying the robot's eye color by selecting a corresponding color card or verbally stating the color, depending on the lesson's level. This exercise necessitated direct eye contact and notably enhanced this skill compared to traditional lessons involving color identification from pictures.
Many challenges are faced during the course of conducting these experiments, mainly in engaging the child and retaining their attention. There are also other technical challenges and limitations that need to be addressed in any future work or extensions. We outline them as follows: 1.
The NAO robot can interact concurrently with a maximum of five children.

2.
The NAO robot is proficient in the official Arabic language but does not comprehend the Kuwaiti dialect.

3.
The NAO robot has an operational battery life of about 90 min and requires regular recharging to prevent automatic shutdown due to a depleted battery.

4.
Internet connectivity is essential for the NAO robot to access cloud services and acquire accurate system data.

5.
The mobile application is exclusive to Android users. 6.
Teachers need a valid email address for application usage. 7.
The user must connect to the internet to use the application.

Conclusions
In the last decade, robots have undergone significant advancements, transitioning from factory machines to sophisticated companions and assistants in various aspects of human life. Thanks to breakthroughs in artificial intelligence, including machine learning, image recognition, voice recognition, and speech synthesis, robots have become more human-like companions or assistants in many areas of human life. One such application is education, where robots have shown a key role in attracting autistic youngsters and bridging the gap between them and their isolation. Autistic children often find devices more appealing than people. The NAO robot was carefully selected to resemble a human when augmented with a backend application that could increase the robot's intelligence to recognize the child's profile better and direct their learning path, as well as provide the teacher with statistical charts to track the learner's progress better and tailor future lessons based on previous historical data.
This study was implemented in an autism-specific private preschool. With the aid of a school consultant, all lessons were created and conducted using linguistic and cultural references specific to the learners. Twelve children were given the solution and showed improvement in their eye contact skills.
This research effectively demonstrated the potential of combining humanoid robots, specifically the NAO robot, with a smartphone application to improve autistic children's ed-ucational experiences. The implemented solution has increased eye contact and engagement in AE by providing a customized and individualized learning approach. Collaboration with specialists and educators ensured that the content was tailored to the specific requirements and cultural context of the children.
However, the study also revealed a number of limitations and obstacles, including the limited number of children with whom the NAO robot can interact simultaneously, language barriers, battery life, and internet connectivity requirements. Future research should strive to resolve these issues to enhance the efficacy and accessibility of these educational aids for autistic children.
Continued innovation and refinement in robotics and AI will contribute to bridging the divide between autistic children and their educational requirements by building upon the findings of this study. This technology has the potential to become a valuable resource for educators and specialists, promoting inclusivity and individualized learning for children with autism. Informed Consent Statement: Informed consent was obtained from the parents of all subjects involved in the study.

Data Availability Statement:
The original contributions presented in the study are included in the article/Supplementary Material. Further inquiries can be directed to the corresponding author(s). The video and media for this study can be found in the OneDrive link at https://bit.ly/includeMe accessed on 18 July 2023.