The Limited Effect of Graphic Elements in Video and Augmented Reality on Children’s Listening Comprehension

There is currently significant interest in the use of instructional strategies in learning environments thanks to the emergence of new multimedia systems that combine text, audio, graphics and video, such as augmented reality (AR). In this light, this study compares the effectiveness of AR and video for listening comprehension tasks. The sample consisted of thirty-two elementary school students with different reading comprehension. Firstly, the experience, instructions and objectives were introduced to all the students. Next, they were divided into two groups to perform activities—one group performed an activity involving watching an Educational Video Story of the Laika dog and her Space Journey available by mobile devices app Blue Planet Tales, while the other performed an activity involving the use of AR, whose contents of the same history were visualized by means of the app Augment Sales. Once the activities were completed participants answered a comprehension test. Results (p = 0.180) indicate there are no meaningful differences between the lesson format and test performance. But there are differences between the participants of the AR group according to their reading comprehension level. With respect to the time taken to perform the comprehension test, there is no significant difference between the two groups but there is a difference between participants with a high and low level of comprehension. To conclude SUS (System Usability Scale) questionnaire was used to establish the measure usability for the AR app on a smartphone. An average score of 77.5 out of 100 was obtained in this questionnaire, which indicates that the app has fairly good user-centered design.


Introduction
Augmented Reality (AR) is one of the fastest-growing mobile technologies, both in terms of development and research [1]. AR combines real-world elements with computer-generated elements to create partially immersive experiences. The application of AR-based systems has been explored in various disciplines, including industry [2], medicine [3,4] and education [5]. It would appear that this technology is set to have a promising future, both in and outside the classroom, given that there are a great many reports from the field of education regarding high levels of satisfaction with this technology and its effectiveness [6][7][8].
In the educational setting it is vital that student motivation is fostered, and that good performance is ensured [9]. Consequently, the importance of updating teaching methodologies has frequently been addressed. By employing new multimedia technologies it is possible to update teaching methodologies and explore their effect on performance and motivation. Several new multimedia technologies emerged following the digital boom. These technologies ranged from videos and animations [10] through to video games [11]. As a result of these technologies it has been possible to update teaching methodologies and explore their effects on teaching and learning. What is more, it has never been easier to access high-capacity hardware [12,13], making it possible to create proposals using an even greater array of emerging technologies, such as augmented reality.
Nowadays, the combination of several technologies has great potential in terms of applicability, waiting to be discovered. Messaging services, video calls, augmented reality and chatbots are four very popular digital tools that have revolutionized the social media atmosphere. Chatbots are artificial intelligence software programs (or even, in some cases, hardware) that focus on understanding the messages and questions a person asks to respond to texts or even voice messages naturally and efficiently [14].
AR and chatbots are exciting technologies and whose combination allows an interesting space to explore and develop [15]. In fact, e-commerce has integrated them to make the customer experience not only interesting and unique but also more efficient and satisfying [16]. This is not far from the educational field, in which the potential of the combination of AR and Chatbots supported by advances in artificial intelligence (AI) can provide more efficient, motivating and satisfactory training.
In harnessing these new technologies new proposals can be put forward, that is, remote learning; in doing so, it is then possible to transform learning experiences into experiences that are more interactive and more flexible for mentors and students alike. The use of this technology for educational purposes has been researched for students of all ages and in different disciplines, ranging from teaching writing in children [17] to specialist medical training for university students [18] and, more recently, to search for the most popular gestures among engineering students [19].
In education focus is not only placed on learner satisfaction and motivation but also on comprehension [20]. In a traditional environment, knowledge is transmitted from mentor to student face-to-face through the use of different tried-and-tested pedagogical techniques. An alternative to this that has recently emerged is that of online learning. This new technique encompasses online reading materials, educational videos, vodcasts (classes that teachers record to share with their students in video format) and podcasts. In the case of vodcasts, positive performance has been observed in university and high school students [21][22][23]; however, one observation that has been made suggests that the lack of teacher-student interaction affects the potential impact of vodcasts as a replacement for traditional education [24].
It is from here that the interested in listening comprehensions stems. It has been found that incorporating visual elements into audio in the dialogues and lessons in listening comprehension tests usually improves student performance [25][26][27][28]. This would seem to weigh in favor of the use of vodcasts, thus giving rise to the question-what effect does the type of visuals that accompany audio lessons have on auditory comprehension, in particular video/image and AR elements?
Answering such a question would impact educators and students. As digital and also mobile devices, education continues to grow [29] it is important to make recommendations concerning learning material formats. In these recommendations it is important to establish which are the most effective for understanding and retaining knowledge while never losing sight of user experience. If we contrast the uninterrupted video format with AR format, it appears the latter would imply a reduction in supporting visual content. However, when AR models are incorporated in audios they offer the user contact with the real-world environment and interaction between the elements in a way that flat videos do not achieve so easily [30]. Similarly, the construction of 3D models to illustrate concepts that are independent of lessons would contribute to the strengthening of educational content repositories and allow the possible reuse of educational content [31] when updating audio lessons or, equally, when creating new material. As a consequence, this approach could represent a less costly alternative for educators that is more attractive in the long term [32].

Listening Comprehension
Listening comprehension has been studied extensively in people studying a second language. Often, the level of language proficiency is measured using internationally recognized standardized tests, such as TOEFL (Test of English as a Foreign Language) and IELTS (International English Language Testing System). As this type of test evaluates listening comprehension, research has been conducted to observe the effects of varying certain components of the test on performance. For instance, by means of TOEFL-like evaluations flat audio has been compared with audio that includes visual material related to the interlocutors. Findings suggest that video content favors student performance on these types of tests [25,28].
Another study involved Korean high school students and university students who possess knowledge of a second language (English). The main differentiator between these two groups was their level of English. Each group had to perform comprehension tests, after which researchers compared the results for both groups. Flat audio lessons were compared with audiovisual lessons in both Korean and English. It was found that visual content does contribute to better performance for the English test in individuals with less proficiency in the language (in this case, high school participants) but that the presence of images does not affect comprehension in the native language. Among the qualitative observations, it was highlighted that the participants felt that the level of effort was higher in comparison to that of taking the flat audio lesson [27], despite facing less difficulty in the test associated with the audiovisual lesson.
Following the idea that visuals favor the listening comprehension of foreign language learners, another study explored the relevance of visual content and its effect on listening comprehension. Three versions of the same lesson were compared-flat audio, audio with video containing related content and audio with video containing irrelevant content. Surprisingly, results showed that listening with video formats helped participants perform better on the comprehension test regardless of content, with no major differences detected between the two versions [33].
In another piece of research, researchers address the optimization of listening comprehension supported by visual elements for foreign language learners. The objective was to determine the ideal order in which to present visuals. Three proposals were tested-flat audio playback, the presentation of images before audio playback and the presentation of images during audio playback. The study concluded that the best results in comprehension tests are achieved when showing visual elements before introducing audio [34].
These effects seem to relate top-down processing strategies when using audiovisual and bottom-up strategies when the task is limited to audio [35]. In other words, the visual content gives background content instead of requiring users to identify keywords.

Augmented Reality and Comprehension in Children
Little research has been undertaken to explore the effects of augmented reality on reading comprehension. That said, some pioneering experiences have in fact proposed using interactive games based on AR to study how they affect reading comprehension in children [36]. Although there were no significant differences between this proposal and the traditional methods in terms of comprehension, greater interest and motivation were observed for AR games. A positive effect was also perceived among the participants in terms of problem solving, exploration and socializing with peers.
The study by Yilmaz, Kucuk and Goktas [37] focused on the attitude of children towards augmented reality picture books (ARPB). When looking at the relationship between attitude and reading comprehension, the researchers found that this type of reading material initially stimulates a sense of happiness, interest and fun in children and subsequently the attitude of happiness results in a thorough understanding of the content. Thus, it was concluded that ARPBs have significant potential to improve cognitive and listening skills in children.

Effects of Visual Content Characteristics on Comprehension
For reading and multimedia, researchers have experimented with the format of visual content in conjunction with pedagogical principles. In their study, Roohani, Jafarpour and Zarei [38] checked reading comprehension in children, comparing multimedia texts with two types of graphics-static and animated-in combination with two types of advanced organizers (pedagogical technique for connecting existing knowledge with new knowledge). Among their findings, these authors found that individuals who visualized the animated elements performed better in comprehension tests. Previous studies used a virtual speaker as visual support; they found that this audiovisual content improved the number of correct answers following listening [39] and that students received better listening comprehension scores than when no visual support was provided [40].
The effect of reading format and the role of interactive technologies on comprehension has also been analyzed. For secondary school children, research has been done to compare comprehension when reading with videos that require different levels of interaction versus reading from a traditional illustrated book. Results showed that in both formats effectiveness is similar for the comprehension of complex content. However, it was concluded that in the design of interactive material aimed at boosting learning it is advisable to opt for micro tasks [41].
Dünser witnessed through his study [42] the positive effect of AR on young, unskilled readers when he compared an early literacy book in two formats-AR format versus text format with images.
In an activity that tested information retention the group of children who were skilled readers managed to recite a larger portion of the content from the text format; however, no major differences were found to exist between the groups when using AR format. This finding thereby revealed the potential learning benefits for children less fluency. Table 1 provides an overview of the studies that are most relevant to this research topic in terms of (a) the number of tests with users, (b) the educational context in which they unfold, (c) their scope and (d) comparisons of digital material formats. The present study is also included in the table. Listening comprehension in foreign language affected by flat audio and audio + video.
Audio and video on computer.
Listening comprehension in foreign language affected by flat audio, audio + relevant visuals and audio + non-relevant visuals.
No statistical difference in second language test-takers' performance on an academic listening test in an audio-only mode versus an audio-video mode.
Audio and video on computer.
Listening comprehension in foreign language affected by flat audio, audio + relevant visuals and audio + non-relevant visuals. The animation type of visualization was more effective than the static one and embedding animations with question advance organizers improved reading comprehension significantly.
Text, animations and static images on computer.
Listening comprehension in noise and silence conditions by audio and audiovisual. The results are inconclusive regarding how seeing a virtual speaker affects listening comprehension.
Audio and audiovisual (virtual speaker video).
Reading comprehension for traditional picture books and for AR picture books. No statistical difference in the number of interactive lessons read between high and low reading comprehension groups. Readers with high reading comprehension remembered more events.
Text, images, AR on computer. A webcam is used to capture bookmarks.
Present study 32 Children aged 9 to 11 (primary school students).
Listening comprehension affected by visual elements of video and visual elements of AR. See Section 4.
Audio, video on computer, AR on mobile device.

Objective and Hypotheses
The main objective was to evaluate whether AR content improves listening comprehension in primary school students to a greater extent than video content. For this, an applied exploratory inductive study was designed to support the learning process for listening comprehension using computer-based visual content, which led to the following hypotheses: • H 1A : People who hear AR lesson will have more successful marks in comprehension test (CT). • H 2A : People who hear AR lesson will take longer to complete the comprehension test (CT).
The corresponding null hypotheses shall be H 0i = ¬H 1i . A multi-method approach was conducted based on listening comprehension performance and time taken, usability interviews and observation.

Sample
The sample consisted of 32 primary school students studying in the fourth and fifth grades in an elementary private school in Monterrey city (Mexico) where students are of upper-middle class. The recruited students were aged from 9 and 11 (mean (M) = 9.75 and standard deviation (SD) = 0.672). The experience was carried out in April of 2019 and the participants were divided into two groups containing 16 participants each-one group would observe and test video visuals and the other would observe and test AR elements. These groups were then subdivided into two subgroups of equal size-students documented as having outstanding reading comprehension (8 participants) and students documented as having poor reading comprehension (8 participants). Thus, the following groups and sub-groups were defined: Participant recruitment was based on the results of the reading section of the Early Alert System (SISAT) (Sistema de Alerta Temprana (Gobierno de Mexico). http://www.sems.gob.mx/en_mx/sems/ sistema_alerta_temprana_siat), a nationwide reading examination run in Mexico as part of a wider examination program. The program uses standardized testing for reading, writing and mathematics to ascertain students' level of comprehension and remediate any issues, as and when necessary.
For this study researchers used the most recent SISAT results. These results had been recorded no more than three months prior to the experiment. Students scoring 17 or 18 on the SISAT reading exam fall under the category Expected Level and were defined as participants possessing high reading comprehension. Participants scoring 15 or less fall under the category of In Development (or the lower limits of Expected Level) and were defined as participants possessing low reading comprehension.
The activity consists of performing a listening comprehension-viewing and listening to a lesson and demonstrating understanding of content.

Software and Resources
A web application was built in PHP v. 5.5.37 and connected to a MariaDB database. This application consisted of two main interfaces-the lesson viewer and the listening comprehension questionnaire.
The application was displayed on a laptop with a touch screen. For the augmented reality lesson, the contents were created using the software Augment-Solution Field Sales [43]. Students used the app Augment-AR Viewer [43] on an iPhone 7 Plus device to scan the markers. This application allows AR models to be projected by scanning the QR codes associated with them.
The free educational app Blue Planet Tales (Blue Planet Tales App. https://bit.ly/2QDdGm7; https://bit.ly/37UVmuF) was used for audiovisual based learning. This free app, which is available in the Apple Store and Google Play Store, contains educational stories about science and history for children. The app contains a comprehension test in the same format as the didactic material, video and audio. For this experience, the original material was modified to last five minutes and adapted to cover solely the lesson that the students visualized.
Most of the augmented reality models used were obtained from online repositories under Creative Commons licenses, while a few were purchased by the author.

Procedure
Each participant interacted with only one of the two versions of the application, depending on the group to which he/she had been assigned. A moderator oversaw the activities of both groups. Following a script, the moderator explained the aim of experience and the activities that the participant would carry out. The brief explanation informed participants that they should not be nervous, as they were not being examined, it also included information on the goal of the activity and explained that they were going to be asked to listen to and look at a lesson and then answer questions in a questionnaire and in an interview (the latter exclusively for the AR group). Each participant was encouraged to think aloud at all times.
Script paragraph pieces are displayed: "I am [insert Monitor Name] and I will be with you throughout this activity. I'm studying computers and I like to learn how to use computers, tablets and cell phones to help people.
I asked you to be here today because I need your help. I'm trying to figure out how to make a reading app for kids like you. So, I want you to help me by doing an activity. It's going to take about 25 min.
Don't get nervous; we are testing the application, not you. Really, you're not going to make a mistake here. There are no right answers or wrong answers but if I see that you're paying attention I'll give you a prize (I'll give every participant a prize A sound check was performed prior to commencing the lesson to give each participant the opportunity to adjust the volume to a suitable level. In the case of the AR group, participants were given a smartphone and asked to perform a marker reading test to familiarize the participant with the Augment AR viewer app and ensure they knew how to correctly operate the device. For this test the moderator provided a demonstration and projected images in augmented reality, after which the participant was asked to replicate the actions using a different marker. Next, the participant was presented with a brief demo of the proposed lesson format. The participant had to interact with the demo and in the same fashion as they would in the actual lesson, meaning they had to project images and listen to audio snippets. Up until this point the images observed by the AR group were entirely unrelated to the lesson material. Once ready, the participant proceeded with the lesson. The theme of the lesson was the story of Laika, the canine crewmember who travelled on Sputnik 2. All participants listened to the same audio, an adaptation of the Spanish version of Laika the Little Astronaut Dog (Educational Video Story-Laika's Space Journey. https://bit.ly/2tKwzdI) by Blue Planet Tales. They also observed images relevant to the content of the lesson but in different formats.
For the VIDEO group, the interface used for the lesson was a video player. The participant was responsible for pressing play to start the video. The entire lesson was delivered in a single sitting without pauses. All audio and visuals were delivered via the aforementioned video player. The visuals consisted of static and animated images (see Figure 1). In contrast, the lesson that was delivered to the AR group was split into separate parts that were presented one at a time. From the adapted material, the main ideas had been highlighted and the times when video images originally changed were taken into consideration to define the parts that would make up the AR lesson. When defining the resulting parts steps were taken to ensure they made sense when listened to (i.e., showcased complete ideas) and that they were as long as possible in order to minimize the number of interruptions to the lesson. What is more, the long duration of the parts was expected to have a positive effect on comprehension, similar to that found in the length of text fragments in accelerated reading tests for high school students [44]. Evidently, the present experience did not involve accelerated reading but it did include the presentation of material at a pre-set pace, perhaps different from that of the test participants, which was considered important not to ignore.
Each part in the AR group test consisted of an AR image and audio (see Figure 2). Participants would scan the marker with the smartphone to project the image and press a 'Continue button' when they wanted to listen to the audio. Consequently, the AR group had to go through more steps (scan + projection and playback) than the VIDEO group (playback only). The AR lesson was divided into a total of eight parts. Although the length of the parts varied, one or more ideas from the lesson were initiated and concluded in each of the parts. The number of images shown in the video lesson was also taken as a reference. Table 2 shows the duration in seconds of each part in the AR lesson, along with a description of the main ideas and the three-dimensional model shown to the participants. The participants in the AR group were free to view the images before, during or even after they had finished listening to the audio. They were also able to interact with the models any way they wanted and for as long as they wanted. Once the audio fragment ended, the next part would be displayed and highlighted in a different color to the precedent parts. The participants could continue with the steps explained previously. Both groups were free to advance, rewind, pause or restart the lesson if they so wished.
Once the lesson was finished, the application would enable the option to start the comprehension test (CT). The examination consisted of nine ad-hoc multiple-choice questions, which had been created to measure participants learning. All questions were strictly related to what was mentioned in the lesson and presented in the same order as the information was presented. The beginning of the exam triggered a stopwatch that was not visible to the participants that recorded the time it took each participant to answer the questions. The number of correct answers was also calculated upon the submission of the completed questionnaire. Below are some examples of the questions included: • What was the goal of the experiment?
To experiment with a dog in order to test whether humans could survive in space.
To watch animal behavior in situations of extreme fear. Build an intelligent ship to explore the Moon and to find aliens. Send living beings into space for them to live in another galaxy.
• How did scientists know that Laika was alive in space?
By her barking. By her pulse. Because of movements in the ship. They saw her from their lab's telescope.
• What was the name of the ship that Laika travelled in?
In addition to measuring listening comprehension, interviews were conducted to determine the level of satisfaction of the AR group participants with the new lesson format as part of usability study.

Results
This section reports the experimental results and presents an analysis of findings. In Section 4.1 the differences between both groups with regards to the number of correct answers in the comprehension test is detailed. In Section 4.2 examination timings are discussed. In Section 4.3 the results of the usability interviews are presented. And lastly, in Section 4.4 the observational findings are discussed. Table 3 below displays the statistical description of data that was compiled for each of the experimental groups (questionnaire scores and timings).

Comparison of the Number of Successes in the Comprehension Test
After giving a verbal indication that the lesson and listening comprehension exercises were finished, the participants were asked to answer a multiple-choice questionnaire covering the content of the listening. In order to successfully complete the exam, the participant could not leave any questions unanswered. Listening comprehension was measured as an integer value according to the number of correct answers in the exam.
Through a Levene's test, it was revealed that the variances among the results from all subgroups were not significantly different (p = 0.104). In other words, both groups have similar reading comprehension prior to commencing the lesson. Consequently, a two-factor ANOVA was conducted to test the possible statistically significant relationship between the data and independent variables. Any p-value < 0.05 was deemed indicative of significance. Comprehension test results were analyzed using Level of Reading Comprehension (high or low) and Lesson Format (AR group or VIDEO group) as variables. The resulting p-values are displayed in Table 4 below.  The results indicate significant difference between groups in scores obtained for Level of Reading Comprehension (p-value = 0.005); students with high reading comprehension obtained the highest scores. Significant difference in participant performance was not detected for Lesson Format (p-value = 0.254). In other words, the participants' scores were similar regardless of whether they followed the lesson in VIDEO format or AR format. Finally, the interaction between reading comprehension level (High or Low) and lesson format (VIDEO or AR) resulted in p-value = 0.180, meaning there is no significant difference for listening comprehension, irrespective of the participant and format.
Finally, a Tukey grouping analysis was performed using the mean for correct marks from each group. From this analysis new groupings were established that are based on the extent to which the mean values differ statistically. Table 5 shows the two resulting groups. It should be noted that in the questionnaire participants from the AR group obtained the highest number of correct answers. From a statistical stance, the performance of sub-group high reading comprehension AR (HighAR) was similar to the sub-groups HighVIDEO and LowVIDEO (Tukey group A). No difference was established between the sub-groups LowVIDEO and LowAR (Tukey group B). However, difference that is statistically significant is present between the groups HighAR and LowAR.
Given these results, hypothesis H 1A cannot be accepted from a statistical standpoint. Although a priori the ANOVA analysis indicates that there is no significant difference in learning between the groups when taking into account the instructional materials used for learning, the Tukey analysis provides a deeper analysis and indicates that there are in fact slight differences when taking into account participants' reading comprehension.

Comparison of Times for Comprehension Test Completion
In addition to recording the number of correct answers, researchers also observed the time taken by participants to complete the exam. The purpose of this was to establish whether a lesson format favored children with a certain level of reading comprehension.
As the variances in time among the groups were statistically unequal (p = 0.020 in Levene's test), it was concluded that the Welch's t-test would be suitable for data analysis. Using this test, any p-value < 0.05 would show significance. This analysis was performed to study time with respect to the level of reading comprehension and the format of the lesson, considering both factors separately only. Table 6 details the p-values obtained from Welch's test. Table 6. Welch's t-test on time taken to complete exam based on reading comprehension level and lesson format. Once again, the average time taken on the exam did not appear to differ significantly with respect to the format of the lesson taken by "All participants" (p = 0.731), thus hypothesis H 2A is accepted given that the AR group does take longer to complete the questionnaire. However, average times did vary significantly for the distinct reading comprehension levels (p = 0.017) and participants with high reading comprehension completed the questionnaire the fastest. On balance, the overall result is the same when comparing the number of successes in the comprehension test.

Usability
Participants were asked to complete the popular SUS (System Usability Scale) questionnaire [45,46] to establish their perceptions and measure usability for the AR app on a smartphone. To make the questionnaire more user-friendly pictograms were used on the response scale for each question, as suggested by Baumbartner et al. [47]. According to Lewis and Sauro [48], usability studies using the SUS should have sample sizes of at least 12; because of this the 16 participants of the AR Group answered the ten questions on the SUS (System Usability Scale) (see Figure 3). Interpreting scoring can be complex. "The participant's scores for each question are converted to a new number, added together and then multiplied by 2.5 to convert the original scores of 0-40 to 0-100. Though the scores are 0-100, these are not percentages and should be considered only in terms of their percentile ranking" [46]. SUS is a highly robust and versatile tool for usability professionals and based on research, a SUS score above a 68 would be considered above average and anything below 68 is below average, however the best way to interpret your results involves "normalizing" the scores to produce a percentile ranking [49,50].
An average score of 77.5 out of 100 was obtained from SUS questionnaire administrated at participants of the AR Group, which indicates that the app has fairly good user-centered design. In interviews with the AR group important qualitative information was obtained about the experience associated with the new prototype. The term "application" will be used to refer to the complete prototype-this includes the lesson elements and the mobile device application Augment Sales used to view the 3D models.
The two sessions conducted with each of the experimental groups were completely recordedexperience, feedback and completion of the SUS questionnaire. The two sessions carried out with each of the experimental groups were completely recorded-experience, feedback and completion of the SUS questionnaire.
In addition to SUS questionnaire, three questions were asked of all AR group participants to obtain extra feedback: • Regarding the emotions associated with the application, fourteen participants consider the overall activity as entertaining (Question #1). Among the many responses made by participants at Question #2, one in particular stood out-a high-comprehension participant stated that the surprise element of the application (projecting unfamiliar images) kept her interested and attentive. However, this same element produced a different feeling in a participant with low reading comprehension, who stated that he did not like having to wait to be able to see the image (referring to the time taken between scanning and projecting). Five participants indicated that they were not surprised by the way they visualized the 3D models, because they had already seen similar applications on social networks. However, nine participants considered it a novel way of watching and listening.
When asked about the most frustrating moment of the test (Question #3), all participants indicated that they did not encounter any particular difficulty when using the application. It should be noted that all AR participants tested the application prior to starting the experiment. Despite some encountering complications when scanning the first models, they generally improved with practice. Only one participant did not scan all the models, this was because he did not locate the scanner feature in the smartphone application during the first parts of the lesson.
Interestingly, very different postures emerged when participants were asked for their opinion on the lesson format (Question #4). Nine participants stated that they would prefer to take the lesson with visuals in video format. The rest of participants (seven), did express wonder at the augmented reality models. These individuals stated that they were like imaginary or fantasy elements brought into the real world. Even though they had never taken a lesson in the format proposed by the study, the novelty of the experience did not cause too much disruption and the initial surprise seemed to pass quickly. These users response is important to consider when designing applications and content in the future.
It also appeared that children were more accepting of smartphones than computers. During the test, the participants were given the choice of using the touch screen or the mouse and many chose the touchscreen. This suggests that they are naturally more comfortable with the interactions present with mobile devices. Additionally, in the interview, one participant associated taking the lesson using a smartphone with the ubiquity and freedom to do multiple tasks at once.

Observational Findings
In addition to making quantitative observations, researchers also closely monitored the behavior of the participants during the experiment.
An important difference between reading comprehension groups was whether they used their voice to read aloud. The high reading comprehension group (HighAR and HighVIDEO) tended to remain silent during most of the test. On the other hand, those with low reading comprehension (LowAR and LowVIDEO) made more verbal comments about what they observed and heard. It is worth mentioning that these comments were not always accurate but it seemed that the children interpreted the visuals to build a narrative that was not necessarily attached to audio. Also, part of this last group read aloud (the instructions and the questionnaire) and on multiple occasions they showed difficulties with speed (slow and halting reading). Additionally, in the case of one specific participant, researchers observed omission or swapping of syllables (signs of dyslexia).
Regarding the lesson format and the use of speaking out loud, a difference was observed in topics. The VIDEO group would usually talk about what they heard while the AR group commented on what they heard and what they saw.
As for the perceived interest in the lesson format, nearly all of the participants in the VIDEO group observed the visuals at all times (see Figure 4). However, the time spent viewing images in the AR group seemed to be very variable but one trend observed was that the more the participant progressed in the lesson, the less time he or she would spend observing the models. About half of the participants in the AR group tended to observe the model and then listen to the lesson. They seemed to interpret the order in which the elements were shown in the application (marker scanning first and audio snippet player second) as the order they were expected to follow. Another common sequence was Scan > project > play audio > observe image > finish part. Other interesting observations associated with the AR group trial were made regarding participants' attitudes towards AR. Contrary to what has been found in preschool children [35], in this study, few primary school children showed visible excitement or great interest towards this technology, despite its certain novelty and unfamiliarity. This attitude was also perceived from the time spent exploring the image and from the children's initiative to interact with the object.
In terms of interactions, there were four participants who did not show much interest in interacting with the models. These participants would usually be students with high reading comprehension (HighAR group) or students within the group that was already familiar with QR codes and smartphones. Those who did explore the models' response to different gestures mostly resorted to using pinch, drag and rotation. There was frustration when performing pinch because the Augment AR viewer application does not recognize this gesture; consequently, after seeing there was no response, the participants would usually follow up with the gesture double tap or drag. Drag in Augment app is the gesture most similar to zoom.
Another relevant point has to do with the differences displayed in Table 4. Children with low reading comprehension (LowAR) were the most likely to interact with AR models (see Figure 5) but the opposite was generally found for children with high comprehension (HighAR). This suggests that this active interaction may have a counterproductive effect on comprehension in primary school children in general, actually representing a distraction rather than providing support that would help them to follow a reading or audio.

Discussion and Conclusions
In this pilot study two lesson formats were contrasted to assess listening comprehension in two types of populations-fourth and fifth graders with high reading comprehension versus fourth and fifth graders with low reading comprehension. The effects of visuals (video and AR) on the number of correct answers in a comprehension test and the time taken to complete said test were compared for a total of 32 participants, with the aim being to find another feasible alternative for creating educational material for children.
The results showed that there is no difference in listening comprehension between visuals in video format and visuals in AR format. This discovery opens many doors for teachers and creators of digital content for children in terms of creative solutions and in terms of offering alternatives based on available budgets and cost-benefit goals. As a result of the continuing growth of online repositories [28] and advancements in the technology supported by smartphones, AR is currently a good option. Furthermore, given that children are now increasingly inclined towards using mobile devices [51], it is important that content directed towards children takes their preference into account.
Through a Tukey test, it was discovered that in the AR format significant difference exists between participants with high reading comprehension and low reading comprehension. In terms of future research, it would be interesting to determine the effect of interactions with AR technologies in education on the performance of primary school students. A further line of research would be to explore these interactions specifically in students with low reading comprehension in order to establish guidelines for the creation of digital content that favors and supports its acquisition.
The time taken to assimilate the lesson is statistically similar for both the video format group and the AR format group. However, there is statistical difference in terms of the time take to assimilate the lesson between students with high and low reading comprehension. Students with high reading comprehension completed the evaluation questionnaire the fastest. On balance, the overall result is the same when comparing the number of successes in the comprehension test.
The results of this study should be taken with caution, since it is limited to a pilot study that aims to obtain an approximation of our hypotheses and for which researchers have relied on a non-probabilistic population sampling. To obtain conclusive results, a larger and representative sample of the population calculated by probabilistic methods would be necessary.
Additionally, the observations in this pilot study revealed the lack of excitement among the children towards a new lesson format. The effect of fascination on children's academic performance has been explored [37] and it is important to investigate its counterpart (boredom and lack of motivation). Thus, another variant to this type of experiment could address the design of applications and content for education, now paying attention to the over-saturation of digital content and the easy adaptation skills perceived in young students as engagement factors.
Other limitation of this study is related with the usability of the prototype, despite getting a good score on the usability evaluation. For the time being, it is thought that this lesson format could prove inconvenient, as it implies more tasks for the users. We consider that a dedicated AR application like similar experiences with children [52] would provide different results. It is possible that the more usable, easy and attractive the application the reading comprehension of students' will be better. Ideally, an application such as this should be mounted in an AR headset display to give users as much freedom as possible. However, headsets are not commonplace in primary schools, so presenting such a solution at this time would not impact a large number of people. Nevertheless, it is a great finding to know that it is possible to introduce AR as a new type of visual in listening tests without hampering performance, despite the increased effort required by users. Interesting future lines of research could include the exploration of a new modality of podcasts with AR that can be consumed through headset devices.