A Systematic Review on Oral Interactions in Robot-Assisted Language Learning

: Although educational robots are known for their capability to support language learning, how actual interaction processes lead to positive learning outcomes has not been sufficiently examined. To explore the instructional design and the interaction effects of robot-assisted language learning (RALL) on learner performance, this study systematically reviewed twenty-two empirical studies published between 2010 and 2020. Through an inclusion/exclusion procedure, general research characteristics such as the context, target language, and research design were identified. Further analysis on oral interaction design, including language teaching methods, interactive learning tasks, interaction processes, interactive agents, and interaction effects showed that the communicative or storytelling approach served as the dominant methods complemented by total physical response and audiolingual methods in RALL oral interactions. The review provides insights on how educational robots can facilitate oral interactions in language classrooms, as well as how such learning tasks can be designed to effectively utilize robotic affordances to fulfill functions that used to be provided by human teachers alone. Future research directions point to a focus on meaning-based communication and intelligibility in oral production among language learners in RALL.


Introduction
Educational robots are known as capable interactive pedagogical agents in language learning situations.Previous research has reported on educational robots' affordances for training skills in one's first, second, or foreign language [1][2][3].Despite claims about the potential of educational robots for helping learners improve language skills [4], no previous review has focused on instructional design that leads to positive learning outcomes in robot-assisted oral interactions.This review study, therefore, aims to fill this gap by analyzing 22 empirical studies in terms of the interactive design of oral tasks by highlighting the teaching methods used, the oral task types, the role served by the robot and the instructor/facilitator, as well as their effectiveness in improving oral competence.
the presentation of digital content, task repeatability, interactivity, flexibility for incorporating different learning theories, and embodied interactions conducive to learning [7,8].In particular, interactions that enable oral communication between learners and robots serve as the core of robot-assisted language learning (RALL).
Defined as interactive language learning through systems that involve the physical presence of a robot, RALL provides learners face-to-face communication opportunities that resemble real conversation situations [9].In RALL, verbal (e.g., question-and-answer) and non-verbal modalities (e.g., gesturing, nodding, face tracking) can be used to facilitate language practice, leading to increased learning motivation, interest, engagement, as well as cognitive gains [9].Furthermore, based on principles of instructional design for technology-enhanced language learning, appropriate use of language teaching methods for designing learning activities [10], as well as the roles played by various interactive agents in RALL, need to be examined closely in order to yield insights on effective pedagogy [11].This systematic review thus provides details about actions taken by various interacting agents (e.g., learner, robot, instructor/facilitator) in RALL and their effects on learning outcomes to help language practitioners develop interactive course design using robots in their classrooms.

The Review Study
This study aimed to conduct a systematic review, which is a type of review under the Search, Appraisal, Synthesis, and Analysis (SALSA) framework [12,13].A systematic review adheres to a set of guidelines to address research questions by identifying reliable and quality data on a topic.Researchers who conduct this type of review (a) undertake exhaustive, comprehensive searching, (b) apply inclusion/exclusion to appraise the data, (c) synthesize the data through a narrative accompanied by tabular results, and (d) analyze what is known to provide recommendations for practice, or analyze what is unknown and state uncertainty around findings with recommended directions for future research [12].
Previous research has investigated the affordances of educational robots, and analyzed the learning goals of their use of robots for different age groups [7].However, one research topic that remains unexplored in RALL is the cooperation between the teacher and robot and the resulting language teaching and learning model in this cooperation mode [5].It is therefore necessary to delve into the implementation of RALL in the classroom by focusing on the interactions, including the activity design, the interactive agents involved, and interaction processes.It is also important to identify how these interaction elements affect the learning outcomes and shape learners' experiences in RALL.Four research questions were therefore formulated as follows: RQ1: What language teaching methods are incorporated in the design of oral interactions in RALL?RQ2: Which types of oral interaction task design are employed in RALL?RQ3: What roles do robots and instructors fulfill when facilitating oral interactions in RALL?RQ4: What are the learning outcomes of RALL oral interactions in terms of learners' cognition, language skills, and affect?

Oral Interactions in Language Classrooms
Traditionally, interaction is the process "face-to-face" action channeled either verbally through written or spoken words, or non-verbally through physical means such as eye-contact, facial expressions, gesturing [14].In second or foreign language development, comprehensible input plays an important role [15].That is, language learners must be able to understand the linguistic input provided to them in order to communicate au-thentically through spoken or written forms.In particular, classroom oral interaction involves listening to authentic linguistic output from others and responding appropriately to continue in a communicative event such as role play, dialogue, or problem-solving [16,17].Classroom oral exchanges involve two interlocutors speaking and listening to each other in order to predict the upcoming content of the communicative event and prepare for a response [18].As a consequence, providing the context for negotiation of meaning becomes a crucial part of facilitating classroom oral exchanges that range from formal drilling to authentic, meaning-focused communication such as information exchange [19,20].Aside from establishing the context for oral interactions, creating intended communication behaviors among learners is another goal for language instructors.According to Robinson [14], two types of interaction can be found in a classroom-verbal and nonverbal interaction.Verbal oral interactions refer to communicative events such as speaking to others in class, answering and asking questions, making comments, and taking part in discussions.Non-verbal interaction, on the other hand, refers to interacting through behaviors such as head nodding, hand raising, body gestures, and eye contact [17].As educational robots assume humanoid forms, they can help achieve various types of classroom oral interactions in RALL.

Affordances of Educational Robots for Language Learning
As [21] reported, educational robots began to emerge in North America, South Korea, Taiwan, and Japan in the mid-2000s.These robots took anthropomorphic forms and assumed the role of peer tutors, care receivers, or learning companions.They have an outer appearance of anthropomorphized robots with faces, arms, mobile devices, and tablet interfaces attached to their chests [21].With different functions such as voice/sound, facial, gestural, and position recognition, RALL is perceived to be more fun, credible, enjoyable, and interactive than computer-assisted language learning, which relies on mobile devices (e.g., smartphones or tablets) only.Different stimuli can be provided as robots assume roles such as human or animal characters that speak, move, or make gestures [21] to tell stories.The various multimodal sources of input and interactions make RALL a promising field with numerous possibilities in interactive design for language learning.In addition, as the robot-assisted learning mode is still at its infant stage, there remains a great potential for researchers and educators to postulate language learning models for best practices.

Human-Robot Interaction in RALL
Prior research has shown that human-robot interaction (HRI) can lead to language development.In a review study [22], comprehensive insights were provided about the effects of HRI on language improvement, including robots' positive impact on learner motivation and emotions due to novelty effects, and the multifaceted robotic behaviors that provide social and pedagogical support to learners.Through immersing in real-life physical environments and manipulating real-life objects, learners can also experience embodied learning to improve their vocabulary, speaking, grammar, and reading.Whole body movements and gestures have been found conducive to vocabulary learning, for example.
Robots are capable of complementing humans in language learning scenarios that focus on specific language skills such as speaking, grammar, or reading.Studies have concluded that robots can help children gain vocabulary equally well as human teachers.Furthermore, the use of robots in language learning has a great impact on learners' affective state, including learning-related emotions.In the presence of a robot, instead of a human teacher, learners' anxiety is reduced, and they are less afraid of making mistakes in front of a humanoid robot.Higher confidence has also been reported among teenage students when they practiced speaking skills in robot-assisted situations [22].

Applying Language Teaching Methods in Interactive Design in RALL
Cheng et al. [7] claimed that language education is ranked at the top as a learning domain with the application of educational robots.The reported types of language learning varied from general, foreign, to second or additional language skills; and the popular age levels for applying RALL were between ages of three and five (preschool), and prior to puberty (primary school), as these are two critical periods for language learning.Further connection needs to be made between language teaching methods and RALL instructional design.In this regard, the notion of didaktik can be applied [23].Didaktik is a German term comparable to the North American concept of instructional design that considers learner needs, task design, and learning materials.Jahnke and Liebscher [23] argued that an emphasis should be put on the role of the teacher and how his/her course design translates or connects to student learning and performance.The Didaktik system has three components-the instructor, the learner, and the course content or design.The design of second and/or foreign language learning activities involves the incorporation of teaching methods as a basis for the intended learning experience.
As outlined by [24], twentieth-century language instruction mainly employed a number of language teaching methodologies in second or foreign language learning settings.According to [24], language practitioners continuously swing between methodologies that are strictly managed and those that are more laissez faire in terms of content and amounts.On one side of the pendulum swing stand the traditional methods developed in early twentieth century, these include grammar translation, direct method, and the reading method.By the mid-twentieth century, the audiolingual method (ALM) emerged mainly for teaching oral skills.Highlighting drill-based practice, ALM presents specific language structures (e.g., sentence patterns) to learners in a systematic and organized manner and helps them replace native language habits with target language habits.The method also includes pronunciation and grammar correction through drills.
Following ALM was the emergence of total physical response (TPR) and teaching proficiency through reading and storytelling (TPRS).As a method, TPR [25] directs learners to listen to commands in the target language and immediately respond with a commanded physical action.TPRS also extended from TPR and aimed to develop oral and reading fluency in the target language.By having learners tell interesting and comprehensible stories in the classroom, TPRS has been perceived as a useful technique for fostering 21st century speaking skills, connecting closely with the concept of comprehensible input and the natural approach [26].
As ALM gradually faded in the 1980s, communicative approaches such as communicative language teaching (CLT) became the dominant foreign and second language teaching paradigm, and has continued to gain popularity worldwide in the 21st century [27].In a way, CLT makes up for shortcomings of ALM by focusing on the functional aspect of language rather than the formal aspect.Therefore, CLT mainly trains learners' communicative competence through authentic interactions (e.g., role-play scenarios) instead of ensuring pronunciation or grammatical accuracy [28].CLT activities usually incorporate meaningful tasks such as interviews, role-play, and opinion giving [29].

Search Strategy
The authors employed a search strategy to retrieve articles published between 2010 and 2020 [30,31] in order to survey the development of RALL in the past decade.The databases included Web of Science, ERIC, and Ebsco, while journal sources included ten journals, most of which were from the Social Sciences Citation Index, in the field of educational technology and computer-assisted language instruction (e.g., Computers & Education, British Journal of Educational Technology, Computer-Assisted Language Learning, Educational Technology Research & Development, Interactive Learning Environments, System).The researchers conducted six searches using the following key terms-"Interactive robots AND language learning," "L1 learning AND robots," "L2 learning AND robots," "Educational robots," "Robot," and "Humanoid," which led to the retrieval of 1897 articles.

Study Selection
After the initial article retrieval, the researchers underwent a study selection process.The researchers first eliminated inaccessible, duplicate, and non-English articles, which reduced the number of articles to 1887.After these articles were removed, the remaining studies were screened by title, abstract, and type of study.Specifically, titles and abstracts that indicated the use of robots for language learning were selected.Also, only empirical studies were selected.Therefore, other article types such as review studies, book reviews, proceedings, and editorials were eliminated, leading to 1202 studies remaining for further screening based on the Method, Results, and Discussion sections.In particular, the researchers evaluated the rigor of the Method section, evidence of learning outcome in the Results, and pedagogical implications in the Discussion.This led to 49 eligible studies for inclusion/exclusion.

Eligibility: Inclusion/Exclusion Criteria
With a total of 49 studies eligible for assessment, rigorous inclusion/exclusion criteria were applied to obtain valid data on interactions in RALL.The criteria were as follows:

•
The study must present physical use of robots;

•
The study must focus on language learning; • The study must employ rigorous methodology with sufficient details;

•
The study must report about robot-learner interactions in detail, including the specific language input and output during the interactions.
As shown in Figure 1, articles that failed to meet the inclusion criteria were removed.For example, studies that used virtual robots or studies with a focus on subjects other than language learning were removed.Similarly, studies that did not provide thorough accounts of the instructional design for oral interactions (including the language input and output in RALL) were eliminated.The final number of selected articles was twenty-two with the publication period spanning from 2010 to 2020.

Data Extraction
The data extraction process involved close reading of the 22 selected studies.First, the general research profile (See Table A1) with characteristics (e.g., country, target language, implementation duration, research design, technological components) were coded.Second, based on the Didaktik instructional design model, which includes three components-the instructor, the learner, and the course design, the researchers coded content on the learning activity, role of the robot as a pedagogical agent, interactive task design, language input and output, and learning outcome in terms of cognition, affect, and skill (see Tables A2 and A3).Table 1 provides the coding scheme for the interactive oral task design (See Table A3).

Interactive Task Design
The type of task designed to engage learners in oral interactions (e.g., drill, question-and-answer, dialogue, role-play, action commands, acting out a story) Drill: Recite -Robot questioning -Total physical response storytelling [32] Interaction Mode The number of learners in the two-way robotlearner interaction (e.g., one-to-one or one-tomany) Robot-Learner Interaction: -One-to-many [33] Instructional Focus Specific goal for learning the target language items-focus on form (e.

Tabulations
A series of tabulations were conducted by one of the co-authors and one experienced research assistant.First, general characteristics were identified.For example, the target language for each study was categorized as (a) a first language, (b) a foreign language, and (c) a second language (See Table A1).Another general characteristic identified was the major theoretical foundations in RALL and their benefits and drawbacks across the 22 studies.The last general characteristic concerned the technological affordances in RALL, including the type of robot and the sensors used (See Table A1).
Second, the distribution of major language teaching methods (e.g., audiolingual method, communicative language teaching) applied in the 22 reviewed studies was tabulated (See Table A2).Many studies employed more than one language teaching method in their activities.Third, oral interaction tasks that were considered effective in the selected studies were categorized into (a) storytelling, (b) role-play, (c) action command, (d) question-and-answer, (e) drills (e.g., repeating/reciting), and (f) dialogue (See Table A3).Fourth, the roles played by the robot and the support provided by the instructor/facilitator were coded (See Table A2).The robot's main roles included (a) role-play character, (b) action commander, (c) dialogue interlocutor, (d) learning companion, and (e) teacher assistant; while the support by human instructors/facilitators included (a) procedural support, (b) learning support, and (c) technical support.Fifth, the language input and output were coded (See Table A3).Specifically, the language input mode was categorized into (a) linguistic, (b) visual, (c) aural, (d) audiovisual, and (e) gestural/physical modes; and the language output was categorized into four levels based on linguistic complexity, including (a) phonemic level (referring to the smallest sound unit in speech, e.g., the phonetic entities/b/,/ae/, and/t/, respectively in the word bat), (b) lexical level, (c) phrasal level, and (d) sentential level.During the entire inter-coding process, one of the researchers served as the first coder and created a coding scheme to train the second coder.Then, after initial coding trials on three studies, the two coders met and discussed the resulting discrepancies to engage in another trial.After all the studies were coded, the inter-coder reliability in terms of percent agreement was calculated to be 87%.

Synthesis
Synthesis on the detailed instructional design for oral interactions in RALL was based on the type of task design and the actions performed by the robot, learners, and human facilitators/instructors.The researchers synthesized the coded data to connect the nature of each task type to the actual interactions induced by the task.For example, through storytelling, a robot could read a story aloud for the learner to listen and receive the linguistic input.The learners could then be asked to recite, repeat, or act out the story in a role play task to produce language output following the robot's content delivery or action commands.Furthermore, the language input and output, as well as the type of teacher talk afforded by the robot in each oral interactive task among the 22 studies were analyzed to help the researchers understand the mechanisms that enriched the oral interactions.The researchers sought evidence of stimulating and engaging elements in the designed oral tasks and were able to see that the oral interactive tasks were conducive to heightening the level of motivation, interest, and cognitive engagement, which in turned fostered the development of oral skills in language education.

General Characteristics
Several characteristics in the general profile of the 22 studies were worth noting-the geographic research settings, education levels, the target language for acquisition with the robot-assisted activities, the research design, theoretical bases, and technological affordances in RALL.The countries that implemented robot-assisted oral interactions for language learning included Taiwan (n = 6), Japan (n = 3), Sweden (n = 3), Iran (n = 3), South Korea (n = 2), United States (n = 2), Turkey (n = 2), and Italy (n = 1).In terms of the distribution of RALL by learners' education levels, the results showed that primary schools engaged their learners in RALL most frequently (n = 11), followed by preschools (n = 4), higher education (n = 4), and secondary schools (n = 3).This finding indicates that robots best serve children in formal, primary schooling years, as children between the ages of 7 and 12 (the primary schooling age in most countries) still find robots fun and appealing as opposed to older teenagers who might find them somehow childish or less intellectually engaging.The second age group that benefited most from RALL was preschoolers.Similarly, toddlers and young children still enjoy interacting with humanoid robots.Coincidentally, primary school children and preschoolers belong to the two critical periods for language development.It is possible that since learners from these two developmental stages benefit most from enriched language learning activities, language educators devote more efforts by incorporating robot-assisted oral interactive learning activities to engage learners from these two age cohorts.
Target languages in the 22 RALL studies focused primarily on foreign language learning, especially learning English as a foreign language (n = 14) occurred most frequently, followed by Russian (n = 1) and Dutch (n = 1), while first and second language learning occurred less frequently, with three studies for both categories.As for the research design, the majority of the studies employed either single-group (n = 7) or betweengroup (n = 6) experiments; some of these experiments adopt pre-/post-test instruments (n = 6), while others adopt survey evaluation design (n = 2).Other research designs include quasi-experiments (n = 4), ethnographic study design (n = 1), and system design and implementation evaluation (n = 1).Overall, the research instruments revealed a trend of using quantitative, summative assessment in RALL.Specifically, over 70% of the studies employed tests such as listening, speaking, word-picture association, vocabulary, reading, and writing tests to measure learners' performance of target skills.Only less than 15% used qualitative, formative assessment on skills such as storytelling and drawing artifacts.Although 29% of the studies did use video recording to collect data on learning performance, the assessment methods remained test-oriented in RALL.
Two major theoretical bases were identified among the RALL studies-technologies for creating human-robot relationships and embodied cognition through robot-based content design.The first theoretical basis was developing robots for forming human-robot relationships through HRI interactions.Attempts to enable humanoid robots to autonomously interact with children using visual, auditory, and tactile sensors were realized [36].Also, RFID tags enabled mechanisms such as identifying individual learners and adapting to their interactive behaviors to successfully engage learners in actual language use.Such findings support theoretical perspectives from social psychology by highlighting similarity and common ground in learning.Applying this perspective to RALL, it was imperative that robots bear similar attributes and knowledge as target users [36].Doing so led to benefits such as engaged language use, improved oral skills, and higher motivation and interest in learning.However, novelty effects were reported [37].Also, highly structured activities for autonomous robot responses led to little variation among learner responses.Recommendations were thus made about adapting robot behaviors to learners' responses.
The second theoretical basis was applying embodied cognition through robot-based content design.Robot-based content design, as opposed to computer-based content design, which consists of static user model and two-dimensional, visual and audio content displayed on screen consists of dynamic user models with visual, audio, and tangible, human-like humanoids with an appearance and body parts that perform face-to-face interactions [37].In addition to tangible, interactive design, RALL design provided bidirectional interactive content through installing e-book materials, reaping combined benefits of e-learning tools and embodied language learning to improve learners' reading literacy, motivation, and habit [38].
As for technological affordances in RALL, the general functionalities included identifying multiple learners, recalling interaction history, speech recognition and synthesis, body movements, oral interactions, teaching, explaining, song playing, dancing, face recognition, language understanding and generation, dialogue interactions, motions on wheels, and interaction event tracking.Sensors such as wireless ID tags, eye/stomach/arm LEDs, RFID readers/sensors, infrared sensors, tactile sensors, sonars were used to support the various affordances.

Language Methods Used in RALL Oral Interactions (RQ1)
The language teaching methods that were used to create RALL oral interactions were based on language instruction theories that emerged during the 20th and 21st centuries.Moreover, some studies employed more than one language teaching method in their RALL oral interaction activity design.Figure 2 shows that the most popular method adopted was CLT (n = 13), followed by TPRS (n = 7), TPR (n = 6).Other methods such as multimedia-enhanced instruction, learning by teaching, socio-cognitive conflict (n = 6), ALM (n = 4), and multimodal stimuli (n = 2).In addition, studies that adopted multiple language teaching methods employed combinations such as CLT plus TPR plus TPRS (n = 4), CLT plus TPR (n = 2), CLT plus TPR plus ALM (n = 1), ALM plus TPRS plus TPR (n = 1), and CLT plus TPRS (n = 1).

Task Design for Oral Interactions in RALL (RQ2)
The task design for oral interactions was analyzed through a learner-centered perspective.The instructional design elements included (a) the task itself, (b) the language input provided by the robot and received by the learner, as well as (c) the oral language output produced by the learner.In terms of the interactive task design, the task design that led to oral interactions included dialogue (n = 11), storytelling/story acting (n = 8), question-and-answer (n = 7), Role Play (n = 5), drill (n = 4), and action commands (n = 3).The instruction embedded in the task design was more form-focused (n = 12) than meaning-focused (n = 8), with only a few studies that included both in the design (n = 2).Figure 3 presents the results on the interactive task design.The mode of language input provided by the robot served as input from the learner's perspective, and mainly consisted of aural input (n = 18), followed by visual (n = 11), linguistic (n = 4), and gestural/physical input (n = 3), as shown in Figure 4. Language output produced by the learners mostly consisted of sentential, closed answers (n = 11), followed by lexical, closed answers (n = 13), and others (See Figure 5).

Role of Robots and Instructors (RQ3)
From a design-based perspective, there were five possible roles the robots played in RALL oral interactions (Figure 6).The most common role was a dialogue interlocutor (n = 12).This referred to pre-determined dialogues where the robot conversed with the learners using fixed phrases or sentences.The second most frequent role fulfilled by the robot was a role-play character, where the robot acted out a story as one of the characters in the story (n = 9), followed by a companion that sings, dances, played with the learner, or showed pictures on its screen (n = 5), a teaching assistant that helped the teacher with any part of the instructional procedure (4), and action commander that acts out certain movements commanded by the learner during an activity (n = 1).In addition, the robot served a major function of providing teacher talk.Five kinds of teacher talk were provided, including skill training (n = 12), affective feedback (n = 11), knowledge teaching (n = 7), motivational elements (n = 3), and procedural prompts (n = 2).Finally, the instructor or facilitator would, in some studies, serve to provide additional support in RALL.The types of support included procedural support (n = 9), learning support (n = 7), and technical support (n = 1) for those studies that mentioned them.
The interactive oral task design allowed the robot, human facilitators, and learners to engage in a well-orchestrated speaking practice in a contextualized and meaningful way.Some example actions performed by the interacting agents are summarized in Table 2.It is evident that RALL oral interactive mechanisms can be multifarious, each specific to the oral communicative goal and context.In most cases, the interactions were based on robotic functions such as (a) speaking [32], (b) making gestures and movements [39], (c) singing [34], (d) object detections [40,41], (e) voice recognition functions [42], and (f) display of digital content on the accompanying tablets [43].While robots were used to facilitate bi-directional communication by initiating or engaging in verbal, gestural, and physical interactive processes to allow learners to practice receptive (e.g., listening and reading) and productive (e.g., speaking and writing) language use, human facilitators constantly provided procedural, learning, and technical support [34,38] to learners during the interactive tasks.
Learners engaged mostly in productive language practice such as asking questions [33], repeating or creating words or sentences orally [34,39], creating stories orally [44] or in writing [33], performing movements [39], and acting in role plays [45].They also relied on the guidance of human facilitators with various task needs such as game introduction [46] and provision of feedback [39].The cognitive learning outcome of engaging learners in RALL oral interactions was reflected by effective academic achievement [35], increased concentration [35], understanding of new words through pictures, animation, and visual aid [44], and significant improvement in word-picture association abilities [46].Children also gained the ability in picture naming [41].In terms of the acquisition of language skills, there was significant improvement in learners' speaking skills [45].Specifically, student-talk rate and response ratio increased [39], and the RALL system helped to significantly improve speech complexity, grammatical and lexical accuracy, number of words spoken per minute, and response time [43].Pronunciation also became more native-like [43].Efficient vocabulary gains [37,40,42] and retention [42] also occurred.
In terms of language skills, there was significant improvement in listening and reading skills [39].The slightly structured repetitive interaction pattern was perceived as beneficial for adult Swedish learners with low proficiency levels [47].Evidence of the development of other skills such as physical motor skills due to the use of the robot [33] and children's ability in teaching [40] was also reported.As for affective learning outcomes, increased satisfaction, interest, confidence, motivation, and attitudes [34,45,[47][48][49] were found toward the use of RALL and toward learning English [48,50].In RALL, students became more active in a native-like setting [49].Also, the robots reduce learner anxiety about making mistakes in front of native speakers [51].Class atmosphere improved effectively due to RALL.
Moreover, positive emotional responses were identified from various studies.Of the coded emotional responses, over 91% were positive.Only several negative responses were identified, which showed learners' dissatisfaction with the robot's synthesized voice, facial expressions, and feelings of anxiety and fear of making mistakes in RALL.The positive responses are summarized as bolded keywords, which reflect the affective states of learners during RALL (See Table 3).The positive affect included emotional states such as eagerness, enthusiasm, satisfaction, appreciation, motivation, and enjoyment.The learning is a fun and interesting experience Motivation Highly motivated to study English using a robot

Discussion
The review identified recent efforts in the field of RALL that applied various types of robotic sensing technologies (e.g., personal identification mechanisms with RFID tags) to enrich robot-human interactive design.By integrating other tools such as e-books into robots, the field of RALL was advanced with more diverse instructional design.Detailed findings concerning each question are described below and summarized in Table 4.
With regards to the first research question, findings about the language teaching methods incorporated in RALL oral interactions revealed a heavy emphasis on communicative skill training with the use of Communicative Language Teaching and Teaching Proficiency through Reading and Storytelling.On the other hand, many studies also applied Total Physical Response and Audiolingual Method to train bottom-up language skills such as word recognition.Through RALL interactions, learners were able to experience receptive language learning [52] of vocabulary and sentences by mimicking authentic scenarios, reading the storylines, or seeing pictures in word-association tasks.Moreover, they engaged in productive language use by giving robot commands or creating stories.Such interaction opportunities in RALL can effectively enhance both productive communication (e.g., oral skills) and creative skills, which are important for 21st century learners [53].
Although the dominant language teaching methods were communicative and storytelling approaches, existing affordances of educational robots such as giving commands and voice recognition have allowed traditional methods such as audiolingual and total physical response methods to complement the top-down, communicative approach in many of the studies reviewed.To a certain extent, the audiolingual and total physical response methods reflect a bottom-up approach that drills learners with simple instructional design (e.g., dialogues or question-and-answer).This implies that activity design using CLT, TPRS, ALM, and TPR may be easy for RALL practitioners to implement and is especially applicable to the majority of RALL research settings in East Asian contexts.Many traditional English classrooms rely on grammar translation and audiolingual methods for English learning, therefore, the drill-based practices that combine ALM or TPR with communicative approaches appears to be a feasible design combination.
To address the second research question on the types of oral interaction task design in RALL, the designed tasks were aligned to language teaching methods such as teaching proficiency through reading and storytelling to fulfill such goals as (a) learning the meaning of a set of vocabulary confined to the content of a story, (b) forming personalized questions through a spoken class story, (c) reading specific language structures in a story, and (d) acting out parts of a story by repeating certain language structures in the actors' lines [54].The results showed that through communicative, meaning-based language teaching methods, RALL practitioners could create interactive language learning tasks such as storytelling and role play with robots acting as human-or animal-like characters.However, it is worthy to note that the oral output produced by learners tended to be closed answers at lexical and sentential levels, which points to future efforts to develop tasks that highlight intelligibility to fulfill meaning-focused instruction.
The pedagogical implication for RALL instructional design therefore highlights oral and reading fluency as well as communicative competence instead of grammatical accuracy.Language teachers that integrate RALL can adopt a wide array of methods along the skill-training spectrum.On one end, the tasks can focus on communicating in situated dialogues, and on the other end, the tasks can aim to improve accuracy in pronunciation or word-picture association.The instructional design consisting of these methods allows educational robots to engage learners in a context-specific manner to appeal to learners in various educational levels.This further confirms previous researchers' arguments that RALL is a feasible and valuable language learning mode for oral language development [55].Furthermore, robots no longer are perceived as merely machines that automatically carry out a sequence of programmed actions, but as interactive pedagogical agents with multi-sensory affordances conducive to language learners' oral communication development [56].
In response to the third research question concerning the roles played by the robots and instructors, the findings showed that the robot usually played the most essential role during oral interactions in RALL, with timely support by a human instructor or facilitator.The findings are in line with previous claims that compared to books, audios, and webbased instruction, humanoid robots can best engage learners in language learning through human-like interactions [21].The input-output process of comprehensible linguistic content that is vital in language learning [15] can be effectively fulfilled by oral interactions provided by robots.
As for the fourth research question, various learning outcomes in terms of cognition, language skills, and affect were identified.For cognitive learning outcomes, RALL effectively facilitated learners' understanding of vocabulary across all age levels.This echoed the findings by [57] that robot-assisted learning can effectively lead to cognitive gains in target subject domains (e.g., mathematics and science) with robots' complex, multi-sensorial content, and interactions.In this study, the subject domain is language, therefore, the cognitive learning gain is mostly focused on vocabulary comprehension (e.g., closed answers at the lexical level), which was reported as a major focus in the RALL oral instructional design.For the skill-based learning outcomes, significant improvement in terms of speaking abilities, including the complexity, accuracy, and pronunciation was evident in numerous studies.This suggests that oral interactions facilitated by robots are promising for improving oral proficiency among language learners.As put forth by Mubin et al. [58], robots have efficient information and processing affordances, which can reduce learners' cognitive workload and anxiety compared to traditional instructional modes.The review findings support the view that robots can foster speaking abilities without incurring anxiety or extra cognitive demands on the learners.
In terms of the affective learning outcome, which is an important aspect of language acquisition, the presence and affordances of educational robots made the learning experience more exciting, enjoyable, fun, and encouraging.The learners became more eager, enthusiastic, and confident in class under RALL conditions.These positive emotional states serve as advantages of incorporating educational robots in language education.In this respect, previous research has included emotional design as one of the instructional conditions in multimedia learning that enhanced learning [59] with increased motivation and better performance.It has been proven that positive emotional states during learning can activate retention and comprehension during learning according to [59].The review thus confirms the positive impact of robot-assisted interactions in language learning scenarios.

RQ #
Corresponding Findings Communicative language teaching and teaching proficiency through reading and storytelling are often complemented by total physical response and audiolingual method, which train bottom-up oral interaction skills. 2 Applying communicative, meaning-based language learning principles, interactive oral tasks (e.g., dialogue, storytelling, role play) with robots were used to provide speaking practice with a focus on communicative competence instead of grammatical accuracy.

3
Robots' roles included a dialogue interlocutor, role-play character, learning companion, teaching assistant; instructors' roles included providing additional support such as procedural support, learning support, and technical support 4 Learning outcomes in RALL consisted of cognitive gains in target subject domains, skill-based improvements in various aspects of speaking, and a more exciting, enjoyable, fun, and encouraging affective learning experience This review study had three limitations.The first limitation concerns the small sample size of the articles reviewed (n = 22).This limitation is mainly due to the current limited number of studies on RALL oral interactions in existing databases, as RALL is a new research niche with gradual, growing efforts focusing on the analysis on instructional design involving various interacting agents.However, with a narrow research focus and strict inclusion/exclusion procedures, the review did reach data saturation since the studies provided rather rich data for answering the research questions.Other systematic reviews with relatively small sample sizes have also proven to be valuable with rigorous systematic review procedures [60].Secondly, the studies varied in terms of educational levels, which in part was also due to the constraint of a small sample size.Despite the limitation, the authors were able to obtain the expected patterns as the focus was on analyzing instructional design for interactions in language learning with the use of educational robots.The third limitation was the duration of the 22 studies, most of them were not longitudinal, therefore, the researchers cannot make claims about valid learning outcomes in the long run.

Conclusions
This systematic review reported on general research trends for RALL and analyzed interactions among various agents, the robot, the learners, and the human facilitator across educational levels.Specifically, the research questions focused on (a) the language teaching methods, (b) instructional design, (c) roles of robot and instructors/facilitators, and (d) cognitive, skill-based, and affective learning outcomes.The review findings suggested that RALL instructional design employ communicative language teaching and storytelling as the most dominant language learning methods, and these two methods are often complemented by audiolingual and total physical response methods.The learning tasks are based on the principles of the identified language learning teaching methods, and the resulting interaction processes and effects proved to be conducive to language acquisition.Interaction effects from the learning tasks led to positive cognitive, skilledbased, and affective outcomes in language learning.
By examining the benefits and drawbacks of RALL theoretical perspectives and design practices, the review contributes to the research field of robot-assisted language teaching and learning with in-depth exploration and discovery about effective instructional design elements and their effects on interaction processes and language learning.The detailed analysis helps to add new insights and provide specific design elements to guide RALL practitioners including teachers, instructional designers, and researchers.
Future research should aim to develop more sophisticated functions to improve the accuracy and adaptivity for mechanisms such as speech recognition, feedback giving, and personal identification, and engage multiple learners in RALL interactions via collaborative oral tasks.In addition, as storytelling appears as a recent trend of activity design in RALL, forming detailed and applicable storytelling rubrics that emphasize intelligibility in oral production via functions such as automatic speech recognition will help ensure the meaning-focused nature of interactive RALL.Finally, it will be worthwhile to investigate innovative ways to design and assess interactions for learners at different educational levels using innovative teaching methods.Efforts should also aim to combine RALL with other emerging technologies such as the use of tangible objects and internet-of-things technology [61] to better facilitate authentic and embodied language learning for young learners.

Figure 2 .
Figure 2. Language teaching methods in RALL oral interactions.NOTE: CLT = Communicative Language Teaching.TPRS = Teaching Proficiency through Reading and Storytelling.TPR = Total Physical Response.ALM = Audiolingual Method.

Figure 3 .
Figure 3. Type of task design for RALL oral interactions.

Figure 4 .
Figure 4.The mode of input the robot provides to the learner.

Figure 5 .
Figure 5. Type of oral output produced by learners.

Figure 6 .
Figure 6.Roles played by robots in oral interactions.
Finally, specific emotional design in RALL leading to socio-emotional development among young learners holds promises in the RALL research area.

Table 1 .
Coding Scheme for Task Design for Oral Interactions in RALL.

Table 2 .
Synthesis of Actions Performed by Interacting Agents in RALL.

Table 3 .
Positive Cognitive, Skill, and Affective Learning Outcome.

of Cognition Contributing Factor to Learners' Cognitive Development
Humanoid robots have the advantage of creating scenarios similar to child-child social-cognitive conflict situationsAnalysisStudents were intellectually curious when learning with the robot (e.g., generate questions about mathematics and science reasoning) Application Asking a robot to take action using action commands (e.g., drink, sweep, play, brush)

Table 4 .
Alignment of research questions to review findings on RALL.

Table A2 .
Instructional Design and Learning Outcome of RALL.