Redefining the Role of Avatar Chatbots in Second Language Acquisition

Kaplan, Gregory B.

doi:10.3390/histories6010009

Open AccessArticle

Redefining the Role of Avatar Chatbots in Second Language Acquisition

by

Gregory B. Kaplan

Department of World Languages and Cultures, University of Tennessee, Knoxville, TN 38163, USA

Histories 2026, 6(1), 9; https://doi.org/10.3390/histories6010009

Submission received: 7 October 2025 / Revised: 26 December 2025 / Accepted: 14 January 2026 / Published: 20 January 2026

(This article belongs to the Section Digital and Computational History)

Download

Browse Figures

Versions Notes

Abstract

During the past decade, chatbots have been integrated into commercial platforms to facilitate second language acquisition (SLA) by providing opportunities for interactive conversations. However, SLA learner progress is limited by chatbots that lack the contextualization typically added by instructors to college and university courses. The present study focuses on a collaborative Digital Learning Incubator (DLI) project dedicated to creating and testing a chatbot with a physical form, or avatar chatbot, called Slabot (Second Language Acquisition Bot), in two upper-level university courses at the University of Tennessee, asynchronous online Spanish 331 (Introduction to Hispanic Culture), and in-person Spanish 434 (Hispanic Culture Through Film). Students in these two courses believe that their oral skills would benefit from more opportunities to speak in Spanish. To provide the students with more practice and instructors with a tool for assessing Spanish oral skills in online and in-person courses, the DLI project objective was to advance current avatar chatbot platforms by enabling Slabot to elicit student responses appropriate for evaluation according to the American Council on the Teaching of Foreign Languages (ACTFL) standards. An initial test of Slabot was conducted, and the results demonstrated the potential for Slabot to achieve the project objective.

Keywords:

avatar chatbot; asynchronous university course; El laberinto del Fauno (Pan’s Labyrinth); online learning; Second Language Acquisition (SLA)

1. Redefining the Role of Avatar Chatbots in Second Language Acquisition

The chatbot, a computer program that employs voice or text commands to simulate dialogues with human users, is an essential component of commercial second language acquisition (SLA) platforms, such as Babbel and Duolingo. However, SLA progress is often limited by commercial chatbots that do not advance users beyond repetitive exchanges because they are unable to provide the contextualization necessary for inspiring more substantive discourse. Such contextualization is typically an integral component of college and university SLA courses, although U.S. institutions of higher education have made little progress toward integrating chatbots. Achieving this goal may help instructors overcome a challenge faced in online and in-person SLA courses, namely the need to create regular opportunities for students to practice the second language orally. This study focuses on a collaborative Digital Learning Incubator (DLI) project at the University of Tennessee that created a chatbot with a physical form (more precisely, an avatar chatbot) called Second Language Acquisition Bot, or Slabot, which was tested in an upper-level SLA course at the University of Tennessee, asynchronous Spanish 331 (Introduction to Hispanic Culture). Evidence suggests that students in asynchronous Spanish 331 believe their oral skills would benefit from more opportunities to practice speaking Spanish. Slabot was thus designed to draw on a shared course context among all students and inspire them to produce substantive discourse beyond the repetitive level. The goal was not only to increase their opportunities for oral practice and to perfect their grammatical skills, but also to provide instructors with student responses of sufficient length for assessment according to the widely accepted American Council on the Teaching of Foreign Languages (ACTFL) standards.

2. Context

Spanish 331, which is taught completely in Spanish, develops language skills within cultural contexts that reflect the research interests of the instructor, as the catalog description of the course explains (University of Tennessee Undergraduate Catalog 2025):

Introduction to the fundamental historical, political, and demographic developments that led to the creation, geographic distribution, and distinctive character of Hispanic cultures with attention to those qualities that distinguish Hispanic culture from other cultures, as well as to ethnic and linguistic components of the Hispanic world in the present day.

The geographical focus of Spanish 331 varies from semester to semester and can center on Spain or Spanish America. Spanish 331 is offered in two formats: in-person and asynchronous. The present study will involve an asynchronous offering of Spanish 331 focusing on the culture of Spain from prehistory to the present day, which was delivered during a 16-week semester in the spring of 2025.

One invariant in Spanish 331 is an emphasis on developing comprehension, writing, and oral skills in Spanish. Comprehension is a skill that develops through attentive reading, an activity in which all students can participate by completing the reading assignments. Similarly, writing skills develop through the completion of tasks that reinforce vocabulary and themes discussed in the readings.

At the same time, the development of students’ oral skills in Spanish 331 presents a pedagogical challenge to in-person instructors, whose ability to engage students consistently is limited by time constraints, and to asynchronous instructors, who do not engage with students orally.

Comments made by students in their course evaluations of asynchronous Spanish 331 indicate that they seek more opportunities to speak Spanish. These comments are recorded and housed by the University of Tennessee’s Office of Institutional Effectiveness (https://utk.campuslabs.com/faculty/#/ (accessed on 2 January 2026)). One of the most poignant comments was made by a student evaluating asynchronous Spanish 331 in the fall of 2019:

I do not think that the class should be online. In my opinion, the class is not just learning about the history of Spain, it is also about practicing your Spanish listening and speaking skills. If this was not a component of the course then it would not be required for a Spanish minor. I do not think it is possible to effectively practice and continue to learn a language through a hybrid or online class.

A comment by a student after the spring of 2025 reveals a similar desire for oral practice in asynchronous Spanish 331: “This is an online class, but I feel like, even so, there were no chances to practice conversational or speaking skills. 100% of this work is reading, writing, and listening, so I had to hire a tutor to practice speaking for my study abroad trip to Costa Rica.” At the core of these comments lies a measure of discontent over insufficient interaction with the instructor, which aligns with a widely accepted pedagogical theory advanced by Long (1985), according to which interaction is paramount in promoting SLA. The incorporation of an avatar chatbot into asynchronous Spanish 331 gave students a new outlet for the interaction essential for developing their language skills.

3. Review of Related Literature

During the decades following Long’s seminal study, research by scholars (Jdetawy 2011; Rabab’ah 2003) reveals that the development of language skills is hindered by sporadic rather than regular opportunities to speak a second language. Other scholars (Zrekat et al. 2016; Zrekat and Al-Sohbani 2022) have demonstrated that students frequently believe they do not receive sufficient opportunities to practice speaking the second language in SLA courses.

Research (Pan and Steed 2016; Peterson 2005) has indicated the effectiveness of avatar chatbots in improving SLA by facilitating interaction. Additionally, a link has been forged between the use of avatar chatbots and other AI tools and greater accuracy in language assessment (Aneja et al. 2021; Betal 2023). Scholars (Hyde et al. 2015; Schrader 2019; Steptoe et al. 2010; Tien Tan et al. 2025) have also found that the effectiveness of avatar chatbots is enhanced by incorporating animation techniques and lip-syncing into their design, by personalizing them (with the instructor’s voice, for example), and by linking avatar chatbots to learning contexts that draw on course content (Vladova et al. 2019).

4. Method

The learning context in the present study is a unit in asynchronous Spanish 331 on the Spanish Civil War (1936–1939) and the dictatorship of Francisco Franco (1939–1975), which requires viewing del Toro’s (2006) feature-length film, El laberinto del Fauno (Pan’s Labyrinth). Slabot was designed to complement the mythical post-Civil War world in El laberinto del Fauno, which is centered around an abandoned labyrinth and a mysterious creature, “Fauno.” For a discussion of the Fauno character, see https://screenrant.com/pans-labyrinth-movie-faun-creature-explained/#:~:text=The%20Faun%20Creature%20from%20Pan’s,traits%20mixed%20together%20more%20cohesively (accessed on 2 January 2026). Physically, Fauno is a visually striking character that combines human and animal characteristics, as can be seen in Figure 1.

To increase its appeal to students, Slabot was conceived at the University of Tennessee as a university-age avatar by the DLI collaborators, including Dr. Gregory B. Kaplan (Professor of Spanish) as faculty leader, and a technical production team lead by Dr. Jason Johnston (Executive Director of Online Learning), and comprised Naomi Breeden (Executive Director of Solutions), Michael Eilers (Graphic Designer), Chris Emberton (Assistant Director of Interactive Learning Technologies), Jonathan Fuqua (Web Application Developer), and Dani Powers (Graphic Designer). An initial attempt to create the physical form of Slabot was made using Creatify (https://app.creatify.ai/home, accessed on 2 January 2026), a platform like ManyChat (http://manychat.com, accessed on 2 January 2026), and Tars (https://hellotars.com/, accessed on 2 January 2026). The process required entering text describing the anticipated physical features of Slabot, as well as descriptions of Slabot’s clothing and the background that would appear:

Physical features: A visually striking non-binary character that combines human and animal characteristics. Small horns on the head. Light green skin. Human torso. Goat legs and hooves. Narrow eyes slanted inward, with a hint of playfulness. A cheerful, non-threatening, and studious look on its face. White, human teeth. A face that is half human and half goat, a tall and lean body, and a deep, rough voice.

Clothing: A dark green cloak that covers the whole body.

Background: The entrance to a labyrinth made from thick hedges.

The avatar produced by Creatify is seen in Figure 2.

The physical appearance of the avatar in Figure 2 was ultimately deemed non-user-friendly, so a second attempt was made to create a more user-friendly Slabot. As seen in Figure 3 and Figure 4, Slabot was endowed with congenial physical features. To achieve the physical form of Slabot, Adobe Character Animator was used during a process described by Michael Eilers, the University of Tennessee graphic designer who accomplished the task.

First, the DLI team finalized the character design for Slabot, whose physical appearance consists of a youthful figure with a head half-human and half-goat, bright, playful eyes, and a cheerful, non-threatening expression on its face. I then used Adobe Character Animator to build the avatar. One of the more challenging aspects of building Slabot was isolating the mouth visemes and making sure that these visemes correlated with the phonetics of the audio track. A viseme is a visual representation of a phoneme, which is a basic unit of speech sound. In other words, a viseme comprises the mouth shape and facial expression that correspond to a particular sound or word when lip-reading or animating a character. Visemes are important for lip reading, animation, and audio–visual speech recognition. During the isolation process, constantly checking the timing between mouth visemes and the audio was crucial. In sum, I was able to generate a “rough” animation with the audio files provided and clean up the timing of Slabot’s mouth by modifying visemes within the timeline.

The audio files to which Michael Eilers refers above contain the discourse that Slabot would speak to students. Considering evidence (Tien Tan et al. 2025) confirming that students prefer avatar chatbots with characteristics of the instructor, the instructor of 331 gave his voice to Slabot. Slabot’s discourse was designed to engage students in Spanish by always ending its replies to students with a question in Spanish. In addition, two basic functions for Slabot were established: 1. Greeting students; 2. Asking students to discuss themes related to El laberinto del Fauno.

During a stage of this project to be completed in the fall of 2025, the two basic functions will be manifested by a greeting and five questions asked by Slabot. Students will control the pace of the interview by manually pushing the stop and play buttons and will be able to hear each question more than once if needed. However, students will only be permitted one recorded response per question.

The first five questions will be formulated to inspire students to progress through levels of profundity in their responses, from repetitive details to speculative discourse:

Hola, voy a hacerte cinco preguntas sobre la película El laberinto del Fauno.

(Hello, I will ask you five questions about the Pan’s Labyrinth movie.)

Primera pregunta: En la película, el papel de Fauno es guiar a Ofelia en sus tres tareas. ¿Cuáles son esas tareas?

(First question: In the movie, Fauno’s role is to guide Ofelia in her three tasks. What are those three tasks?”)

Segunda pregunta: El director de la película, Guillermo del Toro, mezcla un mundo real con otro imaginario, ¿qué tienen en común ambos mundos?

(Second question: The director of the film, Guillermo del Toro, mixes a real world with an imaginary one: what do the two worlds have in common?)

Tercera pregunta: ¿Cuáles son algunas características del capitán Vidal?

(Third question: What are some of the characteristics of Captain Vidal?)

Cuarta pregunta: ¿Cómo trata el capitán Vidal a Ofelia?

(Fourth question: How does Captain Vidal treat Ofelia?)

Quinta pregunta: En tu opinión, ¿cuál es la moraleja de la película?

(Fifth question: In your opinion, what is the film’s moral?)

The final two questions will require responses from students about an unrehearsed context, specifically a video (https://www.youtube.com/watch?v=T-j-G2GiE6I (accessed on 2 January 2026)) about which they know nothing but which they will view after their interaction with Slabot. In the video, an individual who experienced the Spanish Civil War and Franco’s dictatorship firsthand, Miguel Muñoz (b. 1925–d. 2022), is interviewed and asked to explain some of the hardships that he endured during the 1940s (El laberinto del fauno takes place in 1944). Slabot will ask the following two questions:

Sexta pregunta: “El laberinto del Fauno tiene lugar en 1944. En unos días verás un vídeo sobre Miguel Muñoz, un hombre que vivió en España durante los años 40. ¿Qué crees que Miguel dirá sobre cómo era su vida entonces?”

(Sixth question: “El laberinto del Fauno takes place in 1944. In a few days you’ll see a video about Miguel Muñoz, a man who lived in Spain during the 1940s. What do you think Miguel will say about what his life was like back then?”)

Séptima pregunta: “Miguel Muñoz nació en 1925. ¿Cómo le describirías a Miguel cómo su vida podría verse afectada por los acontecimientos mundiales —políticos, sociales, económicos o militares—entre 1925 y 1945?

(Seventh question: “Miguel Muñoz was born in 1925. How would you describe to Miguel how his life might be affected by world events—political, social, economic, or military—between 1925 and 1945?”)

The capacity of Slabot to evoke substantive responses from students by guiding them through five contextualized questions may be evident in their recorded responses, which will be evaluated in the fall of 2025 according to the ACTFL ratings.

The ACTFL ratings are used at the University of Tennessee and at many other higher education institutions to assess the oral abilities of students in SLA courses. The ratings are assigned during an Oral Proficiency Interview (OPI):

The ACTFL Oral Proficiency Interview (OPI) is a valid and reliable means of assessing how well a person speaks a language. It is a 15–30-min one-on-one interview between you and a certified ACTFL tester. The OPI is an assessment that is carried out in the form of an interview, but follows an established structure and protocol in order to elicit a ratable speech sample.
(ACTFL OPI Examinee Handbook 2024, p. 4)

Based on an evaluation during the OPI of what ACTFL guidelines describe as “the ability to use language that reflects practical communication tasks and that has been learned and practiced in an instructional or other structured setting” (ACTFL Proficiency Guidelines 2024, p. 7), the assessor can award an ACTFL proficiency rating: Distinguished, Superior, Advanced High, Advanced Mid, Advanced Low, Intermediate High, Intermediate Mid, Intermediate Low, Novice High, Novice Mid, Novice Low. Descriptions of how each of these ratings is assigned are found in ACTFL Proficiency Guidelines (2024, pp. 15–23).

In addition to an in-person OPI, ACTFL also offers a virtual option:

The ACTFL Oral Proficiency Interview-Computer (ACTFL OPIc) is a proctored, internet-delivered test of oral communication. It imitates the experience of a “live” ACTFL Oral Proficiency Interview (OPI) in a virtual format. Interview questions are selected by a carefully designed computer program and delivered using a virtual avatar. On average, the ACTFL OPIc takes 20 to 40 min to complete.

The goal of the ACTFL OPIc is the same as that of the ACTFL OPI: to obtain a speech sample that a rater can evaluate in relation to the ACTFL Proficiency Guidelines (2024)—Speaking in order to assign a rating. The recordings of the test taker’s responses are made available electronically through a secure internet site to ACTFL-certified OPIc raters. The ACTFL OPIc measures a range of proficiency on the ACTFL scale from Novice to Superior.

The ACTFL OPIc was developed in response to increasing worldwide demand for oral language proficiency testing that is appropriate for both small-group and large-scale testing. It provides valid and reliable oral proficiency assessment in a format that allows hundreds of examinees to take the test online at the same time; it can be completed on demand from anywhere in the world, and at a time that is convenient for both the candidate and the proctor.
(ACTFL Oral Proficiency Interview-Computer 2024, p. 3)

The ACTFL OPIc involves integration with an avatar, samples of which can be seen in Figure 5:

The ACTFL OPIc avatar substitutes for an in-person interviewer in the following manner:

When the test taker is ready, the ACTFL OPIc begins with the avatar stating: “Let’s start the interview now. Tell me something about yourself.” This serves as a warm-up and an opportunity for the test taker to begin using the language and to interact with the avatar before the main test begins. This warm-up activity is not rated.

The ACTFL OPIc then proceeds with the avatar asking randomly selected questions from within the predetermined pool of prompts and the test taker providing responses. After completing the last response, the test taker sees an ending screen with the message “Congratulations! You have successfully completed your test.” The test taker’s recorded speech sample is then automatically uploaded to a secure rater site.”
(ACTFL Oral Proficiency Interview-Computer 2024, p. 5)

Like the in-person OPI, the ACTFL OPIc measures what ACTFL classifies as proficiency:

Proficiency describes an individual’s ability to use the language in all types of situations, with regard to topics that may or may not be familiar and in contexts that may or may not have been encountered previously. Proficiency refers to what an individual is able to do regardless of the setting, or where, when, and how the language was learned.
(ACTFL Proficiency Guidelines 2024, p. 6)

As in the case of the ACTFL OPIc, the seven questions that Slabot will ask include “topics that may or may not be familiar and in contexts that may or may not have been encountered previously.” Students in Spanish 434 will be required to discuss El laberinto del Fauno, which they have viewed, and Miguel Muñoz, with whom they are not familiar.

5. Results

An initial test of Slabot was performed in asynchronous Spanish 331 in late April 2025. The test was associated with two asynchronous classes and was an optional activity, whose successful completion counted for 1% of extra credit. During the two asynchronous classes, students viewed El laberinto del Fauno. Students were then given several days to respond to Slabot’s first question: “Primera pregunta: En la película, el papel de Fauno es guiar a Ofelia en sus tres tareas. ¿Cuáles son esas tareas? (First question: In the movie, Fauno’s role is to guide Ofelia in her three tasks. What are those three tasks?)”

Out of the 34 students in asynchronous Spanish 331, five students completed the extra credit assignment. The times of the five recorded responses, in chronological order of submission, were as follows: 33 s (2 May 2025); 17 s (5 May 2025); 41 s (8 May 2025); 11 s (9 May 2025); 1 min and 10 s (9 May 2025). Their responses to Slabot’s question therefore ranged in time from 11 s to 1 min and 10 s, although one student, whose response lasted 11 s, failed to respond to Slabot’s question.

Student responses were transcribed by the instructor, who provided each student with a transcription of their response and a version containing items needing correction indicated in bold type and corrected by the instructor. It should be noted that students were not provided with the English translations included in parentheses below.

The students, in chronological order of submission, received individualized feedback from the instructor by email, which was preceded by a generic explanation of what they would be receiving:

Gracias por su respuesta. Tenga en cuenta los puntos que necesitan corrección, los cuales se indican en la versión de su respuesta que incluye texto en negrita y que se corrigen posteriormente:

(Thank you for your response. Please be aware of ítems that need correction, which are indicated in the version of your response that includes bold type and which are corrected afterwards:)

Response 1, 33 s (2 May 2025)

Ofelia tiene tres tareas. Primero, ella necesita obtener una llave y una rana. Segundo, ella necesita usar la llave para obtener un cuchillo sin comiendo la cena. Tercero, ella necesita usar un cuchillo y robar a su hermano menor para sacrifisar la sangre de su hermano con el cuchillo.

(Ofelia has three tasks. First, she needs to obtain a key and a frog. Second, she needs to use the key to obtain a knife without eating dinner. Third, she needs to use the knife to steal from her younger brother and sacrifice his blood with the knife.)

Ofelia tiene tres tareas. Primero, ella necesita obtener una llave y una rana. Segundo, ella necesita usar la llave para obtener un cuchillo sin comer [comiendo] la cena. Tercero, ella necesita usar un cuchillo y robar a su hermano menor para sacrificar [sacrifisar] la sangre de su hermano con el cuchillo.

Verbal form error:

Use of the gerund “comiendo” instead of “comer”

Pronunciation error:

“sacrificar” not “sacrifisar”

Response 2, 17 s (5 May 2025)

En la película sus tres tareas eran conseguir una llave, robar un, un cuchillo y derramar un, una gota de sangre inocente.

(In the movie, her three tasks were to get a key, steal a knife, and spill a, a drop of innocent blood.)

Muy bien hecho. Tu pronunciación es excelente. Además, usas el género correcto en todos los casos.

(Very well done. Your pronunciation is excellent. Also, you use the correct gender in all cases.)

Response 3, 41 s (8 May 2025)

El primero es que debe que recuperar una llave del estómago de un gran sapo que vive abajo de un arból. El segundo es que tiene que recuperar una daga de una habitación de un hombre palído y debe no comer nada de la mesa. Y el tercero es debe llevar a su hermano al laberinto para completar el ritual último.

(The first is that she must retrieve a key from the stomach of a giant toad that lives beneath a tree. The second is that she must retrieve a dagger from a pale man’s room and must not eat anything from the table. And the third is that she must lead his brother into the labyrinth to complete the final ritual.)

El primero es que debe recuperar una llave del estómago de un gran sapo que vive abajo de un árbol [arból]. El segundo es que tiene que recuperar una daga de una habitación de un hombre pálido [palído] y debe no comer nada de la mesa. Y el tercero es [que] debe llevar a su hermano al laberinto para completar el último ritual [ritual último].

Errors involving conjunctions:

“debe recuperar” not “debe que recuperar”

Pronunciation errors:

“árbol” not “arból”

“pálido” not “palído”

Syntax error:

“último ritual” not “ritual último”

Response 4, 11 s (9 May 2025)

Para esta tarea yo tenía que mirar la película y yo respondí preguntas sobre la película.

(For this task I had to watch the movie, and I answered questions about the movie.)

Response 5, 1 min and 10 s (5 May 2025)

Hola, me llamo XXXXXX, en la película de El laberinto del Fauno, la Fauno le da a Ofelia tres tareas para completar. La primera tarea es ella necesita obtener un llave y, la llave, una llave. Y la llave está en el estómago de un sapo. Y el sapo está alrededor un arbol. La segunda tarea es ella necesita obtener un daga y, una daga, y la daga está con el hombre palido. El hombre palido es muy alta y tiene, tiene, el tiene ojos en sus manos. Y la tarea tercera es ella necesita mostrar su obediencia y pureza, pureza, porque, porque la Fauna pide que, pide que Ofilia mata, mata su hermano. Y porque él es un bebé ella, ella no mata su, su hermano y este es una manera para mostrar, para mostrar su obediencia y pureza. Gracias.

(Hello, my name is XXXXXX. In the movie Pan’s Labyrinth, the Faun gives Ophelia three tasks to complete. The first task is she needs to get a key, and the key, a key. And the key is in the stomach of a toad. And the toad is around a tree. The second task is she needs to get a dagger, a dagger, and the dagger is with the pale man. The pale man is very tall and has eyes on his hands. And the third task is she needs to show her obedience and purity, purity, because, because Fauna asks that Ophelia kill, kills her brother. And because he is a baby, she, she doesn’t kill her, her brother, and this is a way to show, to show her obedience and purity. Thank you.)

Hola, me llamo XXXXXX, en la película de El laberinto del Fauno, la Fauno le da a Ofelia tres tareas para completar. La primera tarea es [que] ella necesita obtener un llave y, la llave, una llave. Y la llave está en el estómago de un sapo. Y el sapo está alrededor [de] un árbol [arból]. La segunda tarea es [que] ella necesita obtener un daga y, una daga, y la daga está con el hombre pálido [palído]. El hombre pálido [palído] es muy alta y tiene, tiene, el tiene ojos en sus manos. Y la tarea tercera es [que] ella necesita mostrar su obediencia y pureza, pureza, porque, porque la Fauna pide que, pide que Ofelia [Ofilia] mata [mate], mata [mate] [a] su hermano. Y porque él es un bebé ella, ella no mata [a] su, su hermano y este es una manera para mostrar, para mostrar su obediencia y pureza. Gracias.

Gender disagreement:

“el Fauno” not “la Fauno”

“una llave” not “un llave”

“una daga” not “un daga”

“alto” (which modifies “hombre pálido”) not “alta”

“el Fauno” not “la Fauna”

“esta es una manera” not “este es una manera”

Missing conjunctions:

“es [que] ella necesita obtener”

“es [que] ella necesita mostrar”

Missing prepositions:

“[de] un árbol”

“[a] su hermano”

Mood error:

“pide que, pide que Ofelia mata [mate], mata [mate] [a] su hermano” (the third person present subjunctive form “mate” is required after the verb “pide”).

Pronunciation errors:

“árbol” not “arból”

“pálido” not “palído”

“Ofelia” not “Ofilia”

Syntax error:

“tarea tercera” instead of “tercera tarea.”

Slabot’s ability to assist students in developing their oral SLA skills is evident in a synthesized comparison of the lengths of their responses with their aggregated grammatical errors:

Response 4: 11 s, no errors.

Response 2: 17 s, no errors.

Response 1: 33 s, 2 errors.

Response 3: 41 s, 4 errors.

Response 5: 1 min, 10 s, 14 errors.

This comparison illustrates a phenomenon frequently observed by instructors, namely, that students commit more grammatical errors as their responses grow longer. At the same time, students were also able to correct themselves during their responses, as revealed in response 2 (in which the student switched from the incorrect masculine form “un” to the correct feminine form “una” in the phrase “derramar un, una gota de sangre inocente”[spill a, a drop of innocent blood]), response 3 (in which the student included the conjunction “que” [that] in the phrase “es que tiene que recuperar” [is that she must recover]), and in response 5 (in which the student switched to the correct feminine form of the indefinite article “una” instead of the incorrect masculine form “un” to modify the feminine noun “daga” [dagger] in the phrase “un daga y, una daga” [a dagger, a dagger]).

Assessments such as these would undoubtedly be more difficult to complete after in-person interviews without recorded responses, which underscores an advantage of using an avatar chatbot to conduct interviews. In other words, the fact that the interviewer is not present and occupies the singular role of observer, rather than serving as both observer and recorder, offers a perspective by which detailed grammatical corrections may be provided.

6. Discussion

A principal difference between the interview conducted by Slabot and the one conducted by the ACTFL OPIc avatar concerns the process of assessment, for which ACTFL establishes the following norms:

An assessment of performance determines whether an individual’s language use demonstrates the ability to meet the criteria for a particular level when completing a task type within familiar contexts and content areas. Performance assessment asks individuals to apply the language functions and vocabulary that they have learned and practiced during instruction.
(ACTFL Proficiency Guidelines 2024, p. 7)

The ACTFL “assessment asks individuals to apply the language functions and vocabulary that they have learned and practiced during instruction,” which is the task that Slabot encourages students to do by asking questions concerning the film they have recently viewed. Context is, then, essential for the determination of whether “language functions and vocabulary” have been mastered and whether a spoken “language” is being used.

However, scholars have identified a problem with the ACTFL method of contextualization for both the in-person OPI and the ACTFL OPIc. Specifically, the problem identified by Bachman is that the ACTFL process introduces a measure of “bias” by the interviewer:

If we permit method facets to vary from interview to interview in an uncontrolled manner, they are sources of random measurement error. Test method facets such as the specific content areas included and illocutionary acts, for example, may vary from interviewer to interviewer. One way to minimize random error is to standardize the procedures, thereby controlling the conditions under which the interview is given, so that they are the same for all interviews. In the OPI, this standardization involves carefully training interviewers so that they: (a) use similar elicitation procedures; (b) use the same definitions for rating the performances of different individuals; and (c) administer the interview in the same environment and in approximately the same amount of time. However, the problem created by standardization is that controlling these method facets causes them to have a systematic effect on test scores. That is, they affect the performance of all individuals who are interviewed, and thus constitute potential sources of test bias.
(Bachman 1988, p. 153)

Bachman asserts that the “performance” of interviewers leads to “test bias,” which is echoed by Betal, who believes that in-person “language assessments are often time-consuming, subjective, and limited in scope” (2023). The elimination of this bias is a factor that distinguishes the interview conducted by Slabot from the ACTFL OPI and OPIc. Slabot will diminish the “bias” in question by providing a consistent context, such as the viewing of a film, shared by all students in a course. Unlike a randomly generated avatar such as the one in the ACTFL OPIc, the physical features of Slabot relate directly to the context in which the students are immersed.

7. Conclusions

Although the initial test of Slabot did not involve many students, those who did respond demonstrated the usefulness of this avatar chatbot in documenting the status of students’ grammatical skills during oral discourse. In the future, it is anticipated that Slabot, rather than the instructor, will produce transcriptions of student responses, allowing the instructor to dedicate additional time to performing tasks such as explaining corrections.

During the fall of 2025, the objective will be to determine whether more accurate indications of ratable ACTFL speech samples might be produced after answering five questions. Rather than being an extra-credit assignment, interaction with Slabot will be a required activity in the fall of 2025 in another course in which students view El laberinto del fauno, in-person Spanish 434 (Hispanic Culture Through Film). Spanish 434, also taught entirely in Spanish, is an upper-level in-person course typically enrolling 20–25 students that is usually taught twice per week (Tuesday and Thursday), as two 75-min classes during a 16-week semester. As in the case of Spanish 331, the catalog description of Spanish 434 comprises generalized discourse to reflect the fact that the content of the course varies from one instructor to another (University of Tennessee Undergraduate Catalog 2025):

Analysis of selected films on subjects concerning life, culture, and artistic traditions in the Hispanic world; exploration of ideological, philosophical, social, and political implications of films and a comparison of them with treatments of related subjects in other types of artistic production.

The offering of Spanish 434 in the fall of 2025 will focus on the Spanish Civil War and Francisco Franco’s dictatorship. Students will read several texts in Spanish and view their cinematographic adaptations. Before interacting with Slabot, students will read Benicio del Toro’s screenplay for El laberinto del Fauno (del Toro 2005) and then view the film. The fact that students will be required to interact with Slabot in Spanish 434 might add some length to their responses.

Although five students do not constitute a large sample, the fact that one student’s response in the spring of 2025 reached 1 min and 10 s indicates the potential for Slabot to perform a pioneering function for an avatar chatbot, namely, to elicit ACTFL ratable speech samples. It is reasonable to speculate that some students exceed 1 min and 10 s and reach 1 min, 30 s, or even two minutes. In addition, if more questions were added, it might be possible to predict an approximate length of student responses. If this were possible, it would also be possible to estimate the number of questions needed to obtain the 15–30-min ratable speech sample required for an OPI and the assignment of an ACTFL rating.

Speculation that future responses might reach 15–30 min is warranted considering the pedagogical benefits of “[e]merging technologies” such as avatar chatbots, which, as St. Fountoulakis explains, “present new opportunities for teaching languages by creating immersive learning environments that significantly increase student engagement” (St. Fountoulakis 2024, p. 21). Moreover, conclusions reached by Thompson et al. (2016, p. 90), namely, that students who prefer the ACTFL OPIc interview process over the in-person interview believe “that the computer’s nonjudgmental presence and the opportunity to have questions repeated multiple times allowed them to be more relaxed and thus better able to demonstrate their actual speaking ability,” may be an important factor in contributing to responses from student interaction with Slabot that are appropriate for evaluation according to ACTFL standards.

Of course, accurate measurements of Slabot’s success will only be possible after several semesters of testing have taken place. However, initial evidence indicates that students benefit from interacting orally with Slabot in Spanish. At a time when higher education institutions, often out of economic necessity, are offering SLA courses that enroll larger numbers of students, the use of avatar chatbots like Slabot, which unify the interview process through contextualization, may prove to be the most effective method for reaching all students in a course. The Slabot experiment model may ultimately prove to be a versatile tool for instructors of SLA courses in any language and, more broadly, for any course in which interaction is crucial for student success.

Funding

The APC was funded by a University of Tennessee Distinguished Professor in the Humanities account.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The author declares no conflict of interest.

References

ACTFL OPI Examinee Handbook. 2024. Available online: https://www.languagetesting.com/pub/media/wysiwyg/PDF/opi-examinee-handbook.pdf (accessed on 2 January 2026).
ACTFL Oral Proficiency Interview-Computer Familiarization Guide. 2024. Available online: https://www.actfl.org/uploads/files/general/OPIc_Familiarization_Guide.pdf (accessed on 2 January 2026).
ACTFL Proficiency Guidelines. 2024. Available online: https://www.actfl.org/uploads/files/general/Resources-Publications/ACTFL_Proficiency_Guidelines_2024.pdf (accessed on 2 January 2026).
Aneja, Deepali, Rens Hoegen, Daniel McDuff, and Mary Czerwinski. 2021. Understanding Conversational and Expressive Style in a Multimodal Embodied Conversational Agent. Paper presented at 2021 CHI Conference on Human Factors in Computing Systems, New York, NY, USA, May 8–13; pp. 1–10. [Google Scholar] [CrossRef]
Bachman, Lyle. 1988. Problems in examining the validity of the ACTFL Oral Proficiency Interview. Studies in Second Language Acquisition 10: 149–64. [Google Scholar] [CrossRef]
Betal, Asim K. 2023. Enhancing Second Language Acquisition through Artificial Intelligence (AI): Current insights and future directions. Journal for Research Scholars and Professionals of English Language Teaching 7. [Google Scholar] [CrossRef]
del Toro, Guillermo. 2005. El laberinto del Fauno. Available online: https://julianwhiting.wordpress.com/wp-content/uploads/2014/04/panslabyrinthspanishscreenplay.pdf (accessed on 2 January 2026).
del Toro, Guillermo. 2006. El laberinto del fauno (Pan’s Labyrinth). Los Angeles: Warner Bros. [Google Scholar]
Hyde, Jennifer, Elizabeth J. Carter, Sara Kiesler, and Jessica K. Hodgins. 2015. Using an interactive avatar’s facial expressiveness to increase persuasiveness and socialness. Paper presented at the 33rd Annual ACM Conference on Human Factors in Computing Systems, Seoul, Republic of Korea, April 18–23; pp. 1719–28. [Google Scholar] [CrossRef]
Jdetawy, Loae. 2011. Problems encountered by Arab EFL learners. Language in India 11: 19–27. [Google Scholar]
Long, Michael H. 1985. Input and Second Language Acquisition Theory. In Input in Second Language Acquisition. Edited by Susan Gass and Carolyn Madden. Rowley: Newbury House, pp. 377–93. Available online: https://www.scribd.com/doc/179333902/Long-1985-Input-and-Second-Language-Acquisition-Theory-pdf (accessed on 2 January 2026).
Pan, Ye, and Anthony Steed. 2016. A Comparison of Avatar-, Video-, and Robot-Mediated Interaction on Users’ Trust in Expertise. Frontiers in Robotics and AI 3: 12. [Google Scholar] [CrossRef]
Peterson, Mark. 2005. Learning interaction in an avatar-based virtual environment: A preliminary study. PacCALL Journal 1: 29–40. Available online: https://www.researchgate.net/publication/228652431_Learning_interaction_in_an_avatar-based_virtual_environment_A_preliminary_study (accessed on 2 January 2026).
Rabab’ah, Ghaleb. 2003. Communication problems facing Arab learners of English. Journal of Language and Learning 3: 180–97. Available online: https://www.researchgate.net/publication/228380118_Communication_problems_facing_Arab_learners_of_English (accessed on 2 January 2026).
Schrader, Claudia. 2019. Creating avatars for technology usage: Context matters. Computers in Human Behavior 93: 219–25. [Google Scholar] [CrossRef]
Steptoe, William, Anthony Steed, Aitor Rovira, and John Rae. 2010. Lie tracking: Social presence, truth and deception in avatar-mediated telecommunication. Paper presented at the SIGCHI Conference on Human Factors in Computing Systems, New York, NY, USA, April 10–15; pp. 1039–48. [Google Scholar] [CrossRef]
St. Fountoulakis, Michail. 2024. Evaluating the Impact of AI Tools on Language Proficiency and Intercultural Communication in Second Language Education. International Journal of Second and Foreign Language Education 3: 12–26. [Google Scholar] [CrossRef]
Thompson, Gregory L., Troy L. Cox, and Nieves Knapp. 2016. Comparing the OPI and the OPIc: The effect of test method on oral proficiency scores and student preferences. Foreign Language Annals 49: 75–92. [Google Scholar] [CrossRef]
Tien Tan, Chek, Indriyati Atmosukarto, Budianto Tandianus, Songjia Shen, and Steven Wong. 2025. Exploring the Impact of Avatar Representations in AI Chatbot Tutors on Learning Experiences. Paper presented at the 2025 CHI Conference on Human Factors in Computing Systems, Yokohama, Japan, April 26–May 1; pp. 1–12. [Google Scholar] [CrossRef]
University of Tennessee Undergraduate Catalog. 2025. Available online: https://admissions.utk.edu/why-ut/undergraduate-c (accessed on 2 January 2026).
Vladova, Gergana, Jennifer Hasse, Leo Sylvio Rüdian, and Niels Pinkwart. 2019. Educational Chatbot with Learning Avatar for Personalization. Twenty-Fifth Americas Conference on Information Systems 28: 1–5. Available online: https://www.researchgate.net/profile/Jennifer-Haase/publication/334279582_Educational_Chatbot_with_Learning_Avatar_for_Personalization/links/5d666917299bf11adf2748c8/Educational-Chatbot-with-Learning-Avatar-for-Personalization.pdf?origin=publication_detail&_tp=eyJjb250ZXh0Ijp7ImZpcnN0UGFnZSI6InB1YmxpY2F0aW9uIiwicGFnZSI6InB1YmxpY2F0aW9uRG93bmxvYWQiLCJwcmV2aW91c1BhZ2UiOiJwdWJsaWNhdGlvbiJ9fQ&__cf_chl_tk=DzGPxRXbPhNStY8B6B02wLTnj97em.Gcn.kdph6dB0Q-1762982266-1.0.1.1-fPrnQnJjr4EbnukRih4OGgYwfTf3TqS_xzo7HqZcnl0 (accessed on 2 January 2026).
Zrekat, Yousef, and Yehia Al-Sohbani. 2022. Arab EFL University Learners’ Perceptions of the Factors Hindering Them to Speak English Fluently. Journal of Language and Linguistic Studies 18: 775–90. Available online: https://www.jlls.org/index.php/jlls/article/view/3642 (accessed on 2 January 2026).
Zrekat, Yousef, Nadzrah Abu Bakar, and Hafizah Latif. 2016. The Level of Anxiety among Jordanian EFL Undergraduates in Oral Communication Performance. Arab World English Journal 7: 188–202. [Google Scholar] [CrossRef]

Figure 1. Fauno from the film El laberinto del Fauno.

Figure 2. The initial attempt to create Slabot using Creatify.

Figure 3. A moment in the process described by Michael Eilers.

Figure 4. The final version of Slabot.

Figure 5. Sample images of the ACTFL OPIc avatar (ACTFL Oral Proficiency Interview-Computer 2024, p. 5).

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Kaplan, G.B. Redefining the Role of Avatar Chatbots in Second Language Acquisition. Histories 2026, 6, 9. https://doi.org/10.3390/histories6010009

AMA Style

Kaplan GB. Redefining the Role of Avatar Chatbots in Second Language Acquisition. Histories. 2026; 6(1):9. https://doi.org/10.3390/histories6010009

Chicago/Turabian Style

Kaplan, Gregory B. 2026. "Redefining the Role of Avatar Chatbots in Second Language Acquisition" Histories 6, no. 1: 9. https://doi.org/10.3390/histories6010009

APA Style

Kaplan, G. B. (2026). Redefining the Role of Avatar Chatbots in Second Language Acquisition. Histories, 6(1), 9. https://doi.org/10.3390/histories6010009

Article Menu

Redefining the Role of Avatar Chatbots in Second Language Acquisition

Abstract

1. Redefining the Role of Avatar Chatbots in Second Language Acquisition

2. Context

3. Review of Related Literature

4. Method

5. Results

6. Discussion

7. Conclusions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI