Generative AI-Enhanced Virtual Reality Simulation for Pre-Service Teacher Education: A Mixed-Methods Analysis of Usability and Instructional Utility for Course Integration

Hong, Sumin; Moon, Jewoong; Eom, Taeyeon; Awoyemi, Idowu David; Hwang, Juno

doi:10.3390/educsci15080997

Open AccessArticle

Generative AI-Enhanced Virtual Reality Simulation for Pre-Service Teacher Education: A Mixed-Methods Analysis of Usability and Instructional Utility for Course Integration

by

Sumin Hong

¹

,

Jewoong Moon

^2,*

,

Taeyeon Eom

¹

,

Idowu David Awoyemi

²

and

Juno Hwang

³

¹

Department of Education, Seoul National University, Seoul 08826, Republic of Korea

²

Department of Educational Leadership, Policy, and Technology Studies, The University of Alabama, Tuscaloosa, AL 35401, USA

³

Department of Physics Education, Seoul National University, Seoul 08826, Republic of Korea

^*

Author to whom correspondence should be addressed.

Educ. Sci. 2025, 15(8), 997; https://doi.org/10.3390/educsci15080997

Submission received: 29 June 2025 / Revised: 25 July 2025 / Accepted: 1 August 2025 / Published: 5 August 2025

(This article belongs to the Special Issue Emerging Technologies in Teaching and Learning to Solve Real-World Challenges: Perspectives, Challenges, and Future Directions)

Download

Browse Figures

Versions Notes

Abstract

Teacher education faces persistent challenges, including limited access to authentic field experiences and a disconnect between theoretical instruction and classroom practice. While virtual reality (VR) simulations offer an alternative, most are constrained by inflexible design and lack scalability, failing to mirror the complexity of real teaching environments. This study introduces TeacherGen@i, a generative AI (GenAI)-enhanced VR simulation designed to provide pre-service teachers with immersive, adaptive teaching practice through realistic GenAI agents. Using an explanatory case study with a mixed-methods approach, the study examines the simulation’s usability, design challenges, and instructional utility within a university-based teacher preparation course. Data sources included usability surveys and reflective journals, analyzed through thematic coding and computational linguistic analysis using LIWC. Findings suggest that TeacherGen@i facilitates meaningful development of teaching competencies such as instructional decision-making, classroom communication, and student engagement, while also identifying notable design limitations related to cognitive load, user interface design, and instructional scaffolding. This exploratory research offers preliminary insights into the integration of generative AI in teacher simulations and its potential to support responsive and scalable simulation-based learning environments.

Keywords:

teacher education; generative AI; virtual reality simulation; preservice teachers; simulation-based learning

1. Introduction

Teacher education increasingly demands innovative solutions to prepare future educators for the complexities of contemporary classrooms (Grossman et al., 2009). Traditional field experiences—long considered a cornerstone of teacher preparation—are often constrained by limited access, uneven quality, and logistical challenges, especially in under-resourced or geographically isolated contexts (Darling-Hammond, 2017). In response, simulation-based learning has emerged as a promising supplement to real-world practice (Dawson & Lignugaris/Kraft, 2017; L. A. Dieker et al., 2014). However, existing virtual reality (VR) simulations in teacher education remain limited by rigid, scripted scenarios that fail to capture the dynamic and relational nature of classroom teaching (Bautista & Boone, 2015).

This disconnect highlights a critical innovation gap: while VR platforms offer structured practice opportunities, they lack the adaptability and realism necessary to approximate authentic teaching (Dalgarno et al., 2016; Theelen et al., 2019). Emerging advances in generative artificial intelligence (GenAI), however, present new opportunities to reimagine what simulation-based teacher training can achieve. By enabling AI-enhanced agents to simulate responsive, lifelike student behaviors and interactions (Lim et al., 2025; Jeong et al., 2024; N. Zhang et al., 2025), GenAI has the potential to render educational simulations more dynamic, adaptive, and pedagogically aligned (Moon et al., 2025; Zawacki-Richter et al., 2019). Despite its promise, the integration of GenAI into teacher simulation remains in its infancy, with limited empirical understanding of how such technologies can be effectively designed and implemented within authentic instructional contexts (Moon et al., 2025).

This study accordingly represents an exploratory investigation into the integration of a GenAI-enhanced teaching simulation—TeacherGen@i—within a pre-service teacher education program. Rather than testing outcomes or efficacy in a strict experimental sense, the study seeks to examine how such a tool functions in practice, what design and usability challenges emerge, and how pre-service teachers perceive its instructional value (McKenney & Reeves, 2018). In doing so, it aims to generate formative insights into the pedagogical and technical considerations associated with the adoption of generative AI in immersive learning environments.

The study employs an explanatory case study design with a mixed-methods approach, situated within a summer instructional technology course at a public university in the southern United States. Two guiding research questions frame the inquiry: (1) How do multimedia design principles inform the identification of technical and usability challenges in the simulation’s implementation? and (2) How do pre-service teachers perceive the simulation’s instructional utility in supporting their readiness for technology-enhanced teaching and learning? By framing the integration of GenAI not as a solved problem but as a site of design experimentation and critical reflection, this study contributes to emerging research on the pedagogical, and technological foundations of equitable innovation in teacher education.

2. Literature Review

2.1. Virtual Reality Simulation in Teacher Education

Simulation-based learning environments, particularly those leveraging a range of extended reality technologies—including virtual reality (VR) and mixed reality (MR)—have emerged as promising tools in teacher education to address long-standing challenges such as constrained access to diverse field placements, uneven mentorship quality, and the logistical difficulties of coordinating practice opportunities across institutions and school districts. VR simulations, defined as immersive, computer-generated three-dimensional environments, allow pre-service teachers to engage in realistic and interactive classroom scenarios without the risks associated with live instruction (Dawley & Dede, 2013; Ke & Xu, 2020).

The pedagogical affordances of VR environments are particularly salient in supporting experiential learning (Loke, 2015). Prior research has demonstrated that VR simulations can promote instructional decision-making, classroom management competencies, and teacher self-efficacy by allowing pre-service teachers to rehearse instructional techniques in controlled yet realistic settings (L. A. Dieker et al., 2014; Huang et al., 2023). These environments support repeated practice without the risk of negatively impacting real students, providing a safe space for experimentation and reflection (Southgate et al., 2019). Widely used platforms such as TeachLivE and Mursion have demonstrated the potential of simulation to operationalize core teacher education principles by integrating avatars or human-operated agents into semi-scripted teaching encounters (Dalinger et al., 2020).

Despite these benefits, the very mechanisms that once made VR simulations manageable—namely, scripted dialog trees and preprogrammed avatar behaviors—now constitute their primary limitation. Their designs often rely on linear, branching interactions, which inherently constrain improvisation, dialog fluidity, and authentic learner feedback (L. A. Dieker et al., 2014; Hixon & So, 2009). As a result, such simulations struggle to emulate the spontaneous, relational complexity of real-world classrooms (Hayes et al., 2013; L. A. Dieker et al., 2014). This interactivity gap limits opportunities for pre-service teachers to practice adaptive teaching strategies—skills that are increasingly highlighted in contemporary frameworks of responsive and culturally sustaining pedagogy (Fink et al., 2024).

Moreover, these systems often impose scalability bottlenecks. Platforms requiring real-time facilitation or manually operated student avatars are labor-intensive and cost-prohibitive, particularly for programs serving rural or under-resourced populations (Benedict et al., 2016; Kaufman & Ireland, 2016). Consequently, access to high-fidelity simulation experiences remains uneven, reinforcing inequities in teacher preparation.

In response to these limitations, scholars have called for a new generation of intelligent, autonomous simulations capable of modeling the improvisational, situated nature of classroom teaching (Dai et al., 2024). Recent breakthroughs in generative artificial intelligence (GenAI) offer a compelling trajectory forward (Choi et al., 2024). GenAI-powered agents can simulate student discourse that is both unscripted and context-sensitive, producing emotionally expressive responses and dynamically adapting to teacher input (S. Lee & Ahn, 2021; L. Dieker et al., 2023). Such capabilities signal a potential paradigm shift in how teacher preparation programs conceptualize practice, feedback, and reflection.

However, despite the technological readiness of GenAI, its pedagogical application within teacher education remains underexplored. Few empirical studies have examined how GenAI-enhanced simulations function in authentic instructional settings, or how their design choices impact usability, cognitive load, and perceived instructional value. Addressing this gap requires not only a technical inquiry but also a pedagogical one: how can we meaningfully integrate generative AI into simulations that support teacher learning, scaffold critical reflection, and promote transfer to real classrooms?

2.2. GenAI-Enhanced Teacher Simulation

Recent advances in generative artificial intelligence (GenAI) have expanded the potential of teacher education simulations by enhancing interactivity, adaptability, and personalization. Unlike traditional branching script-based/rule-based simulations with limited, pre-programmed responses, GenAI-powered systems can interpret teacher inputs in real time and generate improvisational, contextually appropriate student reactions. These capabilities support more authentic and responsive learning environments, enabling pre-service teachers to practice dynamic instructional strategies within safe, realistic contexts. GenAI-enhanced simulations offer several key affordances: they enable context-sensitive interaction through adaptive student agents (Fink et al., 2024); provide personalized feedback by tailoring content and difficulty to individual learner needs; support high scalability and reusability through simple prompt modifications that allow for diverse student profiles and classroom scenarios (Bommasani et al., 2021); and reduce dependency on human facilitators, enabling repeated, self-paced practice without compromising instructional quality.

Reflecting these advantages, a growing body of research has begun exploring GenAI’s application in teacher training. For instance, SimClass (Z. Zhang et al., 2025) created a multi-agent classroom simulation where GenAI agents enacted roles such as quiet or active students, fostering organic interactions and dynamic classroom atmospheres in two university courses. Similarly, U. Lee et al. (2023) and Lim et al. (2025) embedded GPT-4 student agents in a 3D Roblox classroom, allowing pre-service teachers to safely engage in problem-solving lessons, with pilot studies indicating high usability and learner engagement. Extending this line of inquiry, Nygren et al. (2025) implemented a GenAI-driven mentoring system within a mixed-reality environment and found that AI-generated feedback effectively complemented expert guidance, enhancing preservice teachers’ reflective practice. Docter et al. (2024) designed an AI-enabled virtual-reality simulation that immerses preservice teachers in a controlled yet unpredictable classroom, enabling rich verbal interaction with virtual pupils and targeted practice in classroom-management strategies. Across these and other studies, GenAI-based simulations have been shown to promote improvisational decision-making, flexible instructional strategies, safe environments for trial and error, and deep reflection through automated log analysis (J. Zhang et al., 2024; Zheng et al., 2025). Nevertheless, despite these promising developments, most existing research remains in the exploratory phase, and few empirical studies have closely examined the usability, effectiveness, and educational impact of GenAI simulations in real-world teacher education settings. In response, this study adopts an explanatory case study approach to investigate the implementation and pedagogical implications of a GenAI-enhanced teaching simulation, contributing to a deeper understanding of how such technologies can meaningfully support teacher learning and development.

2.3. Instructional Design and Usability: A Dual-Theoretical Lens Using CTML and Gagné’s Nine Events of Instruction

The Cognitive Theory of Multimedia Learning (CTML; Mayer, 2024) offers a foundational framework for understanding how learners process information in multimedia-rich environments. Built upon the principles of dual-channel processing, limited working memory, and active knowledge construction, CTML emphasizes how instructional design can support meaningful learning through visual-verbal integration and cognitive load management (Clark & Mayer, 2016). In immersive simulations—particularly those that include generative AI interactions—CTML principles such as coherence, segmenting, and modality are critical in shaping user experience. These principles inform not only interface clarity but also instructional usability—that is, whether learners are cognitively supported in navigating complex, multimodal tasks. Moreover, CTML enables simulations to foster proactive reflection by structuring content in ways that help learners monitor, organize, and evaluate their instructional decisions in real time (Clark & Mayer, 2016; Johnson-Glenberg, 2018; Parong & Mayer, 2018).

While CTML focuses on optimizing how information is presented and processed, Gagné’s Nine Events of Instruction (Gagné et al., 2005) provide a complementary framework that addresses how learning tasks are sequenced and scaffolded to support internalization and performance. The model outlines nine systematic stages—from gaining attention and activating prior knowledge to eliciting performance and promoting retention—that guide the design of coherent instructional episodes. In the context of generative AI-enhanced simulations, Gagné’s model can be used to organize simulation activities as structured, goal-directed instructional events. For example, by embedding explicit cues for recall, feedback, and reflection, teacher simulations can support students in engaging with the simulation not merely as a technical environment but as a structured pedagogical space. The Nine Events framework thus ensures that immersive interactions are aligned with instructional goals and cognitive readiness.

Together, CTML and the Nine Events of Instruction offer a dual lens for both the design and evaluation of immersive, AI-supported learning environments. CTML ensures that information flow within the simulation is cognitively effective and aligned with multimedia processing principles, while Gagné’s framework ensures that this flow is organized into instructionally meaningful sequences that mirror real-world teaching tasks. Applying both perspectives reframes usability as more than operational efficiency—it becomes a question of whether the simulation enables learners to engage with content in ways that are instructionally coherent, reflectively grounded, and transferable to professional contexts. This dual-theoretical grounding supports the design of simulations that are not only engaging and immersive, but also instructionally principled and cognitively sustainable.

3. Methods

3.1. Research Design

This study adopted an explanatory case study design to investigate the implementation of a GenAI-enhanced teaching simulation (TeacherGen@i) in a teacher education context. An explanatory case study is particularly well-suited for capturing the contextual complexity of educational innovations and for analyzing both the process and outcomes of their integration in authentic learning environments (Yin, 2018).

To capture both subjective experiences and objective usability data, the study employed a convergent parallel mixed methods approach (Creswell, 2021). This design integrates the simultaneous collection and analysis of both qualitative and quantitative data, thereby enabling a holistic interpretation of the simulation’s educational affordances. Specifically, a usability survey was used to assess ease of use, satisfaction, and perceived effectiveness of the teaching simulation, while reflective journals captured in-depth cognitive and emotional responses from pre-service teachers engaging with the simulation. The data were complementarily analyzed through data triangulation, examining the extent to which survey findings aligned with journal reflections. Discrepancies were interrogated to uncover latent patterns and deepen understanding of the simulation’s efficacy. This design supports a nuanced evaluation of TeacherGen@i in a real-world instructional context and provides foundational insights for future GenAI-enhanced teacher education tools.

As illustrated in Figure 1, the study followed a formative design cycle consisting of five iterative phases: initial design, implementation, data collection, triangulated analysis, and redesign. First, the GenAI-based simulation (TeacherGen@i) was initially designed based on pedagogical principles and the intended learning context. This version was then implemented in an authentic teacher education setting, where pre-service teachers engaged with the simulation. Usability surveys and reflective journals were simultaneously collected to capture both objective and subjective user experiences. The data were analyzed through triangulation to identify patterns and inconsistencies, which informed the next phase of design improvement. Insights gained from this analysis were used to refine the simulation iteratively, enabling continuous enhancement of its usability and pedagogical alignment.

3.2. Research Context and Participants

A total of 23 undergraduate students participated in this study (CAT100: n = 10; CAT200: n = 13). This research was conducted within the teacher preparation program at a public university located in the southern United States. The simulation was implemented across two undergraduate-level educational technology courses—CAT 100 (n = 10; 4 male, 6 female), which focuses on the development and practical application of digital tools for the classroom, and CAT 200 (n = 13; 5 male, 8 female), which introduces core concepts of instructional technology. This study was conducted in accordance with ethical guidelines and received Institutional Review Board (IRB) approval from the University of Alabama (IRB Protocol #23-11-7113). Prior to their participation, all study participants were provided with a detailed consent form outlining the study’s purpose, procedures, potential risks and benefits, and their rights as participants, including their right to withdraw at any time. To ensure data protection and maintain participant privacy, several measures were implemented. All reflection responses collected from pre-service teachers were anonymized before analysis. Both courses were taught asynchronously online by the same instructor, allowing for controlled and consistent simulation integration across classes. After learning about instructional design theories—including Gagné’s Nine Events of Instruction and Mayer’s Multimedia Principles—preservice teachers participated in a 10–15 min VR-based simulation using TeacherGen@i. The instructor provided example scripts, and the study participants applied various teaching strategies and instructional scripts based on introductory lesson scenarios. Following the simulation, they completed a structured reflection assignment under the instructor’s guidance (Figure 2). Specifically, in CAT 100, preservice teachers explored Richard Mayer’s principles of multimedia learning and used them as an analytical lens to assess the effectiveness of instructional communication and the integration of visual and verbal elements in the simulation. In contrast, CAT 200 focused on Gagné’s Nine Events of Instruction, framing the simulation as a tool to design and implement well-structured learning episodes. Both courses included modules on virtual and immersive learning environments, which provided pedagogical grounding for incorporating TeacherGen@i. Given the dual role as both the course instructor and researcher, several steps were taken to mitigate potential bias. First, all participants received standardized instruction sheets, reflection prompts, and simulation access protocols to ensure consistency in experience across students. The simulation activities were conducted asynchronously without real-time instructor interaction. Furthermore, to maintain objectivity in data interpretation, reflection responses were anonymized prior to analysis, and qualitative coding was triangulated with independent reviewers. These measures were implemented to minimize researcher influence and promote validity in findings.

Participants included two distinct groups. First, three graduate students in education participated in a pilot test to evaluate the simulation’s instructional usability, focusing not only on interface functionality but also on the clarity of content presentation, verbal interaction quality, and perceived coherence of multimodal instructional elements. Second, 23 undergraduate pre-service teachers in CAT 100 and CAT 200 completed the full study. They engaged with the simulation, applied theoretical frameworks to their learning experiences, and contributed to both qualitative and quantitative usability evaluation. By embedding the simulation in theory-supported coursework and capturing study participants’ responses to its communicative and pedagogical features, the current study positioned usability not simply as a measure of technical smoothness, but as an indicator of instructional effectiveness and media design fidelity.

3.3. Data Collection and Instruments

The data collection process involved the integration of the TeacherGen@i simulation into course modules designed to promote instructional communication and design literacy. After an initial orientation session, students engaged with the simulation environment at their own pace, completing tasks that required verbal interaction with AI-driven student agents, visual presentation of content, and reflection on their instructional strategies. This self-directed use of the simulation was intended to approximate real-time teaching decision-making in a low-risk, yet cognitively rich environment.

Upon completion of the simulation and accompanying reflections, all participants completed a usability evaluation survey. Unlike traditional interface-focused usability assessment, this survey was deliberately reframed to evaluate instructional usability—that is—how the simulation supported effective teacher communication, multimodal coordination, and clarity of instructional delivery. The instrument was adapted from established tools by Brooke (1996) and Hartstein et al. (2022), with supplemental items reflecting Mayer’s principles of cognitive load reduction and message design effectiveness. Key constructs assessed included: perceived ease of communication, support for instructional coherence and signaling, clarity of visual-verbal alignment, and satisfaction with learner-agent interaction dynamics.

To enhance the reliability of the research findings, several measures were considered and implemented to ensure the trustworthiness of the research: First, the instructor of the courses utilized standardized guidance documents across both courses to ensure consistent orientation, task instructions, and reflection scaffolding. Second, data triangulation was used to compare usability survey patterns with themes emerging from reflective journals, allowing for convergence, contradiction, or complementarity. Third, the research team conducted iterative member-checking sessions to review and verify emergent interpretations, ensuring the dependability and confirmability of thematic findings. This multifaceted data collection approach not only captured user interface experiences, but also assessed how the simulation shaped pre-service teachers’ ability to organize, present, and adapt instructional content—central competencies in contemporary teacher preparation.

3.4. Data Analysis

This study employed a comprehensive mixed-methods approach integrating qualitative thematic analysis with advanced computational linguistics to achieve robust triangulation of findings regarding pre-service teachers’ perceptions of instructional utility and readiness for classroom teaching. We analyzed reflective journals from 23 pre-service teachers using inductive thematic analysis guided by Gagné’s Nine Events of Instruction framework. Initial open codes captured meaning-rich units related to instructional utility, adaptive teaching practice, scaffolding effectiveness, and multimodal communication development, which were iteratively refined into higher-order themes through constant comparative analysis.

To validate and extend our qualitative findings, we employed Linguistic Inquiry and Word Count (LIWC) analysis (Kovanović et al., 2018) on the same reflection corpus (N = 1371 sentences from 23 documents). This quantitative text analysis provided objective measures of cognitive processing patterns including insight (analytical thinking), causation (cause-effect reasoning), discrepancy (critical evaluation), certainty markers, affective processes, educational language, temporal orientation, and social dynamics. We further implemented advanced NLP techniques including Sentence-BERT embeddings for semantic clustering (k = 4 reflection types), dependency parsing for causal reasoning structure extraction with network visualizations.

We employed systematic triangulation across multiple analytical levels: cross-validation between manual thematic coding and automated LIWC categorization. Each qualitative theme was validated against corresponding LIWC cognitive processing indicators to ensure authentic rather than performative responses. For example, the “Authentic Interaction” theme was corroborated by high social reference language (M = 2.77%) and present-focus temporal orientation (M = 1.76%), while “Scaffolding Appreciation” aligned with increased certainty markers and decreased tentativeness language. To support the interpretive trustworthiness of our thematic analysis, we conducted a convergence check between manually coded qualitative themes and linguistic categories derived from LIWC. Approximately 78% of study participant reflection responses contained LIWC markers that aligned with one or more identified themes, indicating a moderate level of semantic correspondence between qualitative insights and cognitive-affective linguistic patterns, which is a benchmarking metric to triangulate and enrich our interpretations through complementary analytical lenses.

3.5. Learning Environment: TeacherGen@i

TeacherGen@I has been designed to create highly interactive and realistic teaching experiences in immersive VR by seamlessly simulating various classroom interactions (Hwang et al., 2024). Developed with Unity 6 engine, the simulation runs on Meta Quest, PC, and Mac, offering a comprehensive and accessible learning experience. In this study, study participants used their personal laptops, which varied in specifications and operating systems, including both Windows-based and Mac devices. The system includes several key features: (1) verbal interaction with virtual student agents, (2) AI-generated real-time responses, (3) dynamic facial expressions of the agents, and (4) integration of course slides on a virtual display. Figure 3 below illustrates the overall architecture of the simulation environment:

User interactions in TeacherGen@i are supported by a combination of advanced technologies. All spoken inputs are transcribed in real time by Whisper speech-to-text API, after which GPT-4 generates contextually appropriate responses based on the predefined persona of each AI student agent. To enhance realism, these agents also exhibit dynamic facial expressions, representing emotional states such as happiness, confusion, and frustration, offering nonverbal cues that support teacher decision-making and communication adjustments. The agent used in this study was designed to possess multidimensional attributes to simulate learner characteristics within the simulation environment. The agent was defined based on seven key elements: name, achievement level, personality, career aspiration, thinking, preference, and problem behavior. For example, the agent’s name was set as “Sophia” so that it could clearly recognize and respond when addressed by the user. The achievement level was designated as “Gifted”, assuming that the agent possessed a deep understanding of lesson content and extensive background knowledge. The personality traits were defined as intelligent, independent, and perfectionistic, which were reflected in the agent’s affective tone and overall disposition during the simulation. In addition, the career aspiration was set as “Scientist” to demonstrate strong interest and goal-oriented behavior in the field of science. The agent’s thinking traits were defined as having strengths in logical reasoning and abstract thinking, while personal preferences included enjoying challenging puzzles, conducting science experiments, and engaging in advanced reading activities. Finally, to reflect realistic learner behavior, the agent was programmed to occasionally display problem behaviors, such as a lack of patience with slower-paced learners.

Before each simulation, users can pre-set instructional parameters—including subject, grade level, learning objectives, and content materials, via a connected web server. Instructional materials can be projected through an interactive slideshow panel, allowing fluid coordination between visual and verbal information.

Grounded in Gagné’s Nine Events of Instruction, TeacherGen@i offers a pedagogically sound framework that supports pre-service teachers in developing essential teaching skills. Users can upload instructional materials, interact via keyboard or voice input, and access automatically recorded conversation logs for post-lesson review. Collectively, these features create an immersive, flexible, and realistic virtual teaching experience that mirrors real classroom dynamics and supports the rehearsal of evidence-based instructional practices.

4. Results

4.1. RQ1: Identification of Technical and Usability Challenges Through Multimedia Design Principles

To explore the technical and usability challenges encountered in the implementation of TeacherGen@i, we analyzed data from both pilot and main usability tests as well as reflective journals, employing Mayer’s multimedia learning principles as an interpretive lens.

4.1.1. Usability Evaluation—Graduate Pilot Study

Three graduate students with prior experience in instructional simulation participated in a pilot usability evaluation. As shown in Table 1, participants reported high overall satisfaction with both ease of use and instructional usefulness. All usefulness-related items, including goal alignment, user motivation, and scenario authenticity, received a mean rating of 4. While ease-of-use items such as task clarity and speed were positively rated, areas such as content quantity and technical reliability scored slightly lower (M = 3.00), suggesting room for further refinement. In addition, all aspects of usefulness were rated highly, with a mean score of 4 across the board. Qualitative feedback indicated positive user experiences with the information input features, yet also identified a technical limitation concerning the inability to upload PDF files. While the interface design was commended, some participants noted the lack of an easily discernible exit function. Notably, the GenAI integration was unanimously praised with participants describing its responsiveness and contextuality as a marked improvement over previous simulation experiences.

4.1.2. Usability Evaluation—Undergraduate Main Study

A larger-scale evaluation was conducted with 23 pre-service teachers enrolled in two instructional technology courses. As presented in Table 1, a portion of total study participants (n = 18) rated the simulation moderately positively overall, though the variance in rating was higher compared to the graduate group. In the ease-of-use category, content amount (M = 3.78) and text readability (M = 3.78) received the highest ratings, indicating that users perceived the simulation as visually manageable and sufficiently informative. Conversely, technical reliability (M = 2.28) was consistently rated the lowest, with several students noting lag issues and response failures under suboptimal internet conditions. In terms of instructional usefulness, familiar terminology (M = 3.61) and scenario realism (M = 3.56) were rated highly, reflecting the system’s success in aligning its virtual context with classroom expectations. However, future use intention (M = 2.89) was comparatively low, suggesting that while the simulation offered novelty and engagement, its long-term pedagogical utility may not have been fully realized.

4.1.3. Multimedia Design Principle Evaluation

To complement the survey data, we analyzed reflective journals written by undergraduate participants who had engaged with Mayer’s multimedia learning principles during their coursework. Students were asked to evaluate the simulation using these principles, providing a theory-informed assessment of its instructional affordances and constraints. The systematic analysis of student reflections across eight multimedia design principles revealed both successful implementations and critical areas for improvement (Table 2). To validate these qualitative perceptions, we conducted computational linguistic analysis using LIWC, which provided objective measures of cognitive processing patterns underlying participants’ multimedia principle evaluations.

Positive Multimedia Principle Adherence. As detailed in Table 2, participants acknowledged that several design principles were well reflected in the simulation. The coherence principle was successfully implemented through relevant text/image/animation integration and simplified UI elements (blackboard, desk), though students noted some overuse of animation. Signaling effectiveness was evident in headings, color codes, and bold text, with microphone use effectively guiding attention, although participants identified a lack of dynamic signaling options and vague visual focus cues. Spatial and temporal contiguity were generally well-maintained with consistent on-screen placement and synchronous narration with visuals, and the clear segmented structure (topic, objectives, process) supported systematic learning progression.

Critical Multimedia Challenges. Table 2 reveals significant violations of key multimedia principles that created substantial usability barriers. The modality principle emerged as the most problematic area, with participants noting over-reliance on text, weak auditory cues, and no background/ambient sounds that constrained multimodal processing capabilities. Redundancy issues manifested through overlapping teacher speech and speech bubble overload, while temporal contiguity problems included speech delays due to lag. The pre-training principle showed mixed implementation, with input space for prior knowledge but no integration guidance for prior content. Personalization appeared limited, offering adaptive content suggestions but providing only generic feedback without tailored guidance based on learner performance. Technical issues compounded these principle violations: intermittent file display errors disrupted lesson continuity, posture tracking revealed students spending extended periods with heads down, and the simulation’s inability to process open-ended visual responses highlighted fundamental limitations in multimodal interaction capabilities.

4.1.4. Severity-Frequency Analysis and Design Response

The LIWC findings provide empirical validation for these multimedia principle concerns (Figure 4). Despite positive overall sentiment (M = 0.998), study participants exhibited elevated discrepancy detection language (M = 0.83%), indicating active identification of design inconsistencies rather than uncritical acceptance. Notably, the combination of high technology language use (M = 1.41%) with low certainty markers (M = 0.20%) suggests that while participants engaged enthusiastically with technological features, they experienced underlying uncertainty about optimal multimedia integration—precisely reflecting the modality and redundancy violations identified in qualitative analysis.

The usability issues severity-frequency analysis reveals that Audio-Visual Integration challenges (severity: 2.89, frequency: 0.55%) fall into the “Moderate Priority” quadrant, validating participants’ concerns about modality principle violations. Content Overload issues (severity: 3.78, frequency: 0.15%) appear in the “Targeted Fix” quadrant, confirming that redundancy problems created significant cognitive burden for affected users. Technical Reliability issues (severity: 2.28, frequency: 0.83%) occupy the “Moderate Priority” space, demonstrating that technical constraints compounded multimedia design challenges.

In response to these principle-based findings, targeted improvements were implemented to restore multimedia learning effectiveness. Technical reliability improvements resolved PDF upload failures and lag-induced response interruptions that violated coherence by introducing extraneous distractions. To address modality principle violations, auditory feedback and processing status indicators were added to engage underutilized auditory channels, while subtitles were turned off by default to reduce visual overload. Speech bubble display time was dynamically adjusted based on text length to optimize temporal contiguity and reduce redundancy issues. Enhanced visual cues, clearer tutorial labeling, and racially diverse avatars were implemented to improve signaling and personalization principles. The convergence between qualitative multimedia principle evaluations, quantitative linguistic patterns, and severity-frequency analysis demonstrates that multimedia principle-based evaluation reflects genuine cognitive processing rather than theoretical compliance, providing robust guidance for VR simulation design refinement.

4.2. RQ2: Pre-Service Teachers’ Perceptions of Instructional Utility and Readiness for Classroom Teaching

To explore how pre-service teachers view the instructional utility of the simulation in preparing for technology-enhanced teaching and learning, we analyzed reflective journals using Gagné’s Nine Events of Instruction (Gagné et al., 2005) as an analytic framework. Through thematic analysis of study participants’ reflections, we identified four key dimensions that illuminate their evolving perceptions of the simulation’s pedagogical values and challenges (Table 3). To validate these qualitative findings, we employed LIWC analysis across 1371 sentences (23 reflection documents) to compute a network model that reveals underlying participants’ linguistic responses. The analysis revealed a complex picture of perceived utility, where participants recognized both transformative potential and significant constraints. Rather than uniform enthusiasm or rejection, the students demonstrated sophisticated understanding of how GenAI-enhanced teacher simulation could or could not support their professional development. The LIWC-based network analysis demonstrates how these four themes form an interconnected cognitive ecosystem, with Social References (M = 2.77%) and Insight Processing (M = 0.86%) serving as shared linguistic anchors across multiple themes, validating the integrated nature of participants’ reflective thinking.

The radial network visualization (Figure 5) represents a comprehensive network model that demonstrates how the four RQ2 themes are linguistically interconnected through LIWC categories across 1371 analyzed sentences. The network features a three-layer hierarchical structure: (1) a central dark blue node, representing “RQ2 Core” that serves as the cognitive anchor, (2) four intermediate theme nodes positioned at cardinal coordinates—Authentic Interaction, Structured Scaffolding, Multimodal Communication, and Future Improvement—connected to the core via thick dark lines, and (3) LIWC category nodes arranged in concentric circles where shared categories (appearing in multiple themes) occupy the inner circle at radius 2.5 with larger, more opaque nodes, while unique categories populate the outer circle at radius 4.5. The network employs three distinct edge types: solid dark lines connecting core to themes, lighter gray lines (weight = 1.5) linking themes to categories, and red dashed lines (weight = 2) highlighting shared categories’ direct connection to the core. Node sizes are proportionally scaled to mean LIWC scores, with Social References (2.77%) emerging as the largest node, validating its role as the primary linguistic hub across themes.

4.2.1. Authentic Interaction and Adaptive Teaching Practice

Study participants consistently echoed the simulation’s capacity to generate realistic student responses as its most valuable feature for developing adaptive teaching skills. Unlike traditional role-playing or scripted scenarios, the GenAI-powered agents provided unpredictable yet contextually appropriate reactions that required real-time instructional adjustments. One participant noted how a seemingly disengaged virtual student would suddenly become responsive during specific lesson segments, forcing reconsideration of engagement strategies typically reserved for live classroom settings. Their dynamic responsiveness enabled study participants to practice differentiated instruction in ways that felt consequential. They reported experimenting with Socratic questioning for academically capable but unmotivated students, and developing patience through repeated explanations—skills that traditional simulation environments rarely afford. The customization features, allowing input of student profiles and prior lesson content, further enhanced this sense of authentic practice by enabling contextually grounded instructional decision-making. LIWC analysis corroborates these qualitative findings through linguistic evidence of adaptive cognitive processing. The moderate positive correlation between Social References and Insight Processing (r = 0.425) demonstrates that participants’ discussions of interpersonal interactions were systematically linked to analytical thinking patterns. Furthermore, the high frequency of Social References (2.77%—the highest among all LIWC categories) coupled with Present Focus language (1.76%) validates participants’ emphasis on real-time adaptive responses. The network visualization reveals Social References as a central hub connecting multiple themes, confirming its role as the primary linguistic marker of authentic pedagogical engagement.

4.2.2. Structured Scaffolding and Confidence Building

Study participants mostly agreed that the simulation’s alignment with Gagné’s instructional framework provided essential scaffolding for novice teachers attempting to coordinate multiple pedagogical demands simultaneously. They appreciated how the mission-based structure maintained their focus on instructional goals while managing the cognitive complexity of virtual teaching. This structured approach appeared particularly valuable for building confidence through repeated, low-stakes practice opportunities. However, they also recognized the limitations of this scaffolding. While the framework supported systematic lesson delivery, some noted that it might inadvertently constrain creative or student-centered approaches that diverge from linear instructional models. This tension suggests that structured support, while necessary for skill development, may need to be gradually reduced to foster pedagogical flexibility. The LIWC evidence suggests the confidence-building narrative through measurable linguistic changes. Certainty language increased 44% (0.16% → 0.23%), while Learning Process language showed 12% growth (1.32% → 1.48%), indicating progressive pedagogical confidence. The correlation between Learning Process and Teaching Process language (r = 0.342) validates the integration of theoretical knowledge with practical application that participants described. Notably, the low frequency of Tentativeness language (M = 0.15%) across all reflections suggests that Gagné’s framework successfully reduced uncertainty and supported systematic instructional thinking.

4.2.3. Multimodal Communication Development

The simulation’s integration of verbal and non-verbal communication channels was perceived as essential for developing technology-mediated teaching competencies. Participants valued the nuanced facial expressions of virtual agents, which provided feedback cues typically absent in digital learning environments. This multimodal interaction enabled practice of sophisticated communication strategies, from reading student confusion to adjusting pace based on visual cues.

Yet participants also identified critical gaps in the communication experience. The absence of authentic student work submissions, limited file format support, and restricted interactive capabilities constrained their ability to practice comprehensive technology integration. These limitations highlight the ongoing challenge of balancing simulation fidelity with technical feasibility in educational technology development. The LIWC validation emerges in this theme through the exceptionally high correlation between Technology Language and Social References (r = 0.630), indicating that participants successfully integrated technical and interpersonal communication competencies rather than treating them as separate skill domains. The substantial presence of Technology Language (1.41%) combined with high Positive Emotion scores (0.73%) suggests that despite technical limitations, participants maintained optimistic engagement with multimodal communication tools. This linguistic pattern validates the qualitative finding that technology-mediated communication was perceived as enhancing rather than replacing traditional pedagogical interactions.

4.2.4. Areas of Future Refinement

Our analysis revealed critical design tensions that provide pathways for enhancing GenAI-enhanced teacher simulations. Study participants’ insights point to fundamental refinement priorities that extend beyond technical issues to core pedagogical considerations. The system’s limited ability to process visual responses with diverse multimedia formats may constrain its utility for developing student-centered teaching practices. Furthermore, the tension between structured scaffolding and pedagogical flexibility requires careful recalibration—while participants valued framework support for systematic instruction, they recognized its potential to limit creativity in teaching strategies. LIWC analysis reveals a sophisticated pattern of constructive criticism rather than simple dissatisfaction. The presence of Discrepancy Detection language (0.83%) combined with high Insight Processing (0.86%) and Future Focus orientation (0.63%) indicates that participants engaged in analytical problem-solving rather than mere complaint expression. Crucially, the minimal Negative Emotion language (0.15%) despite clear awareness of system limitations suggests that participants maintained a solution-oriented mindset. The multi-layer diffusion network model demonstrates how improvement-focused thinking flows from higher-level cognitive processing through pedagogical structures to concrete implementation concerns, validating the hierarchical nature of participants’ reflective analysis.

5. Discussion

The findings of this study reveal a fundamental tension in applying established multimedia learning principles to emerging GenAI-enhanced VR environments and integrating teacher simulations in current teacher education courses. While these principles effectively served as both diagnostic frameworks and design heuristics, our analysis demonstrates that their application practices may inadequately address the complex cognitive architecture of AI-enhanced teacher simulation. This tension suggests the need for a dialectical reconceptualization of multimedia design principles that acknowledges both their theoretical foundations and their contextual limitations in emerging educational technologies.

5.1. RQ1: Identification of Technical and Usability Challenges Through Multimedia Design Principles

5.1.1. Multimedia Principles as Diagnostic Frameworks

Data collected from usability surveys, reflective journals, and LIWC analysis explain that Mayer’s multimedia learning principles function effectively as analytical frameworks for identifying usability challenges in complex educational technologies. The systematic evaluation across eight design principles (Table 2) revealed specific violations that directly corresponded to cognitive processing difficulties experienced by users. Most prominently, modality principle violations—characterized by over-reliance on visual text and insufficient auditory scaffolding—created measurable cognitive bottlenecks that were validated through both qualitative complaints and quantitative linguistic uncertainty markers (M = 0.20% certainty language).

The coherence principle proved particularly valuable in diagnosing technical reliability issues that disrupted instructional flow. When file display errors introduced extraneous cognitive load, participants’ elevated discrepancy detection language (M = 0.83%) provided objective evidence of principle violations beyond mere interface complaints. Similarly, temporal contiguity breakdowns due to speech delays were systematically identified through user reflections, demonstrating the principles’ utility in distinguishing between surface-level usability problems and deeper information processing disruptions.

This diagnostic capacity extends beyond traditional interface evaluation to encompass what we term “instructional usability”—the degree to which technological systems support rather than hinder pedagogical cognition. Unlike conventional usability metrics focused on task completion efficiency, multimedia principles enabled identification of cognitive architecture misalignments that could undermine learning effectiveness despite apparent functional success.

5.1.2. Contextual Limitations in Immersive Environment Design

Our findings reveal limitations when established multimedia principles encounter the unique perceptual and attentional affordances of VR environments. The contradiction emerged in redundancy principle application, where participants consistently requested complementary auditory cues rather than viewing them as extraneous cognitive load. This inversion challenges fundamental assumptions about dual-channel processing in immersive contexts.

Study participants’ preference for “redundant” auditory feedback aligns with emerging research suggesting that VR’s spatial audio capabilities may fundamentally alter cognitive load dynamics (Baceviciute et al., 2022; Liu et al., 2021). The authentic, three-dimensional nature of VR interaction appears to increase rather than decrease learners’ capacity for processing simultaneous auditory and visual information streams. This finding contradicts traditional redundancy effects observed in 2D multimedia environments, suggesting that established principles may inadequately account for the enhanced cognitive integration possible in immersive spaces.

Furthermore, the segmenting principle—typically valuable for learner control—proved problematic in AI-driven interactions where conversational flow and social presence depend on temporal continuity. Participants valued the ability to pause and reflect, yet also noted that excessive segmentation disrupted the authentic interpersonal dynamics that made GenAI-enhanced agents effective teaching practice partners. This tension reveals how principles designed for static multimedia content may conflict with the dynamic, responsive nature of AI-mediated learning environments.

5.2. RQ2: Pre-Service Teachers’ Perceptions of Instructional Utility and Readiness for Classroom Teaching

Exploring Instructional Utility as Perceived by Pre-Service Teachers

This study explored how pre-service teachers perceive the simulation’s instructional utility in supporting their readiness for technology-enhanced teaching and learning. Our analysis employed Gagné’s Nine Events of Instruction as a theoretical framework, validated through LIWC analysis across 1371 sentences, revealing complex perceptions that both align with and challenge existing research on VR-based teacher education.

Study participants consistently perceived the simulation as enhancing their readiness for technology-enhanced instruction, particularly valuing the unpredictable yet contextually appropriate responses of GenAI agents. The prominence of Social References as the largest LIWC category validates that perceived utility centered on interpersonal pedagogical dynamics rather than technical features. This finding extends Stiell et al.’s (2022) research on VR confidence-building, as the correlation between Social References and Insight Processing (r = 0.425) indicates pedagogical realizations emerging from AI interactions. However, this perceived authenticity reveals tension with traditional field experience models—while L. A. Dieker et al. (2014) emphasized limitations of scripted VR simulations, study participants found value precisely in GenAI’s ability to generate responses that felt simultaneously artificial and pedagogically meaningful. LIWC analysis reveals measurable confidence development through an increase in Certainty language (0.16% → 0.23%), yet participants maintained substantial Discrepancy Detection (M = 0.83%), suggesting “confident uncertainty” rather than linear progression.

The simulation’s perceived utility extended to technology integration, challenging research that treats technological and pedagogical competencies as separate domains. The high correlation between Technology Language and Social References (r = 0.630) indicates that participants experienced technology not merely as a teaching tool but as integral to pedagogical relationship-building, extending J. Zhang et al.’s (2024) research on VR microteaching by suggesting qualitatively different integration possibilities. However, this integration remained incomplete—despite positive perceptions of AI agent interactions, study participants noted limitations in multimodal capabilities and personalization features. The combination of substantial Technology Language (M = 1.41%) with constructive criticism patterns (high Insight Processing with Future Focus) suggests participants could envision enhanced utility beyond current constraints, aligning with Theelen et al.’s (2019) observations about gaps between VR simulation potential and implementation.

Overall, study participants perceived the simulation as contributing meaningfully to their professional readiness by enabling engagement with contemporary teaching complexities rather than simplifying them. Under LIWC network analysis, the balanced distribution of Present Focus (1.76%) and Future Focus (0.63%) language indicates that participants used current simulation experiences as foundations for envisioning future pedagogical practice, supporting Gagné’s framework while extending it to encompass AI-mediated learning affordances. Contrary to existing concerns (Cowan et al., 2023; Hagan et al., 2024) about potential discomfort in immersive environments, minimal Negative Emotion language despite clear awareness of system limitations suggests that well-designed and authentic narrative-augmented GenAI simulations can foster productive rather than paralyzing uncertainty. The simulation’s perceived utility lay in providing unique preparatory practice that enabled safe experimentation with pedagogical strategies while developing comfort with technological mediation of teaching relationships.

6. Conclusions, Limitation, and Future Research

This study examined the usability, design challenges, and instructional value of TeacherGen@i, a GenAI-enhanced teacher simulation embedded within an authentic teacher education course. Implemented in a real classroom setting with pre-service teachers, the case study reveals that the efficacy of GenAI-based instructional simulations depends not only on technical sophistication but also on the intentional alignment with pedagogical design principles.

6.1. Limitation and Future Research

There are several limitations of the study. First, this research employed short-term usability assessments of a GenAI-based teaching simulation through short-term observation and participant feedback. Although short-term usability assessments were effective in capturing individuals’ perception related to intuitiveness, these attempts are limited in gathering how their efficacy towards teacher simulation leads to their teaching competency development. Second, since the study was conducted within a single authentic classroom context, the findings have limited generalizability across diverse subjects, instructional approaches, and learner populations. Usability can vary significantly depending on learners’ digital literacy, instructional design, and the learning environment. Therefore, future research should explore multi-case or cross-institutional studies to investigate how GenAI-enhanced simulations function across diverse educational settings, disciplines, and learner populations.

Looking ahead, research should move beyond surface-level evaluations of usability or user satisfaction to address deeper questions of pedagogical impact. Specifically, it is necessary to investigate whether sustained and scaffolded use of teacher simulations contributes to measurable growth in pre-service teachers’ instructional capacities such as lesson planning, adaptive communication, and reflective rehearsal. To this end, future research should employ multi-method research design approaches as well as teacher analytic system integrations (U. Lee et al., 2024). These approach can incorporate various proxy behavior indicators such as changes in teaching self-efficacy, learning curve analysis of in-simulation dialogue patterns, and rubric-based evaluations of instructional artifacts generated during or after simulation use.

6.2. Implications

This study contributes to the growing body of research at the intersection of generative AI, immersive simulation, and instructional design. By applying Mayer’s multimedia learning principles as an analytical lens, the findings demonstrate how established cognitive theories can be extended to evaluate emerging learning environments. The study illustrates that these principles are not merely prescriptive design heuristics but can serve as interpretive frameworks for identifying and understanding usability and pedagogical tensions in immersive learning. This expands the theoretical utility of multimedia principles in VR-based educational research and highlights the need for their contextual adaptation in multimodal and interactive settings. Moreover, the integration of reflective journaling and usability data exemplifies a mixed-methods approach for examining learner experience in technology-enhanced environments, offering a methodological contribution for future research in learning sciences.

From a practical perspective, the current study findings offer actionable insights for instructional designers, teacher educators, and curriculum developers aiming to integrate AI-enhanced VR simulations into teacher preparation programs. The usability analysis of the present study underscores that even technically advanced environments can fall short without deliberate attention to instructional coherence, balanced modality use, and accessible interface design. Over-reliance on text and limited auditory scaffolding, for instance, detracted from cognitive efficiency—emphasizing the importance of developing multimodal interaction strategies that align with cognitive processing limits in immersive spaces. Teacher preparation programs can leverage this model to train candidates in high-fidelity, situated environments that simulate real classroom dynamics, support iterative practice, and provide adaptive feedback. Specifically, programs might incorporate GenAI-powered VR simulations to enhance courses in instructional methods, classroom management, and technology integration—thereby offering scaffolded opportunities to practice teaching decisions in risk-free environments. Furthermore, aligning such simulations with core standards (e.g., ISTE) can bolster their relevance and applicability to certification pathways.

This study also recognizes and begins to explore the broader ethical implications of “opening up” VR simulations to generative AI, particularly concerning its role in teacher education. We acknowledge that the integration of powerful AI, like that in TeacherGen@i, introduces several important considerations that extend beyond traditional research ethics. One primary concern revolves around the “have/have not” nature of technology access and equity. While GenAI-enhanced VR simulations offer promising resources to challenges like limited access to authentic field experiences in traditional teacher preparation, their cost, technical requirements, and digital literacy demands could inadvertently exacerbate existing educational inequities. Our study implicitly addresses this by using readily available personal laptops for accessibility, but future research should explicitly investigate strategies to ensure equitable access and effective integration for all pre-service teachers, regardless of their background or institutional resources. Furthermore, the very nature of AI-generated interactions in simulations raises questions about authenticity, transparency, and the potential for unintended biases. While TeacherGen@i is designed to create realistic and adaptive student behaviors, it is vital to consider how these simulated interactions might influence pre-service teachers’ perceptions of student behavior, classroom dynamics, or even their own pedagogical effectiveness. Future work will need to explore how to maintain pedagogical fidelity and avoid perpetuating biases that might be inherent in the AI’s training data. This includes considering the design of AI agents to ensure they represent diverse student populations fairly and realistically, and how the system transparently communicates its AI-driven nature to users to manage expectations and encourage critical reflection on the simulation experience itself. Finally, integrating GenAI introduces questions about the nature of “practice” itself. While the simulation offers a safe space for experimentation, it is essential to critically examine how these controlled, yet unpredictable, environments prepare pre-service teachers for the full, nuanced complexity of real human interactions in diverse classrooms. This dimension prompts us to continually evaluate the transferability of skills learned in AI-enhanced simulations to actual teaching contexts in future research.

Author Contributions

Conceptualization, S.H. and J.M.; methodology, S.H. and J.M.; software, J.H.; Validation, T.E.; formal analysis, S.H., J.M., T.E.; investigation, S.H. and I.D.A.; resources, J.M. and J.H.; data curation, J.M., T.E., I.D.A. and J.H.; writing—original draft, S.H., J.M. and I.D.A.; visualization, J.M.; supervision, J.M.; funding acquisition, J.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by CREATE award from by the Office of Sponsored Program at the University of Alabama (24-002440).

Data Availability Statement

The data cannot be made publicly available due to privacy or ethical restrictions.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Baceviciute, S., Lucas, G., Terkildsen, T., & Makransky, G. (2022). Investigating the redundancy principle in immersive virtual reality environments: An eye-tracking and EEG study. Journal of Computer Assisted Learning, 38(1), 120–136. [Google Scholar] [CrossRef]
Bautista, N. U., & Boone, W. J. (2015). Exploring the impact of TeachME™ lab virtual classroom teaching simulation on early childhood education majors’ self-efficacy beliefs. Journal of Science Teacher Education, 26(3), 237–262. [Google Scholar] [CrossRef]
Benedict, A., Holdheide, L., Brownell, M., & Foley, A. M. (2016). Learning to teach: Practice-based preparation in teacher education (Special issues brief). Center on Great Teachers and Leaders. [Google Scholar]
Bommasani, R., Lin, K., Narayan, S., & Le, Q. V. (2021). On the opportunities and risks of foundation models [CRFM Technical Report FM-TR-2021-1]. Stanford Center for Research on Foundation Models. [Google Scholar] [CrossRef]
Brooke, J. (1996). SUS—A quick and dirty usability scale. In Usability evaluation in industry. Taylor and Francis. [Google Scholar]
Choi, G. W., Kim, S. H., Lee, D., & Moon, J. (2024). Utilizing generative AI for instructional design: Exploring strengths, weaknesses, opportunities, and threats. TechTrends, 68(4), 832–844. [Google Scholar] [CrossRef]
Clark, R. C., & Mayer, R. E. (2016). E-learning and the science of instruction: Proven guidelines for consumers and designers of multimedia learning (4th ed.). John Wiley & Sons. [Google Scholar] [CrossRef]
Cowan, P., Donlon, E., Farrell, R., Campbell, A., Roulston, S., Taggart, S., & Brown, M. (2023). Virtual and augmented reality and pre-service teachers: Makers from muggles? Australasian Journal of Educational Technology, 39(3), 1–16. [Google Scholar] [CrossRef]
Creswell, J. W. (2021). A concise introduction to mixed methods research. SAGE. [Google Scholar]
Dai, C. P., Ke, F., Pan, Y., Moon, J., & Liu, Z. (2024). Effects of artificial intelligence-powered virtual agents on learning outcomes in computer-based simulations: A meta-analysis. Educational Psychology Review, 36(1), 31. [Google Scholar] [CrossRef]
Dalgarno, B., Gregory, S., Knox, V., & Reiners, T. (2016). Practising teaching using virtual classroom role plays. Australian Journal of Teacher Education, 41(1), 126–154. [Google Scholar] [CrossRef]
Dalinger, T., Thomas, K. B., Stansberry, S., & Xiu, Y. (2020). A mixed reality simulation offers strategic practice for pre-service teachers. Computers & Education, 144, 103696. [Google Scholar] [CrossRef]
Darling-Hammond, L. (2017). Teacher education around the world: What can we learn from international practice? European Journal of Teacher Education, 40(3), 291–309. [Google Scholar] [CrossRef]
Dawley, L., & Dede, C. (2013). Situated learning in virtual worlds and immersive simulations. In J. M. Spector, M. D. Merrill, J. Elen, & M. J. Bishop (Eds.), Handbook of research on educational communications and technology (pp. 723–734). Springer. [Google Scholar] [CrossRef]
Dawson, M. R., & Lignugaris/Kraft, B. (2017). Meaningful practice: Generalizing foundation teaching skills from TLE TeachLivE™ to the classroom. Teacher Education and Special Education, 40(1), 26–50. [Google Scholar] [CrossRef]
Dieker, L., Hughes, C., & Hynes, M. (2023). The past, the present, and the future of the evolution of mixed reality in teacher education. Education Sciences, 13(11), 1070. [Google Scholar] [CrossRef]
Dieker, L. A., Rodriguez, J. A., Lignugaris/Kraft, B., Hynes, M. C., & Hughes, C. E. (2014). The potential of simulated environments in teacher education: Current and future possibilities. Teacher Education and Special Education, 37(1), 21–33. [Google Scholar] [CrossRef]
Docter, M. W., De Vries, T. N. D., Nguyen, H. D., & van Keulen, H. (2024). A proof-of-concept of an integrated VR and AI application to develop classroom management competencies in teachers in training. Education Sciences, 14(5), 540. [Google Scholar] [CrossRef]
Fink, M. C., Robinson, S. A., & Ertl, B. (2024). AI-based avatars are changing the way we learn and teach: Benefits and challenges. Frontiers in Education, 9, 1416307. [Google Scholar] [CrossRef]
Gagné, R. M., Wager, W. W., Golas, K. C., & Keller, J. M. (2005). Principles of instructional design (5th ed.). Thomson/Wadsworth. [Google Scholar]
Grossman, P., Hammerness, K., & McDonald, M. (2009). Redefining teaching, re-imagining teacher education. Teachers and Teaching: Theory and Practice, 15(2), 273–289. [Google Scholar] [CrossRef]
Hagan, H., Fegely, A., Warriner, G., & Mckenzie, M. (2024). The teaching methods classroom meets virtual reality: Insights for pre-service teaching methods instructors. TechTrends, 68, 358–369. [Google Scholar] [CrossRef]
Hartstein, A. J., Verkuyl, M., Zimney, K., Yockey, J., & Berg-Poppe, P. (2022). Virtual reality instructional design in orthopedic physical therapy education: A mixed-methods usability test. Simulation & Gaming, 53(2), 111–134. [Google Scholar] [CrossRef]
Hayes, A. T., Straub, C. L., Dieker, L. A., Hughes, C. E., & Hynes, M. C. (2013). Ludic learning: Exploration of TLE TeachLivE™ and effective teacher training. International Journal of Gaming and Computer-Mediated Simulations (IJGCMS), 5(2), 20–33. [Google Scholar] [CrossRef]
Hixon, E., & So, H. J. (2009). Technology’s role in field experiences for preservice teacher training. Journal of Educational Technology & Society, 12(4), 294–304. [Google Scholar]
Huang, Y., Richter, E., Kleickmann, T., & Richter, D. (2023). Virtual reality in teacher education from 2010 to 2020. In Bildung für eine digitale Zukunft (pp. 399–441). Springer. [Google Scholar] [CrossRef]
Hwang, J., Hong, S., Eom, T., & Lim, C. (2024, June 10–14). Enhancing pre-service teachers’ competence with a generative artificial intelligence-enhanced virtual reality simulation. 4th International Society of the Learning Sciences (pp. 24–27), New York, NY, USA. [Google Scholar]
Jeong, Y., Lee, Y., Byun, G., & Moon, J. (2024). Navigating the creation of immersive learning environments in roblox: Integrating generative AI for enhanced simulation-based learning. Immersive Learning Research-Practitioner, 1(1), 16–19. [Google Scholar]
Johnson-Glenberg, M. C. (2018). Immersive VR and education: Embodied design principles that include the body in cognition. Frontiers in Robotics and AI, 5(5), 81. [Google Scholar] [CrossRef]
Kaufman, D., & Ireland, A. (2016). Enhancing teacher education with simulations. TechTrends, 60, 260–267. [Google Scholar] [CrossRef]
Ke, F., & Xu, X. (2020). Virtual reality simulation-based learning of teaching with alternative perspectives taking. British Journal of Educational Technology, 51(6), 2544–2557. [Google Scholar] [CrossRef]
Kovanović, V., Joksimović, S., Mirriahi, N., Blaine, E., Gašević, D., Siemens, G., & Dawson, S. (2018, March 5–9). Understand students’ self-reflections through learning analytics. 8th International Conference on Learning Analytics and Knowledge (LAK18) (pp. 389–398), Sydney, NSW, Australia. [Google Scholar] [CrossRef]
Lee, S., & Ahn, T. (2021). Pre-service teachers’ learning experience of using a virtual practicum simulation with AI learners. Multimedia-Assisted Language Learning, 24(4), 107–133. [Google Scholar] [CrossRef]
Lee, U., Jeong, Y., Koh, J., Byun, G., Lee, Y., Lee, H., Eun, S., Moon, J., Lim, C., & Kim, H. (2024). I see you: Teacher analytics with GPT-4 vision-powered observational assessment. Smart Learning Environments, 11(1), 48. [Google Scholar] [CrossRef]
Lee, U., Koh, J., Jeong, Y., & Lee, S. (2023, December 15). Generative agent for teacher training: Designing educational problem-solving simulations with LLM-based agents for pre-service teachers. NeurIPS Workshop on Generative AI for Education, New Orleans, LA, USA. [Google Scholar]
Lim, J., Lee, U., Koh, J., Jeong, Y., Lee, Y., Byun, G., Jung, H., Jang, Y., Lee, S., & Moon, J. (2025). Development and implementation of a generative artificial intelligence-enhanced simulation to enhance problem-solving skills for pre-service teachers. Computers & Education, 232, 105306. [Google Scholar] [CrossRef]
Liu, T., Lin, Y., Wang, T., Yeh, S., & Kalyuga, S. (2021). Studying the effect of redundancy in a virtual reality classroom. Educational Technology Research and Development, 69, 1183–1200. [Google Scholar] [CrossRef]
Loke, S. K. (2015). How do virtual world experiences bring about learning? A critical review of theories. Australasian Journal of Educational Technology, 31(1), 112–122. [Google Scholar] [CrossRef]
Mayer, R. E. (2024). The past, present, and future of the cognitive theory of multimedia learning. Educational Psychology Review, 36(1), 8. [Google Scholar] [CrossRef]
McKenney, S., & Reeves, T. C. (2018). Conducting educational design research (2nd ed.). Routledge. [Google Scholar]
Moon, J., Lee, U., Koh, J., Jeong, Y., Lee, Y., Byun, G., & Lim, J. (2025). Generative artificial intelligence in educational game design: Nuanced challenges, design implications, and future research. Technology, Knowledge and Learning, 30, 447–459. [Google Scholar] [CrossRef]
Nygren, T., Samuelsson, M., Hansson, P., Efmova, E., & Bachelder, S. (2025). AI versus human feedback in mixed reality simulations: Comparing LLM and expert mentoring in preservice teacher education on controversial issues. International Journal of Artificial Intelligence in Education. [Google Scholar] [CrossRef]
Parong, J., & Mayer, R. E. (2018). Learning science in immersive virtual reality. Journal of Educational Psychology, 110(6), 785–797. [Google Scholar] [CrossRef]
Southgate, E., Smith, S. P., Cividino, C., Saxby, S., Kilham, J., Eather, G., Scevak, J., Summerville, D., Buchanan, R., & Bergin, C. (2019). Embedding immersive virtual reality in classrooms: Ethical, organisational and educational lessons in bridging research and practice. International Journal of Child-Computer Interaction, 19, 19–29. [Google Scholar] [CrossRef]
Stiell, M., Markowski, M., Jameson, J., Essex, R., & Ade-Ojo, G. (2022). A systematic scoping review and textual narrative synthesis of physical and mixed-reality simulation in pre-service teacher training. Journal of Computer Assisted Learning, 38(3), 861–874. [Google Scholar] [CrossRef]
Theelen, H., van den Beemt, A., & den Brok, P. (2019). Using 360-degree videos in teacher education: Effects on teacher self-efficacy, instructional skills, and classroom management. Journal of Computer Assisted Learning, 35(5), 635–647. [Google Scholar]
Yin, R. K. (2018). Case study research and applications: Design and methods (6th ed.). SAGE. [Google Scholar]
Zawacki-Richter, O., Marín, V. I., Bond, M., & Gouverneur, F. (2019). Systematic review of research on artificial intelligence applications in higher education–where are the educators? International Journal of Educational Technology in Higher Education, 16(1), 1–27. [Google Scholar] [CrossRef]
Zhang, J., Pan, Q., Zhang, D., Meng, B., & Hwang, G. J. (2024). Effects of virtual reality based microteaching training on pre-service teachers’ teaching skills from a multi-dimensional perspective. Journal of Educational Computing Research, 62(3), 655–683. [Google Scholar] [CrossRef]
Zhang, N., Ke, F., Dai, C. P., Southerland, S. A., & Yuan, X. (2025). Seeking to support preservice teachers’ responsive teaching: Leveraging artificial intelligence-supported virtual simulation. British Journal of Educational Technology, 56(3), 1148–1169. [Google Scholar] [CrossRef]
Zhang, Z., Zhang-Li, D., Yu, J., & Zhou, H. (2025, April 29–May 4). Simulating classroom education with LLM-empowered agents. 2025 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (pp. 10364–10379), Albuquerque, NM, USA. [Google Scholar] [CrossRef]
Zheng, L., Jiang, F., Gu, X., & Li, Y. (2025). Teaching via LLM-enhanced simulations: Authenticity and barriers to suspension of disbelief. The Internet and Higher Education, 65, 100990. [Google Scholar] [CrossRef]

Figure 1. Formative design cycle of TeacherGen@i.

Figure 2. Example of Undergraduate students’ Reflection Journal.

Figure 3. Architecture of the TeacherGen@i VR simulation (Hwang et al., 2024).

Figure 4. Usability Issues: Severity vs. Frequency Analysis.

Figure 5. LIWC Category Integration Network.

Table 1. Undergraduate Usability Test Result (N = 18).

Ease of Use								Usefulness
	Usage Ease	Task Clarity	Speed	Content Amount	Text Clarity	Text Readability	Technical Reliability	Objective Alignment
M (SD)	3.17 (1.38)	3.33 (1.14)	3.56 (1.10)	3.78 (1.17)	3.56 (1.38)	3.78 (1.35)	2.28 (1.45)	3.06 (1.06)
Usefulness
	Goal Support	User Motivation	Realistic Scenarios	Familiar Terminology	Broad Suitability	Class Preparation	Teacher Confidence	Future Use
M (SD)	3.17 (1.10)	3.06 (1.06)	3.56 (1.20)	3.61 (1.20)	3.39 (1.46)	3.56 (1.25)	3.56 (1.20)	2.89 (1.23)

Table 2. Thematic analysis regarding students’ applications of multimodal design principles.

Design Principle	Well-Reflected Multimedia Elements	Elements Needing Improvement
Coherence	Relevant text/image/animation Simple UI (e.g., blackboard, desk)	Overuse of animation
Signaling	Headings, color codes, bold text Microphone use to guide attention	Lack of dynamic signaling options Vague visual focus cues
Redundancy	Clear delivery mode (typed or spoken)	Overlapping teacher/speech Speech bubbles overload
Spatial Contiguity	Consistent on-screen placement	Explanatory text far from visual elements
Temporal Contiguity	Synchronous narration + visuals	Speech delays (e.g., lag)
Segmenting	Clear structure: topic, objectives, process	Uneven interactive exercise flow Lack of guidance for segment pacing
Pre-training	Input space for prior knowledge Goal reminders for simulations	No integration guidance for prior content
Modality	Text boxes for visual support Helpful for visually impaired learners	Over-reliance on text Weak auditory cues No background/ambient sounds
Personalization	Adaptive content suggestions Conversational style support Individualized learning paths	Generic feedback No tailored guidance based on learner performance

Table 3. Summary of Pre-Service Teachers’ Perceptions of the Simulation’s Instructional Utility.

Key Theme	Participant Perception	Detailed Description	LIWC Validation
Authentic Interaction and Adaptive Teaching Practice	Perceived as effective in enhancing readiness for tech-integrated instruction	Authentic interactions enabled by diverse virtual student personalities and dynamic facial expressions Customization by inputting student data and previous lesson content Allowed practice of differentiated instruction strategies	Social References: 2.77% Insight Processing: 0.86% Present Focus: 1.76% (r = 0.425, social-insight correlation)
Scaffolding and Confidence Building	Helped maintain instructional focus and support systematic lesson implementation	Mission-based format aligned with Gagné’s instructional events Repetitive practice enhanced confidence Enabled iterative teaching experiences focused on instructional goals	Learning Process: 1.32% → 1.48%; Teaching Process: 0.46% Certainty: 0.16% → 0.23% (r = 0.342, learning-teaching correlation)
Multimodal Communication Development	Contributed to development of technology-mediated communication skills	Rich communication combining verbal and non-verbal elements Nuanced facial expressions informed adaptive teaching Examples: Socratic questioning for unmotivated but capable students, metacognitive strategies for those needing support	Technology Language: 1.41% Social References: 2.77% Positive Emotion: 0.73% (r = 0.630, technology-social correlation)
Areas for Future Improvement	Recognized limitations in fully replicating real classroom scenarios	File upload restrictions hindered use of diverse resources (e.g., videos, physical materials) Lack of student work submission functionality Limited opportunities for student-centered, interactive learning	Discrepancy Detection: 0.83% Future Focus: 0.63% Insight Processing: 0.86% (Constructive criticism pattern)

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Hong, S.; Moon, J.; Eom, T.; Awoyemi, I.D.; Hwang, J. Generative AI-Enhanced Virtual Reality Simulation for Pre-Service Teacher Education: A Mixed-Methods Analysis of Usability and Instructional Utility for Course Integration. Educ. Sci. 2025, 15, 997. https://doi.org/10.3390/educsci15080997

AMA Style

Hong S, Moon J, Eom T, Awoyemi ID, Hwang J. Generative AI-Enhanced Virtual Reality Simulation for Pre-Service Teacher Education: A Mixed-Methods Analysis of Usability and Instructional Utility for Course Integration. Education Sciences. 2025; 15(8):997. https://doi.org/10.3390/educsci15080997

Chicago/Turabian Style

Hong, Sumin, Jewoong Moon, Taeyeon Eom, Idowu David Awoyemi, and Juno Hwang. 2025. "Generative AI-Enhanced Virtual Reality Simulation for Pre-Service Teacher Education: A Mixed-Methods Analysis of Usability and Instructional Utility for Course Integration" Education Sciences 15, no. 8: 997. https://doi.org/10.3390/educsci15080997

APA Style

Hong, S., Moon, J., Eom, T., Awoyemi, I. D., & Hwang, J. (2025). Generative AI-Enhanced Virtual Reality Simulation for Pre-Service Teacher Education: A Mixed-Methods Analysis of Usability and Instructional Utility for Course Integration. Education Sciences, 15(8), 997. https://doi.org/10.3390/educsci15080997

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Generative AI-Enhanced Virtual Reality Simulation for Pre-Service Teacher Education: A Mixed-Methods Analysis of Usability and Instructional Utility for Course Integration

Abstract

1. Introduction

2. Literature Review

2.1. Virtual Reality Simulation in Teacher Education

2.2. GenAI-Enhanced Teacher Simulation

2.3. Instructional Design and Usability: A Dual-Theoretical Lens Using CTML and Gagné’s Nine Events of Instruction

3. Methods

3.1. Research Design

3.2. Research Context and Participants

3.3. Data Collection and Instruments

3.4. Data Analysis

3.5. Learning Environment: TeacherGen@i

4. Results

4.1. RQ1: Identification of Technical and Usability Challenges Through Multimedia Design Principles

4.1.1. Usability Evaluation—Graduate Pilot Study

4.1.2. Usability Evaluation—Undergraduate Main Study

4.1.3. Multimedia Design Principle Evaluation

4.1.4. Severity-Frequency Analysis and Design Response

4.2. RQ2: Pre-Service Teachers’ Perceptions of Instructional Utility and Readiness for Classroom Teaching

4.2.1. Authentic Interaction and Adaptive Teaching Practice

4.2.2. Structured Scaffolding and Confidence Building

4.2.3. Multimodal Communication Development

4.2.4. Areas of Future Refinement

5. Discussion

5.1. RQ1: Identification of Technical and Usability Challenges Through Multimedia Design Principles

5.1.1. Multimedia Principles as Diagnostic Frameworks

5.1.2. Contextual Limitations in Immersive Environment Design

5.2. RQ2: Pre-Service Teachers’ Perceptions of Instructional Utility and Readiness for Classroom Teaching

Exploring Instructional Utility as Perceived by Pre-Service Teachers

6. Conclusions, Limitation, and Future Research

6.1. Limitation and Future Research

6.2. Implications

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI