Enhancing Reflection in VR-Based Evacuation Training Through Synchronized Auditory Clue Presentation: A Pilot Study

Mitsuhara, Hiroyuki; Yamanaka, Ryoichi; Matsushige, Maya; Kozuki, Yasunori

doi:10.3390/app16063048

Open AccessArticle

Enhancing Reflection in VR-Based Evacuation Training Through Synchronized Auditory Clue Presentation: A Pilot Study

¹

Graduate School of Technology, Industrial and Social Sciences, Tokushima University, 2-1, Minami-Josanjima, Tokushima 770-8506, Japan

²

Research Center for Management of Disaster and Environment, Tokushima University, 2-1, Minami-Josanjima, Tokushima 770-8506, Japan

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2026, 16(6), 3048; https://doi.org/10.3390/app16063048

Submission received: 19 February 2026 / Revised: 18 March 2026 / Accepted: 19 March 2026 / Published: 21 March 2026

(This article belongs to the Special Issue Interaction Design Technologies for Education: Advancements, Challenges, and Impacts)

Download

Browse Figures

Versions Notes

Abstract

Virtual reality (VR)-based evacuation training provides a safe and immersive environment for participants to experience disaster scenarios. However, existing systems often prioritize the experience itself, leaving the critical stage of reflection—essential for refining and stabilizing evacuation knowledge—under-supported. This study presents a qualitative pilot investigation into an extended reflection support function for a VR-based evacuation training system. Unlike traditional replay functions that only visualize avatar movements, our system synchronizes spatialized environmental sounds and recorded verbal utterances, i.e., voices of the user and non-player characters (NPCs), with the visual replay. A preliminary experiment involving eight university students was conducted to evaluate how these auditory clues influence the reflection-on-action process. Qualitative results indicate that audio clues help participants recall their internal decision-making processes and provide essential context for understanding the actions of others (NPCs). The findings suggest that the integration of auditory information facilitates evacuation knowledge refinement, i.e., the transition from mere experience to the formulation of concrete survival concepts. Although limited by a small sample size, this study highlights the potential of multi-modal reflection support in VR-based evacuation training.

Keywords:

reflection-on-action; evacuation knowledge refinement; auditory clue presentation

1. Introduction

Virtual reality (VR) is widely adopted in education to enhance learning outcomes through immersive, domain-specific experiences [1,2,3,4,5,6]. Recently, the integration of artificial intelligence (AI) has further advanced these environments, enabling highly interactive and personalized training such as simulations involving AI-driven agents [7,8]. A primary advantage of educational VR is the ability to facilitate repeated trial and error within a safe environment. This is particularly valuable in safety education—such as construction and chemical training—where real-world failures would entail significant danger [9,10,11,12]. By simulating hazardous scenarios, VR allows trainees to recognize and learn from risks without physical consequences, making it an ideal medium for critical safety training.

Disaster training, a key component of safety training, focuses on survival strategies for natural and man-made disasters, aiming to mitigate risks such as injury, long-term health effects, and fatality. VR-based disaster simulations allow users to refine their survival knowledge and skills through immersive experience [12,13,14]. For example, a serious game has been used to promote public awareness of safety behaviors during natural disasters in urban spaces [15]. Additionally, a desktop VR system has been utilized to train healthcare workers for active shooter incidents by simulating tense, high-pressure environments with scenario-controlled avatars [16]. Another immersive VR system provides mining safety knowledge by visualizing warning signs of danger via head-mounted displays (HMDs) [17].

In many communities, evacuation training is conducted regularly to ensure people can reach safety during a disaster. For example, fire drills provide simulated experiences of calm evacuation upon hearing an alarm. However, traditional real-world training struggles to achieve high realism; while harmless smoke can be used, real flames cannot be utilized for safety reasons. In contrast, VR-based evacuation training can simulate various disaster scenarios with high realism (notwithstanding sensory limitations such as smell and temperature). Trainees can safely repeat evacuations while testing different strategies and actions. Due to these advantages, VR has been actively applied to building and fire evacuation training [18,19,20,21].

Despite these advancements, most VR-based evacuation training systems do not sufficiently emphasize the process of reflection. In educational theory, reflection is a critical stage in self-regulated learning [22] and experiential learning [23]. Incorporating reflection into VR evacuation training can significantly enhance its effectiveness. Specifically, it encourages participants to identify factors that lead to success or failure, thereby refining their evacuation knowledge. By repeating this cycle of participation and reflection, trainees can stabilize their knowledge and increase their self-efficacy.

Against this background, we previously developed a VR-based evacuation training system that allows participants to evacuate as avatars and subsequently reflect on their performance by replaying their actions [24,25]. However, the replay functionality was limited to the movement of avatars (i.e., location and orientation), which provided insufficient clues for deep reflection. To further enhance the educational impact of reflection, we have extended the system’s support by introducing synchronized auditory clue presentation.

This study adopts a qualitative-dominant pilot approach. Based on the purpose of this pilot study, we formulated the following research hypothesis (H1) and research questions (RQs):

H1.

The presentation of synchronized auditory clues during VR replay enhances the participant’s ability to identify factors leading to evacuation success or failure compared to visual-only recall.

RQ1: How do synchronized auditory clues assist participants in recalling their internal decision-making processes?
RQ2: To what extent do these clues contribute to the formulation of concrete survival concepts?

Existing systems predominantly rely on visual data, leaving a research gap regarding the role of synchronized sound in evacuation knowledge refinement. Therefore, the purpose of this article is to propose a novel reflection support function and to present a qualitative pilot investigation into its effectiveness in assisting trainees to internalize survival concepts.

2. Theoretical Framework

Evacuation training aims to enhance survival capabilities through simulated experience. Unlike traditional drills that often focus on rote memorization of routes, VR-based training facilitates complex decision-making through high-fidelity simulations [26,27,28].

2.1. Evacuation Knowledge Hierarchy and Training Effect

In this study, the evacuation training effect is conceptualized as the process through which participants refine and stabilize their evacuation knowledge. We posit that while basic information can be acquired through passive media, simulated experience is indispensable for transforming that information into actionable knowledge.

As illustrated in Figure 1, we categorize evacuation knowledge into a three-layer hierarchy. The foundational layer is Disaster Knowledge, which encompasses the physical phenomena and inherent threats of disasters. Building upon this is Local Knowledge, specific to the geographical and social characteristics of a community, such as shelter locations and hazardous areas. The apex is Survival Knowledge, which represents the direct concepts required for successful evacuation. This is further subdivided into local, situational, and individual survival knowledge.

Within this framework, refinement refers to the cognitive update where a participant corrects or enhances their knowledge layers based on feedback from the simulation. Stabilization occurs when these refined insights are finalized into “confident concepts”—mental models that allow for rapid, decisive action under the psychological pressure of a real disaster. The training effect is thus measured by the number and clarity of these concepts formed through experience.

In this research design, the constructs of knowledge refinement and stabilization are operationalized through the qualitative analysis of post-questionnaire responses. Refinement is identified when a participant recognizes a gap between their prior knowledge and the simulated experience (e.g., “I tend to get disoriented by concentrating on evacuation even in my community”). Stabilization is identified when a participant formulates a confident concept—a specific, actionable rule for future survival (e.g., “I will prioritize wide routes to avoid disorientation”).

2.2. Cyclic Training Model Based on Experiential Learning

Regardless of the environment—real or virtual—evacuation training must provide both simulated experiences and opportunities for reflection. Such opportunities enable participants to refine their evacuation knowledge effectively. According to [29], reflection is categorized into reflection-in-action, where trainees adjust their behavior during the simulation (e.g., accelerating movement to compensate for delayed decision-making), and reflection-on-action, which involves a comprehensive post-simulation analysis of one’s decisions and their consequences. This latter form of reflection is essential for the significant refinement and stabilization of evacuation knowledge.

For higher training effects, we propose a cyclic evacuation training model [30] based on Kolb’s experiential learning theory [23]. As illustrated in Figure 2, this model views learning as a four-stage cycle where knowledge is created through the transformation of experience:

Concrete Experience (CE): The participant is immersed in a high-fidelity disaster scenario, requiring real-time decision-making against dynamic hazards.
Reflective Observation (RO): After the simulation, the participant observes their performance from multiple perspectives to identify factors that led to successful or failed outcomes.
Abstract Conceptualization (AC): Based on these observations, the participant formulates or modifies their survival concepts (e.g., “I must prioritize wide routes over familiar narrow ones during a fire”).
Active Experimentation (AE): The newly formed concepts are tested in subsequent simulations, initiating a new cycle of learning.

This cyclic process identifies the RO and AC stages as the “reflection-on-action” phase, which serves as the critical bridge for knowledge refinement.

2.3. Necessity of Multi-Modal Reflection Support

In highly realistic VR simulations, participants often cannot perform reflection-in-action due to intense immersion and stress (e.g., fear or panic). Therefore, they must perform reflection-on-action calmly after the simulation. However, recalling details from a high-stress experience is difficult. This necessitates reflection support such as replaying the participant’s actions to facilitate the RO stage. In the AC stage, participants are expected to create their concepts spontaneously depending on the outcomes in the RO stage (i.e., their factors for successful or failed evacuation).

While visual replays facilitate objective observation of avatar movements, they may not fully trigger the recall of the participant’s subjective mental state at the time of decision-making. By introducing auditory clue presentation into this framework, we aim to provide a multi-modal stimulus that more effectively prompts participants to reconstruct their internal logic, thereby facilitating a more robust transition from concrete experience to abstract conceptualization.

2.4. Focused Literature Review

Reflection, encompassing both reflection-in-action and reflection-on-action, is essential for enhancing training effects; however, it is often difficult to complete effectively without external support. A practical investigation into computer-supported collaborative reflection pointed out that comprehensive instructions on how to reflect using a specific tool are required for sufficient reflection [31]. Another critical issue is how to motivate the reflective process. One effective approach is to induce failure; for instance, a VR-based safety training system grounded in productive failure theory induces failure by controlling feedback content. Research has shown that failure-induced participants were more likely to acquire and retain knowledge within the experiential learning cycle by applying and reapplying prior knowledge through trial and error [32].

In real-world evacuation training, participants can reflect on their simulated experiences by reviewing video recordings of their performance [33]. Additionally, some mobile applications for evacuation training record participants’ routes via GPS and visualize them on digital maps to promote reflection [34,35].

In the context of VR-based evacuation training, while reflection has not always been the primary focus, many systems have incorporated it as a feedback functionality. Regarding reflection-on-action, for example, a VR-based fire safety training allows participants to answer quizzes after tasks and retry completed ones [36]. Other VR systems provide integrated support; a training system for active shooter incidents provides immediate feedback for dangerous situations (reflection-in-action) along with a summary board for performance evaluation (reflection-on-action) [16]. An extended version of that system further provides positive or negative feedback based on actions, accompanied by videos demonstrating correct behaviors [37]. Similarly, a VR-based earthquake evacuation training adopting a serious game approach presents various feedback styles, such as post-game assessments and “spiral feedback,” where participants must answer quizzes correctly to proceed [38]. Feng et al. developed an earthquake training system allowing for customized methods, including immediate feedback and post-training assessments [39]. While studies showed that secondary school students improved their knowledge acquisition and self-efficacy compared to traditional leaflet-based instruction, researchers noted that reflection time should be increased to further enhance training effects [40]. Furthermore, this system was extended to provide spiral and linear narrative feedback, which proved effective for university students and staff [41].

Despite these advancements, a critical synthesis of these existing mechanisms reveals a significant gap; they predominantly rely on abstract performance data or visual summaries. While visual replays and quizzes provide objective outcomes, they often fail to help trainees reconstruct the subjective, internal decision-making process—the “why” behind their actions—especially in high-stress scenarios where immersion and panic may cloud memory. The integration of synchronized auditory clues (e.g., environmental sounds and verbal interactions) as a catalyst for reflection remains largely unexplored. This lack of multi-modal reflection support limits the depth of experiential learning, as trainees may struggle to recall the specific sensory triggers that influenced their behavior.

Previous studies have demonstrated that reflection support is effective in VR-based evacuation training, but as noted above, several issues regarding the modalities of reflection clues remain. Consequently, we believe that reflection support should be further improved and diversified to maximize its educational impact.

3. Reflection Support

The reflection support function was implemented within a VR-based evacuation training system characterized by metaverse features such as multi-user access and avatar interaction [24].

3.1. Base System Overview

The system was developed using the Unity game engine. Multi-user synchronization and voice-based communication are handled via the Photon Fusion and Photon Voice frameworks, while gesture-based interactions are managed through the Final IK kinematics asset. An administrative user hosts the virtual environment, which is simultaneously accessible by multiple clients using HMDs (e.g., Meta Quest, HTC VIVE) or standard computers.

The training focuses on sudden earthquake scenarios and is structured into three distinct phases: Normal Time (NT), Emergency Time (ET), and Reflection Time (RT). Figure 3 shows the relationship between these phases and the learning stages, along with the user interfaces provided in each phase. The ET phase corresponds to the Concrete Experience (CE) and Active Experimentation (AE) stages of Kolb’s model, while the RT phase covers Reflective Observation (RO) and Abstract Conceptualization (AC).

3.1.1. Normal Time (NT)

In the NT phase, users interact within the metaverse through basic functions. Activities include attending digital lectures delivered by a teacher avatar or engaging in social games such as darts and bowling.

3.1.2. Emergency Time (ET)

The ET phase is triggered by a simulated earthquake that shakes the virtual environment and generates disaster-related objects (e.g., flames and debris) based on a pre-selected scenario. Each instance of an ET phase is termed an “incident.”

To maintain realism, the system does not force evacuation or display instructional prompts; users must decide whether to evacuate based on the situation. Those who attempt to reach safety within the time limit are designated as “participants.” During this phase, the system records “incident logs”—text files containing time-stamped data on avatar positions (x, y, z) and eye-direction vectors. To ensure data consistency and manage file size, the host computer records these movements at regular intervals using Unity’s FixedUpdate method.

The system also includes an NPC (non-player character) mode for sessions with few human participants. In this mode, NPCs replicate the movements recorded in previous incident logs to simulate human-centric disaster challenges such as crowd congestion or reckless behavior driven by cognitive biases.

3.1.3. Reflection Time (RT)

Participants can transition from the NT to the RT phase at any time to review their performance. In this phase, they are referred to as “observers.” The system supports reflection-on-action by replaying the incident through the following sequence:

Log loading: The system loads the observer’s selected incident log and associated logs of other participants.
Environment reconstruction: Disaster objects are reallocated according to the original scenario.
Avatar replay: Avatars are placed at their initial coordinates and moved/rotated at the recorded intervals.

Observers can toggle between first-person and third-person views. In the third-person view, the camera is positioned diagonally above the avatar and can be rotated using controller sticks.

While a preliminary experiment [25] confirmed that this visual replay is helpful, participants reported difficulty in deeply observing their experiences. This difficulty was attributed not only to UI issues but also to a lack of sufficient clues for reflection. Specifically, replaying only the physical movements of avatars proved insufficient for understanding the underlying decision-making process. To address this and enhance the training effect, we identified a need to provide additional observational clues—leading to the integration of auditory clue presentation.

3.2. Auditory Clue Presentation

In real-world evacuations, individuals make critical decisions based on sensory input, particularly visual and auditory cues. Research indicates that verbal guidance from staff is the most effective form of emergency communication for improving response times [42]. The verbal cues of others can profoundly influence decision-making. For instance, if a perceived leader confidently suggests staying in a building after an earthquake, others may follow despite the risk of aftershocks. Conversely, hearing a warning about a potential gas leak may prompt immediate evacuation.

VR-based simulations are highly effective for investigating these behaviors. Studies using VR fire evacuation simulators have shown that participants tend to choose familiar routes or become “followers” in group dynamics [43]. Furthermore, providing spoken messages or signage explaining situational hazards such as crowd congestion has been shown to mitigate anxiety and reduce total evacuation time [44]. Other research emphasizes that individual roles—and the communication associated with them—largely determine evacuation actions [45].

3.2.1. Overview

In our base system, the reflection support function relied solely on visual cues (avatar movements), which made it difficult for observers to accurately recall or infer the reasons behind their own or others’ actions. For example, an observer might see an avatar remaining in a building but cannot know if that avatar was staying because they believed the building was safe (“This building is sturdy”) or because they were paralyzed by indecision (“I have no idea what to do”).

To enable more effective reflection-on-action, we extended this function to include auditory clue presentation. The extended function replays all auditory information synchronized with the visual replay of the ET phase. By replaying environmental sounds (e.g., fires, collapsing structures) alongside the voices of the observer and surrounding avatars, the system provides the necessary context to clarify the “why” behind evacuation behaviors during the RO stage.

3.2.2. Module Composition

Figure 4 illustrates the system’s module composition, including the extended reflection function. To ensure the high-fidelity presentation of localized auditory clues, the extended function operates in a standalone mode for each user. Specifically, to maintain synchronization and audio spatialization, the multi-user network connection is deactivated during the sound-enabled reflection, and the system utilizes the NPC mode.

During the ET phase, the user’s voice is captured via a microphone and Photon Voice and then saved as a time-stamped WAV file (e.g., 202601010000.wav). In this extended system, an “incident log” is defined as a synchronized pair consisting of a movement log (text) and a voice log (audio). Currently, the system utilizes pre-recorded voice files for NPCs that correspond to specific disaster scenarios.

3.2.3. Sound Play

To ensure a seamless, non-stressful experience, the system initiates audio playback and visual movement simultaneously. Environmental sounds are triggered based on the distance between the observer and the sound-emitting disaster objects.

A key technical challenge is maintaining synchronization during playback control (e.g., pausing or rewinding). When an observer rewinds the replay, the system calculates the exact corresponding line in the movement log and the precise timestamp in the voice log. For example, with a recording interval of 0.02 s, a 500-frame rewind prompts the system to jump back exactly 10 s in both the movement and audio data.

To ensure “situation-congruent” playback, volume levels are adjusted using Unity’s Linear Rolloff model. This provides realistic spatialization, where sounds attenuate with distance. Furthermore, the system accounts for acoustic insulation as follows:

If an NPC is behind a wall, the volume is reduced by 50%.
If an NPC is on a separate floor, the volume is muted (0%).

While environmental and NPC sounds are spatialized, the observer’s own recorded voice is replayed at a constant, clear volume. This allows the observer to objectively review their own verbal reactions and decisions while remaining immersed in the simulated auditory environment. The mechanism for synchronized playback and volume adjustment is conceptualized in Figure 5.

4. Preliminary Experiment

We conducted a preliminary experiment to evaluate whether the extended reflection support function (auditory clue presentation) operates as intended. To verify the formulated research hypothesis (H1) and research questions (RQ1–2), we employed a descriptive qualitative method supplemented by basic descriptive statistics. Thematic analysis was used to interpret open-ended responses, while a Likert scale provided a preliminary measure of user perception.

4.1. Settings

4.1.1. Participants

We recruited eight students from Tokushima University (aged 18–24; one female, seven males) who volunteered to participate in the VR-based evacuation training and the subsequent reflection session.

4.1.2. Scenario and NPCs

The experiment was based on a scenario where students encounter a major earthquake and subsequent building fire while in a six-story campus lecture building. Participants were required to navigate their avatars to a first-floor exit as quickly as possible. Disaster elements, including flames, smoke, and debris, were strategically placed throughout the floors.

To simulate physical risk, we introduced a “health status” metric for avatars. If an avatar remained near fire or smoke, its health decreased, accompanied by a coughing sound effect. If health reached zero, the avatar was declared “dead,” simulating the fatal consequences of reckless actions. All participants were already familiar with the building’s layout (e.g., exits and stairwells), as they regularly attend lectures there, which enhanced the spatial realism of the training.

A total of 10 NPCs were placed within the building:

Unconscious NPCs (N = 3): Remained stationary and silent, except for occasional faint moans.
Active NPCs (N = 7): Simulated uninjured evacuees, providing approximately three minutes of pre-recorded verbal utterances tied to their actions and surroundings. For example, an NPC near fire would exclaim, “It’s too hot to get close!”

Table 1 details the NPC settings, including their specific actions and dialogue. Figure 6 illustrates the building’s composition, disaster object locations, and NPC placement.

4.1.3. Procedure

Figure 7 outlines the experimental procedure. Initially, participants completed a pre-questionnaire regarding their prior VR experience. They were then briefed on avatar operations using the Meta Quest 2 HMD (Meta Platforms, Inc., CA, USA) and controllers. To minimize VR sickness, participants were seated in a rotating chair and informed that they could stop the training at any time. A brief exploration period followed to allow them to habituate to the controls.

The core procedure for the ET and RT phases was as follows:

ET phase: When a participant reached the sixth-floor hall, a massive earthquake was triggered. To prevent VR sickness caused by visual processing delays and perceptual discrepancies, the earthquake was represented by an Earthquake Early Warning alert and 10 s of rumbling and glass-breaking sounds, rather than shaking the visual display. Following a fire alarm, the participant evacuated the building. The system recorded their voice and avatar movements (incident logs), and the screen was video-recorded for data analysis.
Transition: The phase ended when the participant exited the building (within a 20 min limit) or was declared dead. In the latter case, the cause of death was displayed before transitioning to the RT phase.
RT phase: Participants were instructed on how to use the extended reflection function. They could choose to view the replay via the HMD or a 2D monitor and were free to toggle between first- and third-person views. While they were required to watch the replay until the end, there were no limits on the duration of reflection or the frequency of playback operations (pause, rewind, view change). All user operations during this phase were logged.

Finally, participants completed a post-questionnaire to evaluate their experience.

4.2. Results

Prior to the experiment, seven participants reported experience with immersive VR, while one had only experienced desktop-based VR.

4.2.1. Behavioral Data and Scores in the ET Phase

All participants successfully completed the evacuation. One participant (P6) temporarily suspended the session due to VR sickness but resumed after a brief rest. The mean evacuation time, excluding P6’s break, was 142.8 s (SD = 52.4, range: 79–235 s). Table 2 summarizes the participants’ actions in representative situations, as determined through video analysis. Interaction with NPCs varied depending on the chosen evacuation route.

During the ET phase, three participants spontaneously uttered phrases such as “Wow,” “It’s dangerous!”, or “What should I do?”, reflecting surprise or anxiety upon encountering flames, debris, or NPCs. The other participants remained silent during this phase.

Behavioral patterns toward NPCs and hazards were diverse. No participants followed distant NPCs, even if they were within the field of view. While all participants generally detoured around flames and smoke, three participants (P1, P4, and P5) passed through fire at least once. Seven participants (except for P2) passed through an unconscious NPC (N10) near the exit to reach the outside via E1. Individual characteristic actions included the following:

Auditory-driven decisions: P6 and P8 immediately detoured upon hearing N2’s voice near a hazard (FS1). P8 also approached N9 after hearing another NPC (N8) ask, “Are you okay?”
Social/NPC interaction: P2 and P7 followed nearby NPCs without overtaking them and paused to check unconscious NPCs. In contrast, P3 and P4 moved more aggressively, often overtaking NPCs.
Risk-taking: P1 and P5 chose to pass through fire (FS1 and FS2, respectively) despite hearing NPC warnings.

4.2.2. Replay Interaction During the RT Phase

For safety, P6 conducted the RT phase using a desktop monitor rather than the HMD. The mean numbers of pause, rewind, and view-change operations were 3.25, 2.13, and 5.62, respectively. Seven participants (all except P8) utilized the pause or rewind functions, and seven (all except P4) actively toggled between viewpoints.

Common behaviors during the RT phase included:

Environmental scanning: Many participants (P1, P2, P5, and P7) paused in the hall or near hazards to observe their surroundings from a third-person perspective.
Reviewing interactions: P2 focused her reflection on the area around N5, switching views while rewinding to re-examine the interaction. P6 specifically reviewed his decisions near flames (FS1 and FS3&D2) and NPCs.
Linear viewing: P8 was the only participant who watched the replay until the end without any control operations.

4.2.3. Post-Questionnaire

Table 3 presents the questionnaire items and responses. Participants evaluated Q1 and Q5–Q7 using a five-point Likert scale and provided open-ended responses for the remaining questions. Notably, Q4 was designed to capture what participants gained through reflection without explicitly prompting them for “concepts,” allowing for a more organic assessment of the training’s impact.

4.3. Discussion

4.3.1. Behavioral Analysis of Decision-Making in the ET Phase

The diversity in the participants’ actions during the ET phase appears to stem from individual differences in cognitive tendencies, mental states, and evacuation strategies. While P1, P4, and P5 made independent decisions regardless of NPC behavior, P2 and P7 followed nearby NPCs as a social heuristic. P8 demonstrated a particular sensitivity to auditory cues, relying heavily on NPC voices for decision-making.

The median for Q1 (4.0) suggests that the ET phase achieved a favorable level of realism. Notably, five participants explicitly highlighted that NPC voices enhanced the immersive experience. However, P2 and P7 noted a discrepancy between the NPCs’ visual appearances and their voices. This suggests that further synchronizing audiovisual fidelity is crucial for maximizing training effects across a broader range of participants.

Common behavioral patterns were also observed. No participants followed distant NPCs, but many detoured around hazards based on perceived danger. Interestingly, P1, P4, and P5 chose to pass through flames (FS1 and FS2), likely perceiving gaps in the visual effects as navigable spaces. This type of risk-taking—often observed in VR where physical safety is guaranteed—highlights the importance of the RT phase, where such reckless actions can be critically re-evaluated. The fact that P1 and P5 ignored auditory warnings (N2) while P6 and P8 heeded them confirms that while sound enhances realism, its impact on behavioral change is moderated by individual cognitive traits and evacuation strategies.

4.3.2. Cognitive Impact of Multi-Modal Reflection in the RT Phase

The frequent use of replay operations in the initial hall suggests that participants were familiarizing themselves with the interface. The medians for Q5 (4.0) and Q6 (4.0) indicate that participants transitioned smoothly into the RT phase and could operate the system with ease, likely due to the intuitive nature of the pause, rewind, and view-change functions. Nevertheless, the risk of VR sickness remains a challenge; future iterations must balance immersiveness with user comfort for prolonged sessions.

Responses to Q2 show that the replay functionality successfully prompted participants to observe not only their physical actions but also their internal thought processes. Participants used the reflection period to evaluate their decisions and search for an “ideal” evacuation strategy. The NPCs also played a vital role in this phase (Q3); P1 and P2 realized how the presence of others could complicate or delay decision-making, while P6 recognized that NPC voices had actually prevented him from taking reckless risks. This “belated realization” of auditory influence is a key effect of the extended function. However, the fact that few participants mentioned their own recorded voices suggests that brief exclamations may not be perceived as valuable data for reflection compared to environmental or NPC cues.

Regarding Q4, although we did not explicitly ask for “concepts,” several participants successfully formulated specific lessons or survival concepts (concepts for successful evacuation):

P3 and P6: Emphasized the importance of speedy evacuation and early hazard detection.
P4 and P5: Recognized that prior knowledge of multiple routes is essential to avoid indecision during fire events.
P8: Developed a concept of calm, independent decision-making while maintaining situational awareness.
P1, P2, and P7: While they gained insights into their susceptibility to social influence, their concepts remained more tentative, suggesting that some participants may require more structured guidance or multiple cycles to finalize their evacuation knowledge.

To evaluate the depth of reflection, we performed a thematic analysis on the open-ended responses for Q3 and Q4. Rather than treating each response in isolation, we categorized them into two dominant themes.

Auditory hazard recognition: Participants reported that auditory clues acted as a catalyst for recognizing hazards they had missed during the simulation. For example, P6 realized the danger of broken glass only after hearing an NPC’s warning during the replay.

Social influence awareness: Participants noted that the NPCs’ movements and/or voices made them realize their own susceptibility to social influence. P8 noted that recognizing others’ confusion reinforced the importance of independent decision-making.

We observed notable individual differences. P7 reported “None” for NPC influence, which may be attributed to a visual-dominant cognitive style or a high degree of task-focus that filtered out secondary auditory information. Furthermore, P8 did not utilize the playback control interface (pause/rewind). This could suggest a high level of immersion and successful recall without the need for manual intervention or, conversely, a lack of engagement with the reflective tools. These findings suggest that future iterations of the system should offer adaptive reflection support tailored to individual cognitive profiles (e.g., visual- vs. auditory-dominant learners).

4.3.3. Summary

The median for Q7 (4.0) confirms the perceived utility of reflection in VR-based evacuation training. Although the specific weight of auditory clues versus visual cues requires further quantitative isolation, the combination of movement visualization and auditory clue presentation appears to be a robust element for VR-based evacuation training.

Regarding RQ1, the synchronized auditory clues significantly assisted participants in recalling their internal decision-making processes. The playback of NPCs’ anxious dialogue served as a powerful sensory trigger, allowing them to reconstruct the “why” behind their actions, even when their memories were clouded by the stress of the simulation.

Regarding RQ2, the results indicated that auditory clues contributed to the refinement and stabilization of hierarchical evacuation knowledge. By identifying hazards (e.g., fire or broken glass) and social influences (e.g., others’ panic) through auditory information, participants formulated more concrete and actionable survival concepts such as prioritizing independent decision-making over following a crowd.

Consequently, these findings provide preliminary verification of our hypothesis (H1): the presentation of auditory clues during VR replay enhances a participant’s ability to identify critical factors of evacuation success or failure more effectively than visual-only recall.

To ensure the rigor of our qualitative analysis, the theoretical constructs were mapped to specific empirical indicators as follows:

Knowledge Refinement: Identified when participants explicitly recognized a gap between their prior knowledge and their simulated experiences (e.g., P6 stated, “I realized I didn’t begin evacuating immediately despite my knowledge”).

Knowledge Stabilization: Identified when participants shifted their behavioral intent from hesitation to a clear rule-based decision, formulating a concrete and actionable survival concept during reflection (e.g., P8 stated, “I learned to stay calm and scan the environment to identify safe routes”).

4.4. Limitations

Despite the insights gained, several significant limitations must be addressed. This study lacked a comparative experiment between the previous system (visual-only reflection) and the extended system (visual–auditory reflection). To isolate the specific effects of auditory clue presentation, it would ideally be necessary for multiple participants to participate in the ET phase simultaneously, after which they would be divided into control and experimental groups for the RT phase. However, technical constraints currently prevent the centralized collection of individual WAV files onto a host computer, limiting the function to localized playback. Consequently, the current findings remain qualitative and exploratory. Further investigation is required to quantify how auditory cues specifically contribute to the refinement and stabilization of evacuation knowledge.

Another significant limitation of this study is the small and homogeneous sample size (N = 8), consisting primarily of university students (seven males and one female). Consequently, the findings presented here must be treated as preliminary and interpreted within the context of a qualitative pilot investigation. The limited number of participants reduces the statistical power of the Likert-scale data and restricts our ability to generalize the results to a broader population, such as different age groups, occupations, or individuals with varying levels of disaster experience.

Finally, the effectiveness of the reflection phase may have been influenced by the participants’ varying levels of understanding regarding the underlying pedagogical model. In future studies, providing participants with a more comprehensive orientation on the importance of reflection-on-action within the experiential learning cycle may help them better utilize the auditory clues provided.

As an initial exploration of Metavearthquake’s novel reflection support function, this pilot study prioritized the in-depth qualitative analysis of the user experience and the internal cognitive processes of reflection. The rich verbal feedback and detailed behavioral logs obtained from the participants provided valuable insights into how auditory clues facilitate the transition from concrete experience to abstract conceptualization. To address these limitations and enhance the generalizability of our findings, future research will be designed as a full-scale quantitative study. This will involve:

Expanding the sample size: Increasing the number of participants to allow for rigorous statistical inference and hypothesis testing.
Diversifying the demographics: Recruiting a more representative sample, including elderly individuals and professionals, to evaluate how different life experiences influence the perception of auditory cues in disaster scenarios.
Implementing a control group: Conducting a comparative experiment between visual-only and visual–auditory reflection groups to quantitatively isolate the training effect of auditory clue presentation.

5. Conclusions

In this pilot study, we developed and evaluated an extended reflection support function for VR-based evacuation training that incorporates synchronized auditory clue presentation. Based on Kolb’s experiential learning model, we designed the system to facilitate the “Reflective Observation” and “Abstract Conceptualization” stages by replaying not only the physical movements of avatars but also the auditory environment encountered during the simulated disaster. The preliminary experiment provided early indications that participants could utilize auditory clues to re-examine their evacuation strategies and the social influences exerted by surrounding NPCs. The auditory clues appeared to provide a layer of interpretability that visual data alone might not offer, enabling participants to identify factors leading to success more objectively. Consequently, several participants were able to formulate specific, individual concepts for successful evacuation, such as the importance of independent decision-making and pre-disaster route familiarity. Regarding RQ1 and RQ2, our qualitative analysis suggested that synchronized auditory clues could aid in the recall of internal decision-making and the formulation of survival concepts. These findings provide a preliminary indication towards H1, suggesting that multi-modal replay may be more effective for reflection than visual-only recall.

Despite the promising results, this study has limitations, notably the small and homogeneous sample size (N = 8) and the absence of a control group (i.e., visual-only recall). It is crucial to emphasize that this study is a qualitative pilot study intended for initial exploration. The small and homogeneous sample size inherently limits the generalizability of our findings to broader populations. Due to the absence, a direct comparative evaluation of the specific effect of auditory clues versus visual cues remains beyond the scope of this paper. Furthermore, these results should therefore be interpreted as preliminary insights rather than definitive evidence of training efficacy. Future research will focus on a full-scale quantitative experiment with a larger, more diverse population to statistically validate the training effects. Nevertheless, this study shows promise that multi-modal reflection support—integrating both visual and auditory information—offers a potential framework for enhancing disaster education through immersive technology.

Future work will focus on addressing the current technical limitations, particularly the centralized collection of audio data to enable large-scale comparative experiments. We also aim to enhance the audiovisual realism of NPCs and explore adaptive reflection support tailored to individual cognitive tendencies. By refining these reflection mechanisms, we hope to contribute to more effective and resilient disaster preparedness through immersive technology.

Author Contributions

Conceptualization, H.M.; methodology, H.M.; software, H.M.; validation, H.M.; formal analysis, H.M.; investigation, H.M.; resources, H.M.; data curation, H.M.; writing—original draft preparation, H.M.; writing—review and editing, H.M.; visualization, H.M.; supervision, R.Y., M.M. and Y.K.; project administration, H.M., R.Y., M.M. and Y.K.; funding acquisition, H.M., R.Y., M.M. and Y.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by JSPS KAKENHI (Grant Number JP23K25701) and the Research Clusters of Tokushima University (Grant Number 2202007).

Institutional Review Board Statement

The study was conducted in accordance with the Declaration of Helsinki and approved by the Ethics Committee of TOKUSHIMA UNIVERSITY (protocol code 23001, 3 July 2023).

Informed Consent Statement

Informed consent was obtained from all participants involved in the study.

Data Availability Statement

All data are contained within the article.

Acknowledgments

I would like to express my sincere gratitude to Yusaku Ichino and everybody that collaborated in this study.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following original acronyms (abbreviations) were used in this article.

CE	Concrete Experience
RO	Reflective Observation
AC	Abstract Conceptualization
AE	Active Experimentation
NT	Normal Time
ET	Emergency Time
RT	Reflection Time
FS	flame and smoke
D	debris
N	NPC (non-player character)
E	exit
P	participant
NE	no encounter
PT	pass through
DT	detour
SL	stop and look

References

Fares, O.H.; Aversa, J.; Lee, S.H.; Jacobson, J. Virtual reality: A review and a newframework for integrated adoption. Int. J. Consum. Stud. 2024, 48, e13040. [Google Scholar] [CrossRef]
Marougkas, A.; Troussas, C.; Krouska, A.; Sgouropoulou, C. Virtual Reality in Education: A Review of Learning Theories, Approaches and Methodologies for the Last Decade. Electronics 2023, 12, 2832. [Google Scholar] [CrossRef]
Conrad, M.; Kablitz, D.; Schumann, S. Learning effectiveness of immersive virtual reality in education and training: A systematic review of findings. Comput. Educ. X Real. 2024, 4, 100053. [Google Scholar] [CrossRef]
Schott, C.; Milligan, A.; Marchall, S. Immersive VR for K-12 experiential education—Proposing a pedagogies, practicalities, and perspectives informed framework. Comput. Educ. X Real. 2024, 4, 100068. [Google Scholar] [CrossRef]
Stracke, C.M.; Bothe, P.; Adler, S.; Heller, E.S.; Deuchler, J.; Pomino, J.; Wölfel, M. Immersive virtual reality in higher education: A systematic review of the scientific literature. Virtual Real. 2025, 29, 64. [Google Scholar] [CrossRef]
Samala, A.D.; Rawas, S.; Rahmadika, S.; Criollo-C, S.; Fikri, R.; Sandra, R.P. Virtual reality in education: Global trends, challenges, and impacts—Game changer or passing trend? Discov. Educ. 2025, 4, 229. [Google Scholar] [CrossRef]
Tan, A.; Dorneich, M.C.; Cotos, E. Developing and evaluating the fidelity of virtual reality-artificial intelligence (VR-AI) environment for situated learning. Front. Virtual Real. 2025, 6, 1587768. [Google Scholar] [CrossRef]
Hong, S.; Moon, J.; Eom, T.; Awoyemi, I.D.; Hwang, J. Generative AI-Enhanced Virtual Reality Simulation for Pre-Service Teacher Education: A Mixed-Methods Analysis of Usability and Instructional Utility for Course Integration. Educ. Sci. 2025, 15, 997. [Google Scholar] [CrossRef]
Guo, X.; Liu, Y.; Tan, Y.; Xia, Z.; Fu, H. Hazard identification performance comparison between virtual reality and traditional construction safety training modes for different learning style individuals. Saf. Sci. 2024, 180, 106644. [Google Scholar] [CrossRef]
Sabir, A.; Hussain, R.; Pedro, A.; Park, C. Personalized construction safety training system using conversational AI in virtual reality. Autom. Constr. 2025, 175, 106207. [Google Scholar] [CrossRef]
Fracaro, S.G.; Tehreem, Y.; Toyoda, R.; Gallagher, T.; Glassey, J.; Bernaerts, K.; Wilk, M. Benefits and impact of emergency training in a VR environment. Educ. Chem. Eng. 2024, 48, 63–72. [Google Scholar] [CrossRef]
Scorgie, D.; Feng, Z.; Paes, D.; Parisi, F.; Yiu, T.W.; Lovreglio, R. Virtual reality for safety training: A systematic literature review and meta-analysis. Saf. Sci. 2024, 171, 106372. [Google Scholar] [CrossRef]
Faiz, T.; Tsun, M.T.K.; Mahmud, A.A.; Sim, K.Y. A Scoping Review on Hazard Recognition and Prevention Using Augmented and Virtual Reality. Computers 2024, 13, 307. [Google Scholar] [CrossRef]
Scippo, S.; Luzzi, D.; Cuomo, S.; Ranieri, M. Innovative Methodologies Based on Extended Reality and Immersive Digital Environments in Natural Risk Education: A Scoping Review. Educ. Sci. 2024, 14, 885. [Google Scholar] [CrossRef]
De Fino, M.; Cassano, F.; Bernardini, G.; Quagliarini, E.; Fatiguso, F. On the user-based assessments of virtual reality for public safety training in urban open spaces depending on immersion levels. Saf. Sci. 2025, 185, 106803. [Google Scholar] [CrossRef]
Liu, R.; Becerik-Gerber, B.; Lucas, G.M.; Busta, K. Impact of behavior-based virtual training on active shooter incident preparedness in healthcare facilities. Int. J. Disaster Risk Reduct. 2025, 118, 105225. [Google Scholar] [CrossRef]
Li, X.; Song, S.; Liu, S.; Yin, D.; Wang, R.; Gong, B. Application of Virtual Reality Technology in Enhancing the Teaching Effectiveness of Coal Mine Disaster Prevention. Sustainability 2025, 17, 79. [Google Scholar] [CrossRef]
Gagliardi, E.; Bernardini, G.; Quagliarini, E.; Schumacher, M.; Calvaresi, D. Characterization and future perspectives of virtual reality evacuation drills for safe built environments: A systematic literature review. Saf. Sci. 2023, 163, 106141. [Google Scholar] [CrossRef]
Menzemer, L.W.; Ronchi, E.; Karsten, M.M.V.; Gwynne, S.; Frederiksen, J. A scoping review and bibliometric analysis of methods for fire evacuation training in buildings. Fire Saf. J. 2023, 136, 103742. [Google Scholar] [CrossRef]
Hung, M.-C.; Lin, C.-Y.; Hsiao, G.L.-K. Virtual Reality in Building Evacuation: A Review. Fire 2025, 8, 80. [Google Scholar] [CrossRef]
Liu, Q.; Liu, R. Virtual reality for indoor emergency evacuation studies: Design, development, and implementation review. Saf. Sci. 2025, 181, 106678. [Google Scholar] [CrossRef]
Zimmerman, B.J. Becoming a self-regulated learner: Which are the key subprocesses? Contemp. Educ. Psychol. 1986, 11, 307–313. [Google Scholar] [CrossRef]
Kolb, D.A. Experiential Learning: Experience as the Source of Learning and Development; Prentice Hall: Englewood Cliffs, NJ, USA, 1984. [Google Scholar]
Mitsuhara, H. Metaverse-Based Evacuation Training: Design, Implementation, and Experiment Focusing on Earthquake Evacuation. Multimodal Technol. Interact. 2024, 8, 112. [Google Scholar] [CrossRef]
Mitsuhara, H.; Yamanaka, R.; Matsushige, M.; Kozuki, Y. Reflection Support Function in a Metaverse-Based Evacuation Training System. In Proceedings of the 2024 9th International Conference on Business and Industrial Research (ICBIR), Bangkok, Thailand, 23–24 May 2024; pp. 1570–1575. [Google Scholar] [CrossRef]
Keya, R.T.; Heldal, I.; Patel, D.; Murano, P.; Wijkmark, C.H. Implementing Virtual Reality for Fire Evacuation Preparedness at Schools. Computers 2025, 14, 286. [Google Scholar] [CrossRef]
Fu, Y.; Li, Q. A Virtual Reality–Based Serious Game for Fire Safety Behavioral Skills Training. Int. J. Hum. Comput. Interact. 2024, 40, 5980–5996. [Google Scholar] [CrossRef]
Zhang, Y.; Paes, D.; Feng, Z.; Scorgie, D.; He, P.; Lovreglio, R. Comparative analysis of fire evacuation decision-making in immersive vs. non-immersive virtual reality environments. Autom. Constr. 2025, 179, 106441. [Google Scholar] [CrossRef]
Munby, H. Reflection-in-action and reflection-on-action. Educ. Cult. 1989, 9, 31–42. [Google Scholar] [CrossRef]
Mitsuhara, H.; Tanimura, C.; Nemoto, J.; Shishibori, M. Location-based game for thought-provoking evacuation training. Multimodal Technol. Interact. 2023, 7, 59. [Google Scholar] [CrossRef]
Renner, B.; Prilla, M.; Cress, U.; Kimmerle, J. Effects of Prompting in Reflective Learning Tools: Findings from Experimental Field, Lab, and Online Studies. Front. Psychol. 2016, 7, 820. [Google Scholar] [CrossRef]
Lu, S.; Feng, Z.; Lovreglio, R.; Wang, F.; Yuan, X. Comparing the productive failure and directive instruction for declarative safety knowledge training using virtual reality. J. Comput. Assist. Learn. 2024, 40, 1040–1051. [Google Scholar] [CrossRef]
Sun, Y.; Yamori, K.; Kondo, S. Single-person drill for tsunami evacuation and disaster education. J. Integr. Disaster Risk Manag. 2014, 4, 30–47. [Google Scholar] [CrossRef]
Leelawat, N.; Suppasri, A.; Latcharote, P.; Abe, Y.; Sugiyasu, K.; Imamura, F. Tsunami evacuation experiment using a mobile application: A design science approach. Int. J. Disaster Risk Reduct. 2018, 29, 63–72. [Google Scholar] [CrossRef]
Yamori, K.; Sugiyama, T. Development and social implementation of smartphone app Nige-Tore for improving tsunami evacuation drills: Synergistic effects between commitment and contingency. Int. J. Disaster Risk Sci. 2020, 11, 751–761. [Google Scholar] [CrossRef]
Shiradkar, S.; Rabelo, L.; Alasim, F.; Nagadi, K. Virtual World as an Interactive Safety Training Platform. Information 2021, 12, 219. [Google Scholar] [CrossRef]
Liu, R.; Becerik-Gerber, B.; Lucas, G.M. Effectiveness of VR-based training on improving occupants’ response and preparedness for active shooter incidents. Saf. Sci. 2023, 164, 106175. [Google Scholar] [CrossRef]
Ahmadi, M.; Yousefi, S.; Ahmadi, A. Exploring the most effective feedback system for training people in earthquake emergency preparedness using immersive virtual reality serious games. Int. J. Disaster Risk Reduct. 2024, 110, 104630. [Google Scholar] [CrossRef]
Feng, Z.; González, V.A.; Mutch, C.; Amor, R.; Rahouti, A.; Baghouz, A.; Li, N.; Cabrera-Guerrero, G. Towards a customizable immersive virtual reality serious game for earthquake emergency training. Adv. Eng. Inform. 2020, 46, 101134. [Google Scholar] [CrossRef]
Feng, Z.; González, V.A.; Mutch, C.; Amor, R.; Cabrera-Guerrero, G. Instructional mechanisms in immersive virtual reality serious games: Earthquake emergency training for children. J. Comput. Assist. Learn. 2021, 37, 542–556. [Google Scholar] [CrossRef]
Feng, Z.; González, V.A.; Mutch, C.; Amor, R.; Cabrera-Guerrero, G. Exploring spiral narratives with immediate feedback in immersive virtual reality serious games for earthquake emergency training. Multimed. Tools Appl. 2023, 82, 125–147. [Google Scholar] [CrossRef]
van der Wal, C.N.; Robinson, M.A.; Bruine de Bruin, W.; Gwynne, S. Evacuation behaviors and emergency communications: An analysis of real-world incident videos. Saf. Sci. 2021, 136, 105121. [Google Scholar] [CrossRef] [PubMed]
Wang, B.; Ren, G.; Li, H.; Zhang, J.; Qin, J. Developing a Framework Leveraging Building Information Modelling to Validate Fire Emergency Evacuation. Buildings 2024, 14, 156. [Google Scholar] [CrossRef]
Tucker, A.; Marsh, K.L.; Gifford, T.; Lu, X.; Luh, P.B.; Astur, R.S. The effects of information and hazard on evacuee behavior in virtual reality. Fire Saf. J. 2018, 99, 1–11. [Google Scholar] [CrossRef]
Lin, J.; Peng, Z.; Zhu, R.; Xue, Y. Formation and evolution of individual evacuation roles in building emergencies: A role-playing immersive virtual reality study. Int. J. Disaster Risk Reduct. 2025, 126, 105632. [Google Scholar] [CrossRef]

Figure 1. Evacuation training effect and hierarchical evacuation knowledge.

Figure 2. Evacuation training model.

Figure 3. Relationship between evacuation training phases and learning stages based on Kolb’s model.

Figure 4. System architecture and module composition for synchronized auditory clue presentation.

Figure 5. Mechanism for synchronized movement replay and spatialized sound playback.

Figure 6. Distribution of disaster objects and NPCs. Participants could pass through any fire source (FS), although doing so resulted in a deterioration of their health status. Debris D1 completely blocked access to the lower floors, whereas D2 and D3 remained navigable. FS3 was positioned nearly identically to D2. NPCs N1 and N7 remained stationary at their initial positions. NPCs N5, N8, and N10 were rendered as unconscious and did not move; note that the system did not support rescue operations. NPCs N2, N3, N4, and N6 actively moved toward the exits. Light-green lines represent the evacuation trajectories of the NPCs, while the bold orange line indicates the designated safest evacuation route.

Figure 7. Experimental procedure.

Table 1. NPC settings.

NPC	Floor	Action	Voice (Excerpts)
N1	6	Remaining stationary	“We should wait for directions from the authorities.”
N2	6	Approaching fire, then retreating in hesitation	“It’s hot! Can I even get through these flames?”
N3	6	Moving slowly with disorientation	“Where is the exit?”
N4	5	Evacuating calmly	“It is okay; we can evacuate without rushing.”
N5	5	Unconscious	(Occasional faint moaning or heavy breathing)
N6	4	Running along a safe route in panic	“Oh my gosh! This is so dangerous!”
N7	3	Remaining stationary and indecisive	“I have no idea where to evacuate.”
N8	3	Attempting to rescue N8 and then abandoning the effort	“Are you okay? We might need an AED.”
N9	3	Unconscious	Same above
N10	1	Unconscious	Same above

Table 2. Participants’ actions in representative situations.

Participant	FS1	FS2	FS3&D2	N5	N9	FS5	FS6	N10
P1	PT	DT	NE	NE	NE	NE	DT	PT
P2	NE	NE	DT	SL	SL	NE	DT	SL
P3	NE	NE	DT	PT	PT	DT	DT	PT
P4	NE	PT	DT	PT	NE	NE	NE	PT
P5	NE	PT	DT	PT	PT	NE	DT	PT
P6	DT	DT	DT	SL	PT	NE	DT	PT
P7	NE	NE	DT	SL	SL	DT	DT	PT
P8	DT	NE	DT	PT	SL	DT	DT	PT

NE: no encounter; PT: pass through; DT: detour; SL: stop and look.

Table 3. Post-questionnaire results.

Question	Representative Replies and Statistics
Q1. Did you feel a sense of realism during the evacuation? (5-point Likert scale) What specific aspects contributed to the realism?	P1 (4): The voices of others felt very realistic. P2 (3): There were discrepancies between avatar appearances (e.g., estimated age) and their voices. There was also a lack of vocal variety. P3 (4): The chaotic atmosphere of the panic was well represented. P4 (4): Hearing NPC voices and coughing sounds created a strong sense of urgency. P5 (4): The sight of everyone wandering combined with their anxious voices heightened the realism. P6 (4): I understood the severity of the situation through the NPCs’ dialogue, which prompted me to run or turn back at dangerous spots. P7 (3): Avatars speaking while evacuating improved immersion, but the mismatch between their looks and voices was distracting. P8 (5): A few NPCs who seemed unsure of their next move or were heading upstairs felt somewhat out of place. Median = 4.0 (MAD = 0.0)
Q2. What did you recall or observe while watching the replay?	P1: I recalled why I hesitated at certain points and what my internal thoughts were at the time. P2: I observed my physical actions, such as gaze direction and where I decided to change my route, as well as the underlying reasons for those choices. P3: I evaluated whether I took unnecessary or dangerous actions. P4: I observed the overall situational context surrounding me during the evacuation. P5: I reflected on whether I chose the optimal route and how I felt when encountering the unconscious NPCs. P6: I observed the distance to hazards, the paths I missed while evacuating, and how others were moving. P7: I assessed whether any of my actions were incorrect. P8: I checked whether I remained calm and if I followed the evacuation route as remembered.
Q3. What did you realize regarding the NPCs during the reflection?	P1: I realized that since people react differently, I must decide my own disaster response in advance. P2: The NPCs made me realize that other people might be slower to escape than expected. P3: I obtained early information about fire and casualties through the NPCs’ verbal cues. P4: I was focused on my own evacuation; I noticed the screaming and coughing but did not prioritize helping them. P5: I realized that people wandering aimlessly leads to chaos and potential accidents. P6: I realized that the absence of authority figures can lead to rushing. Seeing and hearing an NPC retreat from the heat helped me understand the risk of flames. I also realized the need to avoid debris after hearing an NPC mention broken glass. P7: None. P8: I realized the importance of independent decision-making and prioritizing my own safety without hesitation in a confused state.
Q4. What lessons did you learn through this reflection?	P1: I realized that unfamiliar situations can leave me momentarily paralyzed. P2: The presence of others can be a distraction and may increase my own evacuation time. P3: I noticed my movements were slow, which taught me the need to escape more rapidly. P4: Knowing evacuation routes in advance is essential for survival. P5: I need to prepare specifically for fire events; I was flustered because I hadn’t thought about alternative routes. P6: I realized I didn’t begin evacuating immediately despite my knowledge. The replay showed I was following NPCs rather than using my own knowledge of the building, and the third-person view revealed I was stepping on debris without noticing. P7: I realized that in a real disaster, the difficulty of rescue operations might result in me prioritizing my own evacuation over helping the injured. P8: Hearing the anxiety in others’ voices made me anxious. I learned to stay calm and scan the environment to identify safe routes.
Q5. Did you feel immersed during the reflection phase?	P1–P8: (4, 5, 4, 4, 5, 3, 2, 4) Median = 4.0 (MAD = 0.5)
Q6. Was the reflection process easy to perform?	P1–P8: (2, 4, 4, 4, 4, 4, 4, 4) Median = 4.0 (MAD = 0.0)
Q7. Did the reflection increase your disaster awareness?	P1–P8: (4, 4, 4, 4, 5, 5, 3, 4) Median = 4.0 (MAD = 0.0)

Note: Likert scale options: 1 = definitely no; 2 = no; 3 = neutral; 4 = yes; 5 = definitely yes. MAD: median absolute deviation.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Mitsuhara, H.; Yamanaka, R.; Matsushige, M.; Kozuki, Y. Enhancing Reflection in VR-Based Evacuation Training Through Synchronized Auditory Clue Presentation: A Pilot Study. Appl. Sci. 2026, 16, 3048. https://doi.org/10.3390/app16063048

AMA Style

Mitsuhara H, Yamanaka R, Matsushige M, Kozuki Y. Enhancing Reflection in VR-Based Evacuation Training Through Synchronized Auditory Clue Presentation: A Pilot Study. Applied Sciences. 2026; 16(6):3048. https://doi.org/10.3390/app16063048

Chicago/Turabian Style

Mitsuhara, Hiroyuki, Ryoichi Yamanaka, Maya Matsushige, and Yasunori Kozuki. 2026. "Enhancing Reflection in VR-Based Evacuation Training Through Synchronized Auditory Clue Presentation: A Pilot Study" Applied Sciences 16, no. 6: 3048. https://doi.org/10.3390/app16063048

APA Style

Mitsuhara, H., Yamanaka, R., Matsushige, M., & Kozuki, Y. (2026). Enhancing Reflection in VR-Based Evacuation Training Through Synchronized Auditory Clue Presentation: A Pilot Study. Applied Sciences, 16(6), 3048. https://doi.org/10.3390/app16063048

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Participant	FS1	FS2	FS3&D2	N5	N9	FS5	FS6	N10
P1	PT	DT	NE	NE	NE	NE	DT	PT
P2	NE	NE	DT	SL	SL	NE	DT	SL
P3	NE	NE	DT	PT	PT	DT	DT	PT
P4	NE	PT	DT	PT	NE	NE	NE	PT
P5	NE	PT	DT	PT	PT	NE	DT	PT
P6	DT	DT	DT	SL	PT	NE	DT	PT
P7	NE	NE	DT	SL	SL	DT	DT	PT
P8	DT	NE	DT	PT	SL	DT	DT	PT

Participant	FS1	FS2	FS3&D2	N5	N9	FS5	FS6	N10
P1	PT	DT	NE	NE	NE	NE	DT	PT
P2	NE	NE	DT	SL	SL	NE	DT	SL
P3	NE	NE	DT	PT	PT	DT	DT	PT
P4	NE	PT	DT	PT	NE	NE	NE	PT
P5	NE	PT	DT	PT	PT	NE	DT	PT
P6	DT	DT	DT	SL	PT	NE	DT	PT
P7	NE	NE	DT	SL	SL	DT	DT	PT
P8	DT	NE	DT	PT	SL	DT	DT	PT

Article Menu

Enhancing Reflection in VR-Based Evacuation Training Through Synchronized Auditory Clue Presentation: A Pilot Study

Abstract

1. Introduction

2. Theoretical Framework

2.1. Evacuation Knowledge Hierarchy and Training Effect

2.2. Cyclic Training Model Based on Experiential Learning

2.3. Necessity of Multi-Modal Reflection Support

2.4. Focused Literature Review

3. Reflection Support

3.1. Base System Overview

3.1.1. Normal Time (NT)

3.1.2. Emergency Time (ET)

3.1.3. Reflection Time (RT)

3.2. Auditory Clue Presentation

3.2.1. Overview

3.2.2. Module Composition

3.2.3. Sound Play

4. Preliminary Experiment

4.1. Settings

4.1.1. Participants

4.1.2. Scenario and NPCs

4.1.3. Procedure

4.2. Results

4.2.1. Behavioral Data and Scores in the ET Phase

4.2.2. Replay Interaction During the RT Phase

4.2.3. Post-Questionnaire

4.3. Discussion

4.3.1. Behavioral Analysis of Decision-Making in the ET Phase

4.3.2. Cognitive Impact of Multi-Modal Reflection in the RT Phase

4.3.3. Summary

4.4. Limitations

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Participant	FS1	FS2	FS3&D2	N5	N9	FS5	FS6	N10
P1	PT	DT	NE	NE	NE	NE	DT	PT
P2	NE	NE	DT	SL	SL	NE	DT	SL
P3	NE	NE	DT	PT	PT	DT	DT	PT
P4	NE	PT	DT	PT	NE	NE	NE	PT
P5	NE	PT	DT	PT	PT	NE	DT	PT
P6	DT	DT	DT	SL	PT	NE	DT	PT
P7	NE	NE	DT	SL	SL	DT	DT	PT
P8	DT	NE	DT	PT	SL	DT	DT	PT