Analysis for Evaluating Initial Incident Commander (IIC) Competencies on Fireground on VR Simulation Quantitative–Qualitative Evidence from South Korea

Park, Jin-chan; Yun, Jong-chan

doi:10.3390/fire8100390

Open AccessArticle

Analysis for Evaluating Initial Incident Commander (IIC) Competencies on Fireground on VR Simulation Quantitative–Qualitative Evidence from South Korea

by

Jin-chan Park

¹

and

Jong-chan Yun

^2,*

¹

Department of Education and Training, National Fire Service Academy, Gongju 32522, Republic of Korea

²

Department of Safety Policy Research, National Fire Research Institute, Asan 31555, Republic of Korea

^*

Author to whom correspondence should be addressed.

Fire 2025, 8(10), 390; https://doi.org/10.3390/fire8100390

Submission received: 14 September 2025 / Revised: 29 September 2025 / Accepted: 30 September 2025 / Published: 2 October 2025

Download

Browse Figures

Versions Notes

Abstract

This study evaluates the competency-based performance of Initial Incident Commander (IIC) candidates—fire officers who serve as first-arriving, on-scene incident commanders—in South Korea and identifies sub-competency deficits to inform training improvements. Using evaluation data from 92 candidates tested between 2022 and 2024—of whom 67 achieved certification and 25 did not—we analyzed counts and mean scores for each sub-competency and integrated transcribed radio communications to contextualize deficiencies. Results show that while a majority (72.8%) passed, a significant proportion (27.2%) failed, with recurrent weaknesses in crisis response, progress management, and decision-making. For example, “Responding to Unexpected or Crisis Situations 3-3” recorded 27 unsuccessful cases with a mean score of 68.8. Candidates also struggled with resource allocation, situational awareness and radio communications. The study extends recognition-primed decision-making theory by operationalizing behavioral marker frameworks and underscores the need for predetermined internal alignment, scalability and teamwork synergy. Practical implications recommend incorporating high-fidelity simulation and VR scenarios, competency frameworks and reflective debriefs in training programs. Limitations include the single-country sample, reliance on predetermined scoring rubrics and absence of team-level analysis. Future research is indispensable to adopt multi-jurisdictional longitudinal designs, evaluate varied training interventions, assess skill retention and explore the interplay between physical and cognitive training over time.

Keywords:

Initial Incident Commander (IIC) certification; fire officer (First Arriving Fire Commander); competency-based assessment; crisis response; progress management; Decision Making; VR simulation training; radio communication analysis; mixed (Quantitative–Qualitative) methods analysis

1. Introduction

Emergency-management scholars widely agree that modern incident command requires integrating procedural expertise with non-technical cognitive skills. National guidance documents codify core principles—structured decision-making, clear objectives, scalability and multi-agency coordination [1]. Behavioral marker systems, such as THINCS, operationalize these principles by evaluating decision quality, communication clarity and situational awareness [2]. Research on naturalistic decision-making and recognition-primed decision theory shows how experienced commanders use pattern recognition and mental simulation to act effectively under uncertainty [3,4]. Empirical studies from the UK fire and rescue service show that incident commanders routinely rely on RPD processes during in situ operations and exercises, with decisions shaped by SOPs and the deployment of operational discretion under time pressure. In particular, field and simulation evidence demonstrates how cue recognition and rapid mental simulation can both support and, at times, paradoxically deviate from SOP intent when conditions evolve unexpectedly (e.g., priority drift, mistimed tactical actions). We draw on these findings to frame our SA→PE mechanisms and to interpret the observed shortfalls in crisis response and progress management [5,6]. High-fidelity simulations and scenario-based drills improve adaptability and overall performance [7,8]. Scholars further emphasize that effective fireground commanders must exhibit strong leadership, adaptive decision-making and a commitment to continuous training [9]. Collectively, these studies underscore that incident command effectiveness depends on combining technical procedures with non-technical competencies, and that immersive training and competency-based assessment are essential preparation for complex emergencies.

Despite consensus on the importance of non-technical skills, significant research gaps remain. The U.S. Federal Emergency Management Agency argues that the Incident Command System must be systematically evaluated and demystified [1], yet rigorous assessment of command performance is sporadic. Many studies prioritize procedural compliance or technological innovation over demonstrable learning outcomes; for instance, VR training research often emphasizes user experience rather than cognitive or behavioral improvement [10]. Sub-competency deficits in crisis response and progress management are underexplored, and few evaluations integrate radio communications into training analysis. Fire departments facing command-capability gaps due to retirements report that standard certification courses and simulation exercises are insufficient to ensure consistent performance [11]. Additionally, while decision-making theory is well developed, little is known about the temporal dynamics of evolving incidents or how simulation-acquired skills transfer to real-world operations. These omissions highlight the need for research that appraises training efficacy, identifies specific performance gaps and links them to operational outcomes.

This study tackles the identified gaps by systematically evaluating crisis-response and progress-management competencies among initial incident commanders. Using a unique dataset of 2022–2024 evaluation results, it compares successful and unsuccessful candidates to pinpoint sub-competencies that consistently fall below performance thresholds. Building on behavioral marker frameworks and recognition-primed decision theory, it integrates quantitative metrics (counts and average scores) with qualitative analysis of radio communication transcripts to contextualize observed weaknesses. The study also examines the operational impact of these deficiencies and formulates evidence-based recommendations for improvement. By bridging theoretical models with empirical performance data, this research not only highlights which competencies require reinforcement but also proposes a targeted, scenario-based training regimen that emphasizes adaptive decision-making, situational awareness and effective communication. Through this mixed-method approach, the study aims to guide training designers, instructors and field commanders in refining curricula and assessment processes, thereby enhancing command readiness and resilience.

Addressing training and competency gaps is critical because inadequate incident-command capability leads to casualties and property loss. In northeast Ohio, 59 of surveyed fire departments reported declining command capability due to retirements, citing losses in hazard recognition and communication skills [11]. Recent South Korean disasters underscore the cost: the Jecheon sports-center fire killed 29 people and trapped victims in a sauna because exits were blocked and toxic smoke delayed rescue [12]; investigations later criticized firefighters for failing to deploy ladders or access upper floors [13]. One month later, a fire at Miryang’s Sejong Hospital left 41 dead and 153 injured [14], demonstrating how mismanagement can compound fatalities. Communication failures during 9/11—where responders lacked interoperable radios—also impeded decision-making and increased casualties [15]. Complex emergencies demand predetermined internal alignment, scalability and teamwork synergy [1]; without these attributes, evolving incidents overwhelm commanders. As disasters become more frequent and diverse, there is an ethical imperative to rigorously evaluate and improve incident-command training to prevent future tragedies and strengthen community resilience.

This study makes several contributions to the academic understanding of incident command and crisis management. First, it reinforces and extends theories of decision-making under uncertainty by showing how deficiencies in adaptive decision-making, including recognition-primed and naturalistic processes, manifest in measurable sub-competencies [3,16,17]. Second, it refines competency frameworks and behavioral marker systems by translating constructs such as situational awareness, communication, and task delegation into quantifiable evaluation outcomes observable in VR simulations [2,18]. Third, it advances simulation-based learning theories by revealing persistent weaknesses in progress management and underscoring the need for temporal adaptability and continuous reassessment in experiential training [7,8,19,20]. Fourth, it contributes to the debate on standardization versus flexibility by demonstrating empirically that rigid adherence to SOPs without adaptive judgment correlates with poor performance [21,22,23,24]. Finally, it broadens discourse on organizational resilience by highlighting the enabling role of training culture, feedback, and institutional systems [25,26,27,28].

2. Literature Review

Modern incident command is now understood as a complex interplay between procedural expertise and non-technical cognitive skills, rather than a series of rote steps. The United Kingdom’s National Operational Guidance [29] provides a doctrinal backbone by codifying essential principles such as structured decision-making, clear objectives, stringent safety management and coordinated multi-agency response [1]. These guidelines clarify that mere compliance is inadequate and call for the adoption of competency-based assessments. Behavioral marker systems such as THINCS were developed to operationalize these principles, enabling evaluators to judge decision quality, communication clarity and situational awareness in practice [2]. Consequently, scholars argue that effective fireground commanders must blend technical expertise with strong leadership, adaptability and continuous training [9]. This holistic perspective marks a clear shift from a purely procedural paradigm to a competency-driven approach.

At the theoretical level, research into naturalistic decision-making (NDM) provides the foundation for understanding the non-technical skills demanded of incident commanders. NDM illustrates how experienced responders make rapid, high-stakes decisions amid ambiguity and uncertainty [3]. Within this paradigm, recognition-primed decision theory describes how commanders integrate current cues with prior experience and mentally simulate actions to guide responses [4]. Targeted training interventions have shown that goal-oriented practice in virtual fire scenarios can enhance decision strategies and situational awareness [17]. These advances underscore the importance of cognitive training that addresses mental workload and time pressure inherent in fireground command. Robust measurement tools are crucial for capturing such cognitive elements; ref. [16] developed widely used scales to assess situational awareness at both individual and team levels. Across other high-risk domains, studies consistently highlight the centrality of teamwork, situational awareness and decision-making to safety, reinforcing the need for integrated cognitive and perceptual skills [18]. Foundational situation awareness (SA) theory clarifies how perception–comprehension–projection supports rapid RPD decisions [16]. In fire/IC contexts, goal-oriented decision-control training and remote/immersive VR for incident commanders have shown measurable benefits in decision-making and SA [17].

Building on these theoretical insights, simulation-based training has evolved into a cornerstone of incident command development. Early computer-based simulations were shown to improve situational assessment, tactical decision-making and overall performance [7]. Subsequent research has confirmed that high-fidelity scenario-based drills foster adaptability and enhance safety outcomes [8]. Reflective learning models such as Introspect combine scenario-driven exercises with structured self-assessment to refine decision-making and leadership skills [30]. Regular participation in realistic, high-fidelity simulations reinforces coordination and decision-making proficiency, promoting enduring preparedness [31]. Distributed virtual simulations allow geographically dispersed teams to train together and have been shown to improve situational awareness, adaptability and decision-making [20]. Multi-agency, high-fidelity exercises expose participants to the complexities of large-scale emergencies and enhance operational command performance [32]. Importantly, the benefits of immersive training translate into real fireground operations, where firefighters trained with these methods demonstrate superior situational awareness and tactical execution [33]. Such training not only hones technical and cognitive skills but also fosters reflective practice and coordination. Recent evaluations further report gains in leadership/decision competence in firefighter training [34] and effective IC readiness maintenance via VR-enabled programs during COVID-19 [27].

Rapid technological advances are expanding the horizons of command training. Fire services worldwide now integrate virtual reality (VR), augmented reality (AR) and mixed reality (MR) systems to enhance realism and engagement [35]. For example, immersive VR modules focusing on hazardous materials have significantly improved hazard recognition, tactical decision-making and overall preparedness [36]. Training institutions are embracing these innovations, as evidenced by the UK Fire Service College’s adoption of advanced simulation technologies [37]. Despite these developments, scholars warn that many VR studies prioritize technological novelty and user experience over demonstrable learning outcomes, underscoring the need for outcome-focused research [10]. Evidence from other high-risk sectors illustrates VR’s potential but also highlights its challenges: emotionally immersive simulations have been used both for combat decision training and therapeutic purposes [38], mission rehearsal for preparedness [39] and stress-management programs that enhance resilience [40]. Some claimed benefits remain contested, stressing the importance of rigorous evaluation.

Alongside these technological trends, organizations are formalizing and professionalizing command training. The Korean National Fire Agency has introduced a Strategic On-Scene Commander certification to systematically evaluate and strengthen command competence [28]. Leadership competency assessments help identify strengths and gaps in training programs [34], while efforts to validate job-specific physical fitness tests ensure that firefighters’ capabilities meet the demands of emergency operations [26]. Research-driven initiatives in the UK have focused on behavioral assessments and evidence-based improvements to incident command [41]. Mixed-methods evaluations—combining quantitative performance metrics with qualitative observations—yield comprehensive insights into training effectiveness [42]. The COVID-19 pandemic forced further innovation in training delivery; fire departments successfully leveraged remote learning to maintain commander readiness during lockdowns [27]. Experts caution against complacency, emphasizing that training must evolve through increasingly realistic scenarios and rising standards [43]. Analyses of major fire disasters reveal systemic deficiencies in enforcement, resource allocation, communication and oversight, reinforcing the need for continuous improvement in fire safety management and training [24].

Collectively, the literature makes clear that effective incident command rests on the integration of procedural knowledge, cognitive adaptability and organizational learning. Simulation and immersive training programs, bolstered by evolving VR/AR technologies, offer promising avenues for enhancing command competence. Formal certification frameworks and institutional initiatives underscore the growing professionalism of incident command roles. Nevertheless, important gaps remain: many VR studies emphasize user experience over demonstrable learning outcomes; few training programs address the temporal complexities of evolving crises; and questions persist about the optimal balance between standardized procedures and improvisational flexibility. Addressing these challenges will require interdisciplinary research that combines cognitive theory, behavioral assessment and advanced simulation technology to prepare adaptable, resilient incident commanders. Future studies also need to investigate long-term retention of simulation- and VR-acquired skills across diverse incident types and cultural contexts.

3. Materials

This study analyzed archival results from the National Fire Service Academy’s (NFSA) Initial Incident Commander (IIC)—fire officers who serve as first-arriving, on-scene incident commanders—certification assessments conducted between 2022 and 2024. Across these three years, multiple assessment sessions were held, involving a total of 92 unique IIC candidates (some individuals who did not pass initially retook the assessment in subsequent sessions). All participants were active fire service officers who met the NFSA’s eligibility criteria for this advanced certification program (Appendix A.3).

All assessments were scenario-based simulations of complex fire incidents, including immersive VR-based fireground scenarios with live radio communication to mimic real emergency conditions, with scenario narratives and difficulty parameters calibrated to ensure a consistent challenge for all candidates (Appendix A.5). Each candidate’s performance was documented using a standardized evaluation form spanning five major competency domains (e.g., situational assessment, operational response management, crisis leadership and communication, action planning/briefing, and media briefing), which together encompassed 20 specific sub-competency items (the full competency framework and behavioral indicators for these domains are provided in Appendix A.2).

The scoring for each sub-competency was expressed as a percentage of the maximum possible score for that item. An 80% performance score was defined as the minimum acceptable standard for each sub-competency (Appendix A.2), consistent with the certification’s pass/fail criteria. Candidates were required to achieve an overall average score of at least 80% across all competencies and demonstrate no critical deficiencies in any core sub-competency (those designated as essential in the assessment criteria; see Appendix A.2) in order to pass the certification. A score below 80% on any given item indicated that the expected standard for that skill was not met. The pass/fail outcomes in each session reflected these rigorous criteria: many candidates failed to meet the 80% threshold in one or more critical sub-competencies during the first assessment, necessitating additional training and a second attempt (specific pass rates and score distributions for each session are provided in the Section 5).

All assessment scenarios and scoring rubrics were designed by NFSA subject-matter experts as part of a standardized evaluation process (Appendix A.1). Detailed descriptions of the evaluation structure and criteria are provided in the appendices for reference. Appendix A.1 documents the overall evaluation process and sequence of activities, and Appendix A.2 details the specific competency domains, sub-competencies, and behavioral indicators, along with the scoring guidelines for each. The full simulation scenario narratives and trigger events (including their calibrated difficulty parameters) are presented in Appendix A.5. The certification’s candidate eligibility requirements and the composition of the evaluation panels are described in Appendix A.3, while Appendix A.4 explains the organization and functions of the certification committees that oversaw the assessment process to ensure fairness, objectivity, and transparency.

All data used in this research were fully anonymized and provided by NFSA for secondary analysis. Personal identifiers were removed by NFSA prior to data release to ensure candidate confidentiality. Because the study utilized retrospective, de-identified assessment records from routine certification activities, it did not constitute human subjects research and no formal IRB review was required. Informed consent from participants was waived under institutional policy, as obtaining consent at the time of the high-stakes evaluation was deemed impractical and could have influenced performance. Throughout our analysis, data security and privacy were strictly maintained in accordance with ethical guidelines (each candidate was referenced only by a coded ID, and no individual results can be traced to personal identities).

4. Methods

All data used in this research were fully anonymized and provided by NFSA for secondary analysis. Personal identifiers were removed by NFSA prior to data release to ensure candidate confidentiality. Because the study utilized retrospective, de-identified assessment records from routine certification activities, it did not constitute human subjects research and no formal IRB review was required. Informed consent from participants was waived under institutional policy, as obtaining consent at the time of the high-stakes evaluation was deemed impractical and could have influenced performance. Throughout our analysis, data security and privacy were strictly maintained in accordance with ethical guidelines (each candidate was referenced only by a coded ID, and no individual results can be traced to personal identities). For a graphical overview of the study design and analytic streams, it would be better to see Figure 1. Mixed-methods workflow for evaluating IIC competence.

4.1. Study Framework

We adopted a mixed-methods approach that integrated quantitative performance metrics with qualitative analyses of candidates’ communication behaviors. Our primary aim was to identify sub-competencies in which participants scored below the 80% threshold and to explore how these deficits surfaced in on-scene decision-making and radio exchanges. By triangulating numerical data with transcript evidence, we pinpointed recurring weaknesses and their operational impact, thereby aligning our methodology with the outcome patterns presented in the Section 5. To ensure scenario realism, each of the five VR disaster simulations was explicitly mapped to a real fire incident in South Korea (see Table A5b in Appendix A.5). For example, the Goshiwon Fire scenario drew on the 2018 Jongno Goshiwon blaze, while the Residential Villa Fire scenario was based on the 2014 Goyang villa incident, both of which exhibited fatal evacuation barriers and rapid smoke spread. This mapping process verified that the simulated conditions—building layouts, hazards, victim placements, and fire behavior—were grounded in actual case data rather than hypothetical constructs. In addition, a structured VR Training Readiness and Debriefing Checklist (Table A5c in Appendix A.5) was implemented before each session to confirm instructor preparation, technical setup, fidelity of scenario triggers, and post-session reflective debrief prompts. These measures substantiate the external validity of the VR design and demonstrate clear, actionable pathways for integrating high-fidelity VR and reflective debriefing into the Korean fire training system.

4.2. Quantitative Evaluation

Across the 2022–2024 assessment datasets, we conducted a systematic review of each candidate’s performance on all 20 sub-competency items. Any sub-competency with a score under 80% for a given candidate was flagged as an “underperforming” item for that individual. Throughout the figures, “unsuccessful” = score < 80%; N = 92 candidates. Bars denote Unsuccessful Count, and the line denotes Mean Score. We then tallied the frequency of underperformance for each sub-competency in each session—essentially counting how many candidates fell below 80% on that item. This produced a profile of which sub-competencies were most commonly failed by the candidates. We tabulated the number and percentage of candidates not meeting the 80% mark for each item and calculated the mean score for each sub-competency as well as the average scores for each major competency domain within each session. These descriptive statistics allowed us to rank the sub-competencies by their failure rates and severity of underachievement. Sub-competencies that had high failure frequencies (i.e., a large proportion of candidates scoring < 80%) were identified as notable weak areas in that assessment round. These descriptive statistics allowed us to rank sub-competencies by failure rates and severity of underperformance. Sub-competencies with high failure frequencies (i.e., a large proportion of candidates scoring below 80%) were identified as notable weak areas. We then compared results across sessions over the 2022–2024 period to determine patterns of recurring weaknesses: any sub-competency that consistently appeared as a failure point across multiple sessions was noted as a persistent skill gap across cohorts. To aid interpretation, sub-competencies were grouped according to their parent competency domain, which helped reveal if certain broader skill areas (encompassing multiple related sub-skills) tended to yield lower performance overall. The analytical procedures described here directly informed the findings reported in the Section 5—for example, the identification of multiple candidates struggling with communication-related sub-competencies in both sessions emerged from this frequency analysis and is reflected in the Results. Complete details of the scoring system and sub-competency definitions (including which sub-competencies were considered “critical” for pass/fail purposes) are available in Appendix A.2. The exact step-by-step procedures for performance scoring and data tabulation are documented in Appendix A.1.

4.3. Qualitative Communication Analysis

Alongside the quantitative scoring analysis, we performed a qualitative examination of fireground radio communication transcripts generated during the simulated incident scenarios. All radio exchanges were audio-recorded during the sessions, and our research team produced verbatim (Korean) transcripts and directly analyzed both the audio files and the transcripts. These transcripts, recorded for each candidate, were analyzed to determine how observed performance deficits were reflected in real command communications and decision-making processes. Particular attention was given to cases in which candidates scored below 80% on communication- or coordination-related sub-competencies. We systematically reviewed the transcripts and identified concrete examples of communication breakdowns and decision-making errors that corresponded to the weaknesses revealed in the quantitative analysis. Instances such as unclear or ineffective radio transmissions, delays or omissions in delivering critical updates, insufficient coordination with supporting agencies, and other communication problems were coded and documented. Coding was conducted in a structured Excel-based coding matrix (Microsoft Excel) using utterance-level segments as the unit of analysis, with a predefined codebook referenced in Appendix A.5. This was a census analysis of all candidates (N = 92). Two coders independently dual-coded all transcripts; disagreements were reconciled in scheduled consensus meetings documented in a coding-decision log, with an a priori ≥80% agreement threshold. Inter-coder reliability was quantified with Cohen’s κ, which we report by category along with agreement (%) and exemplar excerpts (see Appendix A.6). For example, when a candidate fell short on a sub-competency like “Radio communication control,” we examined whether the transcript displayed issues such as overlapping radio traffic, unanswered transmissions, or ambiguous orders. Detailed procedures for this transcript analysis—including transcription methods, the coding framework used to classify lapses, and illustrative excerpts for each category—are presented in Appendix A.5, which also includes an overview of the VR simulation design and scenario timeline. For contextual alignment between content and communications, Appendix A.5 (Table A5a) provides per-scenario operational factors (e.g., hazards, spread, victim layout) that informed expected radio content at key injects; Appendix A.5 (Table A5b) documents the real-incident anchors used to validate the scenario logic; and Appendix A.5 (Table A5c) lists the readiness/debrief items used to standardize instructor prompts, technical checks, and reflective questioning. In addition, Appendix A.3 outlines the expected communication benchmarks for each scenario segment, providing evaluators with reference points for anticipated performance at critical stages of the simulations.

4.4. Integration of Findings

We combined the quantitative and qualitative results to develop a comprehensive understanding of performance gaps. Sub-competencies with the highest failure rates were cross-referenced with the communication transcripts to determine whether the same deficiencies were evident during the simulations. This process enabled us to verify whether statistical weaknesses (e.g., poor performance on a decision-making task) were also manifested in practice as command problems such as confusion or delays in radio communication. By employing this mixed-methods approach, the study directly identified critical competency deficits and clarified their operational consequences. The patterns derived from this analysis are presented in the Section 5, where we outline the sub-competencies most frequently falling below standard and illustrate how these shortcomings emerged during incident simulations. In essence, our methodological design aimed not only to measure the areas in which candidates underperformed but also to explain the underlying causes, thereby providing a foundation for the targeted improvements discussed later in the paper.

Appendix A.1 (Evaluation Process) documents the step-by-step evaluation workflow (briefing→VR on-scene command→Q&A), pass criteria (≥80% overall with ★-indicator rule), and the assessor/adjudication protocol.

Appendix A.2 (Behavioral Indicators & Scoring Framework) lists the five domains and 20 sub-competency indicators with High/Mid/Low anchors mapped to percentages (pass/fail <80% rule, ★ key indicators defined).

Appendix A.3 (Eligibility & Evaluation Panel) details candidate eligibility, panel composition (≥3 evaluators including ≥1 external), and any post-scenario interview procedures; Appendix A.4 (Committees & Oversight) describes the committee structure and adjudication/appeals oversight ensuring fairness and objectivity.

Appendix A.5 (Scenario Design, Validation, & Readiness) provides full scenario narratives and trigger timelines, the real-incident mapping for each VR scenario (Table A5b), and the VR Readiness & Reflective Debrief Checklist used before/after sessions (Table A5c).

Appendix A.6 (Qualitative Coding Reliability & Exemplars) provides the codebook excerpt, anonymized KR exemplar utterances (optional EN gloss), and inter-coder agreement metrics by category (Agreement % and Cohen’s κ; total 1,226 utterances, overall agreement 87.39%, κ = 0.74).

5. Results

5.1. Overall Candidate Performance

As presented in Figure 2, between 2022 and 2024, a total of 92 candidates participated in the Initial Incident Commander certification assessments. Scenario context for interpreting the observed performance patterns (e.g., common hazards, spread conditions, and communication demands) is summarized in Table A5a–c, which specifies the scenario structure, real-incident anchors, and the readiness/debriefing checklist. Of these, 67 candidates achieved certification, corresponding to an overall success rate of 72.8, while 25 candidates did not meet the certification criteria (27.2 failure rate). These aggregate figures encompass all assessment sessions in that period and provide a general indication of candidate performance outcomes. The majority of participants thus attained the certification, but a substantial minority did not, underscoring the rigorous standards of the program. This section further details performance at the level of specific sub-competencies to identify which skill areas were most commonly problematic and to compare performance trends across different competency domains.

5.2. Sub-Competency Performance Breakdown

Table 1 illustrates the performance of candidates on each evaluated sub-competency, listing the domain, sub-competency code and description, the number of candidates who failed that sub-competency, and the mean score for that item. For clarity, the sub-competencies are presented here in descending order of failure frequency.

5.2.1. Highest Failure Sub-Competencies

The greatest number of unsuccessful performances was observed for three sub-competencies, each with 39 candidates not meeting the required standard. In the Response Activities domain, sub-competency, Crisis Response & Progress Management (2-4) was failed by 39 out of 92 candidates, with an average score of 74.2 (the lowest mean score among all sub-competencies). An equally high failure count (39) was recorded for Ventilation (3-4) in the Firefighting Tactics domain, which had a mean score of 75.6. Similarly, Response Activities, Apparatus Placement (2-1) saw 39 unsuccessful attempts, with an average score of 76.4. Just below these, Standard Response Tactics (2-2) in Response Activities had 38 candidates unable to successfully complete the task, and an average score of 78.2. These results indicate that several critical response and tactical tasks posed considerable difficulty for a large portion of the candidates, as reflected by both high failure counts and comparatively lower average scores (in the mid-70 range).

Several other sub-competencies also exhibited high failure frequencies. In the Response Activities domain, Unit Leader Task Execution (2-6) was not successfully executed by 36 candidates (mean score 79.9), and Identification and Management of Fireground Elements (2-5) was failed by 33 candidates (avg. score 78.9). Two sub-competencies each saw 32 failures: Firefighting Water Supply (3-1) in Firefighting Tactics (avg. score 78.2) and Victim Information Gathering & Dissemination (1-4) in Situation Assessment (78.2). Just below these, Assigning Tasks to Later-Arriving Units (2-3) in Response Activities had 31 failures with a higher average score of 83.7, and Requesting Additional Firefighting Resources (1-5) in Situation Assessment had 30 failures (avg. 82.3). Each of the above sub-competencies was associated with roughly one-third or more of the candidates struggling to meet expectations, highlighting them as consistent challenge areas across multiple cohorts.

5.2.2. Moderate Failure Sub-Competencies

A second tier of sub-competencies showed moderate failure counts, generally between about one-fifth to one-quarter of the candidates. For instance, in the Communication domain, Information Delivery Effectiveness (4-2) was not successfully demonstrated by 27 candidates (average score 82.3). The Achievement of Core Objective domain’s Adequacy of Achieving Rescue Objectives (5-1) had 26 failures (80.7 avg.), indicating that over a quarter of the participants fell short in fully accomplishing the stated rescue goals. Two sub-competencies had 25 candidates fail to meet the standard: Crew Safety Management (5-2) with an average score of 82.3, and Initial Situation Report & Command Declaration (1-3)★ in Situation Assessment, which despite a failure count of 25 had a somewhat higher mean score of 84.2. These mid-ranking items suggest that while the majority of candidates performed them satisfactorily, a significant minority experienced difficulties.

5.2.3. Lower Failure Sub-Competencies

The remaining assessed sub-competencies had comparatively lower failure incidence, indicating stronger performance by most candidates in these areas. In Firefighting Tactics, Hose Deployment, Water Application & Nozzle Placement (3-3) resulted in 22 unsuccessful attempts (average score 82.2). The Communication sub-competency, Radio Communication Principles (4-1) likewise had 22 failures, with a mean score of 84.1. Another communication task, Situation Handover to Arriving Chief Officer (4-3), had 19 candidates (approximately 21) who did not perform it successfully, and an average score of 86.7. In the Situation Assessment domain, Situation Dissemination en Route (1-2) was failed by 12 candidates, with a relatively high average score of 92.4, suggesting that most participants executed this task well aside from a handful of outliers. The Firefighting Tactics sub-competency, Forcible Entry and Interior Attack Initiation (3-2) had 10 failures (avg. score 88.7). Finally, the strongest performance was observed for Information Gathering and Mission Sharing en Route (1-1) in Situation Assessment, which had the lowest failure count—only 5 candidates did not meet the standard on this item—and the highest mean score of 93.1. This sub-competency represents the top end of performance, as virtually all participants mastered it.

5.2.4. Domain-Level Patterns

As shown in Figure 3, when results are aggregated by domain, distinct performance patterns emerge. The Response Activities domain (Domain 2) accounted for the largest share of sub-competency failures in total. This domain encompasses six sub-competencies, all of which ranked among the upper half by failure count; in fact, five of the six Response Activities tasks had failure counts of 33 or higher. Cumulatively, Response Activities sub-competencies contributed 216 failure instances (out of the total 542 recorded across all tasks), and their average scores tended to be on the lower end (mostly in the 75–80 range). In contrast, the Situation Assessment domain (Domain 1) included five sub-competencies with a wide range of outcomes: while one Situation Assessment task (1-1) had the best overall performance (only 5 failures, 93.1 mean), another (1-4) was among the more challenging (32 failures). In total, Situation Assessment tasks accounted for 104 failures combined. The Firefighting Tactics domain (Domain 3) comprised four sub-competencies and showed a similar total failure count (103 combined failures) to Situation Assessment. Firefighting tasks included one of the most failed items (Ventilation, 3-4) as well as one of the higher-performing ones (3-2, with only 10 failures), yielding domain average scores in the low 80%. The Communication domain (Domain 4) with three sub-competencies had fewer overall failures (68 combined); none of its tasks exceeded 27 failures, and their mean scores were relatively high (roughly 82–87). Finally, the Achievement of Core Objective domain (Domain 5), containing two summary outcome measures, had the fewest total failures (51 combined). The Achievement of Core Objective sub-competencies’ average scores were moderate (around 81–82), and their failure counts (25 and 26) fell in the midrange of the distribution. In order to address these differences, we conducted a linear mixed-effects model (LMM) analysis to compare sub-competency scores across all five categories (Situation Assessment, Response Activities, Firefighting Tactics, Communication, and Key Objective Achievement) over the five sessions (from 2022 to 2024). The LMM results confirmed significant differences: there was a main effect of competency Category (χ²(4) = 38.20, p < 0.001) and a main effect of Session (χ²(4) = 23.57, p < 0.001). More importantly, we found a Category × Session interaction that was statistically significant (χ²(16) = 37.30, p = 0.002), indicating that the pattern of competency scores across the five categories varied significantly by session cohort. In other words, the relative strengths and weaknesses in competency scores were not consistent across the different sessions. For example, in the first cohort (2022), Firefighting Tactics and Situation Assessment scores were among the highest (each ~0.83 on a 0–1 scale) and significantly higher than Response Activities (0.75, p < 0.01). By contrast, in later cohorts the pattern shifted—notably in the 2023-H2 and 2024-H2 sessions, Firefighting Tactics scores dropped relative to other competencies and were significantly lower than Response Activities (e.g., 2024-H2 means 0.747 vs. 0.787, p = 0.001). Meanwhile, Communication and Key Objective Achievement remained consistently strong across sessions (with no significant decline). These results demonstrate where the competency patterns differ significantly: certain sub-competencies (such as Firefighting Tactics) show a different trend over time compared to others, answering the reviewer’s concern by providing statistical evidence of the differences across categories.

5.2.5. Descriptive Statistics

Across all 20 sub-competencies assessed, the number of candidates failing each task ranged from 5 (minimum) to 39 (maximum). On average, approximately 27 candidates (about 29 of the cohort) did not successfully complete a given sub-competency. The sub-competency mean scores spanned from 74.2 at the low end (for the most failed item) up to 93.1 (for the best-performed item). The arithmetic mean of the sub-competency average scores was ~82, indicating that, in aggregate, participants attained roughly four-fifths of the total points on a typical task. In summary, while overall certification success was achieved by nearly three-quarters of the candidates, the granular results reveal considerable variability in performance across different skill areas: certain operational response tasks consistently proved difficult for a large subset of candidates, whereas foundational situation assessment and communication tasks were performed strongly by the majority.

6. Discussion

As shown in Figure 4, our results show that “Crisis Response & Progress Management (2-4)” recorded 39 unsuccessful cases with an average score of 74.02, remaining below the 80% pass threshold. In this cohort, lapses in prioritization under time pressure were pivotal: in one case, diverting crews from the dominant 3rd/4th-floor fire to a secondary hazard delayed knockdown and culminated in a thinner explosion, exemplifying failure to maintain incident timelines and protect critical objectives. Quantitatively, Crisis Management & Leadership and Progress Management emerged as the strongest discriminators of pass/fail, showing the largest performance gaps and the highest predictive weights across correlation, logistic, and ensemble models, confirming that hazard dominance and temporal control are central to certification success [44]. These findings are consistent with Naturalistic Decision Making research, which shows that under high time pressure, ICs often default to reflexive SA→PE choices; brief decision-control checks and metacognitive prompts are required to realign actions with life-safety and hazard-dominance goals [45]. In this study, SA→PE denotes a causal chain from Situation Awareness—the perception of cues, comprehension of their meaning, and projection of near-term evolution—to Plan Execution under time pressure. Within RPD/NDM, expert commanders typically recognize a situation by matching cues to prototypical patterns, then mentally simulate a workable course of action rather than comparing multiple options. When SA is degraded (missed/ambiguous cues, weak projection), recognition becomes unreliable and mental simulation degrades, yielding brittle or mistimed execution. Our data reflect this cascade: shortfalls in 2-4 Crisis Response & Progress Management indicate slippage in maintaining temporal control as conditions evolve (projection→execution), repeated ventilation mis-sequencing in 3-4 Ventilation suggests cue-recognition and simulation breakdown (e.g., opening before attack coordination), and initial 2-1 Apparatus Placement errors show early SA deficits that propagate into resource-flow constraints and delayed tactical execution. Framing the observed weaknesses as SA-to-execution breakdowns provides a mechanistic explanation for the patterns we report and motivates targeted decision-control prompts (goal check, consequence scan, time markers) during training and assessment. Training and assessment therefore need to incorporate multi-threat scenarios that demand explicit priority statements, progress markers (conditions–actions–needs), and scheduled decision-control prompts. It is necessary to weight timely confinement of the principal fire/hazmat source and it is important to require documented progress reports, as these measures are likely to reduce mis-prioritization and improve IIC pass rates, consistent with the competency-outcome analysis.

Our results show that “Ventilation (3-4)” recorded 39 unsuccessful cases with an average score of 75.60, remaining below the pass threshold. Delayed or uncoordinated ventilation was a recurrent failure pathway: crews frequently advanced before any vent was established, sustaining zero-visibility and high-temperature conditions that slowed search and knockdown. Fire-dynamics literature emphasizes that ventilation must be purpose-driven and sequenced with attack; premature opening can accelerate a ventilation-controlled fire, whereas coordinated tactical ventilation or anti-ventilation (e.g., door control) improves tenability and operational pace [46]. These observations align with incident-command decision research, which indicates that under time pressure, commanders tend toward intuitive SA→PE responses, risking priority drift, while incorporating brief decision-control checks (stating goals, anticipating consequences, weighing risk) yields more reflective choices and improved outcomes [44]. Training and assessment therefore require early declaration of a ventilation strategy (horizontal, vertical, positive pressure ventilation, or anti-ventilation), explicit sequencing with hoseline placement, assignment of a ventilation group, and evaluation of time-to-ventilation as a progress-management metric. Embedding these elements is essential to reduce misprioritization, shorten time to knockdown, and improve IIC pass rates, consistent with evidence from simulation-based command training.

The evaluation confirms that Apparatus Positioning (2-1) recorded 39 unsuccessful cases with an average score of 76.40, remaining below the pass threshold. Suboptimal placement repeatedly precipitated downstream failures in crisis response and progress management. Engines positioned without first securing or verifying a reliable supply forced low-pressure operations and improvised shuttles, delaying knockdown and elevating firefighter risk. Best-practice guidance emphasizes that first-due placement must be driven by water supply and tactical access—engines is essential to be positioned to connect to a hydrant and pump attack lines while preserving space for aerials and later-arriving units. When hydrant integrity is uncertain or lay lengths exceed practical limits, early relay or alternative water-supply configurations are indicated to maintain adequate intake pressure and avoid cavitation, thus protecting the incident timeline [47]. These findings show that apparatus positioning is a decisive early factor in maintaining operational tempo. It is essential that training and assessment embed “position-to-water” behaviors, including pre-arrival hydrant checks and contingency planning. It is required to assign a water-supply officer explicitly and to measure time-to-secure-supply as a performance benchmark. It is important that decision-control prompts are integrated so placement remains aligned with life safety and hazard dominance. There is a need to institutionalize these practices so that initial apparatus positioning functions as a primary progress-management lever rather than a reactive logistical adjustment.

Evaluation indicates that “Standard Response Tactics (2-2)” accounted for 38 unsuccessful cases with an average score of 78.20, falling below the pass threshold. Across IIC evaluations, deviations from foundational SOPs—particularly delayed (Rapid Intervention Team) RIT establishment and omission of safety controls—were consistently associated with disorganization, slower task progression, and heightened operational risk. Contemporary doctrine underscores that an immediately available RIT is required at working structure fires; specifies a minimum four-member team, leader accountability, and regular 360° surveys. Jefferson Township guidance likewise calls for RIT staging near the command post, comprehensive 360° assessment, and use of accountability boards or passport systems to prevent oversight [48]. The analysis demonstrates that these prescriptions align with broader standards: OSHA 29 CFR 1910.134 (two-in/two-out) and NFPA 1500/1561 are cited in Executive Fire Officer research as the benchmark “standard of care,” explicitly recommending formalized SOPs, competency-based training, and the integration of on-scene accountability aids [49]. It is essential that programs embed checklist-driven verification steps such as RIT establishment, two-in/two-out confirmation, and utilities/exposures control. It is critical to audit metrics such as “time-to-RIT” and “time-to-PAR” as progress indicators. There is a need to institutionalize these practices so that SOP compliance is transformed into measurable decision-control behaviors, thereby stabilizing the incident timeline and improving IIC pass rates.

IIC evaluations showed that fragile water-supply practices—single-hydrant dependence, absent secondary lays, and slow changeover during failures—were decisive contributors to unsuccessful performances. These findings echo disaster lessons at the water–fire interface: when access, power, or policy coordination delay supply, system pressure collapses and suppression effectiveness degrades, underscoring the need for pre-planned redundancy and alternative sources [50]. In parallel, regulatory and engineering critiques emphasize operational and network redundancy (e.g., simultaneous draw from two adjacent hydrants for higher fire-flow demands), verification of capacity under peak-demand conditions, and recourse to auxiliary supplies when distribution limits are reached—principles directly aligned with our observed gaps [51]. Programmatically, training and assessment is necessary to benchmark time-to-secure-supply, require early designation of a water-supply officer, and drill rapid hydrant-failure switchovers (dual hydrants, LDH changeover, relay pumping). Progress management must include explicit radioed supply status and trigger points for transitioning to relay/tanker or alternative intakes to prevent flow interruptions. Embedding these behaviors into SOPs—and scoring them (e.g., secure sustainable supply ≤2–3 min after “water on fire”)—is likely to reduce attack suspensions, limit fire growth, and improve IIC pass rates by turning water-supply continuity into a measurable command competency.

Evaluation indicates that the sub-competency, “Obtaining & Disseminating Victim Information (1-4),” recorded 32 unsuccessful cases with an average score of 78.20, below the 80% pass threshold. Across IIC evaluations, failure to rapidly obtain and broadcast reliable occupant information was a consistent determinant of poor performance. Without early interviews with building staff and acquisition of master keys/rosters, search patterns became unguided, elongating time-to-contact and increasing exposure. ICS doctrine places life safety first and assigns the Planning Section (Situation Unit) to collect, manage, and disseminate incident-relevant data and to set information-reporting schedules; the ICS organizational chart (p. 2) and role descriptions (p. 8) show how these flows are integrated into the IAP and progress checks [52]. The IAFF Incident Command module likewise codifies the APIE cycle—analyze, plan, implement, evaluate—and requires initial “establish command,” “survey the scene,” and “assess vulnerable populations,” followed by continual progress evaluation; each step depends on timely victim intelligence shared to operating units [53]. Programmatically, a “victim-information protocol” needs to be mandated: immediate interview/roster capture, master-key retrieval, and radio dissemination within minutes of arrival; Planning/Situation Unit ownership of updates; and checklist/command-board prompts tied to PARs. Embedding these behaviors as graded metrics is expected to tighten search targeting, compress rescue timelines, and improve IIC pass rates.

These findings show that the sub-competency, “Hazard/Utility Identification & Management (2-5),” registered 33 unsuccessful cases with a mean score of 78.90, which falls below the 80% pass threshold. In our IIC dataset, this sub-competency indicates inconsistent detection and control of critical scene elements (hamates, gas/electric utilities, collapse cues). These lapses produced “surprise” events—including secondary explosions—and eroded temporal control of the incident, prolonging exposure for both civilians and crews. Consistent with operational guidance, early size-up required to establish and continually reassess hazard controls (e.g., utility isolation, collapse/exclusion zones, defensive posture where indicated) and communicate them to all divisions [54]. The decision-science literature helps explain these errors: under high time pressure and uncertainty, IICs frequently default to recognition-primed, intuitive sequences (SA→PE), which elevates the risk of cue neglect; brief decision-control prompts (goal check, risk–benefit, consequence scan) increase alignment between actions and life-safety/hazard-dominance goals [45]. Training and assessment need to therefore mandate checklist-based hazard prompts on arrival, assign Safety/Planning to maintain a live hazard/utility log, and score time-to-hazard-control as a progress metric (with radio verification). Embedding these behaviors operationalizes hazard management as a measurable decision-control practice, which our evaluations suggest will reduce escalation and improve IIC pass rates.

The analysis demonstrates that the sub-competency, “Execution of Unit Commander Duties (2-6),” registered 36 unsuccessful cases with a mean score of 79.90, marginally below the 80% pass threshold. Across IIC evaluations, execution of unit-commander duties averaged 79.90 (n = 36), with failures clustering around dual-hatting: officers tried to command the incident while directly supervising crew tactics, leading to supervision gaps (e.g., a firefighter operating alone) and delayed tactical adjustments after critical events. ICS doctrine emphasizes early assumption of command, span-of-control management, and delegation (sector/tactical supervisors or an aide) so the IIC can maintain a common operational picture, set priorities, and conduct regular accountability/PAR checks; when span of control is exceeded or handovers are informal, oversight and safety degrade [55]. These results align with research on hierarchical networks in crises: clear, unified command with explicit role assignment reduces “independent action,” accelerates decisions, and stabilizes progress; conversely, ambiguous command or delayed delegation under time pressure fragments supervision and slows incident timelines. Therefore, training requires to require immediate delegation upon assuming command, disciplined handover procedures, and measured spans of control, evaluated by time-stamped accountability and progress markers [56].

Table 2 illustrates the performance of candidates on each evaluated sub-competency, listing the domain, sub-competency code and description, the number of candidates who failed that sub-competency, and the mean score for that item. For clarity, the sub-competencies are presented in descending order of failure frequency.

7. Conclusions

7.1. Theorical Impications

This study contributes to academia in several ways by deepening theoretical understanding of competency-based evaluation in incident command, with particular emphasis on crisis response and progress management. Whereas prior literature has consistently underscored the importance of adaptive decision-making under uncertainty [16,17], our findings extend these theoretical insights by demonstrating how such deficiencies manifest empirically in measurable sub-competencies. Specifically, the recurrent weaknesses observed among unsuccessful candidates confirm theoretical claims that recognition-primed and naturalistic decision-making are vulnerable to breakdown when situational complexity exceeds a commander’s experiential template [2,3,57]. In other words, NDM theory posits that experience enables rapid, effective decisions in familiar scenarios [58], so when an incident falls outside prior experience, intuitive decision strategies can fail—a pattern our results clearly identified.

A second implication lies in the refinement of competency frameworks and behavioral marker systems. Previous research has established behavioral markers such as situational awareness, communication clarity, and task delegation as proxies for command competence [17,18]. This study operationalizes those abstract constructs into quantifiable evaluation outcomes, thereby strengthening the theoretical linkage between frameworks like THINCS and observable performance in VR simulations. By doing so, the study advances theory beyond conceptual definitions and demonstrates how such competencies can be systematically identified, scored, and analyzed in high-fidelity environments.

Third, the results have implications for simulation-based learning theories. The literature has widely advocated VR and scenario-driven training as effective tools for preparing incident commanders [7,8,19]. Yet our analysis shows that deficiencies in progress management persist even after simulation exposure, suggesting that current theoretical models insufficiently account for the temporal dimension of decision-making. Crisis environments evolve dynamically, and theories of experiential learning must therefore integrate mechanisms for continuous reassessment and resource reallocation over time. By emphasizing temporal adaptability, this study extends existing simulation theories and underscores the need for integrating longitudinal complexity into training paradigms [20,32].

Another theoretical contribution arises in the debate between standardization and flexibility in high-risk operations. Previous studies have argued that strict adherence to standardized operating procedures ensures coordination and safety [21,34], whereas others contend that adaptability and improvisation are indispensable for managing non-routine crises [22,23,24,43]. Our findings empirically demonstrate that excessive reliance on rigid SOPs, without adaptive judgment, directly correlates with unsuccessful performance. This provides theoretical evidence supporting hybrid perspectives: incident command systems are vital to be conceptualized not as prescriptive rulebooks but as adaptive frameworks enabling flexible decision cycles.

Finally, the study broadens theoretical discourse on organizational resilience and crisis leadership. Weaknesses in crisis response and progress management highlight that theoretical models must move beyond individual cognition to incorporate institutional enablers such as training culture, feedback loops, and accountability structures. This aligns with prior scholarship on resilience and adaptive leadership, which stresses the integration of organizational systems with individual decision-making capacities [25,26,27,28]. By situating individual-level deficiencies within broader organizational contexts, this research contributes to the evolving theory of resilience in emergency management.

In summary, this study advances theory in several domains: (1) Decision-Making Under Uncertainty: It reinforces and extends naturalistic decision-making theories by showing how and why expert intuition can break down in unfamiliar, high-complexity situations. (2) Competency Frameworks: It demonstrates how abstract competency frameworks (e.g., command skills models) can be translated into measurable indicators of performance. (3) Simulation-Based Learning: It refines experiential learning theories by introducing the importance of temporal complexity and continuous adaptation during simulation-based training. (4) Standardization vs. Flexibility: It clarifies the balance between strict SOP adherence and adaptive improvisation, suggesting that a flexible application of standardized frameworks yields better outcomes. (5) Organizational Resilience: It expands the theoretical discussion to include organizational-level factors (culture, feedback, accountability) that bolster individual decision-making in crises.

7.2. Practical Implications

The study’s outcomes highlight the need for targeted interventions across the training ecosystem. Persistent weaknesses in crisis response and progress management stem from gaps in both cognitive and procedural competence; addressing these shortfalls requires coordinated efforts by instructors, curriculum designers, practitioners, and institutions.

For lecturers and VR facilitators, the findings imply that training must emphasize realism, adaptability and reflection. Scenario-driven exercises is essential to simulate dynamic, resource-constrained environments, encouraging trainees to practice recognition-primed decision-making and situational monitoring. Debriefs are critical to focus on decision rationale, communication effectiveness, and adaptability. Instructors are necessary to continually update simulations with lessons from recent incidents and collaborate with human-factors specialists to integrate psychological insights into training.

Curriculum designers are necessary to align programs with competency frameworks that operationalize incident-command principles such as predetermined internal alignment, scalability and teamwork synergyapps.usfa.fema.gov. Courses are necessary to embed non-technical skills—leadership, hazard recognition, and communication—alongside technical procedures, and use mixed assessments that capture both quantitative performance and qualitative indicators (e.g., radio communication quality). Scenarios are vital to scale in complexity to develop mental models gradually and be tailored to local risk profiles.

For initial incident commanders and firefighters, early and continuous exposure to high-fidelity simulations can strengthen mental templates and enhance situational awareness. Practicing radio communications under pressure helps to reduce misallocation of resources and response delays, while reflective review of after-action reports fosters adaptive learning. Maintaining physical and cognitive readiness is essential, and agencies are essential to provide refresher courses that incorporate emerging decision-support technologies and tools.

Finally, training institutions must invest in VR/AR infrastructure, analytics for performance evaluation, and professional development for instructors. Mixed-methods evaluation frameworks—combining quantitative metrics with qualitative feedback—can identify emerging competency gaps and support program improvement. Implementation within existing systems can follow a standard session script: (1) Pre-brief & readiness checks (Table A5c); (2) VR scenario with timed decision-control prompts at pre-set injects (e.g., T + 5′ ‘state priority’, T + 10′ ‘CAN progress report’); (3) Objective progress metrics logged during play (e.g., time-to-ventilation, time-to-RIT, time-to-secure-supply, CAN cadence); and (4) Reflective debrief using synchronized audio/transcript replay and score sheets to link decisions to outcomes. Instructor enablement (short faculty workshop), a common prompt list, and a simple analytics sheet allow integration into current academy and station-level drills without redesigning curricula. Partnerships with fire departments, research centers and technology developers are critical for ensuring curricula remain aligned with operational realities. Systematic evaluation of training efficacy will help to sustain a resilient and capable incident-command workforce.

7.3. Limitations and Future Research Directions

This study has several limitations required to be considered when interpreting the findings. First, the analysis relied on evaluation data from the 2022–2024 IIC certification program in a single national context, which may limit generalizability. The sample size and distribution across sub-competencies might not fully capture the diversity of incident types or operational environments. Second, the evaluation framework emphasized individual performance metrics and did not systematically incorporate team-level interactions or organizational culture, both of which can affect incident command effectiveness. Third, while radio communication transcripts were Analyzed, they may not represent the full range of communication challenges encountered in live incidents; further, some qualitative assessments were based on self-reported data, which can introduce bias. Moreover, the study’s exclusive reliance on predetermined scoring rubrics may overlook nuanced indicators of performance, such as stress management or ethical decision-making.

Future research is essential to address these limitations by adopting multi-jurisdictional and longitudinal designs to test whether observed performance patterns generalize across regions and time. Comparative studies could explore how differing doctrine, resource availability and organizational culture influence crisis-response competencies. Researchers also are necessary to evaluate the efficacy of various training interventions—such as adaptive simulation scenarios, integrated team drills and VR-based cognitive training—in enhancing specific sub-competencies. Given that complex emergencies require predetermined internal alignment, scalability and teamwork synergy, future work is required to investigate how training can foster these attributes across levels. Additionally, longitudinal studies assessing the retention of skills acquired through simulation and VR training will be essential to understand the durability of learning. Future research is critical to examine how physical and cognitive training interact. Finally, incorporating mixed methods—including qualitative interviews, physiological measurements and performance analytics—could provide a richer understanding of the cognitive, emotional and physical factors that underpin successful incident command, thereby informing more nuanced and effective training programs. Future work should integrate physiological indicators of cognitive load and attention—e.g., heart rate/HRV (stress/arousal), eye-tracking (fixations/saccades) and pupillometry (effort)—alongside performance and transcript data, to triangulate how workload and attentional control relate to sub-competency outcomes in VR-command tasks (with appropriate ethics and data-protection controls).

Author Contributions

Conceptualization, J.-c.P.; methodology, J.-c.P.; software, J.-c.P.; validation, J.-c.P., J.-c.Y.; formal analysis, J.-c.P.; investigation, J.-c.P.; resources, J.-c.P.; data curation, J.-c.P.; writing—original draft preparation, J.-c.P., J.-c.Y.; writing—review and editing, J.-c.P. visualization, J.-c.P.; supervision J.-c.Y.; project administration, J.-c.Y.; funding acquisition, J.-c.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This study was supported by a grant from National Fire Research Institution. All individuals mentioned in the acknowledgement section have provided their consent to be acknowledged.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data used in this study are available from the corresponding author upon reasonable requests.

Acknowledgments

This study was supported by a grant from National Fire Research Institution. All individuals mentioned in the acknowledgement section have provided their consent to be acknowledged.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Appendix A.1. Evaluation Process for Initial Incident Commander Certification

Table A1. Evaluation Process for Initial Incident Commander Certification.

Random assignment of five types of virtual disaster scenario	⇨	Virtual Reality	⇨	Interview (Q&A)
		On-Scence Command
5 min		20 min		5 min

Sequence & duration: Briefing (5 min)→VR On-Scene Command (20 min)→Q&A (5 min).
Components
- On-scene command: Evaluates situation reporting, response activities, life rescue, and crew safety.
- Q&A: Provides additional assessment for competencies not fully verified during the VR segment.
Evaluation indicators: 5 evaluation areas comprising 20 behavioral indicators in total.
Pass criteria: A candidate passes with a total score of 80/100 or higher and no “Low” rating on any key behavioral indicator from two or more assessors.
Assessor/Evaluator Procedures & Adjudication. Each evaluator scores independently using the standardized rubric. For ★ Key Indicators, a narrative justification is mandatory whenever a Low rating is assigned. If two or more evaluators assign Low on any ★ indicator, the candidate automatically failed regardless of other scores. Such cases are re-reviewed in a panel consensus meeting to confirm rationales, ensure scoring integrity, and finalize the outcome. Adjudication notes (rationale, time stamps, final decision) are recorded and archived with the candidate’s evaluation record.

Appendix A.2. Assessment Criteria and Behavioral Indicators

For any “Low” rating, the assessor must clearly state the reason, emphasizing its importance and implications for field command.
Record detailed opinions on the candidate’s judgment, communication clarity, tactical command, and leadership qualities.
Specifically describe areas where the candidate showed difficulty or hesitation and provide opinions that can assist in future training and improvement.
- When a criterion is rated Low, the evaluator must record a specific, behavior-anchored reason (what was observed, when, and its operational impact).
- For ★ Key Indicators, a reason for any Low rating is mandatory. If two or more evaluators assign Low, the candidate is automatically failed; such cases are re-discussed in a panel consensus meeting, and the final decision is documented.
- Provide comprehensive comments on overall strengths and areas requiring improvement.
- These comments will be included when notifying individual results and serve as primary feedback. Please provide constructive remarks that can help the candidate improve capabilities.

Table A2. Behavioral Indicators and Scoring Framework.

Evaluation Category	Behavioral Indicators	Score High \| Mid \| Low
1. Situation Assessment	1-1 Information Gathering & Mission Sharing en Route	H(3) \| M(2) \| L(1)
	1-2 Situation Dissemination en Route	H(3) \| M(2) \| L(1)
	1-3 Initial Situation Report & Command Declaration ★	H(10) \| M(7) \| L(4)
	1-4 Victim Information Gathering & Dissemination	H(5) \| M(3) \| L(1)
	1-5 Requesting Additional Resources	H(3) \| M(2) \| L(1)
2. Response Activities	2-1. Apparatus Placement	H(5) \| M(3) \| L(1)
	2-2 Standard Response Tactics	H(5) \| M(3) \| L(1)
	2-3 Assigning Tasks to Later-Arriving Units	H(3) \| M(2) \| L(1)
	2-4 Crisis Response & Progress Management	H(5) \| M(3) \| L(1)
	2-5 Identification & Management of Fireground Elements	H(5) \| M(3) \| L(1)
	2-6 Unit Leader Task Execution	H(3) \| M(2) \| L(1)
3. Firefighting Tactic Skills	3-1 Firefighting Water Supply	H(5) \| M(3) \| L(1)
	3-2 Forcible Entry & Interior Attack Initiation	H(5) \| M(3) \| L(1)
	3-3 Hose Deployment, Water Application & Nozzle Placement	H(5) \| M(3) \| L(1)
	3-4 Ventilation	H(5) \| M(3) \| L(1)
4. Communication	4-1 Radio Communication Protocols	H(5) \| M(3) \| L(1)
	4-2 Information Delivery Effectiveness	H(5) \| M(3) \| L(1)
	4-3 Situation Handover to Arriving Chief Officer	H(5) \| M(3) \| L(1)
5. Achievement of Core Objective	5-1 Effectiveness in Achieving Life-Safety Objectives ★	H(10) \| M(7) \| L(4)
5. Achievement of Core Objective	5-2 Crew Safety Management	H(5) \| M(3) \| L(1)

★: This is the most essential element of incident command at the fireground.

Appendix A.3. Certification Eligibility, Methods, and Evaluation Panel

Table A3. General Information on Eligibility and Evaluation Method.

Eligibility	Applicants must be IICs’ with at least one year of field experience
Evaluation Method	Simulated disaster scenarios will be used to evaluate IICs’ capabilities. ※ Fire, explosion, and hazardous material accidents are simulated; scenarios may be adjusted to match the candidate’s real-world experience and rank. 1. Grant situational authority and responsibilities based on the scenario. 2. Assess the individual’s abilities through interviews on the commander’s decisions and response strategies. 3. Draw one of the five simulated disaster environments by lot before the evaluation begins. 4. Independent scoring: Each evaluator completes all ratings independently using the standardized rubric (H/M/L mapped to percentage). 5. Low-on-★ protocol: Any Low on ★ indicators requires a written rationale; if two or more evaluators assign Low on any ★ item, the candidate is automatically failed. 6. Consensus meeting: Following independent scoring, evaluators convene to review Low-on-★ cases, confirm rationales, and finalize the decision (pass/fail). 7. Documentation: The panel chair records adjudication notes (items discussed, final decision, time) and attaches them to the candidate’s evaluation record.
Evaluation Panel	At least 3 evaluators for the practical exam and 3 or more for the interview. Panels include at least one external evaluator. The panel chair facilitates consensus meetings and ensures that Low-on-★ adjudications follow the protocol uniformly across sessions. ※ Must include both internal and at least one external evaluator. 1. A firefighter of the same or higher rank as the candidate. 2. A firefighter who holds a fire command qualification at the same or higher level. 3. A university professor with relevant research or teaching experience in fire science or command.
Evaluation Criteria	Behavioral indicators are assessed based on High–Medium–Low ratings.
Adjustable Evaluation Items	Excluding the 2 core behavioral indicators, up to 20 of other indicators may be adjusted based on real-world applicability. ※ Adjustments may include deletion, reduction, or redistribution of scores across non-core items
Five Types of Virtual Disaster Scenario	Karaoke fire	Gosiwon fire	Construction site fire	Residential villa fire	Apartment fire
Five Types of Virtual Disaster Scenario

★: This is the most essential element of incident command at the fireground

Appendix A.4. Organization and Functions of the Certification Committees

The committee consists of the Operation Committee, Expert Committees, and the Working-level Council.
- Operation Committee: Includes the Director of 119 Operations, the Head of the Disaster Response Division, and key section chiefs from each operating institution.
- Expert Committees: Includes both the Competency Development Expert Committee and the Competency Evaluation Expert Committee.
- Working-level Council: Composed of the Disaster Response Division (lead department) and staff from operating institutions.
Functions and Deliberation Methods of the Operation Committee
- Function: The committee decides on key matters to ensure consistency in on-site commander competency training and certification assessment, and to maintain appropriateness, fairness, objectivity, and transparency in procedures and content.
- Meeting Convening: As needed—convened by the committee chair if significant matters arise.
- Deliberation Method: A majority vote of present members determines decisions.
- Adjudication oversight: Reviews summary logs of Low-on-★ adjudications for consistency and fairness; issues corrective guidance where needed.
- Appeals handling: Defines a simple, time-bound process to review formal appeals (scope limited to procedure and scoring integrity) and records outcomes for quality assurance.

In case of a tie, the chairperson makes the final decision.

Table A4. Composition of the Operation and Expert Committees.

Chairman: Director of 119 Operations -Members: Head of Disaster Response Division, Key Section Chiefs
Competency Development Expert Committee	Competency Evaluation Expert Committee	Working-level Council
- Chair: Head of HR Development, Central Fire School - Members: 10 internal/10 external	- Chair: Commander, Seoul Fire HQ - Members: 10 internal/10 external	- Staff from Disaster Response Division - Operational institution staff

Appendix A.5. Simulation Scenarios and Difficulty Parameter

Table A5. (a) Scenario Structure and Difficulty Parameters. (b) Real Fire Incidents Underpinning VR Scenarios. (c) VR Training Readiness and Debriefing Checklist.

(a)
Scenario	Dispatch Trigger	Initial Conditions	Victim Distribution	Fire Locations	Fire Spread	Challenges	Hazard Notes
1. Goshiwon Fire	Fire reported on 2nd and 3rd floors	Upon arrival, one person is hanging from a 2F window requesting rescue. Deployment of an air rescue cushion is possible; use of an aerial ladder truck is not feasible.	6 people total; 2F–1 hanging from window, 1 in hallway, 2 inside rooms 218 and 227; 3F–1 trapped on floor; Roof–1 person taking refuge.	2 (two ignition points on 2F and 3F)	Fire extends upward from 2F to 3F	Delayed arrival due to heavy traffic congestion	City gas supply not shut off (explosion risk); one victim fell from 2F window during escape; Room 227 door locked (entry delayed); thick smoke causing near-zero visibility.
2. Karaoke Room Fire	Fire outbreak on 2nd floor	Upon arrival, heavy smoke is billowing from the 2F windows. One person is visible at a 3F window calling for help.	5 people total; 3F–1 hanging from window, 1 in corridor; 2F–1 collapsed in hallway, 2 trapped inside karaoke rooms.	1 (ignition on 2F)	Fire spreads upward from 2F to 3F	Delayed response due to illegally parked cars	Maze-like interior layout hinders search; flammable soundproofing materials produce dense toxic smoke; insufficient emergency exits make egress difficult.
3. Residential Villa Fire	Fire outbreak on 3rd floor	Upon arrival, a resident is found on a 4F balcony awaiting rescue. Fire and smoke are spreading toward the 4th floor.	5 people total; 4F–1 on balcony; 3F–1 trapped in the burning unit, 1 in stairwell; 2F–1 semi-conscious from smoke; Roof–1 person who fled upward.	1 (ignition in 3F unit)	Fire spreads from 3F up to 4F	None	Rapid smoke spread through open stairwell; no sprinkler system in building; small LPG gas cylinder in use (potential explosion hazard).
4. Apartment Building Fire	Fire reported on 8th floor	Upon arrival, flames are venting out of an 8F apartment. A resident is spotted on a 9F balcony shouting for help.	5 people total; 9F–1 on balcony, 1 in apartment above fire; 8F–1 in burning apartment, 1 collapsed in hallway; 7F–1 overcome by smoke on the floor below.	1 (ignition in 8F apartment)	Fire spreads from 8F up to 9F	None	One resident jumped from 8F before rescue (fatal injuries); broken windows create backdraft risk; high-rise height complicates evacuation and firefighting operations.
5.Construction Site Fire	Fire outbreak in building under construction	Upon arrival, the unfinished structure is engulfed in flames on one side. Debris and construction materials litter the scene.	4 people total; 3 at site–1 worker on upper scaffolding, 1 trapped under debris, 1 incapacitated at ground level; 1 missing (unaccounted for amid the chaos).	1 (ignition in scaffold/structure)	Fire spreads through scaffolding and materials on site	None	Multiple fuel and gas cylinders on site (explosion hazard); structural integrity compromised (collapse risk); lack of on-site water source slows firefighting.
(b)
VR Scenario		Real Incident (Year)		Incident Description		Scenario Validation Note
Goshiwon Fire		Jongno District Goshiwon Fire (2018) [59]		Fire in a low-cost dormitory in Seoul; 7 killed, 11 injured.		Mirrors blocked exit and safety lapses from the real case (e.g., crucial exit obstructed, no sprinklers), trapping multiple occupants as in the actual incident.
Karaoke Room Fire		Busan Karaoke Bar Fire (2012) [60]		Blaze in a Busan karaoke lounge; 9 people died and 25 were injured.		Scenario replicates the complex interior layout and evacuation challenges of the real event—illegal interior alterations led to hard-to-find exits (one exit hidden inside a room), lack of sprinklers, and toxic smoke in narrow corridors.
Construction Site Fire		Icheon Logistics Warehouse Fire (2020) [61]		Massive blaze at an unfinished warehouse in Icheon; 38 construction workers killed, rapid flash fire spread.		Scenario includes flammable construction materials and explosion hazards (witnesses heard ~10 explosions in real case), reflecting the real incident’s fast spread and difficulties evacuating workers.
Residential Villa Fire		Goyang Villa Fire (2014) [62]		Fire in a four-story “villa” house in Seoul; 1 killed, 14 injured.		Scenario reflects the open stairwell/piloti design of the actual house—the open ground-floor columns allowed fire and smoke to spread rapidly and hampered evacuation in reality, which is mirrored in the simulation.
Apartment Fire		Uijeongbu Apartment Complex Fire (2015) [63]		Fire engulfed multiple apartment buildings in Uijeongbu; 4 killed, 124 injured.		Scenario captures high-rise fire dynamics observed in this incident—quick vertical fire spread and backdraft risks exacerbated by lack of sprinklers on mid-level floors, and a fatal fall occurred when a resident jumped to escape (a risk included in the simulation).
(c)
Readiness Category				Pre-Simulation Checklist Items
Instructor Preparation				Instructors oriented to scenario objectives; evaluation criteria/rubric prepared; safety and emergency protocols reviewed; briefing materials ready.
Technical Setup				VR hardware and software tested (headsets, controllers, sensors calibrated); communication systems (radios) checked; data recording enabled; backup power and technical support available.
Scenario Fidelity				Virtual scenario verified against real incident data for authenticity; fire and smoke behavior tuned to realistic levels; victim placements and hazard props (e.g., gas cylinders) confirmed; trigger events (timed injects) functioning correctly.
Debrief Prompts				Guided debrief plan outlined (key decision points and actions noted); reflective questions prepared to prompt discussion; performance metrics and observations logged for feedback; debrief aids (video/audio recordings, score sheets) ready.

Appendix A.6. Qualitative Coding Reliability and Exemplars

This appendix provides the full codebook excerpt, exemplar excerpts, and inter-coder agreement metrics (Agreement % and Cohen’s κ) for the qualitative analysis of radio communications. Coding was conducted at the utterance level using a structured Excel-based coding matrix. All candidates (N = 92) were included (census). Two coders independently coded all transcripts; disagreements were resolved in consensus meetings (a priori agreement ≥ 80%).

Table A6. Qualitative Coding Reliability and Exemplars.

Code Category	Definition (Brief)	N (Utterances)	Agreement (%)	Cohen’s κ	Exemplar (Anonymized)	Exemplar (EN Gloss, Optional)
Unclear/Ineffective Transmission	Message lacks clarity/standard format; receiver cannot act unambiguously	210	89.05	0.78	“Uh… can you do that… now?” (unclear addressee / unclear action)	Missing addressee/action verb
Delay/Omission of Critical Update	Late or missing transmission of time-critical information	185	87.57	0.75	“Delay/omission in providing fire spread and victim information updates.”	PAR; conditions–actions–needs missing
Overlapping/Blocked Traffic	Simultaneous transmissions block content	162	86.42	0.72	“Message lost due to overlapping transmissions.”	No radio discipline/priority
Ambiguous Order/Assignment	Order lacks task, location, or acknowledgment requirement	198	90.40	0.80	“Unit 3, proceed to location and handle task, report when complete.”	No read-back; no time marker
Inter-Agency/Division Coordination Failure	Handoff, boundary, or dependency not coordinated	154	84.42	0.67	“Conflict in operations between divisions/agencies.”	Duplicate/contradictory tasks
Missing Acknowledgment/Read-back	Receiver fails to acknowledge/read back key instruction	176	88.07	0.76	“No acknowledgment after task assignment.”	No closed-loop comms
Missing/Invalid Progress Report	Progress markers (conditions–actions–needs) absent or invalid	141	85.82	0.70	“Tactical change made without a progress report.”	No time-stamped updates
		1226	87.39	0.74

Note: Cohen’s κ (kappa) corrects observed agreement for chance; range −1 to 1 (higher is better). Provide anonymized excerpts; optional brief EN gloss. Round κ to two decimals.

References

Cole, D.; St Helena, C. Chaos, Complexity, and Crisis Management: A New Description of the Incident Command System; National Fire Academy: Emmitsburg, MD, USA, 2001.
Butler, P.C.; Honey, R.C.; Cohen-Hatton, S.R. Development of a behavioural marker system for incident command in the UK fire and rescue service: THINCS. Cogn. Technol. Work 2019, 22, 1–12. [Google Scholar] [CrossRef]
Lipshitz, R.; Klein, G.; Orasanu, J.; Salas, E. Taking stock of naturalistic decision making. J. Behav. Decis. Mak. 2001, 14, 331–352. [Google Scholar] [CrossRef]
Klein, G.A.; Calderwood, R. Decision models: Some lessons from the field. IEEE Trans. Syst. Man Cybern. 1991, 21, 1018–1026. [Google Scholar] [CrossRef]
Cohen-Hatton, S.R.; Butler, P.C.; Honey, R.C. An Investigation of Operational Decision Making in Situ:Incident Command in the U.K. Fire and Rescue Service. Hum. Factors 2015, 57, 793–804. [Google Scholar] [CrossRef]
Butler, P.C.; Bowers, A.; Smith, A.P.; Cohen-Hatton, S.R.; Honey, R.C. Decision Making Within and Outside Standard Operating Procedures: Paradoxical Use of Operational Discretion in Firefighters. Hum. Factors 2023, 65, 1422–1434. [Google Scholar] [CrossRef] [PubMed]
Hall, K.A. The Effect of Computer-Based Simulation Training on Fire Ground Incident Commander Decision Making; The University of Texas at Dallas: Richardson, TX, USA, 2010. [Google Scholar]
Duczyminski, P. Sparking Excellence in Firefighting Through Simulation Training. Technology 2024. Available online: https://www.firehouse.com/technology/article/55237747/sparking-excellence-in-firefighting-through-simulation-training (accessed on 29 September 2025).
Lewis, W. Commander Competency. 2023. Available online: https://www.firehouse.com/technology/incident-command/article/21288477/the-making-of-the-most-competent-fireground-incident-commanders (accessed on 29 September 2025).
Radianti, J.; Majchrzak, T.A.; Fromm, J.; Wohlgenannt, I. A systematic review of immersive virtual reality applications for higher education: Design elements, lessons learned, and research agenda. Comput. Educ. 2020, 147, 103778. [Google Scholar] [CrossRef]
Bower, C. Addressing Gaps in Command Capability and Experience in the Copley Fire Departme; National Fire Academy: Emmitsburg, MD, USA, 2018.
Lee, E.P. Analysis of causes of casualties in Jecheon sports center fire-Focus on structural factors of building and equipment. Fire Sci. Eng. 2018, 32, 86–94. [Google Scholar] [CrossRef]
Wikipedia Contributors. Jecheon Building Fire. 2025. Available online: https://en.wikipedia.org/wiki/Jecheon_building_fire#:~:text=people%20and%20injuring%20another%2036.,2 (accessed on 7 September 2025).
Wikipedia Contributors. Miryang Hospital Fire. 2025. Available online: https://en.wikipedia.org/wiki/Miryang_hospital_fire (accessed on 7 September 2025).
U.S. Government Accountability Office. Much Work Remains to Improve Communications Interoperability; F. responders; U.S. Government Accountability Office: Washington, DC, USA, 2007.
Endsley, M.R. Theoretical underpinnings of situation awareness: A critical review. Situat. Aware. Anal. Meas. 2000, 1, 3–21. [Google Scholar]
Cohen-Hatton, S.R.; Honey, R.C. Goal-oriented training affects decision-making processes in virtual and simulated fire and rescue environments. J. Exp. Psychol. Appl. 2015, 21, 395–406. [Google Scholar] [CrossRef]
Reader, T.; Flin, R.; Lauche, K.; Cuthbertson, B.H. Non-technical skills in the intensive care unit. Br. J. Anaesth. 2006, 96, 551–559. [Google Scholar] [CrossRef] [PubMed]
Hayes, P.; Bearman, C.; Butler, P.; Owen, C. Non-technical skills for emergency incident management teams: A literature review. J. Contingencies Crisis Manag. 2021, 29, 185–203. [Google Scholar] [CrossRef]
Wijkmark, C.H.; Metallinou, M.M.; Heldal, I. Remote Virtual Simulation for Incident Commanders—Cognitive Aspects. Appl. Sci. 2021, 11, 6434. [Google Scholar] [CrossRef]
Bigley, G.A.; Roberts, K.H. The incident command system: High-reliability organizing for complex and volatile task environments. Acad. Manag. J. 2001, 44, 1281–1299. [Google Scholar] [CrossRef]
Barton, M.A.; Sutcliffe, K.M. Overcoming dysfunctional momentum: Organizational safety as a social achievement. Hum. Relat. 2009, 62, 1327–1356. [Google Scholar] [CrossRef]
Weick, K.; Sutcliffe, K. Managing the Unexpected Resilient Performance in an Age of Uncertainty; John Wiley & Sons: Hoboken, NJ, USA, 2007; Volume 8. [Google Scholar]
Kwon, S.A.; Lee, J.E.; Ban, Y.U.; Lee, H.-J.; You, S.; Yoo, H.J. Safety Measure for Overcoming Fire Vulnerability of Multiuse Facilities—A Comparative Analysis of Disastrous Conflagrations between Miryang and Jecheon. Crisis Emerg. Manag. Theory Prax. 2018, 14, 149–167. [Google Scholar] [CrossRef]
Duchek, S. Organizational resilience: A capability-based conceptualization. Bus. Res. 2020, 13, 215–246. [Google Scholar] [CrossRef]
Cho, E.H.; Nam, J.H.; Shin, S.A.; Lee, J.B. A Study on the Preliminary Validity Analysis of Korean Firefighter Job-Related Physical Fitness Test. Int. J. Environ. Res. Public Health 2022, 19, 2587. [Google Scholar] [CrossRef]
Lee, S.C.; Lin, C.Y.; Chuang, Y.J. The Study of Alternative Fire Commanders’ Training Program during the COVID-19 Pandemic Situation in New Taipei City, Taiwan. Int. J. Environ. Res. Public Health 2022, 19, 6633. [Google Scholar] [CrossRef]
National Fire Agency. Strengthening Fire-Ground Command Capabilities: First Implementation of “Strategic On-Scene Commander” Certification. In Disaster Incident News; 2024. Available online: https://www.nfa.go.kr/nfa/news/disasterNews/?boardId=bbs_0000000000001896&mode=view&cntId=208261 (accessed on 29 September 2025).
National Fire Chiefs Council. National Operational Guidance. 2023. Available online: https://nfcc.org.uk/our-services/national-operational-guidance/ (accessed on 29 September 2025).
Jane Lamb, K.; Davies, J.; Bowley, R.; Williams, J.-P. Incident command training: The introspect model. Int. J. Emerg. Serv. 2014, 3, 131–143. [Google Scholar] [CrossRef]
Lamb, M.B.D.K.; Verhoef, I. Why Simulation is Key for Maintaining Fire Incident Preparedness. 2015. Available online: https://www.sfpe.org/publications/fpemagazine/fpearchives/2015q2/fpe2015q24 (accessed on 29 September 2025).
Thielsch, M.T.; Hadzihalilovic, D. Correction to: Evaluation of Fire Service Command Unit Trainings. Int. J. Disaster Risk Sci. 2021, 12, 443. [Google Scholar] [CrossRef]
Gillespie, S. A Dissertation Presented in Partial Fulfillment of the Requirements for the Degree Doctorate of Education; G.C. University: Phoenix, AZ, USA, 2013; p. 158. [Google Scholar]
Carolino, J.; Rouco, C. Proficiency Level of Leadership Competences on the Initial Training Course for Firefighters—A Case Study of Lisbon Fire Service. Fire 2022, 5, 22. [Google Scholar] [CrossRef]
Hancko, D.; Majlingova, A.; Kačíková, D. Integrating Virtual Reality, Augmented Reality, Mixed Reality, Extended Reality, and Simulation-Based Systems into Fire and Rescue Service Training: Current Practices and Future Directions. Fire 2025, 8, 228. [Google Scholar] [CrossRef]
Berthiaume, M.; Kinateder, M.; Emond, B.; Cooper, N.; Obeegadoo, I.; Lapointe, J.-F. Evaluation of a virtual reality training tool for firefighters responding to transportation incidents with dangerous goods. Educ. Inf. Technol. 2024, 29, 14929–14967. [Google Scholar] [CrossRef]
Crow, I. Training’s New Dimension, with Fire Service College AI-Powered, Firefighter Training 2025. Available online: https://internationalfireandsafetyjournal.com/trainings-new-dimension-with-fire-service-college/ (accessed on 29 September 2025).
Rizzo, A.; Morie, J.F.; Williams, J.; Pair, J.; Buckwalter, J.G. Human Emotional State and its Relevance for Military VR Training. In Proceedings of the 11th International Conference on Human Computer Interaction, Las Vegas, NV, USA, 22–27 July 2025. [Google Scholar]
Lele, A. Virtual reality and its military utility. J. Ambient. Intell. Humaniz. Comput. 2011, 4, 17–26. [Google Scholar] [CrossRef]
Pallavicini, F.; Argenton, L.; Toniazzi, N.; Aceti, L.; Mantovani, F. Virtual Reality Applications for Stress Management Training in the Military. Aerosp. Med. Hum. Perform. 2016, 87, 1021–1030. [Google Scholar] [CrossRef] [PubMed]
UK Research and Innovation. Improving Command Skills for Fire and Rescue Service Incident Response. Available online: https://www.ukri.org/who-we-are/how-we-are-doing/research-outcomes-and-impact/esrc/improving-command-skills-for-fire-and-rescue-service-incident-response/ (accessed on 29 September 2025).
Alhassan, A.I. Analyzing the application of mixed method methodology in medical education: A qualitative study. BMC Med. Educ. 2024, 24, 225. [Google Scholar] [CrossRef]
Drake, B. “Good Enough” Isn’t Enough: Challenging the Standard in Fire Service Training. 2025. Available online: https://www.fireengineering.com/firefighting/good-enough-isnt-enough-challenging-the-standard-in-fire-service-training/ (accessed on 29 September 2025).
Park, J.-C.; Suh, J.-H.; Chae, J.-M. Simulation-Based Evaluation of Incident Commander (IC) Competencies: A Multivariate Analysis of Certification Outcomes in South Korea. Fire 2025, 8, 340. [Google Scholar] [CrossRef]
Sapsford, O. From the Fire Ground: Insights into Crisis Command Decision-Making. In Coventry University Research Presentation; Coventry University: Coventry, UK, 2024. [Google Scholar]
Hartin, E. Ventilation Strategies: International Best Practice; CFBT-US: Washington, DC, USA, 2008; pp. 1–11. [Google Scholar]
Menomonee Falls Fire Department, T.B. Driver Operator Manual; Menomonee Falls Fire Department, T.B.: Menomonee Falls, WI, USA, 2008; pp. 1–61. [Google Scholar]
Jefferson Township Volunteer Fire Company. Rapid Intervention Team Operations: Standard Operating Guidelines; National Fire Academy: Jefferson Township, PA, USA, 2000; p. 3.
Hansen, S. Rapid Intervention Team Operations: Standard Operating Procedures; U.S. Fire Administration, Federal Emergency Management Agency: Emmitsburg, MD, USA, 2000; p. 70.
Sowby, R.B.; Porter, B.W. Water Supply and Firefighting: Early Lessons from the 2023 Maui Fires. Water 2024, 16, 600. [Google Scholar] [CrossRef]
Ścieranka, G. Krytyczna ocena wymagań przeciwpożarowych dotyczących sieci wodociągowych/Firefighting Water-supply System Requirements—A Critical Assessment. Bezpieczeństwo I Tech. Pożarnicza 2017, 48, 124–136. [Google Scholar] [CrossRef]
Federal Emergency Management Agency U.S Fire Administration. ICS Organizational Structure and Elements; Federal Emergency Management Agency: Emmitsburg, MD, USA, 2018; p. 11.
International Association of Fire Fighters. Incident Command Module; International Association of Fire Fighters: Washington, DC, USA, 2003; p. 33. [Google Scholar]
National Institute for Occupational Safety and Health. Preventing Deaths and Injuries to Fire Fighters by Establishing Collapse Zones at Structure Fires; National Institute for Occupational Safety and Health: Cincinnati, OH, USA, 2014; p. 6.
HM Government Department for Communities and Local Government. Fire and Rescue Manual: Volume 2—Fire Service Operations, Incident Command, 3rd ed.; The Stationery Office (TSO): London, UK, 2008.
Moynihan, D.P. From Forest Fires to Hurricane Katrina: Case Studies of Incident Command Systems. In Networks and Partnerships Series; IBM Center for The Business of Governmen: Washington, DC, USA, 2007. [Google Scholar]
Toolshero. Recognition-Primed Decision Making (RPD); Toolshero: Rotterdam, The Netherlands, 2023. [Google Scholar]
Klein, G. Naturalistic decision making. Human factors. J. Hum. Factors Ergon. Soc. 2008, 50, 456–460. [Google Scholar] [CrossRef]
Korea JoongAng Daily. No sprinklers in gosiwon where fire killed 7 day laborers. Korea JoongAng Daily, 11 September 2018. [Google Scholar]
LIVEJOURNAL. Korean Police Find Illegal Changes in Karaoke Fire Venue. 2012. Available online: https://omonatheydidnt.livejournal.com/9144474.html (accessed on 29 September 2025).
The Guardian. South Korea fire kills nearly 40 construction workers. The Guardian, 30 April 2020. [Google Scholar]
Xinhua News Agency. South Korea wildfires force thousands to evacuate, Yonhap reports. Xinhua, 13 August 2025. [Google Scholar]
Korea JoongAng Daily. Fire at apartment blocks kills 4, leaves 124 injured. Korea JoongAng Daily, 11 January 2015. [Google Scholar]

Figure 1. Mixed-methods workflow for evaluating IIC competence. This workflow illustrates the integration of quantitative assessment scores and qualitative transcript reviews. The workflow combines statistical weaknesses (what went wrong) with behavioral evidence (how/why errors occurred) to provide a comprehensive understanding of performance gaps.

Figure 2. Sub-Competency Performance in IIC Assessments, 2022–2024. Orange bars = Unsuccessful Count (score < 80%); Blue line = Mean Score (points); Red dashed = 80% threshold. Left y-axis: 0–100 points (mean score). Right y-axis: 0–92 candidates (unsuccessful). x-axis: 20 sub-competencies (codes 1-1 … 5-2). N = 92.

Figure 3. Domain-Level Performance in IIC Assessments, 2022–2024. Orange bars = Average Unsuccessful Count per domain (score < 80%); Blue line = Domain Mean Score (points); Red dashed = 80% threshold. Left y-axis: 0–100 points. Right y-axis: 0–92 candidates. x-axis: 1 Situation Assessment; 2 Response Activities; 3 Firefighting Tactics; 4 Communication; 5 Achievement of Core Objective. N = 92.

Figure 4. Sub-Competency Performance in IIC Assessments (Top Eight Failures), 2022–2024. Orange bars = Unsuccessful Count (score < 80%); Blue line = Mean Score (points); Red dashed = 80% threshold. Left y-axis: 0–100 points. Right y-axis: 0–92 candidates. x-axis: eight highest-failure sub-competencies (codes shown). N = 92.

Table 1. Sub-Competency Failures in 2022~2024 IIC Assessment.

Competency Domain	Sub-Competency (Code & Description)	Unsuccessful Sub-Competency (Count)	Average Score
2. Response Activities	2-4. Crisis Response & Progress Management	39	74.20
3. Firefighting Tactics	3-4. Ventilation	39	75.60
2. Response Activities	2-1. Apparatus Placement	39	76.40
2. Response Activities	2-2. Standard Response Tactics	38	78.20
3. Firefighting Tactics	3-1. Firefighting Water Supply	32	78.20
1. Situation Assessment	1-4. Victim Information Gathering & Dissemination	32	78.20
2. Response Activities	2-5. Identification & Management of Fireground Elements	33	78.90
2. Response Activities	2-6. Unit Leader Task Execution	36	79.90
5. Achievement of Core Objective	5-1. Adequacy of Achieving Rescue Objectives ★	26	80.70
3. Firefighting Tactics	3-3. Hose Deployment, Water Application & Nozzle Placement	22	82.20
1. Situation Assessment	1-5. Requesting Additional Firefighting Resources	30	82.30
4. Communication	4-2: Information Delivery Effectiveness	27	82.30
5. Achievement of Core Objective	5-2. Crew Safety Management	25	82.30
2. Response Activities	2-3 Assigning Tasks to Later Arriving Units	31	83.70
4. Communication	4-1. Radio Communication Principles	22	84.10
1. Situation Assessment	1-3. Initial Situation Report & Command Declaration ★	25	84.20
4. Communication	4-3. Situation Handover to Arriving Chief Officer	19	86.70
3. Firefighting Tactics	3-2. Forcible Entry & Interior Attack Initiation	10	88.70
1. Situation Assessment	1-2. Situation Dissemination en Route	12	92.40
1. Situation Assessment	1-1. Information Gathering & Mission Sharing en Route	5	93.10

★: This is the most essential element of incident command at the fireground.

Table 2. Performance of candidates across sub-competencies in IIC assessments, 2022–2024.

Competency Domain	Sub-Competency (Code & Description)	Unsuccessful Sub-Competency (Count)	Average Score
2. Response Activities	2-4. Crisis Response & Progress Management	39	74.20
3. Firefighting Tactics	3-4. Ventilation	39	75.60
2. Response Activities	2-1. Apparatus Placement	39	76.40
2. Response Activities	2-2. Standard Response Tactics	38	78.20
3. Firefighting Tactics	3-1. Firefighting Water Supply	32	78.20
1. Situation Assessment	1-4. Victim Information Gathering & Dissemination	32	78.20
2. Response Activities	2-5. Identification & Management of Fireground Elements	33	78.90
2. Response Activities	2-6. Unit Leader Task Execution	36	79.90

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Park, J.-c.; Yun, J.-c. Analysis for Evaluating Initial Incident Commander (IIC) Competencies on Fireground on VR Simulation Quantitative–Qualitative Evidence from South Korea. Fire 2025, 8, 390. https://doi.org/10.3390/fire8100390

AMA Style

Park J-c, Yun J-c. Analysis for Evaluating Initial Incident Commander (IIC) Competencies on Fireground on VR Simulation Quantitative–Qualitative Evidence from South Korea. Fire. 2025; 8(10):390. https://doi.org/10.3390/fire8100390

Chicago/Turabian Style

Park, Jin-chan, and Jong-chan Yun. 2025. "Analysis for Evaluating Initial Incident Commander (IIC) Competencies on Fireground on VR Simulation Quantitative–Qualitative Evidence from South Korea" Fire 8, no. 10: 390. https://doi.org/10.3390/fire8100390

APA Style

Park, J.-c., & Yun, J.-c. (2025). Analysis for Evaluating Initial Incident Commander (IIC) Competencies on Fireground on VR Simulation Quantitative–Qualitative Evidence from South Korea. Fire, 8(10), 390. https://doi.org/10.3390/fire8100390

Article Menu

Analysis for Evaluating Initial Incident Commander (IIC) Competencies on Fireground on VR Simulation Quantitative–Qualitative Evidence from South Korea

Abstract

1. Introduction

2. Literature Review

3. Materials

4. Methods

4.1. Study Framework

4.2. Quantitative Evaluation

4.3. Qualitative Communication Analysis

4.4. Integration of Findings

5. Results

5.1. Overall Candidate Performance

5.2. Sub-Competency Performance Breakdown

5.2.1. Highest Failure Sub-Competencies

5.2.2. Moderate Failure Sub-Competencies

5.2.3. Lower Failure Sub-Competencies

5.2.4. Domain-Level Patterns

5.2.5. Descriptive Statistics

6. Discussion

7. Conclusions

7.1. Theorical Impications

7.2. Practical Implications

7.3. Limitations and Future Research Directions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A

Appendix A.1. Evaluation Process for Initial Incident Commander Certification

Appendix A.2. Assessment Criteria and Behavioral Indicators

Appendix A.3. Certification Eligibility, Methods, and Evaluation Panel

Appendix A.4. Organization and Functions of the Certification Committees

Appendix A.5. Simulation Scenarios and Difficulty Parameter

Appendix A.6. Qualitative Coding Reliability and Exemplars

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI