LLM-Guided Hybrid Simulation for Airport Cyber-Resilience Assessment

Katale, Tejaswini Sanjay; Gao, Lu; Liu, Yongxin; Liu, Dahai; Chen, Hongyun

doi:10.3390/math14111923

Open AccessArticle

LLM-Guided Hybrid Simulation for Airport Cyber-Resilience Assessment

by

Tejaswini Sanjay Katale

¹,

Lu Gao

^2,*

,

Yongxin Liu

³

,

Dahai Liu

⁴

and

Hongyun Chen

⁵

¹

Department of Computer Science, University of Houston, Houston, TX 77004, USA

²

Department of Civil and Environmental Engineering, University of Houston, Houston, TX 77004, USA

³

Mathematics Department, Embry-Riddle Aeronautical University, Daytona Beach, FL 32114, USA

⁴

College of Aviation, Embry-Riddle Aeronautical University, Daytona Beach, FL 32114, USA

⁵

Civil Engineering Department, College of Engineering, Embry-Riddle Aeronautical University, Daytona Beach, FL 32114, USA

^*

Author to whom correspondence should be addressed.

Mathematics 2026, 14(11), 1923; https://doi.org/10.3390/math14111923

Submission received: 29 April 2026 / Revised: 25 May 2026 / Accepted: 30 May 2026 / Published: 1 June 2026

(This article belongs to the Special Issue Mathematical Methods in System Engineering Modeling and Simulation)

Download

Browse Figures

Versions Notes

Abstract

Airport systems rely on tightly connected digital and physical components, so cyber disruptions can affect both service performance and passenger movement. Existing airport simulation studies often focus on either queue-based passenger processing or pedestrian movement but rarely combine both in a framework suited for cyber-resilience analysis. This paper presents a hybrid simulation framework that integrates discrete-event simulation (DES), JuPedSim-based microscopic pedestrian modeling, and structured large language model (LLM) decision support to examine how cyber disruptions propagate through passenger-facing airport operations. The DES layer models service processes such as check-in, information desks, and security screening, while the pedestrian layer models movement, congestion, route choice, and spatial occupancy. Under degraded display or guidance conditions, the LLM generates structured passenger-level post-security decisions, such as going directly to the gate, checking a display, asking staff, waiting, visiting optional activity areas, or first moving to a wrong intermediate area. The framework is evaluated through a 500-passenger terminal case study with one baseline case and four disruption cases. Results show that check-in and security degradation produce the largest throughput loss, queue growth, and completion-time increase, while guidance degradation mainly affects post-security behavior. Spatial heatmaps further show where bottlenecks emerge and how congestion shifts across the terminal. Additional Rotterdam checkpoint validation, Palma benchmark analysis, and LLM ablation results support the framework’s ability to reproduce plausible queue, timing, throughput, and behavior-sensitive disruption patterns. The study provides a practical methodology for exploratory airport cyber-resilience assessment under coupled service, movement, and degraded-guidance conditions.

Keywords:

airport cyber resilience; hybrid simulation; discrete-event simulation; pedestrian modeling; large language models

MSC:

68T50; 68M25; 00A72; 90B20

1. Introduction

Airports are high-flow, time-sensitive, and tightly coupled critical infrastructures in which passenger-facing operations depend increasingly on digital systems. Global passenger traffic reached about 9.5 billion in 2024, exceeding pre-pandemic levels, which means that even short operational disruptions can affect very large passenger volumes across the airport network [1]. Modern terminals also rely heavily on digital support for check-in, security coordination, flight information display systems, and wayfinding. During the 2024 CrowdStrike outage, failures in systems supporting booking, check-in, baggage, and crew-related operations quickly produced visible disruption and long airport queues even though aircraft systems and air traffic control were unaffected [2]. In terminal settings, such incidents can therefore appear as slower service, queue growth, congestion spillback, misrouting, and throughput loss.

Existing research has already explained several important parts of this problem but mostly in separate streams. Smart-airport cybersecurity has been framed as a question of resilience controls and operational continuity rather than a purely technical protection issue, which places cyber incidents within the broader challenge of keeping airport operations functioning under stress [3]. Discrete-event simulation has also been shown to represent ordered service stages, waiting lines, and resource use in airport terminals, establishing why DES remains a natural way to model passenger processing, queueing, and resource utilization [4]. Explicit spatial representation has likewise been shown to change how terminal dynamics are understood because local interference and route competition matter, which is why passenger movement, density, and route choice cannot be reduced to queue metrics alone [5]. Collectively, these streams explain the service-logic side and the spatial-movement side of airport terminals reasonably well.

The critical gap is that existing research still lacks a terminal-level, mechanism-based, and operationally interpretable framework that explains how a cyber disruption propagates simultaneously through service degradation, passenger movement, and behavior uncertainty under impaired guidance. Malicious events in linked infrastructures have been shown to create cascading airport impacts across interdependent systems, but that cascade perspective does not explain how disruption unfolds inside the terminal at the passenger-process level [6]. Airport-complex analysis has also been argued to require integrated modeling views rather than isolated process descriptions, which points directly to the need for a framework that can explain how queues, walking paths, and local congestion interact during disruption [7]. A second difficulty is that behavior uncertainty matters in degraded guidance scenarios, yet it cannot be handled by letting a large language model improvise freely. LLM-based simulation is useful when the model role is bounded and explicitly controlled, so behavior uncertainty must be represented in a way that remains interpretable and reproducible rather than autonomous [8].

This study addresses this gap by proposing a hybrid DES–JuPedSim–LLM framework that couples service events, passenger movement, and guidance-sensitive behavior within a common terminal representation. The framework models passenger processing and queue dynamics with DES, movement and congestion with JuPedSim, and post-security behavioral responses under degraded guidance through structured LLM-generated passenger-level decisions. The LLM is used only to support post-security decision modeling and does not control the simulation or modify service capacities. The framework is evaluated through a 500-passenger departure-terminal case study with one baseline case and four cyber-disruption cases, together with Rotterdam checkpoint validation, a Palma benchmark, and an LLM ablation study. The results show that the framework can capture baseline stability, post-attack throughput loss, subsystem-specific bottleneck shifts, spatial congestion patterns, and guidance-related post-security behavior changes. The contribution of the paper is therefore a methodological framework for exploratory airport cyber-resilience analysis under coupled service, movement, and degraded-guidance conditions.

2. Literature Review

The literature relevant to this study spans airport resilience, airport process simulation, pedestrian microsimulation, hybrid terminal modeling, and LLM-based behavior simulation. Reviewing these streams together is necessary because airport cyber disruption affects service rates, path choice, dwell behavior, and congestion at the same time. The review below is therefore organized around what each stream explains well and what each stream still leaves unresolved for a behavior-aware airport cyber-resilience case study.

2.1. Airport Cyber Resilience and Operational Disruption

Airport cyber-resilience literature explains why airport disruption should be interpreted through continuity and operational performance rather than only through technical failure. Lykou et al. [3] examined cybersecurity controls for smart airports and argued that resilience measures must be integrated with airport operations. Their study is important here because it frames cyber risk as an operational continuity problem rather than a purely technical security problem. Huang et al. [9] developed an assessment model for airport resilience and translated resilience into explicit evaluation dimensions. Their framework is useful because it gives a structured way to think about how airport performance can be judged under disruption.

Studies closer to operational disruption reinforce the continuity perspective and extend it toward terminal conditions. Zhou and Chen [10] measured airport resilience under severe weather and treated resilience as retained operational performance during disruption. Their study matters here because it shows that continuity-oriented resilience metrics remain useful even when the disruption source is not cyber. Zapola et al. [11] proposed a resilience assessment framework for airport passenger terminal operations and focused their attention directly on passenger-facing terminal functions. Their contribution is relevant because it moves resilience analysis closer to terminal operations, even though it remains framework-oriented rather than mechanism-oriented. Piekert et al. [6] examined malicious events affecting linked critical infrastructures and showed how airport impacts could cascade across interdependent systems. Their analysis is valuable because it highlights cascading disruption, but it still does not explain how the disruption propagates inside the terminal itself.

What remains underdeveloped in this literature is the terminal-level mechanism of disruption propagation. The resilience perspective is now well established, but the passenger-facing pathway from cyber degradation to queue growth, walking delay, misrouting, and local congestion has not yet been modeled in operational detail.

2.2. Discrete-Event Simulation in Airport Operations

Discrete-event simulation literature shows that airport service processes can be modeled clearly when the main concern is queueing logic, waiting time, and resource utilization. Jim and Chang [4] presented one of the early airport terminal simulators for planning and design and showed how passenger processing could be decomposed into service stages and waiting lines. Their study remains foundational because it demonstrates why DES is well suited to ordered service logic in terminals. Takakuwa and Oyama [12] analyzed international-departure passenger flow with a staged simulation structure that connected arrivals, processing times, and congestion. Their work is important because it shows how DES can represent bottleneck formation in an operationally interpretable way.

Later airport studies extended the same modeling logic to more specific operational questions. Ruiz and Cheu [13] developed a simulation model for airport security screening checkpoints and used it to evaluate checkpoint operations. Their study is relevant because it shows how DES can capture service-stage performance in passenger-facing subsystems. Dorton and Liu [14] used queueing networks and discrete-event simulation to study airport security screening checkpoint efficiency under different baggage-volume and alarm-rate conditions. Their study is especially useful here because it shows that simulation can directly analyze passenger-facing screening operations, checkpoint congestion mechanisms, and service-efficiency sensitivity in airport security settings. Oprea et al. [15] modeled passenger flows in an airport terminal with discrete simulation and evaluated waiting time and level of service. Their results reinforce the point that DES remains very effective when the goal is to understand service capacity, queueing, and process efficiency.

This leaves a clear boundary on what DES can explain by itself. It remains highly effective for service flows, queues, and resource utilization, but it becomes less informative once a disruption begins to operate through spatial movement, local crowding, spillback, and competition for access to the next subsystem.

2.3. Pedestrian Simulation and Spatial Congestion Modeling

Pedestrian simulation literature shows why explicit movement modeling is necessary when local congestion, route choice, and guidance quality matter. Fonseca et al. [5] simulated passenger flow in a hub airport and demonstrated that microscopic movement models revealed spatial interactions that queue-only models could not represent. Their work is valuable because it shows how corridor use, local interference, and movement competition shape terminal performance. Lin et al. [16] studied guiding-sign optimization for an airport terminal and linked wayfinding quality to pedestrian efficiency. Their results are especially relevant here because degraded information systems are likely to alter route choice, hesitation, and travel time before those effects appear in queue metrics.

Other pedestrian studies further show that movement behavior changes terminal outcomes in ways that aggregated process models cannot capture well. Kalakou and Moura [17] analyzed passenger behavior in airport terminals through activity preferences and showed that passenger choices affected how terminal dynamics evolved. Their study matters here because it demonstrates that behavior-sensitive movement assumptions change system outcomes. Alam et al. [18] simulated airport pedestrian movement under social-distancing conditions and emphasized density and circulation effects. Their work reinforces the point that crowding patterns and movement constraints can reshape terminal performance even before service logic is considered.

The limitation here lies in the opposite direction. Pedestrian models capture movement, density, route choice, and guidance effects well, yet they do not naturally represent first-come, first-served queues, server occupancy, or service-time degradation. On their own, they therefore cannot provide a full account of passenger-processing disruption.

2.4. Hybrid Simulation Approaches for Terminal Analysis

The hybrid airport modeling literature shows that service processes and movement processes need to be coupled when terminal interactions matter. Wu et al. [19] proposed a hybrid queue-based Bayesian network framework for passenger facilitation modeling and explicitly connected process states with passenger movement logic. Their work is important because it shows that queue dynamics and movement conditions can be analyzed within one representation rather than as disconnected layers. Metzner [20] compared agent-based and discrete-event simulation for airport terminal resilience assessment. That comparison is useful because it demonstrates that different modeling paradigms capture different mechanisms and therefore need to be combined thoughtfully.

Recent studies continue to support hybrid terminal analysis, but they do so mainly for planning and operational design purposes. Ma et al. [21] developed an integrated passenger-flow model that connected stochastic passenger behavior with terminal analysis. Their study showed that combining movement and decision logic could improve how passenger flows were represented. Anagnostopoulou et al. [22] used AI-supported simulation to analyze passenger flows in an airport terminal as a decision-making tool. Their contribution reinforced the practical value of coupled representations, but it remained focused on terminal analysis rather than cyber-resilience propagation.

The key unresolved issue is not whether hybrid modeling is useful, but what it has been used for. Most existing hybrid studies are oriented toward efficiency, capacity, planning, or design, whereas airport cyber resilience requires a mechanism-based account of how degraded passenger-facing digital functions cause service degradation and guidance uncertainty to propagate together.

2.5. LLMs and Behavior-Aware Simulation

LLM-based simulation has emerged as a promising approach for representing human-like behavior in synthetic environments, but existing studies also show that such use requires careful control and validation. Aher et al. [23] used LLMs to simulate multiple humans and replicate human-subject studies in synthetic settings, showing that LLMs can approximate plausible response patterns under structured prompts and evaluation designs. Park et al. [24] developed a generative-agent environment in which agents produced coherent social behavior over time, demonstrating the potential of LLM-driven agents while also highlighting the importance of structure and memory design.

More recent studies further emphasize the need for evaluation, grounding, and robustness checks. Park et al. [25] evaluated individual-level simulation against participant-specific responses after grounding the simulation in rich interview data. This study is especially relevant because it treats LLM-based simulation as behavior that must be validated against observed responses rather than assumed to be realistic. Xie et al. [26] examined whether LLM agents could simulate trust behavior and identified important alignment and robustness limitations. These studies suggest that free-form LLM agents may be useful for exploratory simulation, but they remain difficult to interpret, reproduce, and validate.

In transportation-related research, LLMs have also begun to be applied to cybersecurity-oriented simulation. Gao et al. [27] proposed a multi-agent framework that uses LLMs to automate traffic scenario generation, cyberattack design, and defense strategy development. This work shows the potential of LLM-based simulation for transportation cybersecurity analysis. However, its focus remains on road traffic environments and does not address airport terminal operations, where service queues, pedestrian movement, and passenger decisions are tightly coupled.

For degraded airport information environments, the relevant behavioral problem is not general human simulation, but passenger decision-making when digital guidance becomes incomplete, delayed, or unreliable. Airport wayfinding studies show that signage, flight information displays, and guidance systems affect passenger route choice, hesitation, and the need for assistance [28,29]. Passenger-behavior studies further show that travelers may visit displays, ask staff, wait, enter optional activity areas, or return to the gate depending on their activity preferences and available information [30,31]. These findings suggest that degraded information should be modeled as a behavior-sensitive disruption, not only as a reduction in service rate.

Transportation cybersecurity studies support this framing. Smart-airport cybersecurity research treats cyber incidents as operational continuity problems rather than purely technical failures [3], while malicious-event studies show that disruptions in connected infrastructure can propagate into airport operations [6]. However, existing LLM-based transportation simulation studies mainly focus on road traffic environments and do not examine passenger-level airport terminal behavior under degraded guidance. This gap motivates a constrained use of LLMs, where the model maps passenger-specific degraded-information states to bounded post-security actions rather than acting as an autonomous agent or generating unconstrained behavior.

2.6. Research Gap

Across these strands, the central gap is still the lack of an operationally interpretable airport-terminal framework that jointly represents degraded service performance, spatial passenger movement, and uncertainty in behavior under impaired guidance. Resilience studies explain why continuity matters. DES studies explain service logic. Pedestrian studies explain spatial interference. Hybrid studies explain why process and movement should be coupled. LLM studies explain why behavior uncertainty can be modeled only under explicit control. What is still missing is a terminal-level, mechanism-based, and unified framework that brings those insights together for airport cyber-disruption analysis.

3. Methodology

This section presents the hybrid simulation framework used to evaluate airport performance and resilience under cyber disruption. The framework combines two main layers: (i) a discrete-event simulation (DES) layer that models passenger-processing logic, service events, and queues, and (ii) a microscopic pedestrian layer in JuPedSim that models movement, congestion, and spatial interactions. A large language model (LLM) is used only in a bounded role to generate passenger-level post-security decisions for travelers affected by degraded display or guidance conditions. Each LLM-generated decision specifies an allowed next action, target area, dwell time, reason code, and confidence score. Service capacities and processing rules remain explicitly defined within the simulation model.

The workflow begins with scenario definition, including layout, demand, service, and cyber-disruption inputs. The DES and pedestrian layers then run in a coupled manner so that service events and physical movement remain synchronized. Under degraded display or guidance conditions, bounded LLM-based decision-making is applied to affected post-security passengers. Each simulation run is then evaluated using performance and resilience metrics. Figure 1 summarizes this workflow and the coupling among the DES, pedestrian, and cyber layers. It shows the main information exchanges in the framework, including service-state updates from the DES layer, movement and density feedback from the pedestrian layer, and bounded passenger-level post-security decisions injected from the cyber layer for display- or guidance-affected passenger flows.

3.1. Discrete-Event Layer (DES): Queues and Service

3.1.1. Passenger Arrivals

Passenger arrivals define how travelers enter the simulation. A fixed passenger population P is generated for each scenario. Each passenger i is assigned a release time

a_{i}

and an entrance

e_{i} \in E

, where

E

denotes the set of available terminal entrances. Passengers enter the terminal according to their assigned release times. This setup allows the simulation to represent gradual passenger arrivals rather than assuming that all passengers enter the terminal at the same time. The same passenger population and arrival schedule are used across nominal and disrupted scenarios, so that performance differences can be attributed to the cyber-disruption mechanisms rather than to changes in demand.

3.1.2. Event Scheduling and Simulation Clock

The DES layer uses an event-driven simulation clock. The model updates system states only when key events occur, such as passenger arrival, queue entry, service start, service completion, and transfer to the next stage. Between two consecutive events, the DES state remains unchanged. This event-based structure is suitable for airport passenger-processing systems because queues and services change only when passengers arrive, begin service, or complete service. It also supports coupling with the pedestrian layer: physical arrival at a service point triggers DES queue entry, while service completion releases the passenger to the next movement stage. This design keeps the service process simple, efficient, and synchronized with passenger movement.

3.1.3. Service Stations and Queues

Passenger processing is modeled through service stations and queues. The main service stations are check-in (CI), information desks (IDs), and security screening (SC). Each station has a fixed number of service channels and a nominal service rate. If all channels are busy, passengers wait in a first-come, first-served queue. Queues have finite capacity because the terminal layout provides only a limited number of queue positions. When a queue is full, incoming passengers are redirected to the upstream waiting area until space becomes available. This rule allows the model to capture spillback, where delays at one service point create congestion in nearby areas. The DES layer controls queueing and service logic, while the pedestrian layer controls movement between entrances, service stations, waiting areas, and exits.

3.1.4. Passenger Routing Between Service Stages

The DES layer defines the logical order in which passengers move through service stages. In the nominal process, passengers complete check-in (CI) and then proceed to security screening (SC). Some passengers may also visit the information desk (ID) before security if they need clarification or assistance. This optional branch is controlled by a predefined probability

p_{ID}

: passengers not selected for this branch follow CI–SC, while selected passengers follow CI–ID–SC. The route defines the required service sequence, but the actual arrival time at each stage depends on queueing delay, walking time, congestion, and access conditions in the pedestrian layer.

3.1.5. Cyber-Disruption Model

A cyber disruption starts at

t_{attack}

and degrades selected airport functions. For service-related disruptions, the affected service rate is reduced as

μ_{k}^{'} (t) = α_{k} (t) μ_{k}, 0 < α_{k} (t) \leq 1,

(1)

where

μ_{k}

is the nominal service rate, and

α_{k} (t)

is the degradation multiplier. A smaller

α_{k} (t)

indicates slower service. Guidance-related disruptions are modeled through changes in passenger post-security decisions.

3.2. JuPedSim Pedestrian Layer: Movement, Congestion, and Route Choice

JuPedSim, based on the Optimal Steps Model (OSM), is used to model passenger movement in the terminal layout [32]. In OSM, each passenger selects the next feasible position by minimizing a local movement cost:

x_{i} (t + Δ t) = arg min_{y \in R_{i} (t)} Φ_{i} (y, t),

(2)

where

R_{i} (t)

is the set of positions passenger i can reach in one step, and

Φ_{i} (y, t)

represents the movement cost at candidate position

y

. This cost reflects movement toward the target, obstacle avoidance, and interaction with nearby passengers.

The terminal layout includes walkable areas, barriers, corridor openings, service zones, queueing regions, waiting areas, post-security activity areas, and gates. These spatial elements define JuPedSim movement stages, where each stage represents a spatial objective such as a waiting area, queue slot, service point, activity area, or gate. The DES layer controls service logic and queue release, while JuPedSim controls passenger movement and spatial occupancy.

After security screening, passengers may move to flight information displays, retail areas, food areas, restrooms, or gates. Under degraded display or guidance conditions, LLM-generated decisions may redirect affected passengers to a display, staff-help point, waiting area, optional activity area, wrong intermediate area, or directly to the gate. The possible actions and targets are predefined by the simulation model, and these decisions affect only post-security movement. They do not change DES service capacities.

The pedestrian layer records realized paths, area occupancy, display use, optional activity visits, and gate-arrival behavior. These outputs help identify whether delays are caused by queue pressure, route confusion, post-security dwell, or spatial congestion.

3.3. Coupling DES and JuPedSim

The hybrid model couples service progression in the DES layer with physical movement in the JuPedSim layer. A passenger can join a DES queue only after reaching the corresponding queue-entry region in the pedestrian map. After service is completed in the DES layer, the passenger is released back to the pedestrian layer and moves to the next spatial stage.

Queue spillback is handled through the physical capacity of each queueing area. Let

Q_{k} (t)

be the DES queue length at station k, and let

K_{k}

be the maximum number of passengers that can physically fit in that queueing area. The number of passengers placed in the physical queue is

Q_{k}^{phys} (t) = min (Q_{k} (t), K_{k}) .

(3)

Passengers beyond this capacity remain in the upstream walking area and increase local pedestrian density. This coupling allows service delay, queue spillback, and walking congestion to affect one another during disrupted operation.

3.4. LLM-Generated Passenger-Level Post-Security Decisions

The LLM is used only after passengers complete security and enter the post-security area. Its role is to generate structured passenger-level decisions under degraded display or guidance conditions.

For each affected passenger, the input state includes the intended gate, time to departure, display status, guidance status, local density, display-area density, gate-area density, current time, scenario, and previous action. The LLM returns one decision with five fields: action, next_target, dwell_time_sec, reason_code, and confidence. The possible actions and targets are predefined by the simulation model.

The prompt used for passenger-level post-security decision generation is shown below.

Generate one passenger-level post-security decision for an airport
disruption scenario.

You are simulating one passenger inside an airport terminal.

The passenger has already passed security and is now in the
post-security area.

Current passenger state:

* Passenger ID: {passenger_id}

* Intended destination: {intended_destination}

* Time to departure: {time_to_departure}

* Previous action: {previous_action}

Current airport condition:

* Scenario: {scenario}

* Display status: {display_status}

* Guidance status: {guidance_status}

* Local/display/gate density: {density_state}

Choose one allowed action and target:

* action: gate/display/staff/wait/shop/food/restroom/wrong route

* next target: gate/display area/staff help/shop/food/restroom/wrong area

* dwell time: 0–180 s

Return ONLY JSON:

{“action”: “…”, “next_target”: “…”, “dwell_time_sec”: number,
“reason_code”: “…”, “confidence”: number}

The generated decision is applied only in the JuPedSim layer. A passenger may go directly to the gate, check a display, ask staff, wait, visit a shop, food area, or restroom, or first move to a wrong intermediate area before being redirected to the gate. These decisions change only the passenger’s post-security movement path and dwell time.

For reproducibility, each run records the prompt template, model setting, passenger state, random seed, and generated decision. The ablation study compares these LLM-generated decisions with rule-based, random, and no-guidance baselines.

3.5. Performance, Validation, and Benchmark Metrics

This study evaluates airport performance and resilience through operational, spatial, validation, and benchmark metrics. Here, resilience is interpreted as the ability of the terminal system to retain service performance under cyber disruption. The main operational metrics are system-exit throughput, queue length, waiting time, and total completion time.

Let

Y (t)

denote the cumulative number of passengers who have exited the modeled system by time t. System-exit throughput over a sampling interval

Δ t

is computed as

x (t) = \frac{Y (t + Δ t) - Y (t)}{Δ t} .

(4)

Queue burden is measured by the average and peak queue length at each service subsystem. Passenger waiting time is measured as the time between joining a queue and starting service. Total completion time is defined as the time when the last passenger exits the modeled system:

T_{clear} = max_{i} t_{i}^{exit},

(5)

where

t_{i}^{exit}

is the exit time of passenger i.

Spatial outputs from the pedestrian layer include passenger paths, area occupancy, display use, optional activity visits, and gate-arrival behavior. These outputs help explain whether a disruption causes delay mainly through queue buildup, route confusion, post-security dwell, or spatial congestion.

For empirical validation, simulated outputs are compared with observed airport data using MAE and RMSE:

MAE = \frac{1}{N} \sum_{n = 1}^{N} |{\hat{y}}_{n} - y_{n}|, RMSE = \sqrt{\frac{1}{N} \sum_{n = 1}^{N} {({\hat{y}}_{n} - y_{n})}^{2}},

(6)

where

y_{n}

is the observed value and

{\hat{y}}_{n}

is the simulated value. Additional validation measures include calibration error,

R^{2}

, KS distance, and Wasserstein distance when matched observed data are available.

For benchmark analysis, service utilization is computed as

U_{k} = \frac{B_{k}}{A_{k}},

(7)

where

B_{k}

is accumulated busy server time, and

A_{k}

is accumulated available server time at subsystem k.

3.6. Implementation and Scenario Execution

The framework is implemented as a coupled simulation pipeline. Each scenario is defined by the terminal layout, passenger population, service rates, queue capacities, attack start time

t_{attack}

, degradation factors

α_{k}

, and post-security decision rules for guidance-affected passengers. The same layout and demand settings are used for the baseline and disrupted scenarios.

For each run, the model records operational and spatial outputs, including queue lengths, system exits, throughput, passenger trajectories, area occupancy, post-security activity use, and total processing time. These outputs are used to compare nominal and disrupted conditions and to evaluate how cyber degradation affects both service logic and passenger movement.

3.7. Experiment Protocol

For each scenario, the simulation follows the same procedure. First, the baseline DES + JuPedSim case is run under normal service and guidance conditions. Second, the disrupted scenario is run by applying the specified service degradation factors

α_{k}

and, when applicable, degraded display or guidance conditions. Third, for passengers affected by degraded guidance after security, the LLM generates structured post-security decisions, including action, target, dwell time, reason code, and confidence. These decisions are then applied only in the JuPedSim movement layer.

Each scenario is repeated under multiple random seeds. The reported results are averaged across repeated runs. The final outputs include throughput, queue length, waiting time, completion time, passenger trajectories, area occupancy, and post-security activity use.

4. Case Studies

This section applies the proposed hybrid DES–JuPedSim–LLM framework through three case studies. Case Study 1 uses a 500-passenger departure-terminal scenario to evaluate cyber-disruption effects. Case Study 2 uses the Rotterdam The Hague Airport checkpoint dataset to compare the security-checkpoint queue and service logic with observed passenger-level data. Case Study 3 uses published Palma de Mallorca Airport parameters to examine whether the simulation logic produces plausible queue, waiting-time, throughput, and utilization patterns under real-airport operating assumptions.

4.1. Case Study 1: Cyber-Disruption Terminal Simulation

Case Study 1 evaluated how cyber-induced degradation affects passenger processing and movement in a 500-passenger departure-terminal scenario. The study included one baseline scenario and four disruption scenarios: check-in degraded, guidance degraded, security degraded, and all degraded.

The terminal layout used in Case Study 1 is shown in Figure 2 and Figure 3. Figure 2 shows the main spatial components, including entrances, waiting areas, ticket counters, information desks, security stations, post-security activity areas, restrooms, and gates. Figure 3 shows the feasible passenger routes encoded in the layout. These layout elements define the physical environment used by the coupled DES and JuPedSim model.

Table 1 summarizes the main layout and simulation inputs used in Case Study 1, including terminal size, service resources, queue capacities, destinations, and pedestrian movement assumptions.

Table 2 summarizes the main behavioral and disruption parameters used in Case Study 1. The parameter values are based on published airport simulation studies, airport passenger-behavior studies, and reported airport cyber/IT disruption evidence. The check-in settings follow prior airport check-in modeling and reported manual check-in delays during the Collins Aerospace cyberattack [33,34,35]. The degraded security setting is treated as a stress scenario informed by reported cyber-disrupted terminal operations at BER [36,37]. The information-desk, display-use, wrong-route, optional-activity, and dwell-time assumptions are informed by airport capacity, wayfinding, passenger-activity, flight-information display, and dwell-time studies [28,29,30,31,38,39,40].

Cyber-Disruption Simulation Results

Figure 4 compares average system-exit throughput across the baseline and disruption scenarios. The baseline case has the highest throughput at 21.68 passengers/h. Guidance disruption causes only a small reduction to 21.32 passengers/h. In contrast, check-in and security disruptions reduce throughput substantially, to 11.65 and 11.36 passengers/h, respectively. The combined disruption produces the lowest throughput at 11.31 passengers/h. These results show that throughput loss is driven mainly by degraded check-in and security processing. Guidance disruption alone has a limited effect on total throughput, and the combined case remains close to the check-in and security disruption cases because the main bottleneck is already created by degraded service capacity.

Figure 5 compares average check-in and security queue lengths across the baseline and disruption scenarios. Under baseline conditions, the average queue lengths are 3.589 at check-in and 3.104 at security. When check-in service is degraded, the check-in queue increases to 3.823, while the security queue decreases to 1.237 because fewer passengers reach the downstream security stage. When security service is degraded, the security queue increases to 3.722, while the check-in queue remains close to baseline at 3.413. The guidance-degraded case keeps both queues close to the upper range observed across the scenarios, with average queue lengths of 3.921 at check-in and 3.707 at security. The combined case produces the highest average values at both check-in and security, with queue lengths of 3.935 and 3.843, respectively.

Figure 6 compares total completion time across the baseline and disruption scenarios. The baseline case clears the modeled passenger population in 23.06 h, while guidance degradation causes only a small increase to 23.45 h. In contrast, check-in and security degradation substantially increase completion time to 43.29 h and 44.00 h, respectively. The combined case produces the longest completion time at 44.21 h. These results show that system clearing time is driven mainly by degradation at the main service stations, while guidance degradation alone has a limited effect on total completion time.

Figure 7 compares average and 95th-percentile waiting times at check-in and security. Guidance degradation keeps waiting times close to baseline, while service degradation shifts delay toward the affected stage. Under check-in degradation, the average check-in wait rises to 19.86 min and the P95 check-in wait rises to 23.39 min, while security waiting decreases because fewer passengers reach the downstream stage. Under security degradation, the opposite pattern appears, with the average security wait rising to 19.79 min and the P95 security wait rising to 23.50 min. In the combined case, both check-in and security waits remain high, showing that multi-stage service degradation distributes delay across both major processing stages.

Figure 8 shows system-exit throughput over time for the baseline and four disruption scenarios. The baseline case remains relatively stable after the initial transient. In contrast, the check-in-degraded, security-degraded, and combined-degraded cases show a clear throughput drop after the attack onset and remain in a lower operating range for the rest of the simulation. The guidance-degraded case stays comparatively close to the baseline trajectory, indicating that degraded guidance alone has a smaller effect on throughput than degraded processing capacity. This figure complements the aggregate results by showing not only the magnitude of throughput loss but also when the change occurs and how long it persists.

Figure 9 shows spatial congestion heatmaps for the four disruption scenarios. Check-in degradation concentrates congestion near the entrance and check-in waiting area, while security degradation shifts the main hotspot to the security subsystem. The guidance-degraded case remains lighter and more dispersed, consistent with its smaller effect on throughput and waiting-time metrics. In the combined case, both check-in and security hotspots appear in the same layout, showing that congestion is sustained across both major processing stages. These spatial patterns reveal where disruption burdens form and show that the framework identifies bottleneck locations and upstream–downstream congestion effects, rather than only reporting aggregate throughput loss.

4.2. Case Study 2: Rotterdam Checkpoint Validation

Case Study 2 validated the security-checkpoint queue and service logic using the Rotterdam The Hague Airport security-checkpoint dataset released with Janssen et al. [41,42]. The dataset provides passenger-level timing observations from a real airport checkpoint, allowing simulated outputs to be compared with observed security-stage time, occupation, and throughput.

The Rotterdam data contain 2277 passenger records organized into 13 observation blocks. A leave-one-block-out validation procedure was used. For each held-out block, the security-stage time distribution was calibrated using the remaining blocks, and the observed passenger arrivals to the checkpoint were used as the simulation input stream. The security-only DES model was then run with 30 stochastic replications, and the simulated security-stage time, occupation, and throughput were compared with the observed values. Checkpoint occupation was used as a queue-related proxy.

Table 3 reports the validation results against the Rotterdam checkpoint observations. The calibrated security-stage timing distribution closely matches the observed data, with a timing MAE of 16.11 s, RMSE of 23.26 s, and timing-distribution

R^{2} = 0.882

. The throughput comparison is also close in aggregate: under 15 min buckets, the throughput MAE is 7.50 passengers/h and

R^{2}

reaches 0.933. Overall, the validation supports the security-checkpoint queue and service component of the model.

Figure 10 compares observed and simulated checkpoint throughput using 15 min validation buckets. The simulated mean closely follows the observed Rotterdam throughput series, with only short-period deviations. This visual agreement is consistent with the quantitative validation results in Table 3, where the 15 min throughput MAE is 7.50 passengers/h and the throughput

R^{2}

is 0.933.

Figure 11 compares observed and simulated checkpoint occupation using 15 min validation buckets. The simulated series captures the general scale of and variation in the observed occupation pattern, although some bucket-level differences remain. Because the public Rotterdam data do not provide a separate pre-entry queue length, occupation is used as a queue-related proxy rather than as a direct observed queue-length measure.

Figure 12 compares the empirical cumulative distributions of observed and simulated security-stage times. The two curves closely overlap across most of the distribution, consistent with the KS distance of 0.132 and Wasserstein distance of 16.11 s reported in Table 3. This result shows that the validation captures not only the mean timing value but also the overall shape of the passenger-level timing distribution.

4.3. Case Study 3: Palma Benchmark and Supporting Evidence

Case Study 3 provided a supporting benchmark for the simulation logic using published Palma de Mallorca Airport parameters and related operational evidence. The Palma benchmark was used as the main normal-operation reference because it provides published arrival profiles, routing split, service rates, dynamic security resources, and benchmark queue statistics [43]. Additional evidence from TSA and CATSA provided broader throughput and wait-time context [44,45,46], while SEA, Collins, and EUROCONTROL records support the plausibility of cyber-disruption scenarios involving degraded guidance, manual check-in, and multi-system passenger-processing disruption [34,47,48,49].

The benchmark followed a five-part DES structure: departure hall, check-in, corridor, security control, and boarding gates. Passenger arrivals used the published Weibull profiles for international and Schengen/domestic flows. The routing split followed the Palma study, with 87% of passengers entering check-in and 13% proceeding directly to security. Check-in and security processing times, reference capacities, and dynamic security desk rules are summarized in Table 4.

Figure 13 shows the five-cell DES benchmark structure adapted from the Palma passenger-flow model.

The Palma benchmark was executed under three normal-operation arrival settings: international, Schengen/domestic, and mixed demand. Each setting used 500 passengers and 30 random seeds. The international and Schengen/domestic cases used two check-in desks, while the mixed case also tested 2, 4, 6, and 8 open check-in desks. Table 5 summarizes the two single-profile runs. In both cases, the realized routing pattern follows the Palma assumption that most passengers enter check-in before security. With two check-in desks, check-in utilization is high, reaching 0.955 in the international case and 0.927 in the Schengen/domestic case. This produces large check-in queues and long check-in waits, while security queues remain small because dynamic security resources absorb the downstream flow.

Table 6 reports the mixed-demand sensitivity analysis. Increasing the number of open check-in desks from two to eight reduces the average check-in queue from 128.36 to 7.01 passengers and the average check-in wait from 67.43 to 0.61 min. Check-in utilization also decreases from 0.941 to 0.547. These results show that queue formation is mainly driven by near-saturated check-in capacity; once additional check-in desks are opened, the upstream bottleneck is largely relieved.

Figure 14 shows service utilization profiles for the mixed-demand Palma benchmark under different check-in desk configurations. With two check-in desks, check-in utilization remains near saturation for a long period, which is consistent with the large check-in queues reported in Table 6. As additional check-in desks are opened, check-in utilization decreases and the upstream bottleneck is relieved. Security utilization remains lower in comparison, indicating that the main capacity constraint in this benchmark is the check-in stage rather than security screening.

4.4. Ablation Study of the LLM Decision Module

This section evaluates the contribution of the LLM decision module through an ablation study. The goal is to test whether LLM-generated post-security decisions produce different passenger behavior from simpler alternatives. The comparison includes four variants: an LLM decision variant, a rule-based variant, a random-decision variant, and a no-guidance-effect variant. The analysis focuses on post-security outcomes because the LLM acts only after passengers complete security and does not change DES service capacity or queue rules.

Table 7 reports the LLM inference settings used in the ablation study. The model was accessed through the DeepSeek API with temperature set to zero and top-

p = 1

to reduce sampling variability. Randomness in the experiments comes from the simulation seeds used for passenger generation, routing, and dwell-time realization.

The ablation used 20 post-security passengers for each guidance-sensitive scenario. Table 8 compares how the LLM, rule-based, random, and no-guidance variants affect post-security actions and time outcomes. The table is intended to show whether the LLM module changes the specific behavior layer it controls: where passengers go after security and how long they remain in the post-security area. The key comparisons are the action counts for display checking, staff assistance, wrong-route movement, and optional activities, as well as the resulting dwell time, post-security time, and gate delay.

The selected paired t-tests in Table 9 compare the LLM decision variant with the rule-based, random, and no-guidance baselines using matched seeds. The results are consistent with the descriptive ablation results in Table 8. For both display-degraded and combined-degraded scenarios, the LLM variant produces statistically distinguishable post-security travel times compared with the random and no-guidance baselines. The LLM variant also shows significant differences in wrong-route counts compared with the rule-based baseline. These results indicate that the LLM decision module does not simply reproduce rule-based or random behavior. Instead, it generates a distinct post-security behavior pattern, mainly by changing dwell burden, wrong-route behavior, and gate-arrival delay.

Overall, the ablation results demonstrate the added value of the LLM decision module. Compared with the random baseline, the LLM variant produces lower dwell burden, shorter post-security travel time, and lower gate delay, showing that its decisions are more structured than simple random action selection. Compared with the rule-based baseline, the LLM variant produces fewer wrong-route decisions while still generating diverse post-security behaviors, including display checking, staff assistance, and optional activity visits. Compared with the no-guidance baseline, the LLM variant captures guidance-related detours and dwell behavior that would otherwise be absent. These results show that the LLM module contributes a more context-sensitive and operationally interpretable behavior layer for degraded guidance conditions, while DES and JuPedSim continue to control service processing and physical movement.

5. Conclusions

This paper presented a hybrid simulation framework for airport cyber-resilience assessment by integrating DES-based passenger-processing logic, JuPedSim-based pedestrian movement, and LLM-supported post-security decision modeling. The framework captures both operational effects, such as degraded service rates, queue growth, and throughput loss, and spatial-behavioral effects, such as detours, dwell, route confusion, and congestion in terminal space. The main contribution is a coupled methodology for examining how cyber disruption propagates through service processes, passenger movement, and degraded guidance conditions.

A key contribution is the use of structured passenger-specific LLM decisions to represent post-security behavior under degraded display and guidance conditions. The LLM does not control the full simulation or modify service capacities. Instead, it maps passenger-specific local states to predefined post-security actions, such as going to the gate, checking a display, asking staff, waiting, visiting optional activity areas, or first moving to a wrong intermediate area. This design allows the framework to include guidance-sensitive behavior while keeping the simulation interpretable and auditable.

The case-study results show that the framework produces consistent disruption mechanisms. Check-in and security degradation create major throughput loss and longer completion times, while guidance degradation alone has a smaller effect on aggregate throughput but changes post-security behavior. The spatial heatmaps further show where bottlenecks form and how congestion shifts across the terminal. These findings demonstrate that the framework does more than report aggregate performance loss; it identifies how and where disruption burdens emerge.

The additional validation and benchmark results further support the framework. The Rotterdam checkpoint validation shows that the security-processing component can reproduce observed passenger-level timing and throughput patterns with reasonable MAE, RMSE, calibration error, and distribution-fit results. The Palma benchmark shows that the simulation logic produces plausible queue, waiting-time, throughput, and utilization patterns when parameterized with published real-airport inputs. The LLM ablation study further demonstrates the added value of the LLM decision module. Compared with the random baseline, the LLM variant produces lower dwell burden, shorter post-security travel time, and lower gate delay. Compared with the rule-based baseline, it produces fewer wrong-route decisions while still generating diverse post-security behaviors, including display checking, staff assistance, and optional activity visits. These results show that the LLM module provides a more context-sensitive and operationally interpretable behavior layer than simple random or fixed-rule alternatives.

Future work should extend the framework with broader airport-wide calibration, larger passenger volumes, richer recovery strategies, and additional real-world disruption datasets. These extensions would allow the model to support more detailed resilience planning and mitigation analysis across different airport terminal settings.

Author Contributions

Conceptualization, L.G., Y.L., D.L. and H.C.; methodology, T.S.K. and L.G.; software, T.S.K.; validation, T.S.K., L.G., Y.L., D.L. and H.C.; formal analysis, T.S.K. and L.G.; investigation, T.S.K. and L.G.; resources, L.G., Y.L., D.L. and H.C.; data curation, T.S.K.; writing—original draft preparation, T.S.K. and L.G.; writing—review and editing, T.S.K., L.G., Y.L., D.L. and H.C.; visualization, T.S.K.; supervision, L.G., Y.L., D.L. and H.C.; project administration, L.G. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original data and code presented in this study are openly available on GitHub at https://github.com/aris-research/airport-simulation-code (accessed on 29 May 2026).

Acknowledgments

The authors acknowledge the use of open-source simulation and typesetting tools in preparing this study.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Airports Council International (ACI) World; International Civil Aviation Organization (ICAO). Joint ACI World-ICAO Passenger Traffic Report, Trends, and Outlook. 2025. Available online: https://aci.aero/2025/01/28/joint-aci-world-icao-passenger-traffic-report-trends-and-outlook/ (accessed on 29 May 2026).
International Air Transport Association (IATA). The CrowdStrike IT Outage Carries Lessons for Regulators. 2024. Available online: https://www.iata.org/en/pressroom/opinions/the-crowdstrike-it-outage-carries-lessons-for-regulators/ (accessed on 29 May 2026).
Lykou, G.; Anagnostopoulou, A.; Gritzalis, D. Smart airport cybersecurity: Threat mitigation and cyber resilience controls. Sensors 2019, 19, 19. [Google Scholar] [CrossRef] [PubMed]
Jim, H.K.; Chang, Z.Y. An airport passenger terminal simulator: A planning and design tool. Simul. Pract. Theory 1998, 6, 387–396. [Google Scholar] [CrossRef]
Fonseca, P.; Casanovas, J.; Ferran, X. Passenger flow simulation in a hub airport: An application to the Barcelona International Airport. Simul. Model. Pract. Theory 2014, 44, 78–94. [Google Scholar] [CrossRef]
Piekert, F.; Schaper, M.; Stelkens-Kobsch, T.H.; Predescu, A.V.; Günther, Y.; Carstengerdes, N. Mitigation of operational impacts on airports by early awareness of malicious events impacting linked critical infrastructures. J. Air Transp. Res. Soc. 2024, 1, 100011. [Google Scholar] [CrossRef]
Wu, P.P.Y.; Mengersen, K. A review of models and model usage scenarios for an airport complex system. Transp. Res. Part A Policy Pract. 2013, 47, 124–140. [Google Scholar] [CrossRef]
Gao, C.; Lan, X.; Li, N.; Yuan, Y.; Ding, J.; Zhou, Z.; Xu, F.; Li, Y. Large language models empowered agent-based modeling and simulation: A survey and perspectives. Humanit. Soc. Sci. Commun. 2024, 11, 1259. [Google Scholar] [CrossRef]
Huang, C.N.; Liou, J.J.H.; Lo, H.W.; Chang, F.J. Building an assessment model for measuring airport resilience. J. Air Transp. Manag. 2021, 95, 102101. [Google Scholar] [CrossRef]
Zhou, L.; Chen, Z. Measuring the performance of airport resilience to severe weather events. Transp. Res. Part D Transp. Environ. 2020, 83, 102362. [Google Scholar] [CrossRef]
Zapola, G.S.; Silva, E.J.; Alves, C.J.P.; Müller, C. Towards a resilience assessment framework for the airport passenger terminal operations. J. Air Transp. Manag. 2024, 114, 102508. [Google Scholar] [CrossRef]
Takakuwa, S.; Oyama, T. Simulation analysis of international-departure passenger flows in an airport terminal. In Proceedings of the 2003 Winter Simulation Conference, New Orleans, LA, USA, 7–10 December 2003; pp. 1627–1634. [Google Scholar] [CrossRef]
Ruiz, S.; Cheu, R.L. Simulation model to support security screening checkpoint operations in airport terminals. Transp. Res. Rec. 2020, 2674, 381–394. [Google Scholar] [CrossRef]
Dorton, S.L.; Liu, D. Effects of Baggage Volume and Alarm Rate on Airport Security Screening Checkpoint Efficiency Using Queuing Networks and Discrete Event Simulation. Hum. Factors Ergon. Manuf. Serv. Ind. 2016, 26, 95–109. [Google Scholar] [CrossRef]
Oprea, C.; Rosca, M.; Rosca, E.; Costea, I.; Ilie, A.; Dinu, O.; Rusca, A. Analyzing passenger flows in an airport terminal: A discrete simulation model. Computation 2024, 12, 223. [Google Scholar] [CrossRef]
Lin, J.; Song, R.; Dai, J.; Jiao, P. Pedestrian guiding signs optimization for airport terminal. Discret. Dyn. Nat. Soc. 2014, 2014, 125910. [Google Scholar] [CrossRef]
Kalakou, S.; Moura, F. Analyzing passenger behavior in airport terminals based on activity preferences. J. Air Transp. Manag. 2021, 96, 102110. [Google Scholar] [CrossRef]
Alam, M.J.; Habib, M.A.; Holmes, D. Pedestrian movement simulation for an airport considering social distancing strategy. Transp. Res. Interdiscip. Perspect. 2022, 13, 100527. [Google Scholar] [CrossRef]
Wu, P.P.Y.; Pitchforth, J.; Mengersen, K. A hybrid queue-based Bayesian network framework for passenger facilitation modelling. Transp. Res. Part C Emerg. Technol. 2014, 46, 247–260. [Google Scholar] [CrossRef]
Metzner, N. A comparison of agent-based and discrete event simulation for assessing airport terminal resilience. Transp. Res. Procedia 2019, 43, 209–218. [Google Scholar] [CrossRef]
Ma, W.; Fookes, C.; Kleinschmidt, T.; Yarlagadda, P.K.D.V. Modelling passengers flow at airport terminals - individual agent decision model for stochastic passenger behaviour. In Proceedings of the 2nd International Conference on Simulation and Modeling Methodologies, Technologies and Applications, Rome, Italy, 28–31 July 2012; pp. 109–113. [Google Scholar] [CrossRef]
Anagnostopoulou, A.; Tolikas, D.; Spyrou, E.; Akac, A.; Kappatos, V. The analysis and AI simulation of passenger flows in an airport terminal: A decision-making tool. Sustainability 2024, 16, 1346. [Google Scholar] [CrossRef]
Aher, G.; Arriaga, R.I.; Kalai, A.T. Using large language models to simulate multiple humans and replicate human subject studies. arXiv 2022, arXiv:2208.10264. [Google Scholar]
Park, J.S.; O’Brien, J.C.; Cai, C.J.; Morris, M.R.; Liang, P.; Bernstein, M.S. Generative agents: Interactive simulacra of human behavior. arXiv 2023, arXiv:2304.03442. [Google Scholar] [CrossRef]
Park, J.S.; Zou, C.Q.; Shaw, A.; Hill, B.M.; Cai, C.; Morris, M.R.; Willer, R.; Liang, P.; Bernstein, M.S. Generative Agent Simulations of 1,000 People. arXiv 2024, arXiv:2411.10109. [Google Scholar]
Xie, C.; Chen, C.; Jia, F.; Ye, Z.; Lai, S.; Shu, K.; Gu, J.; Bibi, A.; Hu, Z.; Jurgens, D.; et al. Can large language model agents simulate human trust behavior? arXiv 2024, arXiv:2402.04559. [Google Scholar] [CrossRef]
Gao, L.; Liu, Y.; Chen, R.; Liu, D.; Zhang, Z.; Sun, L. Exploring Traffic Simulation and Cybersecurity Strategies Using Large Language Models. arXiv 2025, arXiv:2506.16699. [Google Scholar] [CrossRef]
National Academies of Sciences, Engineering, and Medicine. Enhancing Airport Wayfinding for Aging Travelers and Persons with Disabilities. Technical Report ACRP Research Report 177, Transportation Research Board. 2017. Available online: https://nap.nationalacademies.org/read/24930 (accessed on 23 May 2026).
Mehranian, H.; Fisher, D.L.; Duffy, S.A.; Niswander, E. Designing Flight Information Displays for Quick Information Access. In Proceedings of the Human Factors and Ergonomics Society Annual Meeting, San Francisco, CA, USA, 27 September–1 October 2010; Available online: https://www.ieda.ust.hk/dfaculty/ravi/papers/display_hfes2010.pdf (accessed on 23 May 2026).
National Academies of Sciences, Engineering, and Medicine. Framework and Tools for Incorporating Technologies into Airport In-Terminal Concessions Programs. Technical Report ACRP Research Report 279, Transportation Research Board. 2025. Available online: https://nap.nationalacademies.org/read/29145 (accessed on 23 May 2026).
Zuniga Alcaraz, C.; Mujica Mota, M.; Herrera García, A. Analyzing Airport Capacity by Simulation: A Mexican Case Study. In Handbook of Research on Military, Aeronautical, and Maritime Logistics and Operations; Ochoa-Zezzatti, A., Sanchez, J., Cedillo-Campos, M.G., Eds.; IGI Global Publishing: Hershey, PA, USA, 2016; pp. 115–150. [Google Scholar] [CrossRef]
Chraibi, M.; Kratz, K.; Schrödter, T.; The JuPedSim Development Team. JuPedSim. 2025. Available online: https://zenodo.org/records/15054966 (accessed on 29 May 2026).
Ahyudanari, E.; Vandebona, U. Simplified Model for Estimation of Airport Check-In Facilities. J. East. Asia Soc. Transp. Stud. 2005, 6, 724–735. [Google Scholar] [CrossRef]
Sky News. Heathrow Among Major Airports Hit by Delays After Cyber Attack. 2025. Available online: https://news.sky.com/story/passengers-stranded-by-cyber-attack-shouted-at-by-staff-despite-insane-queues-for-manual-check-in-13434669 (accessed on 23 May 2026).
International Civil Aviation Organization; World Customs Organization; International Air Transport Association. Guidelines on Advance Passenger Information. Technical Report, ICAO, WCO and IATA. 2014. Available online: https://www.iata.org/contentassets/18a5fdb2dc144d619a8c10dc1472ae80/api-guidelines-main-text_2014.pdf (accessed on 23 May 2026).
Flughafen Berlin Brandenburg GmbH. Update on Cyber Attack on BER Service Provider. 2025. Available online: https://corporate.berlin-airport.de/en/company-media/media-portal/pressemitteilungen/2025-10-02-update-zum-cyberangriff-auf-ber-dienstleister.html (accessed on 23 May 2026).
Travel And Tour World. Berlin Brandenburg Airport Struggles with Ongoing Cyberattack Disruptions, Passengers Experience Manual Check-Ins, Baggage Delays, and Security Checkpoint Congestion in Germany. 2025. Available online: https://www.travelandtourworld.com/news/article/berlin-brandenburg-airport-struggles-with-ongoing-cyberattack-disruptions-passengers-experience-manual-check-ins-baggage-delays-and-security-checkpoint-congestion-in-germany/ (accessed on 23 May 2026).
National Academies of Sciences, Engineering, and Medicine. Guidelines for Improving Airport Services for International Customers. Technical Report ACRP Research Report 161, Transportation Research Board. 2016. Available online: https://nap.nationalacademies.org/read/23683 (accessed on 23 May 2026).
Ma, W.; Yarlagadda, P. A Micro-Simulation of Airport Passengers with Advanced Traits. In Proceedings of the 28th International Congress of the Aeronautical Sciences, Brisbane, Australia, 23–28 September 2012; Available online: https://www.icas.org/icas_archive/ICAS2012/PAPERS/630.PDF (accessed on 23 May 2026).
Gwynne, S.M.V.; Hunt, A.L.E.; Thomas, J.R.; Thompson, A.J.L.; Séguin, L. The Toilet Paper: Bathroom Dwell Time Observations at an Airport. J. Build. Eng. 2019, 24, 100751. [Google Scholar] [CrossRef]
Janssen, S.; Sharpanskykh, A.; Curran, R.; Langendoen, K. Data-Driven Analysis of Airport Security Checkpoint Operations. Aerospace 2020, 7, 69. [Google Scholar] [CrossRef]
Janssen, S.; Sharpanskykh, A.; Curran, R.; Langendoen, K. Airport Security Checkpoint Data. 2020. Available online: https://data.4tu.nl/articles/_/12697205/1 (accessed on 23 May 2026).
Rodríguez-Sanz, Á.; Fernández de Orueta, I.; Comendador, F.G.; Valdés, R.M.A.; Pérez-Castán, J.A. Queue behavioural patterns for passengers at airport terminals: A machine learning approach. J. Air Transp. Manag. 2021, 90, 101940. [Google Scholar] [CrossRef]
Transportation Security Administration. Weekly Passenger Throughput Data. 2026. Available online: https://catalog.data.gov/dataset/tsa-foia-reading-room-weekly-passenger-throughput-data (accessed on 23 May 2026).
Canadian Air Transport Security Authority. Key Performance Indicators. 2026. Available online: https://www.catsa-acsta.gc.ca/en/about-us/corporate-reporting/key-performance-indicators-kpis (accessed on 23 May 2026).
Canadian Air Transport Security Authority. Screened Traffic Data. 2026. Available online: https://www.catsa-acsta.gc.ca/en/screened-traffic-data (accessed on 23 May 2026).
Port of Seattle. Port Cyberattack Archive. 2024. Available online: https://www.portseattle.org/news/port-cyberattack-archive (accessed on 23 May 2026).
Port of Seattle. Employees Pitched in During the SEA Cyberattack. 2024. Available online: https://www.portseattle.org/blog/employees-pitched-during-sea-cyberattack (accessed on 23 May 2026).
EUROCONTROL. European Aviation Overview: 15–21 September 2025. Technical Report, EUROCONTROL. 2025. Available online: https://www.eurocontrol.int/sites/default/files/2025-09/european-aviation-overview-20250925.pdf (accessed on 23 May 2026).

Figure 1. Coupled DES, pedestrian, and cyber framework architecture.

Figure 2. Representative terminal layout used in the synthetic cyber-disruption case study.

Figure 3. Feasible passenger routes overlaid on the representative terminal layout.

Figure 4. Average system-exit throughput under baseline operation and four cyber-disruption scenarios. The check-in scenario slows check-in service, the security scenario slows security screening, the guidance scenario changes post-security passenger decisions, and the combined scenario applies all three disruptions together.

Figure 5. Average check-in and security queue lengths under baseline operation and four cyber-disruption scenarios.

Figure 6. Total completion time under baseline operation and four cyber-disruption scenarios.

Figure 7. Average and 95th-percentile check-in and security waiting times under baseline operation and four cyber-disruption scenarios.

Figure 8. System-exit throughput over time for (a) baseline operation, (b) check-in degradation, (c) security degradation, (d) guidance degradation, and (e) combined degradation.

Figure 9. Spatial congestion heatmaps for (a) check-in degradation, (b) security degradation, (c) guidance degradation, and (d) combined degradation.

Figure 10. Observed and simulated checkpoint throughput in the Rotterdam validation case.

Figure 11. Observed and simulated occupation/queue proxy for the Rotterdam checkpoint validation.

Figure 12. Observed and simulated waiting-time distribution for the Rotterdam checkpoint validation.

Figure 13. DES benchmark structure adapted from the published Palma de Mallorca Airport store-and-forward passenger-flow model.

Figure 14. Thirty-minute service utilization profiles for the mixed Palma benchmark under different check-in desk configurations.

Table 1. Main layout and simulation inputs for Case Study 1.

Element	Value	Unit
Passenger population	500	Passengers
Terminal hall size	$50 \times 30$	m
Entrances	2	Doors
Check-in counters	2	Service points
Information desks	2	Service points
Security screening lanes	2	Screening points
Check-in queue slots per counter	4	Queue positions
Information-desk queue slots per desk	2	Queue positions
Security queue slots per lane	4	Queue positions
Serpentine waiting area	$4 \times 16$	Waiting positions
Flight information displays	2	Panels
Display viewing capacity	10	Passengers/display
Shops/food units/restrooms/gates	$4 / 4 / 3 / 13$	Destinations
Desired walking speed	$1.25$	m/s
Passenger radius	$0.28$	m

Table 2. Behavioral and disruption parameters used in Case Study 1.

Parameter	Value	Unit
Attack onset time	$20, 000$	s
Normal/degraded check-in service time	$300 / 700$	s
Information-desk service time	120	s
Security/degraded security service time	$300 / 420$	s
Information-desk visit probability	$0.35$	Probability
Display visit probability	$0.56$	Probability
Post-security decision dwell-time range	0–180	s
Shop dwell range	300–450	s
Food dwell range	1650–1750	s
Restroom dwell range	120–300	s

Table 3. Simulation validation results against Rotterdam observed checkpoint data.

Indicator	5-min	15 min
Security-stage time MAE (s)	16.11	16.11
Security-stage time RMSE (s)	23.26	23.26
Occupation/queue-proxy MAE (passengers)	1.41	1.38
Occupation/queue-proxy RMSE (passengers)	1.88	1.85
Throughput MAE (passengers/h)	22.79	7.50
Throughput RMSE (passengers/h)	29.21	9.41
Timing calibration error (%)	0.55	0.55
Throughput calibration error (%)	0.02	0.00
Timing-distribution $R^{2}$	0.882	0.882
Occupation/queue-proxy $R^{2}$	0.385	0.239
Throughput $R^{2}$	0.596	0.933
Timing KS distance	0.132	0.132
Timing Wasserstein distance (s)	16.11	16.11

Table 4. Published Palma parameters used in the literature-calibrated benchmark.

Parameter	Value	Use in Benchmark
International arrival profile	$λ = 60$ min, $k = 6.2$ , $Δ t = - 150$ min	International demand case
Schengen/domestic arrival profile	$λ = 90$ min, $k = 6.2$ , $Δ t = - 150$ min	Schengen/domestic and inter-island timing
Check-in/direct-security routing	$0.87 / 0.13$	Passenger path split
Check-in service time	60 s/passenger/server	Check-in processing rate
Security service time	15 s/passenger/server	Security processing rate
Check-in/security reference capacities	$3100 / 900$ passengers	Capacity plausibility check
Security desk rule	2–10 desks; 10 desks at 40 passengers/min	Dynamic resource opening

Table 5. Palma benchmark results for international and Schengen/domestic arrival profiles with two check-in desks.

Arrival Mode	CI Queue	SC Queue	CI Wait (min)	SC Wait (min)	CI Util.	SC Util.
International	157.74	2.18	81.90	0.01	0.955	0.225
Schengen/domestic	133.65	2.11	71.32	0.01	0.927	0.203

Table 6. Mixed-demand Palma benchmark sensitivity to open check-in desks.

CI Desks	CI Queue	SC Queue	CI Wait (min)	CI Util.	Clearance (min)
2	128.36	2.16	67.43	0.941	230.65
4	66.23	3.89	18.63	0.848	128.05
6	24.58	4.99	4.67	0.726	99.73
8	7.01	5.01	0.61	0.547	99.32

Table 7. LLM inference settings for passenger-level post-security decision generation.

Setting	Value
Model identifier	DeepSeek Pro
Access mode	DeepSeek API
Temperature	0
Top-p	1
Maximum output tokens	2048
Seed	Matched simulation seeds
Streaming	Disabled
Quantization	Not applicable

Table 8. Post-security decision ablation results.

Variant	Scenario	Gate	Disp.	Staff	Wrong	Opt.	Dwell (s)	Post-Sec. (s)	Gate Delay (s)
LLM	Display	4.0	4.0	3.0	1.0	6.8	48.6	160.7	115.7
Rule-based	Display	1.4	4.2	4.2	6.0	4.2	57.3	210.2	165.2
Random	Display	3.2	2.0	1.0	2.2	9.6	77.8	204.1	159.1
No guidance	Display	20.0	0.0	0.0	0.0	0.0	0.0	45.0	0.0
LLM	Combined	2.6	6.8	2.8	1.2	5.6	54.9	168.1	123.1
Rule-based	Combined	1.2	4.2	5.4	5.0	4.2	54.1	201.4	156.4
Random	Combined	2.2	2.6	1.2	2.6	8.0	82.1	205.5	160.5
No guidance	Combined	20.0	0.0	0.0	0.0	0.0	0.0	45.0	0.0

Table 9. Selected paired t-test results for LLM ablation.

Scenario	Metric	Comparison	p-Value
Display degraded	Post-security time	LLM vs. rule-based	0.051
Display degraded	Post-security time	LLM vs. random	0.041
Display degraded	Post-security time	LLM vs. no-guidance	<0.001
Display degraded	Total dwell	LLM vs. random	0.035
Display degraded	Wrong-route count	LLM vs. rule-based	0.012
Combined degraded	Post-security time	LLM vs. rule-based	0.006
Combined degraded	Post-security time	LLM vs. random	0.015
Combined degraded	Post-security time	LLM vs. no-guidance	<0.001
Combined degraded	Total dwell	LLM vs. random	0.012
Combined degraded	Wrong-route count	LLM vs. rule-based	0.007

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Katale, T.S.; Gao, L.; Liu, Y.; Liu, D.; Chen, H. LLM-Guided Hybrid Simulation for Airport Cyber-Resilience Assessment. Mathematics 2026, 14, 1923. https://doi.org/10.3390/math14111923

AMA Style

Katale TS, Gao L, Liu Y, Liu D, Chen H. LLM-Guided Hybrid Simulation for Airport Cyber-Resilience Assessment. Mathematics. 2026; 14(11):1923. https://doi.org/10.3390/math14111923

Chicago/Turabian Style

Katale, Tejaswini Sanjay, Lu Gao, Yongxin Liu, Dahai Liu, and Hongyun Chen. 2026. "LLM-Guided Hybrid Simulation for Airport Cyber-Resilience Assessment" Mathematics 14, no. 11: 1923. https://doi.org/10.3390/math14111923

APA Style

Katale, T. S., Gao, L., Liu, Y., Liu, D., & Chen, H. (2026). LLM-Guided Hybrid Simulation for Airport Cyber-Resilience Assessment. Mathematics, 14(11), 1923. https://doi.org/10.3390/math14111923

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

LLM-Guided Hybrid Simulation for Airport Cyber-Resilience Assessment

Abstract

1. Introduction

2. Literature Review

2.1. Airport Cyber Resilience and Operational Disruption

2.2. Discrete-Event Simulation in Airport Operations

2.3. Pedestrian Simulation and Spatial Congestion Modeling

2.4. Hybrid Simulation Approaches for Terminal Analysis

2.5. LLMs and Behavior-Aware Simulation

2.6. Research Gap

3. Methodology

3.1. Discrete-Event Layer (DES): Queues and Service

3.1.1. Passenger Arrivals

3.1.2. Event Scheduling and Simulation Clock

3.1.3. Service Stations and Queues

3.1.4. Passenger Routing Between Service Stages

3.1.5. Cyber-Disruption Model

3.2. JuPedSim Pedestrian Layer: Movement, Congestion, and Route Choice

3.3. Coupling DES and JuPedSim

3.4. LLM-Generated Passenger-Level Post-Security Decisions

3.5. Performance, Validation, and Benchmark Metrics

3.6. Implementation and Scenario Execution

3.7. Experiment Protocol

4. Case Studies

4.1. Case Study 1: Cyber-Disruption Terminal Simulation

Cyber-Disruption Simulation Results

4.2. Case Study 2: Rotterdam Checkpoint Validation

4.3. Case Study 3: Palma Benchmark and Supporting Evidence

4.4. Ablation Study of the LLM Decision Module

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI