Next Article in Journal
Hull Girder Ultimate Strength Analysis for Thin-Walled Steel Structures
Previous Article in Journal
Syntheses on Taxonomic and Functional Biodiversity Related to Ocean Acidification in a Well-Studied CO2 Vents System: The Castello Aragonese of Ischia (Italy)
Previous Article in Special Issue
Multi-Port Liner Ship Routing and Scheduling Optimization Using Machine Learning Forecast and Branch-And-Price Algorithm
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Port Resilience Assessment for Misdeclaration Induced Disasters Using a Hybrid LLM-GNN Framework

1
China Institute of FTZ Supply Chain, Shanghai Maritime University, Shanghai 201306, China
2
China E-Port Data Center Huangpu Branch, Guangzhou 510700, China
*
Author to whom correspondence should be addressed.
J. Mar. Sci. Eng. 2025, 13(12), 2280; https://doi.org/10.3390/jmse13122280
Submission received: 20 October 2025 / Revised: 23 November 2025 / Accepted: 28 November 2025 / Published: 29 November 2025

Abstract

Ports face critical security threats from hazardous cargo misdeclaration, which poses unique challenges due to its high concealment and catastrophic potential, as exemplified by the Beirut Port explosion. Traditional resilience assessment approaches relying on hazard state transition probabilities require abundant historical data or extensive domain expertise for probability elicitation, and static indicator-based assessment frameworks fail to capture the spatiotemporal evolution characteristics of disasters. To address these challenges, this study proposes a hybrid framework that leverages the Large Language Model (LLM)’s generalizable world knowledge for data augmentation while developing a Spatiotemporal Graph Neural Network (STGNN) to predict dynamic disaster propagation. Specifically, a multimodal LLM is employed to extract structured port state descriptions from temporally aligned disaster data and infer the states at undocumented time steps. With more disaster scenarios adapted from the real cases using the LLM, a STGNN is trained to learn the disaster evolution dynamics and make efficient real-time inference for resilience assessment and intervention strategy evaluation. Validation on Tianjin and Beirut Port incidents demonstrates that the framework accurately predicts disaster propagation pathways and identifies critical intervention priorities. It also reveals that topology-based intervention strategies substantially accelerate recovery, while adverse environmental conditions significantly amplify cumulative functional loss. This study represents an advancement toward AI-driven resilience modeling, offering port operators and regulators an adaptable, scalable decision support tool for intelligent safety governance.

1. Introduction

Ports, as key hubs in global supply chains, handle over 80% of international trade freight volume [1]. As complex systems where logistics, energy, and information are highly concentrated, ports are increasingly vulnerable to multidimensional disruptions from natural disasters [2], extreme climate events [3], operational failures, and human-induced risks [4]. Among these risks, the misdeclaration of hazardous goods during customs clearance has emerged as a particularly concealed and highly damaging safety hazard. By concealing the true category, quantity, or storage conditions of cargo, such fraudulent operations circumvent normal safety regulatory procedures [5], easily triggering uncontrollable chemical reactions, fires, and even catastrophic explosions [6]. The 2020 explosion at the Port of Beirut, resulting from the long-term illegal storage of undeclared ammonium nitrate, demonstrates the catastrophic consequences of such risks: the accident caused over 200 fatalities, more than 6000 injuries, and widespread destruction of infrastructure.
Despite the growing awareness of these risks, existing port resilience research remains inadequate in addressing the challenges posed by disaster scenarios initiated by misdeclarations. Much of the research focuses on external shocks such as extreme weather, congestion, or network-physical failures in automated systems, often overlooking the systemic vulnerabilities introduced by information asymmetry and human deception [7,8]. Unlike natural disasters, misdeclaration incidents are characterized by high concealment, long latency periods, and a complex coupling of information falsification and physical failure mechanisms. This presents a unique challenge: how to dynamically model and assess the resilience of port systems under conditions where the initial disturbance originates from data tampering rather than external force.
Current methodological approaches also face several key bottlenecks. First, traditional resilience assessment frameworks often rely on static indicators or macroscopic network metrics, making it difficult to capture the spatiotemporal evolution characteristics of cascading failure processes initiated by misdeclared hazardous goods [9]. Second, although dynamic scenario simulation methods such as Dynamic Bayesian Networks [10] have been applied to railway emergencies and hydrogen leakage risk analysis, they typically depend on predefined state transition rules and struggle to effectively model the nonlinear propagation processes inherent in physical accidents like fire spread and gas dispersion. Third, existing models rarely support what-if intervention analysis, meaning they do not treat emergency response actions, e.g., equipment repair and resource dispatch as controllable inputs to quantify their impact on system recovery [11]. Finally, the scarcity of documented misdeclaration incidents creates a fundamental data bottleneck: traditional data-driven methods require extensive historical records for training, yet such high-consequence events are inherently rare.
Addressing these challenges requires a methodological paradigm that can simultaneously: (1) overcome data scarcity by synthesizing diverse yet physically consistent disaster scenarios from sparse real-world observations, (2) capture fine-grained spatiotemporal dependency structures across heterogeneous port facilities without relying on predefined transition rules, and (3) enable controllable simulation to evaluate intervention effects. To meet these requirements, this paper proposes a hybrid LLM-STGNN framework that synergistically combines the strengths of both components: the LLM performs multimodal data synthesis and inference to generate sufficient training scenarios from limited historical incidents, while the STGNN learns data-driven propagation dynamics and models cascading failure evolution, enabling systematic resilience assessment under misdeclaration-induced disasters. The framework is validated on misdeclaration-induced disasters at Tianjin and Beirut Ports, addressing the data scarcity and modeling flexibility challenges inherent in such low-frequency, high-consequence events. Inspired by the concept of digital twins [12], the framework constructs a Dynamic Heterogeneous Spatiotemporal Graph (DHSG) that abstracts port physical facilities, operational units, and environmental elements as nodes, with their physical connections, transportation dependencies, and functional associations as edges. This structured representation enables the STGNN to capture the multi-dimensional coupling relationships among port entities, facilitating high-fidelity simulation of disaster evolution. Additionally, the framework supports “what-if” style pre-simulation of emergency strategies, allowing decision-makers to evaluate intervention effectiveness before implementation. The framework aligns with risk-based decision-making principles established in international maritime safety standards, particularly the IMO MSC.428(98) “Revised Guidelines for Formal Safety Assessment”, which emphasizes systematic hazard identification and consequence evaluation in maritime operations.
The main contributions of this paper are summarized as follows:
(1)
A hybrid LLM-STGNN framework that addresses the data scarcity challenge in dynamic disaster resilience assessment. By synthesizing training data from sparse real incidents through multimodal LLMs, the framework enables spatiotemporal propagation modeling in scenarios where traditional probability-based methods struggle to obtain sufficient historical records or expert knowledge. Validation on misdeclaration-induced disasters demonstrates its effectiveness in low-frequency, high-consequence events.
(2)
A systematic methodology for multimodal data extraction and DHSG construction, providing a structured representation for complex system dynamic modeling that explicitly characterizes multi-dimensional coupling relationships among port entities across physical, functional, and environmental dimensions.
(3)
An STGNN architecture that integrates spatiotemporal dependency learning with intervention simulation capabilities, enabling controllable evaluation of emergency strategies and supporting the quantitative assessment of different recovery policies.
(4)
Comprehensive sensitivity analyses based on real port accident scenarios, quantifying the impact of intervention strategies, environmental conditions, and management policies on system resilience, and providing actionable decision support for port operators and regulatory authorities.
The remainder of this paper is organized as follows: Section 2 reviews related research on resilience assessment and AI-driven scenario simulation; Section 3 details the methodology, including LLM-based multimodal data extraction and augmentation, DHSG construction, the STGNN model, and recovery strategy simulation; Section 4 presents experimental validation and sensitivity analysis; Section 5 concludes with limitations and future research directions.

2. Literature Review

In the context of increasing global supply chain volatility and rising climate change risks, the resilience of ports as critical infrastructure has garnered growing attention. Resilience is widely defined as the ability of a system to absorb shocks, adapt to changes, and rapidly recover functionality. Recent research has developed multidimensional assessment frameworks from various perspectives, which can be broadly categorized into network structure-based, indicator system-based, probability-based and simulation/AI-based approaches.

2.1. Network Structure-Based Resilience Analysis

Complex network theory provides a powerful tool for understanding the topological vulnerability of port systems. Xin et al. [13] applied symmetric non-negative matrix factorization to identify community structures in global container shipping networks, revealing that targeted attacks on hub ports can trigger partial network collapse due to “strong internal cohesion, weak external connectivity” patterns. Wang et al. [14] constructed a two-layer interdependent network model for automated ports, integrating graph convolutional networks with reinforcement learning to optimize recovery strategies and demonstrating significant “cross-domain cascading” risks where information failures propagate to physical infrastructure. Bo Song et al. [15] coupled an improved SEIR epidemic model with shipping network dynamics, developing optimized load redistribution strategies that effectively mitigate systemic collapse under pandemic disruptions.
These network-based approaches focus on macroscopic inter-port topology or predefined cascading rules, lacking fine-grained spatiotemporal modeling of disaster propagation within individual port facilities. Moreover, they primarily address cyberattacks or epidemic scenarios rather than hazardous cargo misdeclaration, which involves complex physical-chemical processes requiring continuous spatiotemporal simulation capabilities.

2.2. Indicator System-Based Resilience Analysis

Indicator-based approaches quantify port resilience through composite metrics derived from multiple dimensions of system performance. Polydoropoulou et al. [16] developed a composite Port Resilience Index for climate-related disasters in Greek ports, integrating stakeholder opinions through a “Living Labs” approach and weighting 19 indicators across five dimensions (infrastructure, operations, digital, socioeconomic, governance) using the Analytic Hierarchy Process. Chang et al. [17] constructed a resilience assessment framework for congestion disruptions, employing support vector machines for anomaly detection and game-based combination weighting methods to identify congestion recovery speed and congestion duration percentage as critical indicators. Xing et al. [18] proposed a network resilience indicator for port logistics infrastructure and formulated the resilience optimization problem as stochastic mixed-integer linear programming, integrating preparedness and recovery strategies through a double decomposition algorithm combining Lagrangian Decomposition and branch-and-price methods.
These indicator-based frameworks provide systematic assessment tools for specific disruption types but face inherent limitations for misdeclaration induced disaster scenarios. Static or quasi-static indicators fail to capture the rapid, nonlinear evolution of physical disasters, and the reliance on predefined indicator weights struggles to adapt to novel threat patterns where importance hierarchies shift dynamically during disaster progression.

2.3. Probability-Based Resilience Analysis

Probability-based models, especially Bayesian Networks (BN), prevail in risk and resilience assessment research due to their powerful capabilities in uncertainty reasoning and causal relationship modeling. Gonzalez-Solano et al. [19] proposed a hybrid method integrating Decision Making Trial and Evaluation Laboratory (DEMATEL) and Interpretive Structural Modeling (ISM) to construct a Bayesian network for assessing port resilience strategies. The study identified 19 resilience strategies and found that adaptability and recovery capabilities have a significantly greater impact on overall resilience than absorption capacity.
Dynamic Bayesian Networks (DBN) can effectively capture the temporal evolution of state variables by introducing a time dimension. Liu et al. [20] addressed railway emergencies by constructing a scenario knowledge element model based on knowledge element theory and subsequently establishing a DBN for multi-scenario simulation. To reduce the subjectivity of expert assessments, the study proposed an evidence conflict calculation method based on the Tanimoto measure, enhancing the objectivity of the model. Zhang et al. [21] applied DBN to hydrogen leakage risk analysis in hydrogen energy systems, constructing an evolution framework comprising 21 scenario states, 17 mitigation measures, and 11 environmental determinants. By integrating fuzzy set theory, the model achieved precise quantification of state probabilities and time prediction.
To handle the fuzziness and uncertainty of information in hazardous chemical accidents, Lu et al. [22] further proposed a Fuzzy Dynamic Bayesian Network (FDBN), incorporating triangular fuzzy numbers into the conditional probability tables of DBNs. This model can effectively simulate accident evolution trajectories under different emergency response scenarios and, through sensitivity analysis, identify environmental conditions and firefighting capabilities as key influencing factors, providing strong support for emergency decision-making.
While probability-based methods excel at uncertainty quantification and causal reasoning, they face inherent limitations for misdeclaration disaster scenarios. The discretization of continuous variables into predefined state spaces inevitably causes information loss from raw observational data and constrains inference granularity, making it difficult to capture fine-grained spatiotemporal dynamics such as fire spread rates and explosion shockwave propagation. Moreover, accurate elicitation of conditional probability tables requires substantial domain expertise on causal mechanisms and transition likelihoods—knowledge that remains scarce for novel threat scenarios involving complex interactions between concealed cargo properties, environmental conditions, and infrastructure vulnerabilities.

2.4. Simulation/AI-Based Resilience Analysis

Simulation and AI-driven methods have recently advanced resilience research by enabling dynamic, fine-grained analysis of disruption propagation and recovery [23,24,25,26]. Li et al. [27] developed a hypergraph-enhanced agent-based simulation framework for 11 ports in the Guangdong-Hong Kong-Macao Greater Bay Area under typhoon disruptions, capturing heterogeneous agent behaviors and emergent cascading dynamics through bounded rationality decision-making and capacity degradation mechanisms. Srisurin et al. [28] employed discrete event simulation to evaluate operational resilience of Thailand’s inland container terminal, incorporating four interconnected modules to identify bottlenecks and demonstrate that redesigned layouts could absorb 120–140% demand surges despite reduced equipment, providing precise capacity thresholds for infrastructure investment decisions.
Gu et al. [29] pioneered a data-driven approach using port congestion indices as resilience proxies, applying hybrid statistical-ML methods (Isolation Forest, SVM, LSTM) to analyze 9 global ports during 2016–2023 and revealing significant resilience heterogeneity with accurate recovery trajectory prediction from real-time AIS data. Cuong et al. [30] developed a discrete wavelet transform-enhanced LSTM model for Busan Port throughput analysis, employing chaos theory diagnostics to confirm nonlinear dynamics and demonstrating superior multi-scale feature extraction capabilities across different crisis types. Khan et al. [31] employed automated machine learning (AutoML) to systematically evaluate 24 algorithms for predicting port disruptions from natural hazards, analyzing 1214 global incidents and achieving 96.44% accuracy while identifying disaster severity and recovery time as top predictive factors.
These simulation or AI-based approaches represent methodological advances, offering capabilities for learning complex patterns from data, capturing nonlinear dynamics, and conducting scenario-based analysis without requiring explicit rule specification. However, the effectiveness of data-driven methods remains fundamentally constrained by the availability of training samples—a critical bottleneck for rare, high-consequence events like misdeclaration-induced disasters.

2.5. Summary and Positioning of This Study

A comprehensive review of existing literature reveals four critical research gaps for misdeclaration-induced port disasters. First, current studies predominantly focus on external disruptions (extreme weather, congestion, cyberattacks) or operational inefficiencies, with minimal attention to information falsification-induced disasters characterized by deliberate concealment and catastrophic propagation potential. Second, existing approaches lack fine-grained spatiotemporal modeling capabilities—network methods operate at macroscopic levels with predefined cascading rules, indicator frameworks rely on static metrics unsuitable for dynamic disaster evolution, and probability models discretize continuous state spaces causing information loss and requiring extensive domain expertise for probability elicitation. Third, while AI-driven methods offer superior pattern learning without explicit rule specification, their effectiveness remains fundamentally constrained by training sample availability—particularly critical for rare, high-consequence events like misdeclaration-induced disasters. Fourth, existing models rarely support controllable intervention simulation, limiting their utility for evaluating alternative emergency response strategies before implementation.
Recent advances in Large Language Models show promise for addressing data scarcity in disaster scenarios. Lei et al. [32] systematically surveyed LLM applications across disaster management phases, demonstrating their capabilities in processing multimodal data (imagery, text, sensor readings) and generating scenario variations through cross-modal semantic reasoning. However, existing LLM applications primarily focus on post-disaster damage assessment and response planning rather than pre-disaster spatiotemporal propagation modeling.
To bridge these gaps, this study proposes a hybrid LLM-STGNN framework that leverages a multimodal LLM for extracting structured disaster states from sparse real-world incidents and synthesizing diverse scenario variations, while employing a STGNN to learn fine-grained spatiotemporal propagation dynamics. Port infrastructure, response units, and environmental factors are abstracted as nodes in DHSG, with transportation, functional, and temporal dependencies as typed edges. The STGNN architecture integrates heterogeneous graph convolution to capture both long-term trends and sudden anomalies. Emergency interventions are encoded as external control signals, enabling the model to autonomously predict recovery trajectories under different strategies.
In summary, our hybrid LLM-STGNN framework systematically addresses the four critical limitations identified in current research: (1) it focuses specifically on misdeclaration-induced disasters with their unique concealment and cascading characteristics, (2) it captures fine-grained spatiotemporal dynamics at facility granularity through heterogeneous graph neural networks, (3) it overcomes data scarcity through LLM-based multimodal synthesis of physically consistent scenarios, and (4) it enables controllable intervention simulation for quantitative strategy evaluation. This methodological innovation advances resilience assessment from static indicator-based or rule-dependent approaches toward data-driven, simulation-enabled decision support for port safety governance.

3. Methods

This study proposes a hybrid LLM-STGNN framework that addresses the critical challenge of port resilience assessment under misdeclaration-induced disasters through systematic integration of data augmentation, spatiotemporal modeling, and intervention simulation. The framework tackles the fundamental data scarcity problem inherent in low-frequency, high-consequence events by leveraging multimodal large language models to synthesize realistic disaster scenarios from sparse historical records, while employing spatiotemporal graph neural networks to learn fine-grained propagation dynamics and support controllable emergency strategy evaluation.
The overall methodology consists of four interconnected components, as illustrated in Figure 1:
(1)
LLM-Driven Multimodal Data Synthesis: Multimodal data from historical accident incidents, including surveillance videos, satellite imagery and textual investigation reports, are processed through the Qwen2.5-VL model to extract structured spatiotemporal information. The LLM performs cross-modal alignment, visual semantic segmentation, and state quantification to generate JSON-formatted scene snapshots at each time step. Beyond extraction, the LLM leverages its generalizable world knowledge to augment the limited real-world cases (Ningbo, Tianjin, Beirut ports) into 97 diversified simulation scenarios, ensuring sufficient training data while maintaining physical consistency and disaster evolution fidelity.
(2)
DHSG Construction: The extracted and augmented data are transformed into a unified graph representation where port facilities, emergency response units, and environmental factors are abstracted as heterogeneous nodes, while their physical connections, functional dependencies, and temporal associations form typed edges. The DHSG employs a dynamic update mechanism that captures the complete disaster lifecycle from initial disturbance and risk propagation to emergency intervention by adjusting edge existence and weights over 24 hourly time steps, thereby providing a high-fidelity structured input for subsequent spatiotemporal learning.
(3)
STGNN for Disaster Evolution Modeling: A specialized STGNN architecture integrates heterogeneous graph convolution for spatial dependency extraction with a parallel GRU-TCN temporal module that simultaneously captures long-term trends and abrupt anomalies. The model incorporates a risk attention mechanism that dynamically weights propagation pathways based on dependency intensity, enabling accurate prediction of node-level functional degradation. Critically, the framework treats emergency interventions as controllable external signals by encoding repair actions and recovery rates as additional node features, allowing the trained model to autonomously simulate system evolution under different response strategies.
(4)
Resilience Quantification and Intervention Evaluation: Node-level predictions are aggregated into system functionality indices using topology-based weighting, from which three core resilience metrics are computed: Peak Resilience Loss, Recovery Time, and Resilience Area. The framework supports “what-if” simulation with four intervention strategies, which are no intervention, random allocation, centrality-priority and path-blocking, enabling comparative evaluation and optimization of emergency resource deployment before actual implementation.

3.1. Data Extraction and Preprocessing

To address the critical data scarcity challenge in modeling low-frequency, high-consequence misdeclaration disasters, this study develops a hybrid data synthesis strategy that combines extraction from real incidents with LLM-guided augmentation. The Qwen2.5-VL multimodal large language model serves dual functions: parsing structured spatiotemporal information from heterogeneous accident records and generating physically consistent scenario variations by leveraging its generalizable world knowledge. This approach transforms three documented incidents into 100 diverse training scenarios spanning 2400 temporal snapshots.
Based on the existing data extraction workflow, we further supplement the methodology by detailing the data sources, cleaning procedures, vectorization methods, and model training settings to ensure data quality and consistency in feature representation.
(1)
Data Sources. The study integrates four complementary categories of disaster-related information: historical accident investigation reports obtained through online searches provide causal mechanisms, event timelines, system failures, and regulatory findings; surveillance videos supplied by port authorities and self-media accounts cover critical facilities including container yards, quay crane zones, and road networks, enabling reconstruction of fire spread trajectories, equipment damage progression, and emergency response deployment patterns; satellite imagery covering affected port regions to observe spatial layouts, explosion impact zones, plume dispersion, and structural deformation across multiple time points; public news releases issued by maritime safety agencies offer supplemental information on event triggers, emergency level escalation, casualty updates, and official situational assessments.
(2)
Data Cleaning Procedures. All multimodal inputs underwent a unified preprocessing pipeline to ensure consistency and analytical validity through four key steps: text deduplication via human reading; temporal normalization converted all timestamps from reports, videos, and imagery into a standardized accident-centered timeline with t = 0 marking the initial event, using interpolation to align video frame timestamps with satellite image acquisition times; complete environmental parameter records including wind speed and temperature, and explicit causality descriptions linking sequential events.
(3)
Vectorization. To transform heterogeneous multimodal data into unified machine-readable representations, the Qwen2.5-VL-72B [33] model processes all inputs through its integrated multimodal architecture: visual inputs including video frames and satellite imagery are analyzed to extract semantic descriptions of spatial layouts, damage patterns, and temporal dynamics; textual inputs including investigation reports, news narratives, and visual observations are processed to identify causal relationships and operational context; numerical attributes including spatial coordinates, meteorological variables, and equipment categories are extracted and normalized into continuous feature values. Through carefully designed prompts, the model outputs structured JSON snapshots where each node is described by its functional state, operational status, and spatial attributes, while edges are characterized by their relationship types and weights. These JSON-formatted outputs are then directly parsed into numerical arrays to form the feature matrix and adjacency tensor of the DHSG, with each node represented by a feature vector containing its functional index, status indicators, and static attributes.
(4)
No Additional Training on the LLM. It is emphasized that no fine-tuning or additional training is performed on the large language model itself, as all multimodal extraction, state inference, and scenario synthesis are achieved solely through structured prompts and constraint-based reasoning that leverage the pretrained capabilities of Qwen2.5-VL-72B. This zero-shot approach preserves the generalizable world knowledge of the pretrained model and enhances framework transferability across different ports and disaster types without requiring domain-specific model retraining.

3.1.1. Real Incident Data Acquisition

Three representative misdeclaration incidents with comprehensive multimodal documentation serve as foundation cases:
(1)
Ningbo Port (2023): Class 5.2 organic peroxides declared as general cargo, triggering spontaneous combustion under high temperature.
(2)
Tianjin Port (2015): Concealed quantities of nitrocellulose and other hazardous materials causing chain explosions.
(3)
Beirut Port (2020): Long-term undeclared ammonium nitrate storage culminating in catastrophic explosion.
Each incident comprises surveillance videos, satellite imagery, investigation reports, and post-disaster assessments. These materials capture the complete disaster lifecycle across 24 hourly time steps from initial disturbance through emergency response to recovery initiation.

3.1.2. LLM-Based Multimodal State Extraction

For each incident, video frames, images, and textual reports are jointly analyzed using the Qwen2.5-VL-72B model to automatically identify and extract the state evolution of key entities. The specific workflow is as follows:
(1)
Cross-modal Semantic Alignment: Video frames, satellite imagery and textual descriptions are semantically aligned to establish spatiotemporal consistency.
(2)
Visual Semantic Segmentation: The LLM performs pixel-level segmentation of critical infrastructure including containers, vessels, quay cranes, roads, and water areas. Damaged regions are identified through visual anomaly detection.
(3)
Damage State Quantification: The damage ratio of each facility node is calculated as:
DamageRatio i t = DamageArea i t T o t a l A r e a i
where DamageAreai(t) denotes the damaged area identified by Qwen2.5-VL and adjusted by map scale; TotalAreai represents the standard geometric area of the facility obtained from the port’s GIS system. The functional index follows:
FunctionIndex i t = 1 DamageRatio i t
Which ranges from 0 for complete failure to 1 for full operation.
The Qwen2.5-VL-72B model is employed to execute the above extraction workflow. This selection was motivated by three critical capabilities: (1) Superior performance on vision-language understanding tasks requiring fine-grained spatial reasoning and temporal consistency maintenance—essential for tracking disaster evolution across sequential time steps; (2) Robust zero-shot and few-shot learning capabilities enabling generalization from limited real incident data to synthesized scenarios without task-specific fine-tuning; (3) Support for processing long-context inputs spanning multiple images and extensive textual reports, exceeding the context limits of competing models. The Qwen2.5-VL-72B model architecture comprises three core components: a Vision Transformer (ViT) based encoder processing visual inputs through hierarchical feature extraction, a 72-billion parameter Transformer decoder handling textual information and reasoning, and cross-modal attention mechanisms enabling bidirectional information flow between visual and linguistic representations.
In our implementation, surveillance video frames and satellite imagery are preprocessed and fed into the vision encoder, generating dense visual feature representations. These visual embeddings are concatenated with tokenized textual investigation reports, and cross-modal attention layers perform joint reasoning to correlate visual damage patterns (e.g., fire spread) with textual descriptions of casualties and infrastructure failures. The transformation of heterogeneous disaster data into standardized DHSG node attributes proceeds through systematic prompt engineering. Structured prompts explicitly define the output schema, provide exemplars from real incidents demonstrating the desired extraction format, and incorporate domain constraints ensuring physical consistency (e.g., damage ratios ∈ [0, 1], functional indices inversely correlate with damage). The prompts guide the model to: identify entity types (infrastructure, response units, environmental factors), extract quantitative measurements (damage areas from pixel-level segmentation, sensor readings from textual reports), infer categorical labels (operational status, alert levels), and establish relational dependencies (transportation edges from observed workflows, causality links from incident narratives). This structured extraction converts unstructured multimodal observations into machine-readable graph representations suitable for downstream STGNN modeling.

3.1.3. Attribute and Relationship Extraction

In order to construct DHSG in subsequent research, attributes depicting nodes and relationships depicting edges are further extracted from textualized multimodal data, as shown in Table 1 and following text.
Regulatory intensity and emergency response level are extracted from relevant textual reports. The misdeclaration deviation rate is computed as the relative difference between declared and actual detected values.
Edge relationships are inferred through combined analysis. Transportation edges are identified when the LLM detects operational workflows in visual or textual data. Dependency edges are established based on physical causality extracted from incident descriptions, with weights assigned according to impact intensity descriptions such as “severely exacerbated”, “moderately affected” and “slightly influenced”. Temporal edges are created connecting identical nodes across consecutive time steps, i.e., vi(t − 1) → vi(t), and edge weight wtemporal = |FIi(t + 1) − FIi(t)| + ε is automatically assigned to reflect the magnitude of state change, where ε = 0.01 is a smoothing term to prevent zero weights.

3.1.4. LLM-Guided Scenario Augmentation

To generate sufficient training diversity from three real cases, a prompt engineering framework guides the LLM to synthesize 97 physically consistent variations while preserving disaster evolution mechanisms. In the adopted few-shot prompting strategy of generating new cases, the LLM receives structured prompts containing:
(1)
Exemplar scenes from real incidents formatted as JSON snapshots with node states, edge relationships, and temporal sequences.
(2)
Physical constraints including causality rules (e.g., “fire propagates through transportation edges to adjacent flammable nodes”), environmental effects (e.g., “wind speed > 8 m/s accelerates lateral fire spread by factor 1.5”), and operational dependencies (e.g., “road blockage prevents rescue vehicle access”).
(3)
Variation parameters specifying controllable dimensions: misdeclaration type (category/weight/temperature-control), deviation rate (10–50%), environmental conditions (temperature 20–35 °C, wind speed 3–12 m/s), and regulatory intensity (0.3–0.7).
The set of 97 augmented scenarios was systematically designed to ensure comprehensive coverage across multiple critical dimensions. This encompasses all 15 combinations derived from three primary misdeclaration types and five distinct deviation levels. Furthermore, the scenarios incorporate four representative environmental profiles—normal conditions, high-wind, high-temperature, and a combined extreme environment. They are also contextualized within three different port configurations, namely the Ningbo and Tianjin terminals, as well as a generic mid-sized terminal. To assess time-critical factors, variable intervention timings, including immediate response and delays of 2, 4, and 6 h, are integrated into the scenario matrix.
The physical consistency of these 97 augmented scenarios was verified through a systematic review process. Port safety researchers with expertise in emergency response protocols examined the scenarios to ensure basic physical plausibility, including: (1) Consistency of resource quantities with typical port configurations, (2) Plausibility of intervention speeds relative to real-world operational constraints, and (3) Alignment of disaster propagation patterns with fundamental physics principles. Scenarios exhibiting parameter inconsistencies were excluded before obtaining the final dataset. This verification process ensures that the LLM-synthesized scenarios maintain basic physical fidelity suitable for training the STGNN model, while acknowledging that detailed validation against comprehensive historical incident data remains an area for future enhancement as more documented cases become available.

3.2. Dynamic Heterogeneous Spatiotemporal Graph Construction

To capture the multi-dimensional coupling mechanisms underlying misdeclaration-induced disasters, this study constructs a Dynamic Heterogeneous Spatiotemporal Graph (DHSG) that unifies physical connectivity, functional dependency, and temporal evolution within a single mathematical structure. Unlike conventional static network models that treat all nodes and edges homogeneously, the DHSG explicitly distinguishes entity types and relationship semantics, enabling fine-grained modeling of cascading failures across infrastructure, response systems, and environmental conditions. As illustrated in Figure 2, this representation transforms complex disaster propagation processes into learnable graph sequences for STGNN prediction.

3.2.1. Heterogeneous Node Abstraction

The DHSG abstracts port entities into three functionally distinct node types, each contributing unique roles in disaster dynamics. Infrastructure nodes including containers, quay cranes, vessels, roads, and water areas serve as primary impact carriers whose degradation directly reduces system functionality. Response nodes comprising rescue vehicles, emergency personnel, and fire stations act as active intervention agents whose deployment alters disaster trajectories. Environmental nodes representing wind speed, wind direction, and temperature function as external modulators amplifying or suppressing propagation intensity.
This tripartite typology enables differential treatment in subsequent graph neural network layers, where node-type-specific parameters capture distinct degradation mechanisms, recovery patterns, and environmental sensitivities. Each node i maintains a temporal state vector:
h i t = FunctionIndex i t , S t a t u s i t , x i
where Statusi(t) encodes discrete operational modes, and xi represents time-invariant attributes such as location and capacity.

3.2.2. Typed Edge Semantics and Dynamic Weighting

The DHSG distinguishes three edge types with fundamentally different physical meanings and update mechanisms. Transportation edges represent direct material or energy flows following actual operational workflows. These edges activate and deactivate dynamically as cargo operations proceed, with uniform weights wtrans = 1.0 emphasizing binary connectivity over flow magnitude. For instance, a crane-container edge exists only during active lifting operations, forming temporal propagation pathways that evolve with logistics activities.
Dependency edges encode indirect influence relationships derived from physical laws and functional couplings. Unlike transportation edges, dependency topology remains stable while edge weights wdep ∈ {0.3, 0.6, 1.0} adapt to environmental conditions. The discrete weight values {0.3, 0.6, 1.0} represent low, moderate, and high environmental influence levels, respectively. This discretization approach is adopted because the training dataset, despite augmentation from 3 to 100 scenarios, remains insufficient for the STGNN to learn the continuous nonlinear relationship between environmental parameters (wind speed, temperature) and disaster propagation intensity. By explicitly encoding environmental influence through three discrete levels, the model can more effectively learn propagation patterns during training. During simulation, continuous environmental parameters are mapped to these discrete weights based on predefined thresholds, enabling the model to capture context-dependent propagation intensities without requiring extensive training samples to learn complex environmental dependencies.
Temporal edges connect identical nodes across consecutive time steps with weights wtemporal as described in Section 3.1.3. This embeds state transition magnitude directly into graph topology, allowing graph convolution operations to naturally focus on periods of rapid degradation or recovery.

3.2.3. Dynamic Evolution and Multi-Scale Coupling

The graph G(t) = (V, E(t)) evolves through coordinated updates across spatial and temporal dimensions. While node set V remains fixed over 24 hourly time steps, edge set E(t) adapts through three concurrent mechanisms: transportation edges update according to operational workflows, dependency weights adjust proportionally to environmental variable changes, and temporal edges automatically extend the graph forward in time. This multi-scale evolution captures both discrete events such as sudden equipment failures and continuous processes such as gradual fire spread.
The resulting graph sequence explicitly represents the complete disaster lifecycle from initial misdeclaration-triggered perturbation through cascading propagation to intervention-modulated recovery. By encoding spatial topology, temporal dynamics, and heterogeneous semantics within a unified structure, the DHSG transforms the complex physical system into a learnable representation amenable to deep learning architectures. This abstraction addresses the limitation of traditional resilience models which discretize continuous states into coarse categories or rely on predefined transition rules, instead enabling data-driven discovery of propagation patterns directly from multimodal observations.

3.3. STGNN for Disaster Evolution Modeling

The STGNN serves as the core predictive engine that learns disaster evolution dynamics from DHSG sequences and enables controllable intervention simulation. Unlike conventional graph neural networks that process static snapshots or homogeneous temporal graphs, this architecture integrates heterogeneous graph convolution with parallel temporal modeling to simultaneously capture spatial propagation pathways and temporal evolution patterns. The model adopts an encoder-processor-decoder structure accepting T consecutive graph snapshots G(1), G(2),…, G(T) as input and generating predicted functional indices { f ^ i T   + 1 } i   = 1 N for all nodes at the next time step.

3.3.1. Heterogeneous Graph Convolution with Risk Attention

To address the semantic diversity of edges across transportation, dependency, and temporal relationships, the graph convolution layer employs type-specific aggregation with adaptive attention weighting. For layer l, the output feature hi(l,t) of node i at time t is computed as:
h i l , t = σ r R v j N r v i 1 N r v i 1 N r 1 v j w r l h j l 1 , t + b l
where R = {Transport, Dependency, Temporal} is the set of edge types; Nr(vi) denotes the set of neighbors connected to vi via edges of type r; |Nr(vi)| and |Nr−1(vj)| are the out-degree of vi and in-degree of vj, respectively; wr(l) is the learnable weight matrix for edge type r at layer l; b(l) is the bias term; and σ() is the activation function, implemented as LeakyReLU to mitigate gradient vanishing.
To enhance the model’s focus on high-risk propagation pathways, a risk attention mechanism is introduced. The edge weights eij are softmax-normalized to yield attention coefficients aij, which are then multiplied with the convolutional weights wr(l). This enables the model to assign higher importance to high-weight dependency edges during feature aggregation.

3.3.2. Parallel GRU-TCN Temporal Architecture

Disaster evolution exhibits dual temporal characteristics: sustained processes such as gradual fire spread and abrupt events such as sudden explosions. To capture both patterns, the temporal module employs a parallel architecture combining Gated Recurrent Units (GRU) for long-range dependencies and Temporal Convolutional Networks for short-term anomalies.
The GRU module processes graph convolution through standard gating mechanisms:
u t = σ W u h t ; z t 1 + b u
r t = σ W r h t ; z t 1 + b r
z ~ t = t a n h W h h t ; r t z t 1 + b h
z t = 1 u t z t 1 + u t z ~ t
where ut denotes the update gate controlling information flow from previous states (Equation (5)), rt represents the reset gate determining how much past information to forget (Equation (6)), z ~ t indicates the candidate hidden state computed from current inputs and selectively reset previous states (Equation (7)), and zt denotes the final hidden state obtained by interpolating between previous state and candidate state using the update gate (Equation (8)). Throughout these equations, W* denotes learnable weight matrices, ⊙ denotes element-wise multiplication, and ht consistently represents the output from the graph convolution layer at time t, serving as input to the GRU temporal processing.
The TCN module leverages dilated convolutions to capture multi-scale temporal patterns. To model the sequential dependency of disaster evolution while expanding the temporal receptive field, TCN adopts a causal convolution structure (ensuring no information leakage from future time steps to the present) with a fixed kernel size of 3. The dilation rate is defined as d = 2l−1 with l denoting the layer depth, which increases exponentially with the number of network layers. This exponential growth of the dilation rate allows the TCN to cover a wide temporal range with relatively few layers, efficiently capturing both short-term fluctuations and long-range temporal correlations in disaster propagation.
Mathematically, the output of the TCN is computed via dilated causal convolution following the formula:
c t l = R e L U k = 0 K 1 w k l c t k d l l 1 + b l
where
c t l : Output feature of the TCN’s l-th layer at time step t;
K = 3: Convolution kernel size, fixed for all TCN layers;
w k l : Learnable weight parameter of the k-th position in the convolution kernel of the l-th layer;
dl = 2l−1: Dilation rate of the l-th layer, which is used to determine the temporal spacing of input features for the current layer;
c t k d l l 1 : Input feature from the (l − 1)-th layer at the time step (t − k·dl), which ensures sampling of non-consecutive historical features to expand the receptive field;
After propagating through all TCN layers, the final output of the TCN module at time step t is denoted as ct, and it is concatenated with GRU output zt as h t t e m p = z t ; c t . This dual-pathway design enables the GRU to track cumulative degradation while the TCN identifies sudden state transitions characteristic of explosions or rapid intervention effects.

3.3.3. Spatiotemporal Fusion and Intervention Encoding

The fusion layer combines spatial features ht(L) from the final graph convolution layer with temporal features httemp through learnable attention:
h t f u s e d = α h t L + 1 α h t t e m p
where attention weight α ∈ [0, 1] is determined via softmax, adaptively balancing spatial and temporal information based on scenario characteristics.
At the output layer, a fully connected layer with sigmoid activation maps the fused features to the predicted functional index:
f ^ i t + 1 = σ W o h i f u s e d + b o
This value represents the predicted functional state of node i at the next time step and serves as the basis for simulating disaster propagation.

3.3.4. Training Optimization

The training objective is to minimize the error between predicted and true functional indices. The Mean Squared Error (MSE) loss function is used:
L = 1 N T i = 1 N t = 1 T f i t f ^ i t 2
where f i t is the ground-truth functional index and f ^ i t is the model prediction.
The Adam optimizer with initial learning rate 10−3 and exponential decay factor 0.9 per 10 epochs updates parameters. Dropout rate 0.3 after graph convolution and fully connected layers, combined with L2 regularization coefficient 10−5, prevents overfitting to the training set. Early stopping is adopted when validation loss plateaus, typically converging within 100 epochs.
Upon convergence, the trained STGNN performs rolling prediction across 24 h disaster processes, accepting DHSG snapshots as input and generating node-level functional forecasts. This capability enables end-to-end simulation from initial misdeclaration perturbation through cascading propagation to intervention-modulated recovery, providing the foundation for subsequent resilience quantification and strategy optimization.

3.4. Resilience Quantification and Intervention Evaluation

Following dynamic disaster simulation, a systematic framework is established to quantify port system resilience and evaluate emergency response effectiveness. This framework transforms node-level functional predictions into system-level performance metrics and supports evaluation of different recovery strategies.

3.4.1. Node Functional Index

The functional state of each node vi at time step t is quantified by its Functional Index
FI i t = 1 DamageRatio i t
where DamageRatioi(t) ∈ [0, 1] denotes the proportion of damaged area extracted from multimodal data. The index ranges from 0 indicating complete failure to 1 representing full operational status.

3.4.2. System Functionality Aggregation

System-level operational capability is assessed by aggregating node-level indices through topology-based weighting:
S t = i = 1 N w i F I i t i = 1 N w i
where N is the total number of nodes, wi represents the betweenness centrality of node i, reflecting its role in mediating flows within the network. The System Function Degradation Index (SFDI) normalizes for initial state variations:
S F D I t = 1 S t S 0
where S(0) denotes initial system functionality. Higher SFDI values indicate greater functional loss.

3.4.3. Core Resilience Metrics

Based on the SFDI(t) curve, three core resilience metrics are computed to characterize system resilience evolution:
Peak Resilience Loss (PRL) represents the maximum functional loss sustained during the disaster, reflecting the system’s absorptive capacity:
P R L = m a x t 1 , T S F D I t
Recovery Time (RT) measures the duration required to recover from peak functional loss to an acceptable operational level:
R T = t r e c t p e a k
where tpeak is the time at which SFDI(t) reaches its peak, and trec is the earliest time when S(trec) ≥ 0.9·S(0).
Resilience Area (RA) quantifies the cumulative functional loss over the entire evolution period:
R A = 0 T S F D I t d t
In discrete time, the trapezoidal rule is used for approximation:
R A t = 1 T 1 S F D I t + S F D I t + 1 2 Δ t
where Δt = 1 h. Smaller RA indicates superior overall resilience.
These metrics provide complementary decision support for port authorities. PRL directly identifies system vulnerabilities: facilities experiencing maximum functional loss during disasters require priority resource allocation and enhanced structural reinforcement in preparedness planning. RT quantifies temporal urgency: shorter recovery times indicate effective intervention strategies, enabling managers to compare alternative repair sequences and select approaches that minimize operational downtime. RA integrates both severity and duration: lower cumulative loss indicates cost-effective recovery pathways, supporting trade-off analysis between rapid but resource-intensive interventions versus gradual but sustainable restoration approaches. Table 2 lists the symbols used in the formulas of this article and their meanings, so as to facilitate a better understanding of the article.

3.4.4. Intervention Strategy Simulation

Emergency intervention strategies are modeled as external signals to evaluate recovery effectiveness. At time step t, a repair action on degraded node vi is characterized by recovery rate ri, representing functional improvement per unit time. The value of ri depends on node type and response resource, which is derived from historical emergency data and domain experts, as shown in Table 3.
The recovery rate values in Table 3 were determined through a systematic three-step process: (1) Analysis of historical port incident recovery data documenting repair durations for different facility types across multiple disaster scenarios, (2) Consultation with port operations personnel and emergency response coordinators to validate realistic recovery timelines and resource deployment effectiveness, and (3) Calibration against documented restoration processes in similar maritime disasters. For example, the 0.20 recovery rate for ships on fire reflects typical fireboat suppression effectiveness observed in historical incidents; the 0.08 rate for damaged quay cranes accounts for the time-intensive requirements of structural assessment, component replacement, and safety certification; and the 0.40 rate for blocked roads represents rapid clearance capability achievable with standard emergency equipment. These empirically grounded values ensure that intervention simulations maintain fidelity to real-world operational constraints and enable realistic assessment of alternative recovery strategies.
The intervention is encoded as a binary indicator ai(t) ∈ {0, 1}, and it is combined with the recovery rate ri and concatenated with original node features as augmented input to the STGNN.
Based on this intervention mechanism, four typical recovery strategies are defined:
(1)
No Intervention: No repair actions are taken; the system relies on natural recovery.
(2)
Random Intervention: A node is randomly selected from the set of degraded nodes for repair.
(3)
Centrality-Priority: Nodes are prioritized for repair in descending order of betweenness centrality weight wi.
(4)
Path-Blocking: Nodes located on high-risk propagation paths are prioritized to disrupt cascading failures. The high-risk paths are determined by weighted out-degree of dependency edges.
For each strategy, the simulation proceeds as follows:
Starting from t = 1, select the intervention node according to the strategy at each time step;
Inject ai(t) and ri as input features;
Use the STGNN to predict state FIi(t + 1) of all nodes;
Repeat until t = 24, generating a complete evolution trajectory.
From the simulated system functionality curve S(t), the three resilience metrics PRL, RT and RA are computed. The optimal recovery strategy is selected by minimizing the Resilience Area (RA), which captures the total functional loss over time.

4. Case Study

This section validates the proposed framework through comprehensive sensitivity analysis on misdeclaration-induced disaster scenarios at Ningbo, Tianjin, Beirut ports and simulated scenarios. By systematically varying intervention strategies, environmental conditions, misdeclaration characteristics, and management parameters, the experiments quantify their respective impacts on system resilience and identify critical governance priorities for port safety enhancement.

4.1. Experimental Setup

This study employs a controlled variable approach to design a series of comparative scenarios, in which all parameters are held constant except for one key variable. This allows for an isolated assessment of its impact on system resilience. The experimental variables include:
(1)
Emergency resource allocation strategies;
(2)
Environmental risk levels;
(3)
Types and deviation magnitudes of false declarations;
(4)
Initial regulatory intensity and emergency response levels.
Each experiment simulates a 24-h disaster evolution process, generating the system functionality curve S(t). The experimental setup ensures that variations in these metrics can be attributed to the manipulated variable, enabling rigorous causal inference. By systematically altering one factor at a time, the study quantifies the marginal effects of policy levers, operational decisions, and environmental conditions on port resilience. The integration of real accident data with synthetically enhanced scenarios ensures that findings are both empirically grounded and broadly applicable across diverse port configurations and risk profiles.
As to dataset preparation, all real and synthesized scenarios are serialized into JSON-formatted temporal snapshots. Each snapshot contains node attribute arrays, edge adjacency matrices with typed relationships and weights, and global environment vectors. The final dataset comprises 100 scenarios totaling 2400 time-step samples. It was divided using stratified random sampling to ensure balanced representation across key dimensions including disaster type (category/weight/temperature-control misdeclarations), severity levels (low/medium/high deviation rates), and port facility configurations (Ningbo/Tianjin/Beirut/generic terminals). Specifically, 70 scenarios (70%) were allocated to training, 15 scenarios (15%) to validation for hyperparameter tuning, and 15 scenarios (15%) to the held-out test set for final evaluation. The stratification process ensured that each subset maintained proportional representation of rare disaster types and extreme severity cases. Final reported metrics are derived from evaluation on the test set that remained completely unseen during training and validation.

4.2. Sensitivity Analysis of Intervention Strategies

To systematically evaluate the impact of emergency resource allocation strategies on port system resilience, this section conducts a series of comparative experiments grounded in the real-world incident at Ningbo Port. In this case a container declared as ordinary cargo was found to contain Class 5.2 organic peroxides, which spontaneously ignited under high-temperature conditions, triggering a fire. The fire propagated through transportation edges to adjacent containers and quay cranes. The initial regulatory intensity is set to 0.5, the emergency response level to II, and the initial system functionality S(0) = 1.0.
All intervention actions follow the mechanism defined in Section 3.4.4: an intervention indicator ai(t) = 1 and the corresponding recovery rate ri are injected into the selected node. The STGNN model then autonomously simulates the subsequent state evolution. Each experiment is independently run for a full 24-h simulation period.
Figure 3 presents the system functionality index S(t) under the four intervention strategies. In the absence of intervention, system functionality reaches its minimum at hour 6 (S(t) = 0.38) and recovers slowly, reaching 0.62 by hour 24, indicating limited self-recovery capacity. With intervention, all strategies significantly improve system performance, but with notable differences in effectiveness.
Table 4 summarizes the resilience metrics for each strategy. The results show that:
Centrality-Priority Intervention reduces the recovery time (RT) from 18.2 to 10.5 h and decreases the resilience area (RA) by 41.2%, demonstrating superior overall performance.
Path-Blocking Intervention, while not directly repairing the highest-weight nodes, achieves the lowest peak resilience loss (PRL = 0.58), indicating a unique advantage in suppressing disaster propagation.
Random Intervention yields moderate improvement (RT = 14.7 h) but is significantly outperformed by the two prioritized strategies, highlighting the inefficiency of arbitrary resource allocation.
Further node-level analysis reveals that the Centrality-Priority strategy prioritizes the repair of quay cranes, rapidly restoring loading/unloading capacity and driving the recovery of overall logistics functionality. In contrast, the Path-Blocking strategy focuses on repairing main road segments that connect multiple burning containers, effectively containing fire spread to other operational zones, embodying a “control by strategic points” governance logic.
These findings indicate that scientifically designed intervention strategies can significantly enhance system resilience. Specifically, topology-based prioritization excels in accelerating recovery, while path-based blocking is more effective in mitigating peak losses. This provides quantitative support for emergency command decisions involving “protecting core assets” versus “containing spread.”
To quantify the stability and reliability of model predictions, we computed 95% confidence intervals for system functionality indices using bootstrap resampling. For each intervention strategy, the model was evaluated on all 15 test scenarios. Bootstrap resampling (1000 iterations) was then applied: in each iteration, 15 scenarios were randomly sampled with replacement from the test set, and the mean system functionality S(t) at each time step was computed. This process generated 1000 functionality curves, from which the 2.5th and 97.5th percentiles were extracted to form the 95% confidence intervals at each time point.
Figure 4 presents the system functionality evolution curves with confidence bands under the four intervention strategies. The results demonstrate consistent model predictions across test scenarios, with confidence intervals remaining relatively narrow during most phases of disaster evolution. Notably, the intervals widen during critical transition periods (approximately hours 4–8) when cascading failures are most dynamic, which is expected behavior reflecting inherent variability in disaster propagation patterns. During the recovery phase (hours 12–24), the confidence intervals narrow significantly, indicating stable prediction of intervention effectiveness. This bootstrap analysis validates the model’s generalization capability and prediction reliability on unseen test scenarios.

4.3. Environmental Factor Sensitivity Analysis

External environmental conditions, particularly meteorological factors, exert a significant influence on the evolution trajectory of disaster incidents triggered by misdeclared hazardous goods at ports. To quantitatively assess the impact mechanisms of wind speed and temperature on disaster propagation intensity and system functional loss, this section systematically adjusts wind speed and temperature levels within the context of a hazardous material storage misdeclaration incident at Tianjin Port. The initial disturbance originates in the core area of the container yard, with fire spreading through transportation edges to adjacent containers and operational facilities. All other parameters are held constant (regulatory intensity = 0.5, emergency response level = Level II), while the following two categories of environmental variables are adjusted:
Wind Speed:
Low wind speed: 3 m/s (light breeze, negligible impact on fire spread);
Medium wind speed: 8 m/s (strong wind, accelerating lateral fire spread);
High wind speed: 12 m/s (gale, significantly intensifying flame propagation and increasing the risk of firebrand ignition).
Wind direction is set to blow from the ignition point toward the main loading/unloading area, establishing a dominant propagation path.
Temperature:
Normal temperature: 25 °C (standard operating conditions);
High temperature: 35 °C (extreme summer weather).
Temperature affects the weight of dependency edges by modulating the thermal decomposition rate of hazardous chemicals. Based on the Arrhenius kinetic model, the self-accelerating decomposition rate of nitrocellulose at 35 °C is estimated to be approximately 40% higher than at 25 °C, making the initial disturbance more likely to trigger a cascading reaction.
Figure 5 illustrates the evolution of the system function index S(t) under different wind speeds. It is evident that higher wind speeds lead to a more rapid decline in system function and an earlier occurrence of the minimum point. Under high wind speed, the fire breaches the yard’s isolation barrier by the 4th hour, causing multiple containers to ignite consecutively. The system function reaches its lowest point at the 5th hour (S(t) = 0.29, PRL = 0.71), which represents a 20.3% increase compared to the PRL (0.59) under low wind speed. Furthermore, under high wind conditions, RT extends to 16.4 h (+38.6%), and RA increases from 7.93 to 11.02, a rise of 39.0%. These results indicate that strong winds significantly compress the emergency response time window and exacerbate the cascading effects of the disaster.
Figure 6 compares the fire propagation paths under high temperature and normal temperature at the 4th hour. Under high temperature, the accelerated thermal decomposition rate of nitrocellulose causes five adjacent containers to spontaneously ignite within two hours of the initial ignition, forming a “multi-point outbreak” pattern. In contrast, under normal temperature, the fire spreads slowly from a single point, affecting only two neighboring nodes. The simulation results show that high temperature increases PRL from 0.60 to 0.66, extends RT from 12.1 h to 15.3 h (+26.4%), and increases RA from 8.78 to 11.27, a 28.1% increase. This demonstrates that high temperature not only accelerates the development of the initial fire but also significantly increases the probability of secondary disasters.
Table 5 summarizes the impact of wind speed and temperature on system resilience. The results show that high wind speed has the most significant impact on the overall functional loss of the system (RA) (+39.0%), followed by high temperature (+28.1%). Both environmental factors reduce the system’s disturbance resistance capability and prolong the recovery cycle by enhancing the intensity of disaster propagation.

4.4. Sensitivity Analysis of Misdeclaration Type and Deviation Rate

In port hazardous goods regulation, misdeclaration behavior is the core human factor triggering major accidents. Different types of misdeclarations, e.g., concealing cargo category, falsifying weight, misreporting temperature control status possess distinct levels of concealment and destructive mechanisms. To evaluate the differences in their impact on system resilience, this section designs a series of comparative experiments based on the Beirut Port ammonium nitrate misdeclaration incident scenario, systematically analyzing the sensitivity of disaster evolution pathways to misdeclaration type and deviation rate.
The initial state involves containers declared as ordinary fertilizer but actually carrying ammonium nitrate, which decomposes and ultimately explodes under prolonged damp and high-temperature conditions. Environmental conditions (temperature = 30 °C, wind speed = 5 m/s) and management parameters (regulatory intensity = 0.5, emergency response level = Level II) are fixed. The experimental variables are designed as follows:
Misdeclaration Type:
Category Misdeclaration: Declaring ammonium nitrate as ordinary organic fertilizer, completely concealing its hazardous chemical nature.
Weight Misdeclaration: The actual weight is 150% of the declared value, leading to excessive stacking density and accelerated heat accumulation.
Temperature-Control Misdeclaration: Falsely declaring that the container has temperature control capabilities when it does not, resulting in reaction runaway under high temperatures.
Misdeclaration Deviation Rate:
Defined as the relative difference between the declared value and the true value, used to quantify the severity of the misdeclaration.
Three levels are set: 10% (minor deviation), 30% (moderate deviation), 50% (severe deviation).
For category misdeclaration, the deviation rate reflects the “hazard level difference” of the concealment.
The initial disturbance probability increases linearly with the deviation rate, reaching a maximum of 2.5 times the baseline value.
Figure 7 presents the system function evolution curves for the three types of misdeclaration under a moderate deviation rate (30%). It is evident that category misdeclaration causes the system function to plummet to 0.28 at the 14th hour (PRL = 0.72), significantly earlier than the other types, with the most sluggish recovery process (RT = 17.6 h). The fundamental reason is that category misdeclaration completely circumvents hazardous goods regulatory procedures, resulting in ammonium nitrate being stored long-term in non-dedicated yards without temperature monitoring or isolation measures, ultimately leading to an explosion with no prior warning.
In contrast, although weight misdeclaration does not alter the cargo’s nature, the excessive stacking density hinders heat dissipation. Multiple containers sequentially catch fire between the 10th and 12th hours, leading to a gradual functional decline (PRL = 0.68, RT = 15.2 h). Temperature-control misdeclaration, while still recognized as a temperature-sensitive cargo and thus subject to some initial monitoring, collapses functionally after the 16th hour due to the actual lack of cooling, exhibiting a “delayed outbreak” characteristic (PRL = 0.65, RT = 14.8 h).
Table 6 summarizes the average resilience metrics for the three misdeclaration types at different deviation rates. The results show that:
The Resilience Area (RA) for category misdeclaration is consistently the highest. When the deviation rate increases from 10% to 50%, the RA rises from 6.42 to 12.38, an increase of 92.8%, far exceeding that of weight misdeclaration (+68.5%) and temperature-control misdeclaration (+54.3%).
The Peak Resilience Loss (PRL) for category misdeclaration is the most sensitive to the deviation rate. For every 10% increase in deviation rate, the PRL increases on average by 0.042, indicating a nonlinear growth in its destructive power.
Under high deviation rates (50%), weight misdeclaration leads to a significant extension of the RT, reflecting its sustained consumption of system recovery resources.

4.5. Sensitivity Analysis of Initial Regulatory Intensity and Emergency Response Level

Within the port safety governance system, initial regulatory intensity and emergency response level are two critical management decision variables. The former reflects the capability of daily risk prevention and control, while the latter embodies the preparedness level for emergency response. To evaluate their impact on resilience in scenarios involving misdeclaration-induced disasters, this section systematically analyzes the independent and synergistic effects of regulatory intensity and response level based on the average performance across 15 simulated accident scenarios. The experimental variables are designed as follows:
Initial Regulatory Intensity (ρ):
Defined as the regulatory system’s ability to identify and intercept misdeclaration behaviors, with a value range of [0, 1].
Low regulation (ρ = 0.3): Low inspection rate, insufficient inspection coverage.
Medium regulation (ρ = 0.5): Standard regulatory level, corresponding to current practices in most ports.
High regulation (ρ = 0.7): Enhanced inspections, combining AI-driven risk analysis and multi-source data verification.
Regulatory intensity directly affects the initial disturbance probability pinit, modeled by the relation:
p i n i t = p 0 1 ρ
where p0 is the baseline disturbance probability in the absence of regulation.
Emergency Response Level:
Level II (Routine): Activating on-site emergency teams and deploying routine rescue resources.
Level III (Enhanced): Deploying specialized teams and opening emergency access routes.
Level IV (Emergency): Full port coordination, external support forces engaged, recovery rates enhanced.
The response level influences the system recovery process by increasing the recovery rate ri for various nodes.
A full factorial design is adopted, resulting in 3 × 3 = 9 combinations. Each group runs 15 scenarios, and the results are averaged to output PRL, RT, and RA.
Figure 8 presents the system function evolution curves under different combinations of regulatory intensity and response level. It is evident that regulatory intensity significantly delays the onset of disaster: under high regulation (ρ = 0.7), the start of the system function decline is delayed by 3.2 h, indicating that effective proactive prevention can postpone the triggering of disasters. In contrast, the response level primarily affects the recovery slope—the higher-level response makes the curve rebound more steeply, significantly shortening the recovery time.
Table 7 summarizes the average resilience metrics for the nine combinations. The results show that:
Increasing regulatory intensity from 0.3 to 0.7 reduces the PRL from 0.71 to 0.62 (−12.7%), shortens the RT from 16.8 h to 13.2 h (−21.4%), and decreases the RA by 29.6%.
Elevating the response level from Level II to Level IV only reduces the PRL by 5.6% (0.71 → 0.67), but shortens the RT by 25.0% (16.8 → 12.6 h) and reduces the RA by 22.8%.
The combination of high regulation and Level IV response performs optimally (PRL = 0.58, RT = 11.4 h, RA = 7.03), reducing the RA by 38.2% compared to the baseline combination (ρ = 0.5, Level II).
Further analysis of marginal effects reveals:
When regulatory intensity is low (ρ = 0.3), enhancing the response level can shorten the RT, but the PRL remains high, indicating that “post-event response” cannot compensate for “pre-event loss of control.”
When regulatory intensity is high (ρ = 0.7), even initiating only a Level II response results in system recovery performance superior to the low-regulation + Level IV response combination, corroborating the resilience governance principle that “prevention is better than response.”
Figure 9 visually demonstrates the negative correlation between regulatory intensity and initial disturbance probability, validating the rationality of the model setup. Figure 10 shows that a Level IV response increases the recovery rate for critical nodes such as quay cranes and roads by over 40%, significantly accelerating the reconstruction of system functionality.

5. Conclusions

This study presents a comprehensive analytical framework for port safety risks associated with the misdeclaration of hazardous goods, integrating multimodal perception, dynamic heterogeneous spatiotemporal graph modeling, and quantitative resilience assessment. By employing a Spatiotemporal Graph Neural Network (STGNN), the framework enables high-fidelity simulation of disaster evolution processes. A series of multidimensional sensitivity experiments were conducted to reveal the underlying mechanisms through which intervention strategies, environmental factors, and management decisions influence system resilience. The principal findings are as follows:
(1)
The modeling approach based on Dynamic Heterogeneous Spatiotemporal Graphs (DHSG) effectively captures the complex coupling mechanisms of disasters triggered by misdeclaration. By distinguishing between transportation edges, dependency edges, and temporal edges, the model accurately characterizes the propagation pathways of fires, explosions, and other hazards across spatial, functional, and environmental dimensions. The synergistic architecture of Qwen2.5-VL and STGNN facilitates an end-to-end mapping from multimodal data such as images and textual reports to node states, thereby enhancing the automation and objectivity of state perception.
(2)
Scientifically designed emergency intervention strategies can significantly enhance the system’s recovery capacity. Sensitivity analyses demonstrate that the Centrality-Priority Intervention strategy, which prioritizes the restoration of nodes with high betweenness centrality, can reduce the Recovery Time by 42%. Conversely, the Path-Blocking Intervention strategy proves effective in suppressing the Peak Resilience Loss, validating the guiding value of topological structure in emergency decision-making. By incorporating recovery rates as external control signals into the model, the framework supports what-if simulation for strategy pre-evaluation and effect comparison.
(3)
Environmental factors exert a significant amplifying effect on disaster evolution. Increased wind speed and temperature markedly accelerate fire spread, leading to a 12–20% increase in Peak Resilience Loss and a 28–39% expansion in Resilience Area. Specifically, high wind speeds primarily compress the response window by enhancing spatial diffusion, while elevated temperatures trigger a multi-point outbreak pattern by accelerating chemical reaction rates, underscoring the necessity of proactive early warnings and physical isolation measures under extreme meteorological conditions.
(4)
Among various types of misdeclaration, category misdeclaration poses the highest risk. Unlike weight or temperature-control misdeclarations, category misdeclaration completely circumvents the regulatory system, allowing risks to accumulate silently over time and ultimately resulting in a latent eruption. The Resilience Area associated with category misdeclaration exhibits nonlinear growth with increasing deviation rates; at a 50% deviation rate, Resilience Area increases by 92.8% compared to the 10% baseline, highlighting the critical role of intelligent image analysis and multi-source data verification technologies in source-level prevention.
(5)
The initial regulatory intensity exerts a more favorable impact on system resilience than the emergency response level. Enhancing regulatory intensity significantly reduces the initial disturbance probability and delays disaster onset. Its improvement in the Resilience Area of 29.6% surpasses that achieved by upgrading the response level at 22.8%. The synergistic effect of high regulatory intensity and a high-level emergency response yields optimal resilience performance, substantiating the governance principle of prevention over response and providing quantitative justification for resource allocation.
The effectiveness of the proposed framework stems from the synergistic interaction between its LLM and STGNN components. The LLM component addresses the fundamental data scarcity challenge by synthesizing physically consistent disaster scenarios that preserve multimodal relationships including spatial layouts, temporal event progression, and functional dependencies. This enables the STGNN to learn from diverse training examples reflecting a wide range of disaster evolution patterns. The STGNN captures complex spatiotemporal dependencies through heterogeneous graph convolutions that distinguish transportation/dependency/temporal edge semantics for spatial coupling, and through parallel GRU-TCN architecture that captures both gradual degradation and abrupt transitions in temporal evolution. This complementary design enables accurate prediction of cascading failures and quantification of intervention effects—capabilities essential for port recovery planning but unattainable with either component alone or with traditional resilience models that rely on static indicators or predefined transition rules. Based on findings of the research, the following policy recommendations are proposed:
(1)
Establish an integrated intelligent supervision platform that combines risk pre-identification, dynamic simulation, and resilience evaluation, incorporating AI-powered image analysis, multi-source data verification, and disaster simulation capabilities to enhance the efficiency of misdeclaration detection.
(2)
Develop a differentiated emergency response strategy repository that prescribes intervention priorities based on specific misdeclaration types and environmental conditions, enabling precision rescue through topology-based resource allocation.
(3)
Elevate the regulatory standards for hazardous material storage during high-temperature and high-wind seasons, restrict the dense stacking of high-risk cargo, and strengthen real-time monitoring of temperature and humidity to mitigate environmental amplification effects.
(4)
Promote peacetime-emergency integration in resilient infrastructure development by maintaining emergency access routes and reserve resources during regular operations to minimize response delays.
Future research will explore several promising directions:
(1)
Transfer learning applications, where the trained STGNN can be adapted to other ports with limited incident data through domain adaptation techniques, requiring only modest retraining on local facility configurations.
(2)
Integration with digital twin systems, where the framework could provide real-time resilience forecasting by processing live operational data and generating predictive disaster scenarios for proactive risk management.
(3)
Extension to multi-port network analysis, modeling cascading disruptions across connected ports in regional supply chains.
(4)
Integration of real-time decision support, where the intervention simulation capability could be deployed in emergency operation centers to evaluate response strategies during actual incidents. These extensions would significantly enhance the practical impact and generalizability of our methodology.
In summary, the resilience assessment framework developed in this study not only provides a scientific tool for managing misdeclaration risks at ports but also offers a transferable methodological paradigm for the safety governance of complex infrastructure systems.

Author Contributions

Conceptualization, B.S.; methodology, B.S. and Y.W.; software, Y.W.; validation, B.S.; formal analysis, B.S.; investigation, L.X.; resources, L.X.; data curation, L.X.; writing—original draft preparation, Y.W.; writing—review and editing, B.S.; visualization, Y.W.; supervision, B.S. and L.X.; project administration, B.S.; funding acquisition, B.S. All authors have read and agreed to the published version of the manuscript.

Funding

This study was supported by the National Key R&D Program of China (Grant No. 2024YFC3014500), funded by the Ministry of Science and Technology of the People’s Republic of China.

Data Availability Statement

Third-party data. The availability of these data is limited. The data are obtained from third parties and are available from the authors with the permission of the third parties.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Zhang, L.; Su, W.; Liao, S.; Wang, S. Enhancing energy security through multi-scale network analysis: Robustness in global crude oil shipping–trade networks. Reliab. Eng. Syst. Saf. 2026, 265 Pt A, 111525. [Google Scholar] [CrossRef]
  2. Li, C.; Yang, X.; Yang, D. Port vulnerability to natural disasters: An integrated view from hinterland to seaside. Transp. Res. Part D Transp. Environ. 2025, 139, 104563. [Google Scholar] [CrossRef]
  3. Wang, P.; Wang, M.; Zuo, L.; Xi, M.; Li, S.; Wang, Z. Risk assessment of marine disasters in fishing ports of Qinhuangdao, China. Reg. Stud. Mar. Sci. 2023, 60, 102832. [Google Scholar] [CrossRef]
  4. Al Maalouf, N.J.; Mouawad, C. Assessing the impact of the Beirut port explosion on supply chain management and seaport infrastructure in Lebanon: A pathway to resilience and reform. Transp. Res. Interdiscip. Perspect. 2025, 32, 101555. [Google Scholar] [CrossRef]
  5. Liou, J.-H.; Tseng, P.-H. Exploring the ship operation safety indicators of international ports in Taiwan. Marit. Transp. Res. 2024, 6, 100111. [Google Scholar] [CrossRef]
  6. Ashraf, S.; Garg, H.; Kousar, M. An industrial disaster emergency decision-making based on China’s Tianjin city port explosion under complex probabilistic hesitant fuzzy soft environment. Eng. Appl. Artif. Intell. 2023, 123 Pt B, 106400. [Google Scholar] [CrossRef]
  7. Wu, X.; Wang, K.; Fu, X.; Jiang, C.; Zheng, S. How would co-opetition with dry ports affect seaports’ adaptation to disasters? Transp. Res. Part D Transp. Environ. 2024, 130, 104194. [Google Scholar] [CrossRef]
  8. Zhuang, H.; Chen, Y.; Gu, S.; Vanelslander, T. Leveraging structural topic modeling to compare “Green Port” research and practice. Reg. Stud. Mar. Sci. 2025, 86, 104196. [Google Scholar] [CrossRef]
  9. Wu, Y.; Yang, J.; Tu, M.; Naseem, M.H. Comprehensive assessment of port resilience: A three-stage cycle approach. Reg. Stud. Mar. Sci. 2025, 86, 104170. [Google Scholar] [CrossRef]
  10. Tang, M.; Zhang, Y.; Li, C.; Song, Y.; Huang, H.; Niu, W.; Zhang, C. Assessment of port resilience based on Evidential Reasoning and Bayesian network: An improved framework by segmenting the metrics across time and performance dimensions. Reliab. Eng. Syst. Saf. 2025, 262, 111172. [Google Scholar] [CrossRef]
  11. Xu, B.; Tian, Y.; Li, J. Revealing spatiotemporal connections in container hub ports under adverse events through link prediction. J. Transp. Geogr. 2025, 125, 104198. [Google Scholar] [CrossRef]
  12. Li, S.; Xie, J.; Wang, X.; Mei, Z.; Cai, N. A digital Twin-based bi-directional deduction method for the full-pose of the Floating connection mechanism. Measurement 2024, 224, 113905. [Google Scholar] [CrossRef]
  13. Xin, X.; Yang, Z. Resilience assessment: Insights from port community structures across the global container shipping network. Reliab. Eng. Syst. Saf. 2026, 265 Pt A, 111489. [Google Scholar] [CrossRef]
  14. Wang, S.; Wang, H.; Ma, X.; Han, Y.; Xue, G.; Zhang, L.; Li, Y. Resilience analysis and recovery strategy for interdependent automated container port networks under cascading failures. Reliab. Eng. Syst. Saf. 2026, 265 Pt A, 111495. [Google Scholar] [CrossRef]
  15. Song, B.; Shi, L.; Ma, Z. An assessment of shipping network resilience under the epidemic transmission using a SEIR Model. J. Mar. Sci. Eng. 2025, 13, 1166. [Google Scholar] [CrossRef]
  16. Polydoropoulou, A.; Velegrakis, A.; Papaioannou, G.; Karakikes, I.; Bouhouras, E.; Thanopoulou, H.; Chatzistratis, D.; Monioudi, I.; Moschopoulos, K.; Chatzipavlis, A. A composite port resilience index focused on climate-related hazards: Results from Greek ports’ living-labs. Marit. Transp. Res. 2025, 9, 100136. [Google Scholar] [CrossRef]
  17. Chang, Z.; Suo, M.; Fan, H.; Wang, J.; Lai, W. Port resilience assessment under congestion disruptions. J. Sea Res. 2025, 207, 102611. [Google Scholar] [CrossRef]
  18. Xing, Z.; Zhou, C.; Shen, Y.; Chew, E.P.; Tan, K.C. Optimizing port system resilience through integrated preparedness. Reliab. Eng. Syst. Saf. 2026, 266, 111770. [Google Scholar] [CrossRef]
  19. Fernando, G.-S.; Gina, G.; Daniel, R.-R. A Bayesian network and DEMATEL-ISM based approach for evaluating resilience: A case study in inland waterway port. Ocean Coast. Manag. 2025, 270, 107881. [Google Scholar] [CrossRef]
  20. Liu, G.; Liu, S.; Li, X.; Li, X.; Gong, D. Multiscenario deduction analysis for railway emergencies using knowledge metatheory and dynamic Bayesian networks. Reliab. Eng. Syst. Saf. 2025, 255, 110675. [Google Scholar] [CrossRef]
  21. Zhang, J.; Wang, L.; Shi, M.; Li, Z.; Liu, B. Analysis of risk evolution mechanisms for hydrogen leakage in HECS: A dynamic Bayesian network and scenario deduction approach. Int. J. Hydrogen Energy 2025, 163, 150804. [Google Scholar] [CrossRef]
  22. Lu, F.; Meng, F.; Bi, H. Scenario deduction of explosion accident based on fuzzy dynamic Bayesian network. J. Loss Prev. Process Ind. 2025, 96, 105613. [Google Scholar] [CrossRef]
  23. Ma, M.; Hua, X.; Zhang, Y.; Zhai, Z. Spatiotemporal polynomial graph neural network for anomaly detection of complex systems. Measurement 2024, 235, 115035. [Google Scholar] [CrossRef]
  24. Bai, X.; Ma, Z.; Zhou, Y. Data-driven static and dynamic resilience assessment of the global liner shipping network. Transp. Res. Part E Logist. Transp. Rev. 2022, 161, 102661. [Google Scholar] [CrossRef]
  25. Lin, H.; Zeng, W.; Luo, J.; Nan, G. An analysis of port congestion alleviation strategy based on system dynamics. Ocean Coast. Manag. 2022, 229, 106336. [Google Scholar] [CrossRef] [PubMed]
  26. Gao, P.; Shuai, B.; Zhang, R.; Wang, B. STCAGNN-RNKDE: A traffic accident prediction model for spatiotemporal combinatorial attention graph neural networks using Ripley’s K and network kernel density estimation. Reliab. Eng. Syst. Saf. 2025, 265, 111593. [Google Scholar] [CrossRef]
  27. Li, L.; Wei, C.; Liu, J.; Chen, J.; Yuan, H. Assessing port cluster resilience: Integrating hypergraph-based modeling and agent-based simulation. Transp. Res. Part D Transp. Environ. 2024, 137, 104485. [Google Scholar] [CrossRef]
  28. Srisurin, P.; Pimpanit, P.; Jarumaneeroj, P. Evaluating the long-term operational performance of a large-scale inland terminal: A discrete event simulation-based modeling approach. PLoS ONE 2022, 17, e0278649. [Google Scholar] [CrossRef]
  29. Gu, B.; Liu, X. Data-driven approach for port resilience evaluation. Transp. Res. Part E Logist. Transp. Rev. 2024, 186, 103558. [Google Scholar] [CrossRef]
  30. Cuong, T.N.; Long, L.N.B.; Kim, H.-S.; You, S.-S. Data analytics and throughput forecasting in port management systems against disruptions: A case study of Busan Port. Marit. Econ. Logist. 2023, 25, 61–89. [Google Scholar] [CrossRef]
  31. Khan, R.U.; Yin, J.; Asad, M.; Alomayri, T.; Jameel, M. Predicting seaport disruptions from natural hazards using automated machine learning. Mar. Pollut. Bull. 2025, 213, 117610. [Google Scholar] [CrossRef] [PubMed]
  32. Lei, Z.; Dong, Y.; Li, W.; Ding, R.; Wang, Q.; Li, J. Harnessing large language models for disaster management: A survey. arXiv 2025, arXiv:2501.06932. [Google Scholar] [CrossRef]
  33. Bai, S.; Chen, K.; Liu, X.; Wang, J.; Ge, W.; Song, S.; Dang, K.; Wang, P.; Wang, S.; Tang, J.; et al. Qwen2. 5-vl technical report. arXiv 2025, arXiv:2502.13923. [Google Scholar]
Figure 1. Integrated data processing and modeling pipeline of the hybrid Large Language Model-Spatiotemporal Graph Neural Network framework. The data flows in the framework shown in Figure 1 as follows: Raw multimodal disaster data (surveillance videos, satellite images, investigation reports, news) → Qwen2.5-VL multimodal LLM performs cross-modal alignment, visual segmentation, and state extraction → JSON-formatted temporal snapshots containing node attributes, edge relationships, and environmental variables → Dynamic Heterogeneous Spatiotemporal Graph (DHSG) construction with time-evolving feature matrices and typed adjacency tensors → STGNN model combining heterogeneous graph convolution and parallel GRU-TCN temporal module → Node-level functional index predictions → Topology-weighted aggregation produces system functionality and three resilience metrics (Peak Resilience Loss, Recovery Time, Resilience Area). Intervention signals (repair actions and recovery rates) are injected as external control variables to enable “what-if” simulation of alternative emergency strategies.
Figure 1. Integrated data processing and modeling pipeline of the hybrid Large Language Model-Spatiotemporal Graph Neural Network framework. The data flows in the framework shown in Figure 1 as follows: Raw multimodal disaster data (surveillance videos, satellite images, investigation reports, news) → Qwen2.5-VL multimodal LLM performs cross-modal alignment, visual segmentation, and state extraction → JSON-formatted temporal snapshots containing node attributes, edge relationships, and environmental variables → Dynamic Heterogeneous Spatiotemporal Graph (DHSG) construction with time-evolving feature matrices and typed adjacency tensors → STGNN model combining heterogeneous graph convolution and parallel GRU-TCN temporal module → Node-level functional index predictions → Topology-weighted aggregation produces system functionality and three resilience metrics (Peak Resilience Loss, Recovery Time, Resilience Area). Intervention signals (repair actions and recovery rates) are injected as external control variables to enable “what-if” simulation of alternative emergency strategies.
Jmse 13 02280 g001
Figure 2. DHSG (Disaster Heterogeneous Spatiotemporal Graph) evolution during disaster propagation. (a) System under normal conditions; (b) Network disruption leading to container node failure; (c) Cascading failure affecting ship and water area nodes.
Figure 2. DHSG (Disaster Heterogeneous Spatiotemporal Graph) evolution during disaster propagation. (a) System under normal conditions; (b) Network disruption leading to container node failure; (c) Cascading failure affecting ship and water area nodes.
Jmse 13 02280 g002aJmse 13 02280 g002b
Figure 3. System functionality evolution under different intervention strategies.
Figure 3. System functionality evolution under different intervention strategies.
Jmse 13 02280 g003
Figure 4. Stability assessment with 95% bootstrap confidence intervals: (a) No Intervention; (b) Random Intervention; (c) Centrality-Priority; (d) Path-Blocking.
Figure 4. Stability assessment with 95% bootstrap confidence intervals: (a) No Intervention; (b) Random Intervention; (c) Centrality-Priority; (d) Path-Blocking.
Jmse 13 02280 g004aJmse 13 02280 g004b
Figure 5. System functionality evolution under different wind speeds.
Figure 5. System functionality evolution under different wind speeds.
Jmse 13 02280 g005
Figure 6. Fire spread paths under different temperatures at the 4th hour.
Figure 6. Fire spread paths under different temperatures at the 4th hour.
Jmse 13 02280 g006
Figure 7. System functionality evolution under three types of misdeclaration (deviation rate = 30%).
Figure 7. System functionality evolution under three types of misdeclaration (deviation rate = 30%).
Jmse 13 02280 g007
Figure 8. Average system functionality under different regulatory intensities and response levels.
Figure 8. Average system functionality under different regulatory intensities and response levels.
Jmse 13 02280 g008
Figure 9. Relationship between regulatory intensity and initial disturbance probability.
Figure 9. Relationship between regulatory intensity and initial disturbance probability.
Jmse 13 02280 g009
Figure 10. Average recovery rate enhancement under different response levels.
Figure 10. Average recovery rate enhancement under different response levels.
Jmse 13 02280 g010
Table 1. Node categories and extracted attributes.
Table 1. Node categories and extracted attributes.
Node CategoryExtracted Attributes
Infrastructure
(Containers, Quay Cranes, Vessels, Roads, Water Areas)
ID, Type, Spatial Coordinates, Geometric Extent, Status Label, Damage Ratio, Functional Index
Response Units
(Emergency Vehicles, Personnel, Fire Stations)
ID, Type, Real-time Location, Status (Standby/Deployed/Operating), Service Range, Available Resources
Environmental Factors
(Wind Speed, Wind Direction, Temperature)
Real-time Measurements, Alert Thresholds, Amplification Coefficient for Disaster Propagation
Management Variables
(Global Parameters)
Regulatory Intensity (0–1), Emergency Response Level (I–IV), Misdeclaration Type, Misdeclaration Deviation Rate
Table 2. Notation used in the methodology section.
Table 2. Notation used in the methodology section.
SymbolDescriptionType/RangeRelated Equations
i, jIndices of nodes in DHSGInteger(3), (4), (13)
tDiscrete time stept = 0, 1, …, T(1)–(3), (5)–(9), (13)–(18)
G(t)Dynamic heterogeneous spatiotemporal graphGraphSection 3.2, (4)
viNode i in DHSGNode(3), (4), (13)
Statusi(t)Operational status of node iCategorical(3)
xiStatic node attributesFeature vector(3)
DamageAreai(t)Damaged area at time tReal ≥ 0(1)
TotalAreaiGeometric area of node iReal > 0(1)
DamageRatioi(t)Damage ratio of node i[0, 1](1), (2)
FIi(t)Functional index of node i[0, 1](2), (13)
FI^i(t + 1)Predicted functional index[0, 1](11), (12)
h i l , t Hidden node feature in layer lVector(4)
RSet of edge typesFinite setSection 3.2.2
Nr(vi)Neighbor set under edge type rSet(4)
w r l Edge-type weight matrixMatrix(4)
eij, aijEdge weight and attention coefficientRealSection 3.3.1
ztGRU hidden stateVector(5)–(8)
ctTCN outputVector(9)
h t f u s e d Fused spatiotemporal featureVector(10)
αAttention weight[0, 1](10)
S(t)System functionality[0, 1](13), (16)
SFDI(t)System Function Degradation Index[0, 1](14)–(18)
PRLPeak Resilience Loss[0, 1](15)
RTRecovery TimeHours ≥ 0(16)
RAResilience AreaReal ≥ 0(17), (18)
ai(t)Intervention indicator{0, 1}Section 3.4.4
riRecovery rate(0, 1]Section 3.4.4
ΔtTime step sizeΔt = 1 h(18)
Table 3. Recovery rates by node type and resource.
Table 3. Recovery rates by node type and resource.
Node TypeResponse ResourceRecovery Rate ri
Ship (on fire)Fireboat0.20
Container (collapse)Clearance vehicle0.15
Quay Crane (damaged)Maintenance crew0.08
Road (blocked)Clearance vehicle0.40
Water Area (oil spill)Oil recovery vessel0.10
Table 4. Comparison of resilience metrics across intervention strategies.
Table 4. Comparison of resilience metrics across intervention strategies.
Intervention StrategyPRLRT (Hours)RA
No Intervention0.6218.210.84
Random Intervention0.6114.78.92
Centrality-Priority Intervention0.6010.56.37
Path-Blocking Intervention0.5811.87.01
Table 5. Comparison of the impact of wind speed and temperature on system resilience.
Table 5. Comparison of the impact of wind speed and temperature on system resilience.
Environmental VariableLevelPRLRT (Hours)RA
Wind Speed3 m/s0.5911.97.93
8 m/s0.6514.19.87
12 m/s0.7116.411.02
Temperature25 °C0.6012.18.78
35 °C0.6615.311.27
Table 6. Average resilience metrics under different misdeclaration types and deviation rates.
Table 6. Average resilience metrics under different misdeclaration types and deviation rates.
Misdeclaration TypeDeviation RatePRLRT (Hours)RA
Category Misdeclaration10%0.5813.46.42
30%0.7217.69.81
50%0.8219.312.38
Weight Misdeclaration10%0.5612.85.91
30%0.6815.28.47
50%0.7518.99.96
Temperature-Control Misdeclaration10%0.5412.55.73
30%0.6514.87.82
50%0.7116.58.83
Table 7. Average resilience metrics for different regulatory and response level combinations.
Table 7. Average resilience metrics for different regulatory and response level combinations.
Regulatory
Intensity ρ
Response LevelPRLRT (Hours)RA
0.3Level II0.7116.811.52
Level III0.6914.910.18
Level IV0.6712.68.87
0.5Level II0.6615.310.34
Level III0.6413.79.02
Level IV0.6212.18.01
0.7Level II0.6213.28.12
Level III0.6012.47.35
Level IV0.5811.47.03
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Song, B.; Weng, Y.; Xia, L. Port Resilience Assessment for Misdeclaration Induced Disasters Using a Hybrid LLM-GNN Framework. J. Mar. Sci. Eng. 2025, 13, 2280. https://doi.org/10.3390/jmse13122280

AMA Style

Song B, Weng Y, Xia L. Port Resilience Assessment for Misdeclaration Induced Disasters Using a Hybrid LLM-GNN Framework. Journal of Marine Science and Engineering. 2025; 13(12):2280. https://doi.org/10.3390/jmse13122280

Chicago/Turabian Style

Song, Bo, Yanjun Weng, and Laiqun Xia. 2025. "Port Resilience Assessment for Misdeclaration Induced Disasters Using a Hybrid LLM-GNN Framework" Journal of Marine Science and Engineering 13, no. 12: 2280. https://doi.org/10.3390/jmse13122280

APA Style

Song, B., Weng, Y., & Xia, L. (2025). Port Resilience Assessment for Misdeclaration Induced Disasters Using a Hybrid LLM-GNN Framework. Journal of Marine Science and Engineering, 13(12), 2280. https://doi.org/10.3390/jmse13122280

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop