Typhoon and Storm Surge Hazard Analysis Along the Coast of Zhejiang Province in China Using TCRM and Machine Learning

Fang, Yong; Li, Xiangyu; Sun, Yanhua; Li, Ailian; Guo, Yunxia

doi:10.3390/jmse13061017

Open AccessArticle

Typhoon and Storm Surge Hazard Analysis Along the Coast of Zhejiang Province in China Using TCRM and Machine Learning

by

Yong Fang

¹

,

Xiangyu Li

¹,

Yanhua Sun

¹,

Ailian Li

²

and

Yunxia Guo

^1,*

¹

College of Mathematics and Systems Science, Shandong University of Science and Technology, Qingdao 266590, China

²

National Marine Data and Information Service, Tianjin 300171, China

^*

Author to whom correspondence should be addressed.

J. Mar. Sci. Eng. 2025, 13(6), 1017; https://doi.org/10.3390/jmse13061017

Submission received: 22 April 2025 / Revised: 21 May 2025 / Accepted: 21 May 2025 / Published: 23 May 2025

(This article belongs to the Section Ocean and Global Climate)

Download

Browse Figures

Versions Notes

Abstract

Zhejiang Province in China is one of the most typhoon-prone regions globally, making typhoon and storm surge hazard analysis critically important for disaster mitigation. This study integrates the Tropical Cyclone Risk Model (TCRM) with a machine learning-based storm surge forecasting model to analyze typhoon hazards and storm surge risks at four representative coastal sites in Zhejiang Province: Haimen, Ruian, Wenzhou, and Zhapu. Firstly, the input database of the TCRM has been updated and subsequently used to generate a 1000-year synthetic typhoon event catalog for the Northwest Pacific region. Secondly, four machine learning models—Long Short-Term Memory (LSTM), Back Propagation (BP), Support Vector Regression (SVR), and Random Forest (RF)—were developed to forecast storm surge component at the four sites, with sensitivity analysis conducted on the input parameters. Among the four models, RF consistently outperformed the others across all four sites. Thirdly, by integrating the storm surge forecasting model with the Yan Meng (YM) typhoon wind field model, extreme wind speed sequences and extreme surge component sequences were derived for the four coastal sites. Finally, four extreme value distribution models—empirical distribution, Weibull, Gumbel, and Generalized Pareto Distribution (GPD)—were applied to fit the extreme wind and surge sequences. Goodness-of-fit tests indicated that the GPD best captured extreme wind speeds at all four sites and extreme surge levels at Haimen, Ruian, and Wenzhou. Using the optimal distributions, return periods (10-, 50-, 100-, and 200-year) for extreme wind speeds and surge components were calculated, providing actionable references for disaster risk management authorities.

Keywords:

TCRM; machine learning; typhoon; storm surge; hazard analysis

1. Introduction

The economic loss caused by the storm surge hazard is much higher than that caused by any other marine disaster in China, the loss from the severe storm surge hazard being the highest [1]. Zhejiang Province is one of the regions most severely affected by storm surge disasters in China’s coastal areas, primarily dominated by typhoon-induced storm surges. Historically, Zhejiang has experienced multiple catastrophic typhoon storm surge events, which have significantly impacted the socio-economic conditions of its coastal regions. According to statistics, from 1949 to 2020, a total of 46 typhoons made landfall in Zhejiang Province, averaging 0.6 per year; there were 136 notable typhoon storm surge processes, averaging 2 per year. Research by Sun et al. [2] and Weisse et al. [3] indicates that factors such as temperature changes and rising sea levels have led to an increasing trend in both the frequency and intensity of typhoons making landfall in China. Consequently, storm surge disasters along the Zhejiang coast are also on the rise, particularly in the southern coastal areas of Zhejiang. A statistical study of storm surge magnitude makes it possible to determine the severity, and therefore better forecast and prevent losses [4]. Accurately assessing the risk of typhoons and their associated storm surges is of great significance for disaster prevention and mitigation efforts in Zhejiang Province and the nation as a whole.

To address the limitations of limited sample sizes and data quality in typhoon hazard analysis, the international scientific community has developed a stochastic simulation framework integrating three key components. Firstly, stochastic typhoon modeling generates synthetic cyclones with plausible meteorological parameters. Secondly, a coupled storm surge numerical model simulates inundation dynamics for each synthetic event by resolving hydrodynamic interactions between wind fields, atmospheric pressure, and coastal topography. Thirdly, probabilistic frequency analysis using extreme value theory estimates storm surge heights at specified return periods. This integrated approach systematically addresses the challenges of empirical data scarcity while quantifying uncertainties in hazard assessment.

The stochastic modeling of tropical cyclones has evolved through key methodological innovations. Russell [5] pioneered the sector-annulus model for synthetic cyclone generation in the Gulf of Mexico, subsequently applying it to typhoon wind speed estimation within the basin. Building upon this foundation, Shapiro [6] enhanced the methodology by integrating a bespoke typhoon wind field model, enabling cross-regional typhoon hazard assessments with demonstrated predictive skill across diverse geographical regimes. Li and Duan [7] achieved comparable advancements in their research on coastal typhoon risks by employing a similar stochastic framework. A paradigm shift occurred in 2000 with the development of the full-track stochastic typhoon model by Vickery et al. [8]. They discretized the Atlantic basin into spatially homogeneous grids and conducted multivariate regression analyses on historical track parameters (e.g., genesis location, translation speed, central pressure) to derive empirical track algorithms for propagation and intensification. This approach has since undergone numerous refinements. Mudd et al. [9] conducted IPCC RCP8.5 scenario-driven hurricane simulations. Rosowsky et al. [10] applied Bayesian calibration to empirical track models to enhance the representation of intensity decay after landfall.

Storm surge forecasting has been predominantly conducted through three methodological paradigms: traditional empirical approaches, numerical modeling techniques, and contemporary artificial intelligence-driven solutions. Traditional empirical methods demonstrate inherent limitations due to their susceptibility to subjective interpretation and stochastic error propagation. Numerical modeling approaches, though physically rigorous, impose prohibitive computational demands that constrain operational implementation [11]. To address these critical challenges, neural network architectures have emerged as transformative tools for typhoon-induced storm surge prediction, leveraging their exceptional capacity to model complex nonlinear relationships between meteorological forcing and hydrodynamic responses. This paradigm shift has achieved dual objectives: enhanced prediction accuracy and significant computational efficiency gains. Lee T L et al. [12] applied the artificial neural network to predict short-term typhoon surges in order to overcome the problem of exclusivity and nonlinear relationships. Based on Random Forest (RF), Wang [13] applied an assessment model to evaluate regional flood hazard risk. Zhu et al. [14] pioneered a machine learning framework through Random Forest regression analysis of 98 historical typhoon events (1989–2018) across China’s southeastern coastal regions (Guangdong, Fujian, Zhejiang). Their data-driven model demonstrated robust predictive capability for maximum surge elevation estimation, with validation errors within 15% of field observations. In parallel innovation, Miao et al. [15] developed a Long Short-Term Memory (LSTM) neural network specifically optimized for Xiamen Harbor surge dynamics. Systematic comparison with conventional approaches (BP neural networks, support vector machine, and linear regression) revealed the LSTM architecture’s superior performance in short-term forecasting (RMSE reduction > 32%), establishing a new benchmark for operational early warning systems. Tian et al. [16] employed LSTM, Convolutional Neural Networks (CNNs), and Informer Deep Learning (DL) models for forecasting storm surges over the next 1 h, 3 h, 6 h, 12 h, and 18 h. Sun et al. [17] used the Res-U-Net structure neural network to predict the storm surge process of typhoons in the Pearl River Estuary and achieved good prediction results.

Substantial scholarly efforts have been devoted to storm surge hazard analysis. Jia et al. [18] investigated the development of a kriging surrogate model for storm surge prediction utilizing an existing database of high-fidelity, synthetic storms. Zhang [19] developed a refined multi-scale risk identification framework for typhoon storm surge disasters in Wenzhou by integrating ENVI (5.5)-based remote sensing interpretation, GIS spatial analytics, SPSS (IBM SPSS Statistics 25) statistical modeling, and BP neural networks. Wang et al. [20] and Li et al. [21] evaluated the risk of storm surge disasters using the ADCIRC-SWAN (Advanced Circulation Model–Simulating Waves Nearshore) coupled model. Li et al. [22] established a life loss risk assessment model for coastal city storm surge composite floods based on hydrodynamic models and Copula functions. Guo et al. [23] collected long-term tidal data from 13 representative hydrological stations along Zhejiang’s coastal zone to systematically analyze the spatial heterogeneity characteristics of storm surge hazards in Zhejiang Province. Yu et al. [24] employed a two-dimensional Copula joint probability function to construct a vine-structured probability distribution model for compound disaster encounter combinations. Scholars such as Rizzi et al. [25] and Wang et al. [26] constructed a risk assessment index system from different perspectives and evaluated the risk of storm surge disasters.

In recent years, AI forecasting models have shown significant application potential in the field of rapid storm surge forecasting due to their efficient modeling capabilities and real-time computing advantages. In terms of data-driven methods, Xie et al. [27] first applied ConvLSTM to storm surge inundation forecasting in the Pearl River Estuary, achieving autoregressive prediction of sea surface height fields through historical typhoon datasets, reducing RMSE by 23% compared with traditional methods. Qin et al. [28] developed an ANN-MIMO model integrating multivariate inputs such as wind speed and atmospheric pressure, achieving a 6 h forecast error < 15 cm along China’s southeastern coast. For multi-model coupling, Su et al. [17] proposed a regional hierarchical forecasting framework: first generating high-resolution simulation data via ADCIRC-SWAN, then training neural networks for rapid inference, improving computational efficiency by 40-fold. Gharehtoragh et al. [29] developed a climate change proxy model incorporating terrain evolution data, maintaining 85% confidence in century-scale storm surge predictions. Regarding hybrid methods with enhanced physical constraints, Zhu et al. [30] demonstrated in Bohai Sea case studies that Physics-Informed Neural Networks (PINNs) embedded with Navier–Stokes equations reduced extreme storm surge forecast bias by 34%.

The prediction of extreme wind speeds or water-level sequences predominantly relies on the fitting of extreme value distributions. In typhoon hazard assessment, the most widely adopted extreme value distributions include the Gumbel distribution (Type I), Fréchet distribution (Type II), and Weibull distribution (Type III). Guo et al. [31] conducted a systematic comparative analysis of these parametric models in typhoon-induced storm surge studies along China’s southeastern coast. Notably, the empirical distribution function constructed solely from raw observational data without prior assumptions about wind speed distribution tail behavior demonstrated superior applicability for large-sample hydrological scenarios. Early research paradigms predominantly favored the Fréchet distribution for modeling typhoon maximum wind speed sequences. However, seminal work by Simiu and Filliben [32] challenged this convention, statistically verifying that the Gumbel distribution provides a more robust fit for extreme wind speed characterization. Subsequent advancements by Simiu and Heckert [33] revealed that under the Peak Over Threshold (POT) framework, the Generalized Pareto Distribution (GPD) outperforms traditional extreme value distributions in extreme wind speed estimation, particularly for heavy-tailed datasets. Georgios et al. [34] provided a comparison between the generalized extreme value (GEV) distribution and the metastatistical extreme value distribution (MEVD) on their ability to predict “unseen” upper-tail quantiles of storm surge along the US coastline and showed that predictions from the MEVD are more robust with less variability in error. While the empirical distribution function maintains broad applicability across diverse scenarios, the selection of parametric extreme value distribution models remains critical, as it directly governs the predictive accuracy of extreme wind speed quantiles in typhoon risk modeling.

Previous studies by these scholars reveal that most researchers have treated typhoon and storm surge hazard analyses as separate processes. Lin [35] employed conventional numerical models for typhoon-induced storm surge forecasting. Based on our numerical simulation experience, simulating the storm surge process for a single typhoon using the ADCIRC model takes approximately 30 min under 64 CPU cores. Therefore, numerical simulations for tens of thousands of synthetic typhoons would require prohibitive computational resources. In contrast, Yao [36] utilized the Tropical Cyclone Risk Model (TCRM) to generate extensive synthetic typhoon events. They conducted numerical simulations for a limited number of typhoons to obtain storm surge heights. Then they used these simulation results to establish training samples for neural network models, ultimately developing neural network-based surge forecasting systems. Nevertheless, Yao [36] did not conduct probabilistic hazard analyses for typhoons or storm surges. Moreover, as numerical model outputs inherently deviate from in situ water level observations, neural network models trained on such simulation-derived data risk amplifying systematic prediction biases. Building upon the methodology, this study selects Zhejiang Province, depicted in Figure 1, as the target region, employs TCRM to generate synthetic typhoon events, and uses historical observed water level data from the Collection of Storm Surge Disasters Historical Data in China as neural network training samples for storm surge forecasting. Multiple neural network architectures, including LSTM, BP, Support Vector Machine (SVM), and Random Forest (RF), are evaluated through cross-validation to identify the optimal predictive model. Subsequently, the extreme wind speed sequence derived from the YM wind field model and the extreme surge height sequence predicted by the RF model are fitted using the empirical, Gumbel, Weibull, and GPD extreme value distributions. The optimal distribution, selected through Kolmogorov–Smirnov goodness-of-fit tests, is applied to estimate extreme wind speeds and surge heights across 10- to 200-year return periods at four representative coastal stations in Zhejiang Province. These findings provide statistically robust references for disaster management authorities in resilience planning and infrastructure protection decision-making.

2. Data and Methods

2.1. Data Source

The data sources for this study primarily comprise three components: the CMA (China Meteorological Administration) Best Track Dataset, observed typhoon track data retrieved from the Wenzhou Typhoon Network, and measured storm surge height data from the Collection of Storm Surge Disasters Historical Data in China. Detailed data specifications are presented in Table 1.

The CMA Best Track Dataset is sourced from the Tropical Cyclone Data Center of China Meteorological Administration, which systematically records 6-hourly positions and intensities of tropical cyclones in the western North Pacific basin (including the South China Sea, north of the equator, and west of 180° E) from 1949 to 2023. The parameters of the CMA Best Track Dataset include timestamp (YYYYMMDDHH), intensity indicator (I), latitude (LAT), longitude (LONG), minimum central pressure (PRES), 2 min averaged maximum sustained wind speed near the center (WND), and 2 min averaged wind speed (OWD) as shown in Table 1. Among these parameters, the typhoon serial number, timestamp, latitude, longitude, and minimum central pressure are required as input for TCRM. The synthetic typhoons generated by TCRM are subsequently used to predict storm surge heights and typhoon wind speeds.

Since the CMA Best Track Dataset provides typhoon track data at 6 h intervals, while the storm surge neural network forecasting model requires hourly resolution typhoon track data and corresponding surge height data, this study acquired hourly observational typhoon track data through web scraping from the Wenzhou Typhoon Website. The extracted typhoon parameters include timestamp, longitude, latitude, intensity, 2 min averaged maximum sustained wind speed near the center, translation speed, and central pressure, as shown in Table 1.

To improve the accuracy of the storm surge neural network forecasting model, training sample data integrates observational records from the Collection of Storm Surge Disasters Historical Data in China, which systematically archives 221 storm surge events affecting China’s coastal regions between 1949 and 2009. These records contain detailed textual descriptions and graphical illustrations of disaster impacts, surge heights, and instances of high tide levels exceeding local warning thresholds, as shown in Table 1. Using WebPlotDigitizer, an image data extraction tool, observational surge height data were digitized from historical charts. Both the digitized surge height data and the high-resolution hourly typhoon track data were employed to train the machine learning models for storm surge forecasting.

2.2. Methods

2.2.1. TCRM Typhoon Virtual Model

TCRM (Tropical Cyclone Risk Model) is an open-source computational tool developed by Geoscience Australia (GA), primarily used for synthetic typhoon generation and typhoon-induced wind speed hazard assessment. The model simulates typhoon behavior and impacts through statistical and parametric methods. Its statistical engine can generate millennial-scale synthetic typhoon datasets based on input historical typhoon track data, preserving statistical consistency with historical records.

TCRM primarily comprises five core modules: the Data Processing Module, Statistical Analysis Module, Track Generation Module, Wind Field Model Module, and Hazard Risk Assessment Module. The Data Processing Module reads and processes input historical typhoon track databases, precisely extracting typhoon intensity and positional parameters, including translation velocity, heading, and genesis location. The Statistical Analysis Module utilizes files generated by the preceding module to evaluate statistical relationships among parameters while calculating probability density functions (PDFs) for initial parameters (initial translation velocity, heading, and intensity) and genesis probabilities. The Track Generation Module initiates new typhoon events by sampling from genesis locations and initial parameter distributions, then progressively advances these events temporally by leveraging typhoon autoregressive properties, thereby generating thousands of stochastic typhoon events sharing statistical attributes with the input dataset. The Wind Field Model Module calculates maximum wind speeds around each synthetic typhoon event, offering multiple boundary layer model options. The Hazard Risk Assessment Module estimates return period wind speeds using computed wind fields through three methodologies: empirical distribution fitting, Generalized Pareto Distribution (GPD) fitting, and Generalized Extreme Value (GEV) distribution fitting.

2.2.2. Machine Learning Models

The BP neural network (back propagation neural network, BPNN) is a multi-layer feedforward network trained using error backpropagation, first proposed by Rumelhart et al. [37] in 1986. Its operational principle relies on the gradient descent method, employing gradient search techniques to minimize the mean squared error between the network’s actual output and expected output. The BP neural network architecture consists of an input layer, one or more hidden layers, and an output layer, with signal propagation occurring in two distinct phases: forward propagation and backward adjustment. During forward propagation, input signals pass through the hidden layers and undergo nonlinear transformations before reaching the output nodes, ultimately generating the network’s predicted outputs. If discrepancies exist between actual and expected outputs, the process transitions to error backpropagation. In backward propagation adjustment, output errors are redistributed hierarchically from the output layer through the hidden layer to the input layer, allocating error contributions to all network units. These error signals subsequently form the basis for optimizing inter-unit connection weights. The structure diagram of the BP neural network is illustrated in Figure 2.

Long Short-Term Memory (LSTM) networks, as a variant of Recurrent Neural Networks (RNNs), were developed to address the vanishing and exploding gradient challenges inherent in traditional RNNs, proposed by Hochreiter and Schmidhuber [38]. Their core architecture comprises memory blocks containing self-connected memory cells, which integrate specialized gated multiplicative units and preserve temporal network states to regulate information flow effectively. Subsequent optimizations to LSTM enabled processing of continuous input streams segmented into subsequences rather than unsegmented inputs. This enhancement allows resetting cell states at subsequence boundaries while incorporating forget gates into memory blocks. As illustrated in Figure 3, the LSTM network structure employs forget gates that selectively discard or reset cellular memory by scaling internal cell states before feeding them back through recurrent connections as cell inputs. For detailed information on LSTM networks, refer to Hochreiter and Schmidhuber [38].

The Statistical Learning Theory (SLT) proposed by Vapnik et al. [39] is a theory that studies the statistical learning laws for small sample sizes. Based on this theory, the Support Vector Machines (SVM) learning method has recently received extensive attention. SVM, a machine learning methodology originally developed for solving linear classification problems, extends to regression analysis as Support Vector Regression (SVR). Both approaches share core principles for predicting continuous output variables. The SVR model architecture is graphically represented in Figure 4. For detailed information on SVM and SVR models, refer to reference [39]. This study employs the Radial Basis Function (RBF) as the kernel function, mathematically expressed by Equation (1),

k (x, x^{'}) = \exp (- g {‖x - x^{'}‖}^{2})

(1)

In Equation (1), k denotes the kernel function, g represents the parameter to be optimized, x and

x^{'}

denote the input vectors, and

‖x - x^{'}‖

denotes the Euclidean distance between the vectors x and

x^{'}

. The objective function requiring optimization is shown in Equation (2),

\min_{w, b} \frac{1}{2} {‖w‖}^{2} + C \sum_{i = 1}^{N} (ξ_{i} + ξ_{i}^{*})

(2)

In Equation (2),

{‖w‖}^{2}

denotes the regularization term, C represents the penalty factor, and

ξ_{i}

and

ξ_{i}^{*}

indicate the slack variables, which reflect the degree of violation of constraint conditions by sample.

In 2001, Breiman [40] drew inspiration from the random decision forests method proposed by Ho of Bell Laboratories [41] and combined classification trees into Random Forests. Random Forest (RF) is an ensemble method composed of multiple tree-type classifiers, where each base classifier

\{h (x, θ_{k}), k = 1, \dots\}

is an unpruned classification and regression tree constructed using the CART (Classification and Regression Tree) algorithm. The input vector is denoted as x, while

\{θ_{k}\}

represents a set of independent and identically distributed random variables that govern the growth process of individual trees. For detailed information on the RF model, refer to reference [40]. The final RF output is obtained through simple majority voting of all tree predictions (for classification tasks) or simple averaging of tree outputs (for regression problems), with its flowchart illustrated in Figure 5.

2.2.3. Wind Field Model

In typhoon and storm surge hazard assessment, the accurate characterization of typhoon wind fields plays a pivotal role in ensuring the credibility of hazard evaluation results. Typhoon wind field simulation primarily involves two approaches: traditional empirical modeling and numerical modeling. Empirical models provide simplicity and computational efficiency but exhibit relatively large errors, whereas numerical models achieve higher simulation accuracy at the expense of substantial computational resources. This study employs the Yan Meng (YM) wind field model [42], which derives complete analytical solutions through dynamical theory and parametric assumptions, forming a semi-empirical and semi-theoretical framework. The YM model balances accuracy requirements for typhoon wind field simulation with computational efficiency, making it particularly suitable for large-scale synthetic typhoon wind field simulations. This model has been widely adopted in studies by Zhao et al. [43], Xie [44,45], and Guo et al. [46]. Wind speeds at specified altitudes are calculated by combining gradient wind speeds with frictional wind components at corresponding heights, with its mathematical formulation provided in Equation (3),

\frac{\partial V}{\partial t} + V \cdot \nabla V = - \frac{1}{ρ} \nabla p - f \vec{k} \times V + F

(3)

In Equation (3), V represents typhoon wind speed;

ρ

denotes air density;

Δ p

represents the pressure deficit between the central pressure and the normal ambient pressure;

\vec{k}

indicates the vertical unit vector; f corresponds to the Coriolis parameter; and F signifies the boundary layer friction force. This model is grounded in the Holland pressure field model [47], with its specific mathematical formulation provided in Equation (4),

p = p_{0} + Δ \exp [- {(R_{\max} / r)}^{B}]

(4)

In Equation (4), p represents the sea-level pressure at a radial distance r from the typhoon center;

R_{\max}

indicates the radius of maximum winds; B corresponds to the shape parameter of the Holland pressure profile (ranging 0.5–2.5).

The YM wind field model decomposes wind speed V into two directional components: gradient wind speed

V_{g}

and surface frictional wind speed

V^{'}

, i.e.,

V = V_{g} + V^{'}

. Above the typhoon boundary layer, the friction can be neglected. And within the boundary layer, the radial wind pressure gradient can be approximately considered not to change with height. Therefore, Equation (3) can be decomposed into two parts, namely the gradient layer and the boundary layer, which are shown in Equations (5) and (6) as follows:

\frac{\partial V_{g}}{\partial t} + V_{g} \cdot \nabla V_{g} = - \frac{1}{ρ} \nabla p - f k \times V_{g}

(5)

\frac{\partial V^{'}}{\partial t} + V^{'} \cdot \nabla V + V^{'} \cdot \nabla V_{g} + V_{g} \cdot \nabla V^{'} = - f k \times V^{'} + F

(6)

In order to solve the numerical solutions of Equations (5) and (6), it is necessary to deal with the unsteady terms therein. In the free layer, the gradient wind speed

V_{g}

is translated according to the moving speed c of the typhoon, i.e.,

\frac{\partial V_{g}}{\partial t} = - c \cdot \nabla V_{g}

. In the boundary layer, the influence of friction is neglected, i.e.,

\frac{\partial V^{'}}{\partial t} = 0

.

Therefore, Equations (5) and (6) can be finally simplified as

(V_{g} - c) \cdot \nabla V_{g} = - \frac{1}{ρ} \nabla p - f k \times V_{g}

(7)

V^{'} \cdot \nabla V^{'} + V^{'} \cdot \nabla V_{g} + V_{g} \nabla V^{'} = - f k \times V^{'} + F

(8)

The boundary conditions of the upper atmosphere and the near-surface layer environment of the typhoon adopted by the YM model are Equations (9) and (10), respectively:

V^{'} |_{z^{'} \to \infty} = 0

(9)

{ρ k_{m} \frac{\partial V^{'}}{\partial z}|}_{z^{'} = 0} = ρ C_{d}| V_{s}| V_{s}|

(10)

From Equation (10), z represents the vertical coordinate, and z = 0 is set at the earth’s surface.

V_{s}

is the surface wind speed. Using the logarithmic wind profile near the ground, the drag coefficient

C_{d}

has the following relationship with the surface roughness length

z_{0}

:

C_{d} = {κ^{2} / \{\ln [(z_{10} + h - d) / z_{0}]\}}^{2}

(11)

From Equation (11), k is the Karman constant, and its value is taken as 0.4; h is the average height of the rough elements,

h = A z_{0}^{0.86}

, where A = 11.4; d is the zero-plane displacement, d = 0.75 h;

z_{10}

is set at a height of 10 m above the average rough element h.

The YM wind field model is grounded in perturbation balance equations and boundary-layer-friction-modified pressure gradient balance equations, incorporating an equivalent roughness length to account for complex topographic factors. This framework effectively characterizes force interactions within typhoon wind fields and enables analytical solutions of gradient wind equations with computational tractability [44,48]. For technical implementation specifics, refer to Meng et al. [42].

2.2.4. Extreme Value Distributions

Commonly utilized extreme value distributions for sequence fitting include empirical distribution, Weibull distribution, Gumbel distribution, and Generalized Pareto Distribution (GPD). The hazard risk module of the TCRM model similarly incorporates these extreme value distributions to evaluate typhoon wind speed hazards. Through the application of extreme value theory, this approach facilitates the estimation of extreme wind speeds and surge heights corresponding to distinct return periods, thereby generating foundational data support for typhoon-induced storm surge hazard and risk analysis.

Assume that the obtained extreme wind speed samples are denoted as

v_{i} (i = 1, 2, \dots, m)

. The nonexceedance probability of maximum wind speed V induced by typhoons below a certain threshold

v_{i}

within a specific time period t is defined as follows:

P (V < v_{i}, t) = \sum_{n = 0}^{\infty} P (V < v_{i} |n) p (n, t) = e^{- λ t (1 - F_{v_{i}})}

(12)

In Equation (12),

P (V < v_{i} | n)

represents the probability that the maximum wind speed V among n typhoons not exceeds the threshold v_i; F_vi denotes the probability that the maximum wind speed of a typhoon is less than the wind speed v_i, which corresponds to the extreme value distribution model; p(n,t) signifies the probability that a specific location is impacted by n typhoons within t years; and λ represents the parameter of the Poisson distribution. When t = 1, the annual nonexceedance probability can be derived as Equation (13).

P (V < v_{i}, 1) = e^{- λ (1 - F_{v_{i}})}

(13)

Further derivation yields Equation (14),

1 - \frac{1}{T} = P (V < v_{i}, 1) = e^{- λ (1 - F_{v_{i}})}

(14)

From the Equation (14), T denotes the return period of the extreme wind speed

v_{i}

. By applying the empirical distribution function to the simulated extreme wind speed sequence, the probability that any typhoon wind speed V is less than

v_{i}

can be obtained as

F_{v_{i}} = i / (m + 1)

(15)

From Equation (15), the wind speed expression for return period T derived from the empirical distribution is given in Equation (16),

v_{i} = \{i = (m + 1) [1 + \frac{\ln (1 - 1 / T)}{λ}]\}

(16)

In Equation (16), m represents the sample size of the extreme wind speed sequence, i denotes the order of extreme wind speed corresponding to return period T, and

λ

indicates the annual typhoon occurrence rate.

The probability density function expression of the Weibull distribution is given in Equation (17),

f (x, β, η, γ) = (\frac{β}{η}) (\frac{x - γ}{η})^{β - 1} \exp [- (\frac{x - γ}{η})^{β}], x, β, η, γ > 0

(17)

The calculation expression of wind speed for return period T is then provided in Equation (18),

v_{i} = γ + η {\{- \ln [- \ln (1 - 1 / T) / λ]\}}^{1 / β}

(18)

In Equations (17) and (18),

γ

represents the location parameter,

β

denotes the shape parameter, and

η

indicates the scale parameter.

The probability density function of the Gumbel distribution is provided in Equation (19),

f (x) = α \cdot \exp \{- \exp [- α (x - μ)]\} \cdot \exp [- α (x - μ)]

(19)

The calculation expression of wind speed for return period T is then provided in Equation (20),

v_{i} = μ - \ln \{- \ln [1 + \ln (1 - 1 / T) / λ]\} / α

(20)

In Equation (20), μ represents the location parameter, α denotes the scale parameter, and

λ

indicates the annual typhoon occurrence rate.

The probability density function of the GPD is provided in Equation (21),

f (x) = 1 - {(1 + c \frac{x - u}{b})}^{- \frac{1}{c}}

(21)

The wind speed calculation equation for return period T is provided in Equation (22).

v_{i} = u - b \{1 - {[η (u) T]}^{c}\} / c

(22)

In Equation (22), u represents the threshold, b denotes the scale parameter, c indicates the shape parameter, and

η (u)

represents the probability that x exceeds the threshold u.

3. Result Analysis

3.1. Construction of Virtual Typhoons

The TCRM model’s original input dataset originates from historical typhoon track records in the International Best Track Archive for Climate Stewardship (IBTRACS). TCRM generates statistically representative synthetic typhoon events across intensity categories by systematically expanding the sample library through stochastic resampling. While IBTRACS serves as TCRM’s foundational dataset, we identified two critical limitations: temporal obsolescence, as IBTRACS used by TCRM is limited to global typhoon track data spanning 1848–2009 and insufficient geographic resolution for localized applications. Given IBTRACS’s global coverage, its granularity proves suboptimal for regional-scale analyses. Our study focuses on typhoon storm surge-prone coastal zones in Zhejiang Province, necessitating higher spatial precision. To mitigate these limitations, we upgraded the typhoon database by replacing the original 1848–2009 IBTRACS dataset with the CMA Best Track Dataset (1949–2022) from the Tropical Cyclone Data Center, thereby enhancing typhoon simulation accuracy through regionally calibrated historical records.

The configuration process for TCRM is illustrated in Figure 6, which involves five sequential steps. Firstly, input parameters including typhoon identifier, time sequence, longitude, latitude, and central pressure are selected. Secondly, the output path for synthetic simulation results is defined. Thirdly, taking the Wenzhou station as a case study, the domain is set as a rectangular region spanning 115° E–126° E and 23° N–34° N. Fourthly, to conserve computational resources, a 100-year simulation period is adopted for comparative experiments between the CMA Best Track Dataset and the IBTRACS dataset. Finally, return periods of 10, 30, 50, 100, and 200 years are configured, with wind speed units set to m/s.

To investigate the impact of different input datasets (IBTRACS vs. CMA Best Track Dataset) on typhoon hazard analysis, we conducted parallel simulations using the IBTRACS dataset as input under identical configuration parameters. This generated synthetic typhoon events based on IBTRACS, enabling direct comparison with results derived from the CMA Best Track Dataset. For the 100-year return period, Figure 7a,b display the extreme wind speed distributions near Wenzhou simulated using the IBTRACS and CMA datasets, respectively. The black and red bounding boxes highlight the nearshore region of Zhejiang Province to emphasize spatial differences in wind speed magnitudes. Within this delineated zone, the CMA dataset produces systematically higher extreme wind speeds than IBTRACS across both coastal and inland areas of Zhejiang. Further quantitative comparison of return period wind speeds at Wenzhou Station, in Figure 8, reveals that the CMA dataset yields marginally higher maximum wind speeds across all tested return periods (10–100 years). These results suggest that the CMA Best Track Dataset provides more conservative estimates of typhoon wind speed hazards, reflecting its enhanced practical relevance for regional disaster prevention and mitigation planning.

Building upon the aforementioned 100-year return period results and comparative analysis, this study further presents 1000-year simulation outcomes, accompanied by systematic validation of typhoon initiation points and key parameters. The TCRM model generates an extensive ensemble of intensity-stratified synthetic typhoon events with statistically homogeneous spatial distributions, providing comprehensive typhoon track datasets essential for coastal typhoon hazard characterization. This methodology establishes a robust data foundation for typhoon risk quantification and storm surge hazard modeling. Accordingly, we implemented TCRM to develop a 1000-year synthetic typhoon catalog for the Northwest Pacific basin, containing 27,899 computationally generated events. The spatiotemporal distributions of track trajectories and corresponding intensity metrics are presented in Figure 9.

To rigorously assess the statistical consistency between synthetic and historical typhoon characteristics, we validated the genesis locations of the 1000-year synthetic typhoons generated by TCRM. This validation framework involves examination of typhoon genesis locations and typhoon key parameters, including central pressure and maximum sustained wind speed. The methodology employs comparative analysis of genesis location distributions between historical and synthetic typhoons, complemented by statistical comparison of frequency distributions for central pressure and 2 min averaged maximum wind speeds near the typhoon center. As shown in Figure 10, we compared the spatial distributions of historical (blue) and synthetic (red) typhoon genesis locations. The visualization reveals strong spatial consistency in clustering patterns, with both datasets exhibiting high-density genesis zones clustered within the 10° N–25° N latitudinal band. This spatial alignment statistically validates TCRM’s capability to reproduce observed genesis location characteristics in synthetic typhoon generation.

To conduct a storm surge hazard assessment along the Zhejiang coast, this analysis needs to extract synthetic typhoon events that affect this region from the generated 1000-year virtual typhoon event set. A rectangular domain encompassing four representative coastal stations—Haimen, Ruian, Wenzhou, and Zhapu—was shown in Figure 1, geographically bounded by 119° E–125° E and 26° N–32° N. From the comprehensive 1000-year synthetic typhoon catalog, 1294 events passing through this domain were identified. Each one is characterized by a temporal resolution of 1 h, forming the typhoon subset impacting Zhejiang’s coastal zone. The spatial trajectories of these selected typhoons are visualized in Figure 11. This refined dataset will serve as the basis for subsequent typhoon wind field calculations and storm surge modeling, enabling probabilistic hazard characterization across the study area.

To systematically validate the statistical consistency of typhoon parameters between synthetic and historical events, we divided the rectangular domain in Figure 1 into four 1.5°-resolution subregions, i.e., [119° E, 120.5° E], [120.5° E, 122° E], [122° E, 123.5° E], and [123.5° E, 125° E]. Subsequent analysis compared the frequency distributions of central pressure and 2 min averaged maximum wind speeds near the typhoon center between historical and synthetic typhoon events across these subregions, as shown in Figure 12 and Figure 13. Figure 12a–d presents the frequency distributions of central pressure for the four subregions, where blue histograms represent the results of historical typhoon events and red histograms denote synthetic counterparts. The results reveal strong statistical agreement in central pressure between synthetic and historical datasets across all subregions.

Figure 13a–d shows the frequency distributions of typhoon 2 min averaged maximum wind speeds near the typhoon center across the four subregions, with blue histograms denoting the statistical results of historical typhoon events and red histograms representing synthetic counterparts. Mirroring the central pressure analysis, the statistical frequency distributions of 2 min averaged maximum wind speeds near the typhoon center between synthetic and historical typhoon datasets demonstrate strong statistical alignment across all subregions.

These systematic comparisons collectively validate that the 1000-year synthetic typhoon catalog generated through TCRM achieves statistically robust simulations of tropical cyclone climatology—including genesis patterns, trajectory characteristics, and intensity metrics—specifically calibrated for Zhejiang Province’s coastal regions.

3.2. Machine Learning-Based Forecasting of Storm Surge Height

Storm surge forecasting methodologies primarily encompass empirical, numerical, and machine learning approaches. Given the inherent limitations of empirical forecasting (substantial predictive uncertainties) and the high computational costs of numerical modeling, this study employs machine learning techniques for storm surge prediction. The core objective of our machine learning framework is to leverage typhoon-related parameters for predicting site-specific surge elevations, enabling efficient simulation of synthetic typhoon-induced storm surges at targeted coastal stations. This approach systematically expands storm surge datasets while substantially lowering computational demands compared with conventional numerical modeling, particularly for large synthetic typhoon catalogs, thereby supporting probabilistic surge hazard analysis. Focusing on four representative coastal stations in Zhejiang Province, Haimen, Ruian, Wenzhou, and Zhapu, we developed station-specific machine learning models using historical typhoon parameters as inputs and observed surge heights as outputs. To rigorously assess model performance across architectures, four machine learning methods, i.e., RF, BPNN, LSTM, and SVR, were implemented for comparative performance analysis at each station, aiming to identify optimal predictive frameworks.

3.2.1. Data Preparation

To establish hourly resolution storm surge forecasting models for the four target stations, we constructed historical hourly typhoon track data paired with corresponding storm surge height. The hourly storm surge heights were sourced from the Collection of Storm Surge Disasters Historical Data in China, ensuring data authenticity and observational validity. Corresponding typhoon track data were obtained from the Wenzhou Typhoon Website (https://m.wztf121.com, accessed on 20 February 2024), which provides 1 h temporal resolution, a notable enhancement compared with the 6 h resolution of the CMA track dataset. Based on storm surge statistics from the Collection of Storm Surge Disasters Historical Data in China, 36 typhoon tracks impacting the four study stations were selected, as shown in Figure 14. We summarized the typhoon track data for each station, and the statistical results are presented in Table 2.

3.2.2. Input Parameter Experiment

Sensitivity experiments were conducted on the model’s input parameters. The selection of input parameters represents a critical step in model construction. Building on prior research, Yao [36] utilized typhoon eye coordinates (longitude (lon) and latitude (lat)), maximum wind speed (v_max), translational speed (v_T), central pressure (p_central), and radius of maximum winds (R_max) as inputs to develop an LSTM-based storm surge model for the northern South China Sea. Following this methodological framework, we initially adopted identical parameters for our typhoon-induced storm surge forecasting model. However, as historical typhoon track datasets lack two essential parameters, translational speed and radius of maximum winds, these values were derived computationally using Equations (23) and (24) [49].

v = \frac{β \sqrt{Δ l a t^{2} + Δ l o n^{2}}}{6}

(23)

\ln R_{\max} = 3.859 - 7.7 \times 10^{- 5} Δ p^{2}

(24)

In Equation (23),

Δ l a t

denotes the latitude difference between two consecutive timesteps;

Δ l o n

represents the longitude difference between two consecutive timesteps; and β is a latitude-dependent parameter, referring to Wang [50]. In Equation (24),

Δ p

corresponds to the difference between the central pressure and the standard atmospheric pressure.

Following Yao [36], we developed an LSTM model with identical architecture, a two-layer neural network containing 10 neurons per hidden layer. The dataset was partitioned into training (80%) and testing (20%) subsets, corresponding to seven typhoon events per station for model training and two events for validation. To optimize training efficiency, gradient descent algorithms were implemented to iteratively update network weights and biases, progressively minimizing the loss function value. Concurrent test set monitoring was employed to mitigate overfitting risks. Prior to model training, raw data underwent Z-score normalization to standardize input distributions, with the normalization formula for each feature defined in Equation (25),

x_{i}^{*} = \frac{x_{i} - μ}{σ}

(25)

In Equation (25), μ represents the mean value of the feature, while σ denotes its standard deviation.

To evaluate the performance of the predictive model, we selected the correlation coefficient (CC) between model predictions and observed values as the evaluation metric. Its computational formula is given in Equation (26),

C C = \frac{\sum_{i = 1}^{n} (x_{i} - \bar{x}) (y_{i} - \bar{y})}{\sqrt{\sum_{i = 1}^{n} {(x_{i} - \bar{x})}^{2}} \sqrt{\sum_{i = 1}^{n} {(y_{i} - \bar{y})}^{2}}}

(26)

In Equation (26), x represents the sample size, and

x_{i}

and

y_{i}

denote two sets of sampling points for the i-th indexed data, while

\bar{x}

and

\bar{y}

indicate the mean values of predicted and observed values, respectively.

Based on the aforementioned LSTM model and input parameters, the correlation coefficient (CC) between predicted and observed storm surge heights for the test sets at Haimen, Ruian, Wenzhou, and Zhapu stations were 0.71, 0.70, 0.77, and 0.60, respectively. These evaluation results indicate that the neural network model constructed with these input parameters exhibits limited predictive capability. This limitation likely stems from input parameters and typhoon intrinsic characteristics, which act as teleconnection factors that are insufficient to fully resolve localized surge dynamics. To improve forecasting accuracy, we proposed augmenting the model with localized environmental drivers, specifically wind fields and pressure fields. Consequently, the YM wind field model was employed to calculate site-specific pressure and wind speed values. Zonal (u) and meridional (v) wind speed components were further derived through Equation (27) and Equation (28), respectively.

u = v_{r_{s}} \cdot \sin θ + v_{θ_{s}} \cdot \cos θ

(27)

v = v_{r_{s}} \cdot \cos θ - v_{θ_{s}} \cdot \sin θ

(28)

In Equations (27) and (28),

θ

represents the azimuth angle between the observation point and the typhoon center,

v_{r_{s}}

denotes the radial wind speed component, direction from the typhoon center outward, and

v_{θ_{s}}

indicates the tangential wind speed component, which is the direction tangential to the typhoon’s circular motion.

To differentiate the effects of distinct input parameters on the results, we designated the teleconnection input parameters as Test1 (referring to Yao [36]), while the experiment incorporating local factors, i.e., site-specific zonal (u) and meridional (v) wind speed and pressure p, is termed Test2. Additionally, we implemented Test3, which combines both local factors and teleconnection parameters as inputs. Using the CC as the evaluation metric, the performance for all three experimental configurations is summarized in Table 3.

Comparative analysis of Table 3 demonstrates that integrating localized environmental drivers with teleconnection parameters as input features achieves optimal predictive performance. Consequently, the selected optimal input parameter combination comprises typhoon eye coordinates (lon, lat), maximum wind speed (v_max), translational speed (v_T), central pressure (p_central), radius of maximum winds (R_max), site-specific wind speed components (zonal (u) and meridional (v) wind speed and site-specific pressure p). These parameters form the input dataset for machine learning model of storm surge heights forecasting.

3.2.3. Model Comparison Experiments

As shown in Table 4, while the LSTM model demonstrates relatively satisfactory predictive performance, its results remain suboptimal. To explore potential improvements, we further implemented three additional machine learning models, BP model, RF model, and SVR model, aiming to identify a superior forecasting model through comparative analysis.

For the BP, RF, and SVR models, we maintained consistency with the LSTM model’s data partitioning methodology, applying the identical 80%:20% training–test split ratio to ensure comparability of model performance. The BP model employs a Z-score for data standardization. Its architecture comprises an input layer, a first hidden layer (64 neurons, ReLU activation), a second hidden layer (32 neurons, ReLU activation), and an output layer (1 neuron). During compilation, the model uses mean squared error (MSE) as the loss function with the Adam optimizer, trained over 100 epochs with a batch size of 32.

The RF model similarly applies Z-score for data standardization and maintains an 8:2 dataset split. In the RF model, 100 decision trees are configured, fixing the random seed at 42. The predicted values of the training set, the predicted values of the test set, and the actual observed values are subjected to inverse standardization. The results after inverse standardization are used for calculating error metrics and conducting visual analysis to ensure that the model output has physical significance.

The SVR model is implemented with a Radial Basis Function (RBF) kernel, and we set the hyperparameters to C = 100, γ = 0.1, and ϵ = 0.1. Input features and targets are standardized to eliminate unit discrepancies during training, followed by inverse standardization of predictions. Taking Wenzhou Station as an example, we compared the predicted and observed values of all four models, namely LSTM, BP, BF, and SVR, on the training and test sets, as shown in Figure 15 and Figure 16.

Based on the predictive results obtained from the four machine learning models, the correlation coefficients between observed and predicted values in the test sets are summarized in Table 4. Analysis of Table 4 reveals significant variations in training outcomes across different machine learning models for the four stations. Overall, the RF model demonstrates superior predictive performance compared with the BP, LSTM, and SVR models. The SVR model exhibits the poorest performance, with test set correlation coefficients ranging from 0.72 to 0.82. Conversely, the RF model achieves optimal performance, yielding correlation coefficients exceeding 0.85 and peaking at 0.93. To assess the generalization capability of the machine learning model, we performed cross-validation by alternating typhoon samples in the training and test sets. Taking Zhapu Station as an example, the last row of Table 4 provides the correlation coefficients of prediction results after replacing the test set typhoon samples. The minimal variation in correlation coefficients before and after data substitution demonstrates the model’s robust generalization ability. Similar patterns were observed across other stations, though not explicitly illustrated here due to space limitations. Consequently, the RF model is ultimately selected as the optimal model for storm surge height prediction for four stations.

3.3. Extraction of Extreme Wind Speeds and Extreme Storm Surge Heights

To assess typhoon and storm surge hazards in Zhejiang Province, 375 typhoon events impacting the region were extracted from the 1000-year synthetic typhoon catalog generated by TCRM in Section 3.1. Using the optimal RF forecasting model of storm surge identified in Section 3.2, we simulated storm surge elevations at Haimen, Ruian, Wenzhou, and Zhapu coastal stations. Partial surge elevation results for Wenzhou station are shown in Figure 17, with maximum surge elevations reaching 250 cm. We systematically extracted the maximum surge height and maximum wind speed induced by each typhoon event at Haimen, Ruian, Wenzhou, and Zhapu stations, establishing extreme surge height and wind speed sequences. Since line charts inadequately capture statistical distributions, we present the frequency distribution histograms of the extreme surge height sequences for all 375 storm surge events at each station, as shown in Figure 18. The histograms reveal that maximum surge elevations occur at extremely low frequencies across all stations, indicating that while extreme surge events have low probabilities of occurrence, their potential impacts could be severe. Notably, Wenzhou station exhibits higher surge magnitudes compared with other stations, underscoring the need for prioritized attention to its maximum surge scenarios. The mode of storm surge elevation differs markedly between stations. Haimen Station exhibits a modal surge in the 60–80 cm range, Ruian Station in the 20–40 cm range, and both Wenzhou and Zhapu stations in the 50–100 cm range. These variations reflect distinct typhoon-induced surge characteristics across locations.

In addition to storm surge hazard assessment, we also quantitatively evaluated typhoon wind hazards by computing wind speeds at four coastal stations for all 375 synthetic typhoons using the YM wind field model. Extreme wind speed sequences were derived by extracting the maximum wind speed generated at each station by each typhoon, and their frequency distribution histograms are presented in Figure 19. The results indicate that the extreme wind speeds (50–60 m/s) exhibit consistently low occurrence frequencies at all stations, confirming the meteorological rarity of extreme wind events. Statistical analysis reveals that modal wind speeds cluster uniformly in the 5–10 m/s range at all stations, confirming low-intensity winds as the predominant condition during typhoons. This distinct pattern underscores a critical feature of typhoon-induced wind speed distributions along Zhejiang’s coast: although sustained winds of 5–10 m/s exhibit relatively low destructive potential, their persistent prevalence demands proactive planning to address compounding infrastructural and environmental impacts over time.

3.4. Typhoon and Storm Surge Hazard Analysis

Extreme surge height and wind speed sequences serve as the foundational dataset enabling the prediction of extreme wind speeds and extreme surge heights for different return periods at each station through the extreme value distribution model. In this analytical framework, this study applied four extremal distributions, empirical distribution, Weibull distribution, Gumbel distribution, and Generalized Pareto Distribution (GPD), to fit the extracted extreme surge and wind speed sequences.

The determination of the threshold is critical for the GPD. We evaluated the GPD fitting performance using Quantile–Quantile (Q-Q) plots and accepted a candidate threshold if it resulted in a marked improvement in goodness-of-fit. Q-Q plots provide a visual assessment of the goodness-of-fit between theoretical and empirical distributions. Taking the extreme water level fitting at Zhapu Station as an example, candidate thresholds of 80 cm, 100 cm, and 120 cm were tested. As shown in Figure 20, the Q-Q plot for the 80 cm threshold demonstrates the most superior fit. Consequently, the optimal threshold for GPD fitting of extreme water levels at Zhapu Station was determined to be 80 cm. The same methodology was applied to determine the GPD fitting thresholds for other stations and extreme wind speeds. Detailed descriptions are omitted here for brevity.

The statistical goodness-of-fit for the Weibull, Gumbel, and GPD distributions was evaluated using the Kolmogorov–Smirnov (KS) test under a significance level of 0.05, with detailed test results presented in Table 5 and Table 6. In Table 5 and Table 6, the H value is either 0 or 1. When H = 0, it indicates that the difference between the sample distribution and the theoretical distribution is not statistically significant, i.e., the data conform to the specified distribution. When H = 1, it indicates that the difference between the sample distribution and the theoretical distribution is statistically significant, i.e., the data do not conform to the specified distribution. If the p-value is less than the set significance level of 0.05, the data do not conform to the distribution; if it is greater than 0.05, the data conform to the specified distribution.

The KS test results documented in Table 5 and Table 6 reveal consistent limitations of conventional distributions in modeling extremes. For both storm surge heights and typhoon wind speeds, the Weibull and Gumbel distributions demonstrate statistically inadequate performance, with hypothesis test outcomes H = 1, P ≈ 0 confirming significant deviations from empirical distributions. This contrasts sharply with the superior capability demonstrated by the threshold-exceeding approach. The GPD, specifically designed for modeling the heavy-tailed portion, achieves significantly better fit quality across both typhoon wind speeds and storm surge heights.

Visual validation through Figure 21 reinforces this finding: the optimized alignment between GPD-derived theoretical curves and observed frequency distributions at Haimen, Ruian, Wenzhou, Zhapu coastal stations confirms the model’s robust performance in characterizing extreme wind speeds.

To evaluate the GPD fitting performance for extreme storm surge elevations at the Haimen, Ruian, Wenzhou, and Zhapu stations, we constructed corresponding Q-Q plots as shown in Figure 22. As demonstrated in Table 6 and Figure 22, the GPD exhibits robust fitting performance for extreme storm surge heights at Haimen, Ruian, and Wenzhou stations, but demonstrates reduced congruence at Zhapu Station. This discrepancy originates from the empirical frequency distribution at Zhapu departing from heavy-tailed characteristics, manifested through insufficient exceedance data above the threshold and insufficient concentration of peak surge observations.

Based on these findings, we adopted the GPD as the optimal distribution for modeling extreme typhoon wind speeds at all four stations of Haimen, Ruian, Wenzhou, Zhapu and extreme storm surge elevations at the Haimen, Ruian, and Wenzhou stations, while prioritizing empirical distribution selection for extreme surge heights at Zhapu Station.

Based on the calibrated extreme value distributions, we quantitatively estimated extreme wind speeds and surge heights corresponding to 10-, 30-, 50-, 100-, and 200-year return periods at all stations. These estimates were derived using return period formulas associated with the empirical distribution, Weibull distribution, Gumbel distribution, and GPD, as defined in Equations (9), (11), (13), and (15), respectively. The predicted maximum wind speeds and maximum surge heights for different return periods at the four stations are presented in Table 7 and Table 8, respectively.

To more intuitively demonstrate variations in extreme wind speeds and storm surge heights across different return periods and stations, we constructed corresponding bar charts with 95% confidence intervals, as shown in Figure 23 and Figure 24. The narrower confidence intervals indicate that the return period estimates derived from the extreme value distribution adopted in this study exhibit enhanced stability.

As evidenced by Table 7 and Figure 23, wind speeds at all four stations demonstrate a systematic progression with extended return periods. For example, Haimen station documents wind speeds escalating from 34.17 m/s (10-year) to 51.81 m/s (200-year return period). This pattern quantitatively confirms the probability escalation of extreme meteorological events and their associated wind intensity amplification over multi-decadal timescales. Spatial analysis reveals marked inter-station divergence. Ruian station maintains relatively higher return period wind speeds, persistently surpassing Haimen and Zhapu stations throughout all investigated return intervals, while Wenzhou station achieves near-parity with Ruian’s values at defined recurrence epochs. These variations mechanistically originate from site-specific geographical determinants, encompassing the locational influence within typhoon trajectory corridors, the modulatory influence of mesoscale topography on wind field dynamics, and coupling effects between coastal geomorphology and marine processes. Illustratively, Ruian’s elevated wind susceptibility may reflect dual controls from typhoon pathway convergence zones and terrain-induced acceleration phenomena.

As revealed in Table 8 and Figure 24, storm surge heights at all four stations demonstrate hydrodynamic progression with elevated return periods. At Wenzhou station, the surge height climbs from 230.09 cm (10-year) to 261.30 cm (200-year return period), exemplifying the intensification of storm surge impacts under extreme meteorological forcing. This hydrodynamic progression quantitatively demonstrates that extreme-event water level anomalies become geophysically amplified, thereby escalating coastal inundation risks. Inter-station comparative analysis reveals distinct surge response regimes. Wenzhou station maintains the highest surge elevations across the four stations, while Zhapu station exhibits lower baseline surges but disproportionate escalation rates at multi-century return intervals. This spatial dichotomy primarily derives from site-specific oceanographic–topographic coupling. Wenzhou’s enhanced surge susceptibility stems from typhoon-optimized bathymetric configurations and nearshore resonance effects, whereas Zhapu’s nonlinear response pattern arises from estuarine funneling effects and tidal current modulation, which suppress low-intensity surges but trigger amplification thresholds under extreme forcing conditions.

Synthesis of multi-return period extreme surge heights and wind speeds reveals spatiotemporal heterogeneity in typhoon-induced storm surge impacts along Zhejiang’s coastline. This spatial differentiation necessitates systematic integration of site-specific parameters in storm surge hazard mitigation planning, particularly for infrastructure hardening and community-scale emergency preparedness. Fundamentally, the geospatial risk stratification between stations—with Ruian/Wenzhou exhibiting peak intensities versus Zhapu’s threshold-dependent escalation—demands differentiated engineering solutions ranging from typhoon-resilient building codes to estuarine floodgate retrofitting. These findings collectively establish that probabilistic hazard mapping based on station-specific extreme value distributions operationalizes coastal resilience planning across Zhejiang’s diverse littoral environments. The methodology framework, integrating synthetic typhoon modeling with multi-distribution frequency analysis, provides actionable intelligence for optimizing seawall design standards, calibrating early warning systems, and prioritizing coastal zone investments.

4. Summary and Prospects

4.1. Summary

This study integrates the TCRM model for typhoon generation with machine learning-based storm surge forecasting to analyze typhoon and storm surge hazards at Haimen, Ruian, Wenzhou, and Zhapu coastal stations in Zhejiang Province. Key findings are summarized as follows.

Firstly, for the traditional TCRM model, replacing the global IBTrACS tropical cyclone database (1848–2009) with the CMA best-track dataset (1949–2022) resulted in increased typhoon hazard levels predicted for Zhejiang’s coastal areas. The updated TCRM model was used to generate a synthetic typhoon event catalog for the western North Pacific over a 1000-year period. Secondly, based on historical typhoon track data from the Wenzhou Typhoon Network, storm surge height data from the Collection of Storm Surge Disasters Historical Data in China, and machine learning models, this study established storm surge forecasting models for the Haimen, Ruian, Wenzhou, and Zhapu stations. Sensitivity experiments on LSTM input parameters showed that models incorporating both teleconnection and local factors achieved optimal training and testing performance. Evaluations of LSTM, BP, SVR, and RF models demonstrated RF as the optimal model for all four stations. Thirdly, using selected synthetic typhoon events impacting Zhejiang Province from the 1000-year synthetic typhoon catalog, combined with the storm surge forecasting model and YM wind field model, we obtained extreme wind speed sequences and extreme surge height sequences for the four stations. Finally, four extreme value distributions, i.e., empirical, Weibull, Gumbel, and GPD distributions, were applied to fit the extreme wind speed and surge height sequences. Goodness-of-fit tests confirmed that GPD effectively models extreme wind speeds at all four stations and extreme surge elevations at Haimen, Ruian, and Zhapu, while the empirical distribution best fit extreme surge heights at Zhapu. Using these optimal distributions, we calculated 10-, 50-, 100-, and 200-year return period extreme wind speeds and surge heights for all stations, providing critical references for disaster management authorities.

4.2. Prospects

While machine learning represents a nascent research frontier in storm surge prediction, the absence of standardized methodological frameworks or consensus-driven evaluation metrics persists, with current methodologies remaining contingent upon investigator-specific empirical adaptations. Notwithstanding these developmental challenges, paradigm-shifting theoretical breakthroughs in this field could catalyze transformative progress in data-intensive surge forecasting systems.

Although the return period predictions of typhoon wind speed and storm surge for several stations in Zhejiang Province presented in this article have regional limitations, the idea of integrating the TCRM model and machine learning model for typhoon storm surge hazard analysis proposed in this article can be applied to other regions in the world to provide a theoretical reference for the analysis of corresponding typhoon and storm surge disasters.

While this study establishes a foundational framework for typhoon-induced storm surge hazard analysis in Zhejiang Province, several knowledge gaps merit rigorous investigation. Specifically, future research should quantify how parametric variations in typhoon tracks and intensities modulate station-specific surge responses, while advancing predictive frameworks for storm surge genesis and spatiotemporal evolution. Furthermore, data-model fusion strategies integrating multi-source observations with ensemble modeling techniques could substantially enhance hazard assessment precision. Critical infrastructure limitations persist: Zhejiang’s coastal zone currently lacks a domain-specific parametric storm database—a systemic constraint compounded by regional disparities in research frameworks—which fundamentally constrains machine learning’s operational capacity in local surge prediction. Consequently, developing geospatially optimized storm surge repositories for high-risk littoral systems represents an urgent scientific priority.

Therefore, our ongoing research efforts will focus on advancing typhoon and storm surge disaster analysis through three strategic initiatives:

(1): Enhanced Stochastic Typhoon Modeling: Implementing state-of-the-art empirical track simulation methods to upgrade the TCRM model, particularly refining its capacity to capture nonlinear path deviations and intensity fluctuations;
(2): AI-Driven Storm Surge Forecasting: Developing advanced machine learning architectures (e.g., physics-informed neural operators) to achieve high-precision, extended-lead-time predictions of regional storm surge dynamics, with particular emphasis on coastal bathymetry–topography interactions;
(3): Multivariate Hazard Risk Assessment: Optimizing extreme value distribution models while establishing high-dimensional joint probability distributions to enable coupled forecasting of wind–surge–wave compound hazards, incorporating copula theory to address dependence structures among multivariate extremes.

Author Contributions

Conceptualization, A.L.; Methodology, Y.F., X.L. and Y.G.; Software, X.L.; Validation, X.L. and Y.G.; Formal analysis, Y.F. and Y.G.; Investigation, X.L.; Resources, Y.G.; Data curation, Y.G.; Writing—original draft, X.L.; Writing—review & editing, X.L., Y.S. and Y.G.; Funding acquisition, Y.G. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China (Grant No. 42306233 and 42176014), the Strategic Priority Research Program of the Chinese Academy of Sciences (Grant No. XDA0310602), the Tianjin Natural Science Foundation (Grant No. 23JCYBJC01120) and the Shandong Provincial Natural Science Foundation (Grant No. ZR2021QD108).

Data Availability Statement

Data supplied by the China Meteorological Administration Best Track Dataset (Available online: https://tcdata.typhoon.org.cn/zjljsjj.html, 17 March 2023) and Wenzhou Typhoon Website (Available online: https://m.wztf121.com, 20 February 2024). Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Kentang, L.E. An Analysis of the Recent Severe Storm Surge Disaster Events in China. Nat. Hazards 2000, 21, 215–223. [Google Scholar] [CrossRef]
Sun, Z.L.; Lu, M.; Nie, H.; Huang, S.J. Impacts of climatological change on storm surge in Zhejiang coastal water. J. Zhejiang Univ. 2014, 41, 90–94. [Google Scholar]
Weisse, R.; Storch, H.; Niemeyer, H.D.; Knaack, H. Changing North Sea storm surge climate: An increasing hazard? Ocean Coast. Manag. 2012, 68, 58–68. [Google Scholar] [CrossRef]
Zhao, L.D.; Zhang, T. Storm Surge Magnitude Grading—A Case in Zhejiang, Fujian and Guangdong Province. Appl. Mech. Mater. 2013, 373, 413–416. [Google Scholar] [CrossRef]
Russell, L.R. Probability distributions for hurricane effects. J. Waterw. Harb. Coast. Eng. Div. 1971, 97, 139–154. [Google Scholar] [CrossRef]
Shapiro, L.J. The asymmetric boundary layer flow under a translating hurricane. J. Atmos. Sci. 1983, 40, 1984–1998. [Google Scholar] [CrossRef]
Li, Q.; Duan, Z.D. Shapiro typhoon wind field model and its numerical simulation. J. Nat. Disasters 2005, 14, 45–52. [Google Scholar]
Vickery, P.J.; Skerlj, P.F.; Twisdale, L.A. Simulation of hurricane risk in the U.S. using empirical track model. J. Struct. Eng. 2000, 126, 1222–1237. [Google Scholar] [CrossRef]
Mudd, L.; Wang, Y.; Letchford, C.; Rosowsky, D. Assessing climate change impact on the U.S. east coast hurricane hazard: Temperature, frequency, and track. Nat. Hazards Rev. 2014, 15, 04014001. [Google Scholar] [CrossRef]
Rosowsky, D.V.; Mudd, L.; Letchford, C. Assessing climate change impact on the joint wind-rain hurricane hazard for the northeastern U.S. coastline. In Risk Analysis of Natural Hazards: Interdisciplinary Challenges and Integrated Solutions; Springer: Cham, Switzerland, 2016; pp. 113–134. [Google Scholar]
Kohno, N.; Dube, S.K.; Entel, M.; Fakhruddin, S.H.M.; Greenslade, D.; Leroux, M.D.; Rhome, J.; Thuy, N.B. Recent progress in storm surge forecasting. Trop. Cyclone Res. Rev. 2018, 7, 128–139. [Google Scholar]
Lee, T.L.; Shao, C.C.; Hsu, Y.J.; Huang, W.P. Application of artificial neural network in short-term storm surge forecasting. In Proceedings of the Sixteenth International Offshore and Polar Engineering Conference, San Francisco, CA, USA, 28 May–2 June 2006. [Google Scholar]
Wang, Z.L.; Lai, C.G.; Chen, X.H.; Yang, B.; Zhao, S.W.; Bai, X.Y. Flood hazard risk assessment model based on random forest. J. Hydrol. 2015, 527, 1130–1141. [Google Scholar] [CrossRef]
Zhu, P.J.; Luo, N.X.; Zhao, Q.S. Forecast of maximum water increase in typhoon storm surge based on random forest model. Bull. Surv. Mapp. 2021, 12, 71–74. [Google Scholar]
Miao, Q.S.; Xu, S.S.; Yang, J.K.; Yang, Y.; Liu, Y.L.; Xu, X. Application of long short-term memory neural network in Xiamen storm surge forecast. Period. Ocean. Univ. China 2022, 52, 10–19. [Google Scholar]
Tian, Q.; Luo, W.; Tian, Y.; Gao, H.; Guo, L.; Jiang, Y. Prediction of storm surge in the Pearl River Estuary based on data-driven model. Front. Mar. Sci. 2024, 11, 1390364. [Google Scholar] [CrossRef]
Sun, Y.; Hu, P.; Li, S.; Mo, D.; Hou, Y. Regional storm surge forecast method based on a neural network and the coupled ADCIRC-SWAN model. Adv. Atmos. Sci. 2025, 42, 129–145. [Google Scholar] [CrossRef]
Jia, G.; Taflanidis, A.A.; Nadal-Caraballo, N.C.; Melby, J.A.; Kennedy, A.B.; Smith, J.M. Surrogate modeling for peak or time-dependent storm surge prediction over an extended coastal region using an existing database of synthetic storms. Nat. Hazards 2016, 81, 909–938. [Google Scholar] [CrossRef]
Zhang, Y.R. Research on the Risk Identification and Ecological Prevention Planning for Typhoon Storm Surge Disasters in Wenzhou. Master’s Thesis, Tianjin University, Tianjin, China, 2021. [Google Scholar]
Wang, S.; Mu, L.; Yao, Z.; Gao, J.; Zhao, E.; Wang, L. Assessing and zoning of typhoon storm surge risk with a geographic information system (GIS) technique: A case study of the coastal area of Huizhou. Nat. Hazards Earth Syst. Sci. 2021, 21, 439–462. [Google Scholar] [CrossRef]
Li, Z.; Li, S.; Hu, P.; Mo, D.; Li, J.; Du, M.; Yan, J.; Hou, Y.; Yin, B. Numerical study of storm surge-induced coastal inundation in Laizhou Bay, China. Front. Mar. Sci. 2022, 9, 952406. [Google Scholar] [CrossRef]
Li, Y.; Zhou, W.H.; Shen, P. Flood risk assessment of loss of life for a coastal city under the compound effect of storm surge and rainfall. Urban Clim. 2023, 47, 101396. [Google Scholar] [CrossRef]
Guo, J.; Chen, Z.M.; Jin, K.; Li, T.; Zhu, Y. Spatial distribution characteristics of typhoon storm surge in Zhejiang Province under different typhoon tracks. Mar. Forecast. 2023, 40, 19–27. [Google Scholar]
Yu, L.F.; She, Z.Y.; Fan, M.T.; Yu, Z.Y. Research on calculation method of probability and return period for “multi-confrontation” composite disaster encounter combinations. Hydrology 2024, 44, 11–18. [Google Scholar]
Rizzi, J.; Torresan, S.; Zabeo, A.; Critto, A. Assessing stom surge risk under futuresea-level rise scenarios: A case study in the North Adriatic coast. J. Coast. Conserv. 2017, 21, 453–471. [Google Scholar] [CrossRef]
Wang, N.; Yu, G.; Jiang, W.; Wang, S.; Geng, A.; Lin, Q. Research on the construction of a climate adaptive urban storm surge inundation risk spatial assessment system—Taking Qingdao City as an example. J. Ocean. Univ. China 2023, 53, 118–127. [Google Scholar]
Xie, W.; Xu, G.; Dong, C. Research on storm surge flood prediction based on ConvLSTM machine learning. Chin. J. Atmos. Sci. 2022, 45, 674–687. [Google Scholar]
Qin, Y.; Wei, Z.; Chu, D.; Zhang, J.; Du, Y.; Che, Z. Artificial neural network-based multi-input multi-output model for short-term storm surge prediction on the southeast coast of China. Ocean Eng. 2024, 300, 116915. [Google Scholar] [CrossRef]
Gharehtoragh, M.A.; Johnson, D.R. Using surrogate modeling to predict storm surge on evolving landscapes under climate change. Nat. Hazards 2024, 1, 33. [Google Scholar] [CrossRef]
Zhu, Z.; Wang, Z.; Dong, C.; Yu, M.; Xie, H.; Cao, X.; Han, L.; Qi, J. Physics informed neural network modelling for storm surge forecasting—A case study in the Bohai Sea, China. Coast. Eng. 2025, 197, 104686. [Google Scholar] [CrossRef]
Guo, Y.X.; Hou, Y.J.; Qi, P. Typhoon wind numerical simulation and risk analysis for southeast coastal region of China. Mar. Sci. 2020, 44, 1–12. [Google Scholar]
Simiu, E.; Filliben, J.J. Probability distributions of extreme wind speeds. J. Struct. Div. 1976, 102, 1861–1877. [Google Scholar] [CrossRef]
Simiu, E.; Heckert, N.A. Extreme wind distribution tails: A “peaks over threshold” approach. J. Struct. Eng. 1995, 122, 539–547. [Google Scholar] [CrossRef]
Boumis, G.; Moftakhari, H.R.; Hamid, M. A metastatistical frequency analysis of extreme storm surge hazard along the US coastline. Coast. Eng. J. 2024, 66, 380–394. [Google Scholar] [CrossRef]
Lin, N. Multi-Hazard Risk Analysis Related to Hurricanes. Ph.D. Thesis, Princeton University, Princeton, NJ, USA, 2010. [Google Scholar]
Yao, W. Research on Forecast of Storm Surge in the Northern South China Sea Based on Machine Learning. Master’s Thesis, National Marine Environmental Forecasting Center, Beijing, China, 2021. [Google Scholar]
Rumelhart, D.E.; Hinton, G.E.; Williams, R.J. Learning representations by back-propagating errors. Nature 1986, 323, 533–536. [Google Scholar] [CrossRef]
Hochreiter, S.; Schmidhuber, J. Long short-termmemory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
Vapnik, V.N. Statistical Learning Theory; Wiley: New York, NY, USA, 1998. [Google Scholar]
Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Ho, T.K. Random decision forest. In Proceedings of the 3rd International Conference on Document Analysis and Recognition, Montreal, QC, Canada, 14–16 August 1995; pp. 278–282. [Google Scholar]
Meng, Y.; Matsui, M.; Hibi, K. An analytical model for simulation of the wind field in a typhoon boundary layer. J. Wind. Eng. Ind. Aerodyn. 1995, 56, 291–310. [Google Scholar] [CrossRef]
Zhao, L.; Ge, Y.J.; Xiang, H.F. Stochastic parameter sensitivity analysis of typhoon wind field. J. Tongji Univ. 2005, 6, 727–731. [Google Scholar]
Xie, R.Q. Typhoon Numerical Simulation and Typhoon Wind Hazard Analysis Based on CE Wind-Field Model and Yan Meng Wind-Field Model. Master’s Thesis, Harbin Institute of Technology, Harbin, China, 2008. [Google Scholar]
Xie, E. Study on Dynamic Failure of Reticulated Shell Under Typhoon. Master’s Thesis, Zhejiang University, Hangzhou, China, 2016. [Google Scholar]
Guo, Y.X.; Hou, Y.J.; Qi, P. Comparison of extreme wind speeds predicted by Monte-Carlo simulation and empirical track mode. Acta Oceanol. Sin. 2020, 42, 64–77. [Google Scholar]
Holland, G.J. An analytic model of the wind and pressure profiles in hurricanes. Mon. Weather. Rev. 1980, 108, 1212–1218. [Google Scholar] [CrossRef]
Fang, W.H.; Lin, W. A review of typhoon wind field models for disaster risk assessment. Prog. Geogr. 2013, 32, 852–867. [Google Scholar]
Vickery, P.J.; Wadhera, D. Statistical models of Holland pressure profile parameter and radius to maximum winds of hurricanes from flight-level pressure and H*Wind Data. J. Appl. Meteorol. Climatol. 2008, 47, 2497–2517. [Google Scholar] [CrossRef]
Wang, L.L. Failure Mechanism of Sea-Crossing Cable-Stayed Bridge Under Typhoon Wind Field. Master’s Thesis, Zhejiang University, Hangzhou, China, 2014. [Google Scholar]

Figure 1. Delineated coastal zones in Zhejiang Province (The solid red box denotes the study area; the dashed red lines represent the four subregions divided for parameter validation in Section 3.1; and the red dots indicate the research stations).

Figure 2. BP neural network structure diagram.

Figure 3. LSTM neural network structure diagram.

Figure 4. SVR model structure diagram.

Figure 5. Random Forest algorithm flowchart.

Figure 6. TCRM model configuration process.

Figure 7. Plots of maximum wind speed distributions for a 100-year return period derived from (a) IBTrACS Dataset and (b) CMA Best Track Dataset as inputs to TCRM (The black and red bounding boxes highlight areas with marked discrepancies in predicted wind speeds).

Figure 8. Plot of maximum wind speed variations with return period under the condition of using (a) IBTrACS Dataset and (b) CMA Best Track Dataset as inputs to TCRM.

Figure 9. Synthetic tropical cyclone tracks and intensity data generated by TCRM over a 1000-year horizon.

Figure 10. Spatial distribution of historical typhoon events (blue) and synthetic typhoon events (red).

Figure 11. Synthetic typhoon events impacting the Zhejiang coast screened from TCRM-generated 1000-year synthetic tropical cyclones.

Figure 12. Frequency distributions of central pressure for historical (blue) and synthetic (red) typhoon events in four coastal subregions of Zhejiang Province: (a) [119° E, 120.5° E], (b) [120.5° E, 122° E], (c) [122° E, 123.5° E], (d) [123.5° E, 125° E].

Figure 13. Frequency distributions of 2 min averaged maximum wind speeds near the typhoon center for historical (blue) and synthetic (red) typhoon events in four coastal subregions of Zhejiang Province: (a) [119° E, 120.5° E], (b) [120.5° E, 122° E], (c) [122° E, 123.5° E], (d) [123.5° E, 125° E].

Figure 14. Extracted 36 typhoon tracks from the Collection of Storm Surge Disasters Historical Data in China.

Figure 15. Comparison of predicted and observed storm surge heights for four forecasting models on the training set at Wenzhou station.

Figure 16. Comparison of predicted and observed storm surge heights for four forecasting models on the test set at Wenzhou station.

Figure 17. Partial predicted storm surge heights from synthetic typhoons at Wenzhou station.

Figure 18. Frequency distribution histograms of maximum storm surge heights at four stations: (a) Haimen, (b) Ruian, (c) Wenzhou, (d) Zhapu.

Figure 19. Frequency distribution histograms of maximum wind speeds at four stations: (a) Haimen, (b) Ruian, (c) Wenzhou, (d) Zhapu.

Figure 20. Q-Q plot of the fitted GPD with different thresholds for extreme storm surge heights at Zhapu Station.

Figure 21. GPD fitting for extreme wind speeds at four stations: (a) Haimen, (b) Ruian, (c) Wenzhou, (d) Zhapu.

Figure 22. Q-Q plot of the fitted GPD for extreme storm surge heights at four stations: (a) Haimen, (b) Ruian, (c) Wenzhou, (d) Zhapu.

Figure 23. Bar charts with 95% confidence intervals of extreme wind speed for return periods of 10, 30, 50, 100, and 200 years at Haimen, Ruian, Wenzhou, Zhapu stations.

Figure 24. Bar charts with 95% confidence intervals of extreme surge height for return periods of 10, 30, 50, 100, and 200 years at Haimen, Ruian, Wenzhou, Zhapu stations.

Table 1. Data Sources Used in This Study.

Type of Data	Time Range and Temporal Resolution	Key Parameters	Sources of Dataset
Typhoon Path Data	1949 to 2023 6 h	timestamp, intensity indicator, latitude, longitude, minimum central pressure, 2 min averaged maximum sustained wind speed near the center (WND), 2 min averaged wind speed (OWD)	CMA Best Track Dataset (https://tcdata.typhoon.org.cn/zjljsjj.html, accessed on 17 March 2023)
Typhoon Path Data	1945 to 2024 1 h	timestamp, longitude, latitude, intensity, 2 min averaged maximum sustained wind speed near the center, translation speed and central pressure	Wenzhou Typhoon Website (https://m.wztf121.com, accessed on 20 February 2024)
Storm Surge Height Data	1949 to 2009 1 h	disaster impacts, surge heights, and instances of high tide levels exceeding local warning thresholds	The Collection of Storm Surge Disasters Historical Data in China

Table 2. Typhoon events of Haimen, Ruian, Wenzhou, and Zhapu in the Collection of Storm Surge Disasters Historical Data in China.

Station	Typhoon Events (CMA Number and Name)
Haimen	7209 (Betty), 7504 (Ora), 8712 (Gerald), 8923 (Vera), 9219 (Ted), 9417 (Fred), 9620 (Zane), 9711 (Winnie), 0414 (Rananim)
Ruian	9005 (Ofelia), 9216 (Polly), 9219 (Ted), 9608 (Herb), 9620 (Zane), 9711 (Winnie), 0216 (Sinlaku), 0417 (Chaba), 0608 (Saomai)
Wenzhou	8923 (Vera), 9005 (Ofelia), 9012 (Yancy), 9216 (Polly), 9417 (Fred), 9608 (Herb), 9711 (Winnie), 0417 (Chaba), 0608 (Saomai)
Zhapu	5612 (Wanda), 6207 (Nora), 7413 (Mary), 7910 (Judy), 7919 (Tip), 9417 (Fred), 9711 (Winnie), 0417 (Chaba), 0509 (Matsa)

Table 3. Correlation coefficients between predicted and observed storm surge heights at the four stations of Haimen, Ruian, Wenzhou, and Zhapu under three experimental configurations with different input parameters.

Experimental Groups	Input Parameters	Correlation Coefficients
Experimental Groups	Input Parameters	Haimen	Ruian	Wenzhou	Zhapu
Test1	lon, lat, v_max, v_T, p_central, R_max	0.71	0.62	0.77	0.60
Test2	u, v, p	0.74	0.70	0.79	0.76
Test3	Test1 + Test2	0.75	0.72	0.82	0.80

Table 4. Comparative analysis of Pearson correlation coefficients between predicted and observed storm surge heights across four tidal stations (Haimen, Ruian, Wenzhou, Zhapu) using four machine learning models.

Stations	Correlation Coefficients
Stations	RF	BP	LSTM	SVR
Haimen	0.85	0.76	0.75	0.73
Ruian	0.90	0.68	0.72	0.72
Wenzhou	0.92	0.85	0.82	0.80
Zhapu	0.93	0.81	0.80	0.82
Zhapu (Cross-validation)	0.91	0.83	0.75	0.78

Table 5. KS test of typhoon extreme wind speeds by different value extreme distribution models.

Station	Distribution Models
	Weibull Distribution		Gumbel Distribution		GPD
	H	P	H	P	H	P
Haimen	1	~0	1	~0	0	0.33
Ruian	1	~0	1	~0	0	0.08
Wenzhou	1	~0	1	~0	0	0.07
Zhapu	1	~0	1	~0	0	0.35

Table 6. KS test of extreme storm surge heights by different extreme value distribution models.

Station	Distribution Models
	Weibull Distribution		Gumbel Distribution		GPD
	H	P	H	P	H	P
Haimen	1	~0	1	~0	0	0.81
Ruian	1	~0	1	~0	0	0.63
Wenzhou	1	~0	1	~0	0	0.45
Zhapu	1	~0	1	~0	0	0.21

Table 7. Forecast values of wind speed in different return periods for the four stations of Haimen, Ruian, Wenzhou, and Zhapu.

Station	Return Period Wind Speeds (m/s)
Station	10 Years	30 Years	50 Years	100 Years	200 Years
Haimen	34.17	42.31	45.35	48.88	51.81
Ruian	34.88	43.25	46.76	51.16	55.19
Wenzhou	34.79	42.92	45.94	49.42	52.29
Zhapu	33.94	41.63	44.46	47.70	50.35

Table 8. Forecast values of storm surge height in different return periods for the four stations of Haimen, Ruian, Wenzhou, and Zhapu.

Station	Return Period Storm Surge Heights (cm)
Station	10 Years	30 Years	50 Years	100 Years	200 Years
Haimen	133.62	143.67	147.91	153.27	158.21
Ruian	129.30	140.62	145.32	151.18	156.49
Wenzhou	230.09	251.81	257.01	259.09	261.30
Zhapu	142.00	181.89	199.96	224.72	250.70

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Fang, Y.; Li, X.; Sun, Y.; Li, A.; Guo, Y. Typhoon and Storm Surge Hazard Analysis Along the Coast of Zhejiang Province in China Using TCRM and Machine Learning. J. Mar. Sci. Eng. 2025, 13, 1017. https://doi.org/10.3390/jmse13061017

AMA Style

Fang Y, Li X, Sun Y, Li A, Guo Y. Typhoon and Storm Surge Hazard Analysis Along the Coast of Zhejiang Province in China Using TCRM and Machine Learning. Journal of Marine Science and Engineering. 2025; 13(6):1017. https://doi.org/10.3390/jmse13061017

Chicago/Turabian Style

Fang, Yong, Xiangyu Li, Yanhua Sun, Ailian Li, and Yunxia Guo. 2025. "Typhoon and Storm Surge Hazard Analysis Along the Coast of Zhejiang Province in China Using TCRM and Machine Learning" Journal of Marine Science and Engineering 13, no. 6: 1017. https://doi.org/10.3390/jmse13061017

APA Style

Fang, Y., Li, X., Sun, Y., Li, A., & Guo, Y. (2025). Typhoon and Storm Surge Hazard Analysis Along the Coast of Zhejiang Province in China Using TCRM and Machine Learning. Journal of Marine Science and Engineering, 13(6), 1017. https://doi.org/10.3390/jmse13061017

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Typhoon and Storm Surge Hazard Analysis Along the Coast of Zhejiang Province in China Using TCRM and Machine Learning

Abstract

1. Introduction

2. Data and Methods

2.1. Data Source

2.2. Methods

2.2.1. TCRM Typhoon Virtual Model

2.2.2. Machine Learning Models

2.2.3. Wind Field Model

2.2.4. Extreme Value Distributions

3. Result Analysis

3.1. Construction of Virtual Typhoons

3.2. Machine Learning-Based Forecasting of Storm Surge Height

3.2.1. Data Preparation

3.2.2. Input Parameter Experiment

3.2.3. Model Comparison Experiments

3.3. Extraction of Extreme Wind Speeds and Extreme Storm Surge Heights

3.4. Typhoon and Storm Surge Hazard Analysis

4. Summary and Prospects

4.1. Summary

4.2. Prospects

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI