1. Introduction
Global urbanization, a key feature of contemporary socio-economic transformation, has led to increased concentration of production factors and population in urban areas. While this stimulates regional economic growth, it also heightens the vulnerability of urban systems to environmental and infrastructural stresses. China’s urbanization rate grew from 17.92% in 1978 to 65.22% in 2022, accompanied by an annual increase of 12.6% in urban natural disaster losses over the past decade, with floods accounting for 73.6% of the losses [
1]. The Beijing–Tianjin–Hebei (BTH) region, though covering only 1.9% of China’s land, houses 7.8% of the population and contributed 9.3% to the national Gross Domestic Product (GDP) in 2022. However, a 2023 flood in the Haihe River Basin caused losses exceeding 100 billion yuan (CNY), revealing the gap between traditional flood control infrastructure and urban development needs [
2]. Conventional flood control methods are increasingly inadequate, while the emerging paradigm of resilient cities—focusing on self-organization and adaptability—offers a promising solution for overcoming existing flood management challenges and enhancing urban resilience amid climate and hydrological uncertainties [
3,
4,
5,
6,
7]. Resilience, originally an engineering concept for structural recovery, has evolved into a broader framework. It now focuses on dynamic self-organization and system transitions in complex adaptive systems, significantly influencing urban resilience studies [
8,
9,
10,
11].
Scholars have made notable progress in advancing the study of UFR, contributing to a growing body of theoretical frameworks, influencing indicators, and empirical applications. In terms of evaluation frameworks, Oladokun et al. [
12] employed fuzzy logic methods to conduct an abstract analysis of flood resilience systems. By identifying system variables and parameters, they established a flood resilience framework from three perspectives: natural, socio-technical, and socio-economic, and validated the model using raw data from flood-prone communities in southwestern Nigeria. Karamouz et al. [
13] developed a dynamic urban infrastructure resilience assessment equation based on four dimensions: agility, robustness, redundancy, and resource accessibility, aimed at improving infrastructure resilience in New York City. Simonovic et al. [
14] developed the Flood Resilience Simulation Tool (FRST) to help quantify, compare, and visualize dynamic flood resilience. Chen et al. [
15] established a resilience assessment model for rainstorm and flood scenarios, focusing on three attributes: resistance, recovery, and adaptability, which comprehensively consider the various capacities of cities when facing rainstorm and flood disasters. Xiao et al. [
16] constructed an UFR assessment system from the PSR (Pressure-State-Response) perspective, analyzing the pressure, state, and response processes of urban flood systems from three angles: stimulus, sensitivity, and adaptability. Li et al. [
17] quantified UFR based on the “4R” characteristics—robustness, resource, responsiveness, and redundancy. He et al. [
18] developed a flood resilience evaluation index system based on the dimensions of economy, society, environment, and infrastructure construction. Bao et al. [
19] developed a comprehensive “Risk-Resilience-Effect” framework, based on the MaxEnt model and the interpretable XGBoost-SHAP model. The study primarily revealed the trade-offs and synergies between flood risk and resilience, and quantified the nonlinear threshold effects of key influencing factors.
In the exploration of influencing indicators, existing studies mainly apply methods such as gray relational analysis, obstacle degree models, spatial Durbin models, and geographical spatial detectors, among others. Ji et al. [
20] used gray relational analysis to diagnose the importance of indicators influencing urban resilience from a rainstorm and flood perspective. Song et al. [
21], based on the obstacle degree model, analyzed the key limiting indicators hindering the development of flood resilience in cities at different types and stages of development. Zhang et al. [
22] employed the spatial Durbin model, considering both spatial spillover effects and spatial heterogeneity, to deeply analyze the influencing indicators in the Yellow River basin and its upper, middle, and lower reaches. Chen et al. [
23] applied the geographical detector model, which can explain the spatial differentiation of UFR and analyze its internal driving mechanisms.
Through the above review, two areas for improvement in current UFR research can be identified: Firstly, most scholars’ interpretations and understandings of UFR are limited to process mechanisms or isolated urban indicators. However, UFR is a complex, multi-dimensional concept. By approaching it from the dimensions of resistance, recovery, and adaptability, and coupling multi-source indicators from nature, economy, society, and infrastructure, a more systematic and comprehensive evaluation framework can be developed. Secondly, in exploring the influencing indicators, commonly used methods such as the obstacle degree model are effective in identifying key indicators but struggle to quantitatively explain the specific contributions of each indicator to changes in UFR. These methods also have limited capabilities in handling nonlinear relationships and complex interaction effects. The XGBoost-SHAP method combines machine learning’s efficient processing power with strong interpretability. It improves prediction accuracy for complex UFR driving indicators and reveals the quantitative contributions and directional impacts of each indicator. This method offers better interpretability and higher practical value [
19,
24,
25].
Given the complexity of urban systems and flood processes, urban flood resilience evaluation involves multiple levels, objectives, and attributes, often accompanied by conflicts among indicators. Therefore, multi-attribute decision-making (MCDM) methods are particularly suitable for this type of assessment. Among various MCDM approaches, the VIKOR method was adopted because it is specifically designed to address conflicting criteria and to identify a compromise solution by measuring proximity to the ideal solution, which is consistent with the trade-offs among resistance, recovery, and adaptability dimensions in UFR assessment. In terms of indicator weighting, the CRITIC method was selected instead of subjective approaches such as AHP, as it objectively determines weights by considering both the contrast intensity and the conflict among indicators. This avoids expert bias and better reflects the inherent information structure of the data. By incorporating information entropy to characterize indicator dispersion, the CRITIC-based weighting ensures scientifically reliable and robust weight assignment.
Therefore, this study constructs an indicators system of UFR from the process perspective of resistance, recovery, and adaptability, which considers the perspectives of nature, economy, society, and infrastructure. Index weights are determined using the CRITIC method, which integrates entropy weighting. UFR levels of 13 cities in the BTH region from 2011 to 2022 are evaluated using the VIKOR method [
26]. Additionally, the widely applicable XGBoost-SHAP model is employed to identify driving indicators, supporting the development of strategies to improve UFR in the BTH region.
This study focuses on the BTH region, providing an in-depth analysis of the concept of flood resilience. It incorporates spatiotemporal evolution analysis to specifically explore how UFR changes over time and responds to social and environmental dynamics. Furthermore, by integrating the XGBoost-SHAP model with MCDM methods, this study offers a more comprehensive framework for assessing UFR [
19].
2. Materials and Methods
2.1. Study Area
The BTH region consists of Beijing, Tianjin, and 11 cities in Hebei Province, namely Chengde, Zhangjiakou, Qinhuangdao, Tangshan, Baoding, Langfang, Cangzhou, Shijiazhuang, Handan, Xingtai, and Hengshui, as shown in
Figure 1.
The BTH region covers an area of 215,300 km
2 and has a typical warm-temperate, semi-humid to semi-arid continental monsoon climate. The average annual temperature ranges from 8 to 12 °C, and the average annual precipitation is approximately 500 mm. However, there is significant spatial and temporal heterogeneity: 60–80% of the precipitation occurs during the flood season from June to August. The windward slope of the Yanshan Mountains forms a rainy zone, with annual precipitation ranging from 700 to 800 mm, while the northern plateau receives less than 400 mm [
27]. The region is drained by two major river systems: the Hai River and the Luan River. The Hai River system consists of the main stream and five major tributaries—North Canal, Yongding River, Daqing River, Ziya River, and South Canal. The Luan River system includes the Luan River and its neighboring rivers, which flow through cities such as Chengde and Tangshan. Additionally, lakes and wetlands are widely distributed in the region, including Baiyangdian and Hengshui Lake.
2.2. Evaluation Index System Construction
UFR refers to the comprehensive ability of urban systems to maintain essential functions and absorb impacts in the face of flooding disturbances. It also involves the capacity for dynamic adjustments and transformative development through the integration of subsystems such as natural, economic, social, and infrastructure systems.
The evolution of resilience can be viewed as a cyclical process. In
Figure 2, the city system represents the urban system’s structure and function under external disturbances. Disturbed refers to the impact of flood disasters, causing deviation from the original state. The original level is the normal operating state of the city without external disturbances, reflecting stability. The update level indicates the development achieved after resistance, recovery, and adaptability [
28].
Urban systems undergo three stages after being disturbed: resistance, recovery, and adaptability. These stages are demarcated based on the system’s temporal and functional responses to flood disturbances. Resistance is the foundational dimension of UFR, representing the ability of urban systems to withstand floods. It is characterized by the system’s ability to maintain structural stability and functional continuity, relying on infrastructure and flood control measures. This phase occurs immediately after the disturbance and is marked by a sharp deviation from the original level. Recovery refers to the urban system’s ability to return to a stable state after a disturbance. It emphasizes the restoration of socio-economic activities and the normalization of residents’ lives. This phase begins after the immediate impacts subside and focuses on regaining normal operational capacity, marked by a gradual return to the original level. Adaptability is the advanced dimension of UFR, referring to the urban system’s ability to reach a higher level through learning and adjustment to future risks. It involves urban planning, policy changes, and social and economic transformations to enhance long-term resilience. This phase is demarcated by the system’s ability to evolve beyond its original capacity, achieving an “update level.” These stages reflect the system’s capacity to absorb immediate impacts, restore functions, and adapt for future resilience. The decline represents resistance, the rise corresponds to recovery, and the upward trajectory signifies adaptability.
Based on existing research on urban resilience assessment and the analysis of resilience mechanisms [
15], and considering the four components of urban systems—natural, economic, social, and infrastructure—eighteen flood resilience indicators were ultimately selected [
29], as shown in
Table 1. All indicators are classified as positive (+), where a larger value is better, or negative (−), where a smaller value is better.
Resistance represents the ability of urban systems to withstand the impacts of flooding. Urban population density reflects the city’s basic carrying capacity and indicates the efficiency of spatial resource allocation. Higher density can lead to greater strain on infrastructure and resources during a flood, affecting resistance. Annual precipitation volume directly determines the intensity of flooding, it provides a stable measure of long-term climate and rainfall patterns, reflecting the changing flood risk over time. Larger volumes of rainfall lead to more severe flooding, thereby reducing resistance to flood impacts. A higher green coverage rate improves the infiltration of rainwater and reduces surface runoff, enhancing the city’s resistance to flooding by allowing better water absorption. Steeper slopes allow for faster runoff, preventing the accumulation of large amounts of water in low-lying areas, thereby enhancing flood resistance. A higher density of drainage pipes improves the flood discharge effectiveness per unit area, directly increasing the city’s resistance to flooding by enhancing water flow management.
Recovery represents the ability of urban systems to return to a stable state. The urban registered unemployment rate reflects the level of human resources in the city. A lower unemployment rate indicates a stronger labor force, which directly accelerates post-disaster economic recovery by supporting the reconstruction of industries and services. The number of healthcare professionals indicates the capacity for medical support. A higher number of healthcare professionals ensures a timely and effective response to health emergencies, which is crucial for recovery after a disaster. The coverage of medical insurance reflects the inclusivity of the city’s social security system. A higher coverage rate reduces the financial burden on individuals after a disaster, contributing to a smoother recovery process. The proportion of vulnerable groups reflects the vulnerability of the urban community population structure. A higher proportion of vulnerable groups (such as the elderly and children) may slow recovery, as these groups require additional resources for rehabilitation. A balanced dependency ratio can help mitigate this impact by providing adequate labor support for recovery. The per capita disposable income of urban residents indicates individual economic levels. Higher disposable income allows individuals and families to recover faster by rebuilding their homes and businesses, thus contributing to the overall recovery process. The proportion of value added by the tertiary sector is an indicator of industrial upgrading. An economy dominated by the service sector is more flexible and can recover more quickly from disasters, as it is less dependent on physical infrastructure than industries in the primary and secondary sectors. Per capita urban road area serves as a basic indicator of road network capacity. Adequate road infrastructure is essential for ensuring accessibility during recovery, as it facilitates the movement of goods, services, and people for disaster relief. The sewage treatment rate indicates the municipal department’s ability to manage daily sewage and control the spread of pollution. An effective sewage treatment system is critical for maintaining public health and preventing disease outbreaks during recovery.
Adaptability represents the ability of urban systems to achieve sustainable development through learning and adjustment. Flood control and disaster reduction policies serve as institutional safeguards for the city’s response to flood risks, determining the systematization and foresight of urban flood management efforts. These policies not only shape flood risk management strategies but also influence the overall adaptive capacity of urban systems. Per capita park green space can effectively enhance surface water retention capacity. Larger areas of park green space improve the absorption of rainwater, mitigating the impact of heavy rainfall. The proportion of R&D expenditure to GDP is a key indicator of the city’s investment intensity in technological innovation. A higher proportion of R&D expenditure signifies a greater capacity for developing disaster response technologies. Per capita general public budget expenditure provides the material foundation for implementing emergency preparedness. Adequate public funding ensures that cities can implement effective flood risk management strategies, and improve the overall adaptive capacity of urban systems. The density of hydrological stations provides data support for early warning of flood disasters, enabling cities to adjust flood control strategies proactively. A higher density of stations ensures timely and accurate flood predictions, which help prevent passive responses due to delayed information and enhance the city’s adaptive capacity.
2.3. Data Sources
The data used in this study were sourced from the following channels: the China Urban Statistical Yearbook: annual data on urban population, infrastructure, and socio-economic factors were provided (A1, A2, A5, B4, B7, C2, C5, accessed on 5 March 2025). The China Meteorological Disasters Yearbook: data on meteorological events were provided (A3, accessed on 5 March 2025). The National Economic and Social Development Statistical Bulletin of each city: economic and social data for each city were provided, compiled annually (B1, B2, B3, B8, C1, C3, accessed on 11 March 2025). The National Bureau of Statistics, the Hebei Economic Yearbook (2012–2019) and the Hebei Statistical Yearbook (2020–2023): economic performance data for all cities were contained, published annually (B5, B6, C4, accessed on 11 March 2025). The Geospatial Data Platform: the Geospatial Data Cloud was used (A4, accessed on 11 March 2025). Missing data points were interpolated using linear interpolation based on adjacent years, as the missing values were isolated (i.e., data from non-contiguous years were missing).
2.4. CRITIC-Entropy Weight Combination Assignment Method
The CRITIC method is an objective weighting technique that allocates weights by analyzing the intensity and conflict between indicators. It uses the coefficient of variation to quantify the degree of variation in indicator data, while the Pearson correlation matrix captures the interdependencies between them. However, the traditional CRITIC model has two main limitations: first, it does not distinguish between the effects of positive and negative correlations on weight allocation; and second, it fails to consider the nonlinear dispersion of indicators, making it challenging to fully capture interactions between indicators in complex systems using only linear correlation.
To address these limitations, we integrate the Entropy Weight Method, which calculates weights based on the degree of data dispersion, using information entropy theory. Higher dispersion leads to lower entropy values, resulting in higher weights for more variable indicators.
Given the multi-source nature of data in UFR research, a single weighting method cannot fully account for both indicator interrelations and data dispersion. Therefore, we combine the CRITIC method with the Entropy Weight Method, creating a more comprehensive and effective weighting model.
2.4.1. CRITIC Weighting
CRITIC determines the information content of indicators based on their comparative strength and conflict.
Step 1: Data standardization [
36].
For positive indicator
:
For negative indicator
:
Step 2: Calculation of the standard deviation for each indicator. For each standardized evaluation indicator, the standard deviation is calculated to represent the degree of fluctuation of the indicator. A larger standard deviation indicates greater variability in the indicator across the samples, implying richer information. The formula for calculating the standard deviation is:
where
represents the standardized data and
denotes the mean of the
jth indicator.
Step 3: Calculation of correlation between indicators. For each pair of indicators
j and
k, their correlation is calculated as follow:
where
. The closer the value is to 1, the stronger the correlation between the two indicators. The closer the value is to −1, the stronger the negative correlation between the two indicators. The closer the value is to 0, the weaker the correlation.
Step 4: Calculation of information content. For each indicator
j, the information content
is calculated as follows:
where
is the standard deviation and
is the correlation between indicator
j and other indicators
k.
Step 5: Calculation of weights.
where
is the information content of the
jth indicator and
is the sum of the information content of all indicators.
2.4.2. Entropy Weighting Method
The specific determination steps are as follows:
Step 1: Standardize the metrics. Calculate the weight of the sample value of the
ith city under the
jth metric for that metric.
where
,
n is the number of metrics and
m is the number of cities.
Step 2: Calculate the entropy value of the
jth indicator
where
, satisfies
.
Step 3: Calculate the information entropy redundancy.
Step 4: Calculation of objective weights for each indicator.
2.4.3. Combination Weighting
Through the above calculation, the combination weights of UFR indicators can be obtained, as shown in
Table 2.
2.5. Multi-Attribute Decision-Making Methods
The VIKOR method is a trade-off ranking method, which ranks a limited number of decision alternatives by maximizing the group utility value and minimizing the individual regret value [
26]. It is particularly suitable for complex problems involving conflicting criteria. The method ranks alternatives based on their closeness to the ideal solution, and helps decision-makers find a balanced outcome.
Let the set of evaluation objects be and the set of indicators be . represents the value of the jth indicator in the ith year, and is the weight of the jth indicator.
Use Equations (1)–(2) for data standardization, and then calculate the positive ideal solution and the negative ideal solution .
Calculate the group utility value
and individual regret value
.
Determine the compromise value of the evaluation object
.
where
and
are the maximum and minimum group benefits, and
,
are the maximum and minimum group benefits, respectively.
is the coefficient of the decision-making mechanism, which is generally taken as 0.5.
takes values in [0, 1], and the smaller its value, the better.
2.6. Spatial AutoCorrelation
2.6.1. Global Spatial Autocorrelation Analysis
Based on ArcGIS10.8, the global spatial autocorrelation index and the local spatial autocorrelation index were used to systematically analyze the aggregation and heterogeneity characteristics of the spatial pattern of UFR [
37,
38].
The spatial dependence of UFR in each spatial unit within the BTH region is quantitatively characterized by constructing a geographically weighted spatial matrix using Moran’s
I. The value of Moran’s
I ranges from −1 to 1. Moran’s
I > 0, indicates that there is a positive correlation between regions and the spatial agglomeration phenomenon exists, and the larger the value is, the more obvious the spatial correlation is. Conversely, when the
p-value is significant and Moran’s
I < 0, it indicates that there is a negative correlation between regions, and the spatial pattern of decentralization is presented. When Moran’s
I = 0, it indicates that there is no correlation of the spatial distribution of UFR. The formula is as follows:
where
n is the number of cities,
and
are UFR values of the
ith and
jth cities,
is the average value of UFR,
is the spatial weight matrix, and
is the sum of all elements in the spatial weight matrix. Combined with the geographical characteristics the
K-nearest neighbor method is chosen to construct the adjacency matrix.
2.6.2. Local Spatial Autocorrelation Analysis
Local spatial autocorrelation, focusing on the degree of similarity of attributes between geographic units and neighboring regions, quantifies the degree of contribution of local spatial structure to the global picture through Local Moran’s
I. The local spatial autocorrelation is calculated as follows. The formula is calculated as follows:
2.7. XGBoost-SHAP Model
XGBoost is an advanced ensemble learning method based on gradient boosting decision trees, widely used in machine learning tasks such as regression and classification. The core idea of XGBoost is to iteratively fit the residuals of the current model at each round to continuously optimize the objective function. At the same time, it introduces a regularization term to control model complexity, preventing overfitting, and ensuring that the model maintains good stability and robustness.
The objective function of the XGBoost model consists of two components: the training error and the regularization term, typically expressed as follows:
The Objective represents the optimization objective function, where term refers to the loss function, terms and correspond to the true values and predicted values, respectively, and term denotes the regularization term of the k-th tree, which is used to control the model’s complexity. XGBoost optimizes the objective function using both first-order and second-order gradient information. Specifically, in each iteration, the model approximates the optimization of the objective function using gradient descent. By leveraging the second-order Taylor expansion, the convergence of the function is accelerated, thereby improving prediction accuracy.
Although XGBoost can predict the overall impact of indicators on UFR, it is difficult to interpret how these changes affect the outcomes. To address this limitation, SHAP values are introduced. By analyzing marginal contributions, SHAP enhances the interpretability of the model, enabling a deeper analysis of the contribution of influencing indicators.
SHAP, based on the Shapley value theory from cooperative game theory, is an explanation method that measures the importance of features under the intrinsic characteristics of a machine learning model. For a feature
in the feature set
, the SHAP value is calculated as follows:
represents the set of all features; is any subset of features that does not include feature ; is the number of features in set ; denotes the contribution of feature set to the model’s prediction output; and represents the contribution of the feature set , which includes feature to the model’s prediction output.
4. Discussion
4.1. Analysis of Driving Indicators of UFR
To further investigate the main driving indicators and mechanisms of UFR in the BTH region, the XGBoost algorithm combined with the SHAP explanation mechanism is employed to assess the impact of each driving indicator.
4.1.1. XGBoost Model Results
To justify the selection of the XGBoost-SHAP framework, this study not only discusses its theoretical advantages in handling nonlinear and interactive relationships but also compares its predictive performance with other commonly used machine learning models.
The UFR levels of cities in the BTH region were used as output labels, with 18 secondary indicators as input features. The dataset includes data from 13 cities in the BTH region from 2011 to 2022. After data preprocessing, the dataset was divided into training and testing sets at an 80:20 ratio. This ratio is a commonly used practice in machine learning, ensuring that the training set is sufficiently large to train the model, while the test set remains large enough to provide a reliable evaluation. To assess the appropriateness of this ratio, we tested other splits, such as 70:30 and 60:40, and found that the model’s performance remained consistent across these splits. The model was tuned using Bayesian optimization and five-fold cross-validation to ensure that it did not overfit the training data. Additionally, the performance of the model with optimized hyperparameters was tested on the validation set, which validated the acceptability of the optimized hyperparameter values.
Figure 6 shows a comparison between the model’s predicted UFR values and the actual values. The blue points represent the predicted and actual values for each city, while the red dashed line represents the ideal perfect prediction line. Most of the points are close to the diagonal, indicating that the model’s predictions are generally accurate, though a few points deviate from the line, suggesting prediction errors in certain cities or years. Overall, the model can accurately predict UFR values in most cases.
To ensure the model’s generalizability, overfitting is controlled by applying regularization (L1 = 0.27, L2 = 0.06) and early stopping techniques. This study compares the performance of Random Forest, GBDT, and XGBoost models. The test results indicate that the XGBoost model outperforms both the GBDT and Random Forest models in terms of various loss functions, including
(coefficient of determination), MAE (Mean Absolute Error), MSE (Mean Squared Error), and RMSE (Root Mean Squared Error), as shown in
Table 4. Notably, the
value on the test set is close to 0.9618, demonstrating the excellent predictive performance of the model. Therefore, XGBoost was selected as the final predictive model, and the SHAP framework was further employed to interpret feature contributions and driving mechanisms of UFR. The optimal parameter combination was ultimately determined to be: n_estimators = 1014, max_depth = 5, subsample = 0.8, colsample_bytree = 0.8, and learning_rate = 0.02.
4.1.2. Analysis of Driving Indicators Based on the SHAP Model
Figure 7 shows the SHAP feature importance plot, which indicates the contribution of each feature to the model’s predictions. Feature importance is determined by calculating the average SHAP value of each feature in relation to the model’s prediction outcomes. Features with higher importance contribute more to the model’s predictions, and this ranking helps identify the key indicators driving UFR.
As shown in
Figure 7, six indicators—average urban slope, hydrological station density, per capita park green space area, urban population density, R&D expenditure as a percentage of GDP, and green coverage ratio in built-up areas—have contributions exceeding 0.02. This suggests that these indicators have a high level of impact on UFR, making them critical drivers of UFR in the region.
The indicators related to flood prevention and disaster mitigation policies, tertiary sector value added as a share of GDP, health technicians, per capita public budget expenditure, per capita disposable income, and drainage pipeline density contribute between 0.005 and 0.02 to the average SHAP value, indicating a moderate effect on UFR. Indicators such as wastewater treatment rate, health insurance coverage, urban unemployment rate, urban road area, vulnerable community groups, and precipitation contribute less than 0.005 to the average SHAP value, indicating a relatively weak impact on UFR.
Figure 8 shows the positive and negative impacts of each feature on the prediction results at different value levels. The horizontal axis represents the SHAP values, with each point representing a sample; the color gradient from blue to red indicates the feature values ranging from low to high.
The following conclusions can be drawn from
Figure 8; urban slope exhibits a clear positive driving effect: higher values are mainly concentrated in the positive SHAP value range, while lower values are predominantly found in the negative range. This suggests that an increase in urban slope significantly enhances resilience, while a decrease suppresses it. Hydrological station density, per capita park green space area, urban population density, R&D expenditure as a percentage of GDP, and green coverage ratio show a similar distribution pattern to urban slope, making them key drivers of resilience improvement. Flood prevention and disaster mitigation policies, tertiary-sector value added as a share of GDP, health technicians, per capita public budget expenditure, per capita disposable income, drainage pipeline density, and the proportion of vulnerable groups also primarily exhibit higher values in the positive range and lower values in the negative range. However, their points are closer to the zero value, indicating a relatively smaller impact, making them secondary features that contribute positively to resilience. Urban registered unemployment rate shows that high-value samples are mostly distributed in the negative SHAP value range, while low-value samples are biased toward the positive SHAP value range. This suggests that an increase in the unemployment rate reduces resilience, while smaller values help enhance it. Wastewater treatment rate, health insurance coverage, per capita urban road area, and precipitation are distributed on both sides of zero, with samples from high and low values scattered across the positive and negative ranges, showing no clear positive or negative feedback effects.
In summary,
Figure 8 not only clarifies the direction and nonlinear impact of each feature but also aligns with the importance ranking from
Figure 7. From a policy perspective, resources should be allocated toward the key positive indicators, particularly those with a higher SHAP value, such as urban slope and green coverage. Furthermore, targeted management strategies should focus on constraint indicators, such as the unemployment rate, to improve overall resilience.
4.1.3. Strategies for Improving UFR
Building upon the findings presented in
Section 4.1.2, future urban planning should focus on developing indicators related to resistance and adaptability to maximize UFR. Key recommendations include:
Managing population growth: Areas with higher population density tend to have lower UFR, highlighting the importance of controlling urban sprawl and maintaining sustainable population levels.
Integrating ecological considerations: Urban areas with more green infrastructure, such as parks and vegetation, show better flood resilience. Increasing green infrastructure, enhancing soil permeability, and reducing surface runoff through permeable surfaces and green spaces are essential steps.
Strengthening monitoring and management: Cities with robust hydrological systems have better flood response capabilities. Expanding hydrological stations, improving flood forecasting, and enhancing data collection are critical for strengthening adaptive capacity.
Increasing R&D funding: Technological advancements in flood risk management are crucial for developing better flood-resistant infrastructure. Investing in R&D will support this progress.
Implementing these strategies will help cities adapt to urban and environmental challenges, as demonstrated by the flood adaptability planning in Seoul’s Gangnam District, where green infrastructure plays a key role in sustainable urban design and planning [
40].
4.2. Advantages of the Proposed Method
Firstly, a UFR index system covering four dimensions was established through a review of relevant literature. This system integrates indicators from natural, economic, social, and infrastructure aspects, providing a comprehensive research framework. Additionally, the evolution of resistance, recovery, and adaptability aligns with the cyclical nature of urban systems.
Furthermore, compared to traditional methods such as the obstacle degree model, the XGBoost-SHAP method combines the efficiency of machine learning with strong interpretability. This approach not only improves prediction accuracy when handling complex urban flood resilience drivers but also effectively reveals the quantitative contributions and directional impacts of each factor.
4.3. Limitations and Future Research Directions
This study provides valuable insights into UFR in the BTH region but acknowledges the limitations of the multi-method coupling approach. The combination of CRITIC-Entropy, VIKOR, and XGBoost-SHAP offers a comprehensive framework for UFR assessment, but each method has its own assumptions and limitations. For instance, CRITIC, based on linear correlations, may not capture the complex nonlinear relationships between indicators. Although VIKOR and entropy weighting improve upon this, they remain influenced by data distribution, limiting their ability to account for dynamic interactions across resilience dimensions. Therefore, while the multi-method approach is effective, its limitations in capturing UFR dynamics should be considered.
Additionally, the absence of explicit flood modeling is a key limitation. While non-hydrological data provide valuable insights, integrating hydrological simulations or watershed models would offer a more accurate representation of flood risks, especially during extreme weather events. Future research could explore integrating flood modeling to enhance the reliability of UFR assessments.
The data used in this study also has limitations, particularly in terms of adaptability. Adaptability, representing the ability of urban systems to achieve sustainable development through learning and adjustment, was not fully captured due to data constraints, especially socio-institutional factors such as community preparedness and governance quality. Future studies should incorporate these factors. Furthermore, rainfall variability presents challenges for flood resilience assessments. More refined rainfall metrics, such as IDF curves or maximum hourly rainfall, should be considered to reflect flood risks more accurately.
Meanwhile, future research could enhance UFR assessments by conducting quantitative validation using actual flood loss data, helping to verify the effectiveness of the current model. Sensitivity and uncertainty analyses would also reveal the impact of different variables and methods on the results, leading to more reliable outcomes.
Finally, flood risks often extend across administrative boundaries. Using administrative boundaries for flood resilience assessment may lead to inaccurate results. Future research should explore ways to overcome these constraints and promote data sharing and coordinated planning across regions.
4.4. Validation of MCDM Models
To ensure the robustness and reliability of the MCDM models used in this study, we conducted a validation. We focused on two widely used methods: VIKOR and TOPSIS. We compared these two methods in terms of their ability to rank cities based on UFR.
We calculated Kendall’s Tau for each city, with a coefficient of 0.4862, indicating a moderate positive correlation between the rankings derived from VIKOR and TOPSIS methods. The p-value of 2.11 × 10−19 confirmed that this correlation is statistically significant, further strengthening the consistency and reliability of both methods.
From the comparison of the line charts for all cities from 2011 to 2022, as shown in
Figure 9, it is evident that the VIKOR curve shows a wider interval, stronger inter-annual monotonicity, and no intersections between cities, significantly outperforming the low distinguishability of TOPSIS. VIKOR demonstrates higher sensitivity and is better at capturing the fluctuating changes in UFR.
5. Conclusions
This study proposed an evaluation framework of UFR, and then the spatiotemporal evolution and attribution of UFR of the BTH region from 2011 to 2022 was assessed across the three dimensions of resistance, recovery, and adaptability.
The results reveal that the overall UFR in the BTH region exhibited a fluctuating but upward trajectory, indicating gradual improvements in regional resilience capacity. Beijing consistently maintained the highest resilience level, the resilience values are all above the critical threshold of 0.7349, which can be largely attributed to its favorable geographic and socio-economic advantages. Spatially, a distinct gradient pattern was observed, characterized by higher resilience in the northwest and lower resilience in the southeast. Notably, the number of cities classified as low or very low resilience declined over time, while the disparity in resilience levels across cities narrowed under the influence of regional coordinated development policies. Spatial autocorrelation analysis further revealed clustering effects, with high-value clusters centered around Beijing and low-value clusters concentrated in Tianjin.
The decomposition analysis highlights that indicators related to resistance and adaptability, such as urban slope, hydrological station density, per capita urban park area, urban population density, the proportion of R&D expenditure to GDP, and green coverage in built-up areas, were the primary drivers of resilience enhancement during the study period. In contrast, the urban registered unemployment rate was found to have a marginal inhibitory effect, hindering the overall improvement in UFR.
Given the hydrological complexity of the BTH region, the findings of this study are primarily applicable to this region but could also be extended to other cities facing similar challenges, such as those with high population density and rapid urbanization. These results offer valuable guidance for urban planning and policy-making, helping cities better prepare for future flood risks.
Collectively, these findings provide a scientific basis for guiding policy formulation and urban planning aimed at strengthening urban flood resilience and ensuring sustainable urban development in the face of increasing hydrometeorological risks.