A Multi-Source Geospatial Framework for the Evaluation of Urban Flood Resilience Under Extreme Rainfall: Evidence from Chongqing, China

Yang, Tao; Yun, Yingxia; Tang, Fengliang; Zheng, Xiaolei

doi:10.3390/w18091067

Open AccessArticle

A Multi-Source Geospatial Framework for the Evaluation of Urban Flood Resilience Under Extreme Rainfall: Evidence from Chongqing, China

¹

School of Architecture, Tianjin University, Tianjin 300072, China

²

Department of Intelligent Construction, Chongqing Jianzhu College, Chongqing 400072, China

^*

Author to whom correspondence should be addressed.

Water 2026, 18(9), 1067; https://doi.org/10.3390/w18091067

Submission received: 19 March 2026 / Revised: 24 April 2026 / Accepted: 28 April 2026 / Published: 29 April 2026

(This article belongs to the Section Urban Water Management)

Download

Browse Figures

Versions Notes

Abstract

Mountainous megacities face a distinctive form of pluvial waterlogging in which terrain-controlled flow convergence, accelerating imperviousness, and aging drainage interact to produce chronic, spatially clustered failures rather than stochastic events. Existing frameworks, such as hydrodynamic modeling, data-driven machine learning, and multi-criteria composite indexing, carry distinctive failure modes at the municipal scale. This study develops and externally validates a city-wide, grid-based assessment framework for Chongqing, China, through three integrated choices. First, resilience is reformulated as a stabilized adaptation-to-risk ratio and subjected to an explicit falsification test against independent waterlogging observations. Second, multi-source hydroclimatic, topographic–hydrologic, land-cover, and service-accessibility indicators are integrated on a 500 m fishnet (22,500 cells) through within-component CRITIC–Entropy weighting and TOPSIS, with robustness diagnosed by a 500-iteration Monte Carlo weight-perturbation analysis. Third, a spatially grouped LightGBM classifier with SHAP interpretation serves both as an independent validation layer and as a mechanistic lens on non-linear driver thresholds. The composite risk surface achieves ROC-AUC values of 0.834 and 0.873 against two independent waterlogging registries, is strongly spatially clustered (Moran’s I = 0.81, p < 0.001), and preserves its ranking under aggressive weight perturbation (Spearman ρ ≥ 0.95 in 95% of scenarios). A counterintuitive finding emerges from the falsification test as resilience yields ROC-AUC below 0.5 on both point sets, indicating that accessibility-based capacity proxies systematically capture urban centrality rather than drainage robustness, like a diagnosable measurement problem affecting the wider resilience-index literature. LightGBM concentrates 88.0% of waterlogging cells within the top 10% of scored grids, and SHAP-derived thresholds align with saturation-ponding, well-drained, and convergence–hotspot regimes of classical hydrology. Together, these results reframe waterlogging assessment in complex terrain from a cartographic exercise into a falsifiable, resource-aware prioritization framework, and clarify why capacity maps and risk maps should be published as complementary instruments of flood governance.

Keywords:

urban pluvial waterlogging; flood risk index; urban resilience; CRITIC–Entropy–TOPSIS; LightGBM model; SHAP interpretation; Chongqing

1. Introduction

Urban flooding, often reported as waterlogging, is increasingly a routine hazard rather than an exceptional event. Three long-running processes are colliding since intensifying short-duration rainfall, rapid urban expansion that hardens land surfaces, and drainage systems that are difficult to upgrade at the pace of development. From a climate perspective, assessments by IPCC [1] indicate that heavy precipitation is expected to intensify with warming, and the signal is particularly relevant at shorter durations that place immediate stress on stormwater systems. In parallel, urbanization increases imperviousness and accelerates runoff generation, while the spatial concentration of people and assets amplifies consequences even when inundation depths are modest. This combination makes pluvial flooding a central concern for cities pursuing both safety and functional continuity [2].

The problem is especially sharp in mountainous megacities. Complex terrain can accelerate runoff, funnel flow into valley bottoms and transport corridors, and create localized convergence zones where conveyance becomes fragile [3]. In cities like Chongqing, steep slopes, dissected landforms, and dense built-up districts form a coupled natural–built drainage system in which small bottlenecks can repeatedly produce waterlogging [4,5]. Empirically, recent work in Chongqing has highlighted how urbanization processes and land-surface change can alter flood exposure and flood-prone patterns, reinforcing the need for approaches that are both spatially explicit and scalable to the metropolitan level [4]. Yet, city-scale decision-making often must proceed without full access to detailed pipe-network data, maintenance logs, pump operations, or event-resolved hydraulic boundary conditions.

Methodologically, three traditions dominate contemporary urban flood assessment, and each carries a distinctive failure mode when transferred to terrain-constrained megacities. Process-based 1D/2D hydrodynamic modeling represents flow routing and inundation physics with high fidelity when detailed pipe-network data and calibration inputs are available [6,7]. Its principal constraint at the municipal scale is informational, like up-to-date pipe geometries, inlet densities, and pump-station records, which are rarely public, and model recalibration typically lags infrastructure retrofitting. Data-driven machine learning using reported waterlogging records as labels [8,9] is scalable and flexible but susceptible to label sparsity, reporting-density bias, severe class imbalance, spatial autocorrelation that inflates naïvely cross-validated performance [10], and opacity unless paired with post hoc interpreters. Multi-criteria composite indexing (MCDA) is data-light and directly communicable to planners, but its characteristic failure mode is drift into visually compelling maps that are never confronted with independent ground truth and whose implicit weighting is driven by statistical dispersion rather than process importance [11]. None of the three traditions alone resolves the joint constraints that mountainous megacities impose.

Recent work has therefore moved toward hybrid frameworks that couple elements of two or three traditions. Physics-guided machine learning uses hydrodynamic simulation outputs as training labels for surrogate classifiers [7]; hydrodynamic–AHP–ML coupling integrates process-based flood surfaces with expert-weighted exposure layers and tree-based models [12]; Gaussian-process surrogate models learn over low-fidelity hydrodynamic solvers to enable rapid climate-scenario exploration [13] and deep-learning super-resolution recovers sub-block-scale inundation from coarse-grid simulation [14]. While conceptually compelling, these strategies still rest on calibrated hydrodynamic priors and detailed pipe-network information that are not publicly available city-wide for Chongqing. The remaining practical gap is therefore a municipality-scale, independently validated, explainable framework that (i) does not require pipe-network data; (ii) quantifies rather than assumes the relationship between capacity proxies and operational waterlogging performance; and (iii) converts ranking outputs into targetable engineering priorities.

Between these poles, composite index approaches remain widely used because they offer a pragmatic mechanism for screening and zoning in data-limited contexts. The core logic is straightforward, as flood risk emerges from the joint configuration of hazard predisposition, exposure, and vulnerability or sensitivity. Such frameworks align with mainstream disaster-risk thinking and have been operationalized in many urban studies by integrating precipitation indicators, terrain controls, land-cover conditions, and exposure proxies. The question becomes not whether to integrate indicators, but how to weigh them transparently and aggregate them consistently. Multi-criteria decision analysis (MCDA) provides a common toolkit for this purpose, including objective weighting schemes such as CRITIC (which uses contrast and inter-criterion conflict) and aggregation methods such as TOPSIS (which ranks alternatives by distance as an ideal solution) [11]. These methods are attractive because they can be implemented with heterogeneous datasets, are computationally light, and generate interpretable spatial products.

However, mapping risk is only half of the governance problem. Urban systems also differ in their capacity to cope with, respond to, and recover from flood disruptions [15,16,17]. This is where the literature on resilience becomes essential. Classic resilience framing emphasizes the ability to withstand disturbance, maintain function, and recover, often decomposed into robustness, redundancy, resourcefulness, and rapidity [16,18,19]. In urban studies, resilience has been defined more broadly as the capacity of an urban system and its actors to maintain or rapidly restore desired functions in the face of shocks and stresses [16,19,20,21], a definition that highlights multi-dimensionality and the possibility that capacity can vary independently of hazard [22,23]. In other words, risk and resilience are not mirror images. It is a place where it can be high-risk and resource-rich. Conversely, a low-risk place can be capacity-poor [19]. This conceptual separation matters because index-based practice often drifts into treating capacity proxies as if they were direct evidence of flood avoidance.

Recent policy agendas reinforce the need to keep this distinction crisp [24,25]. In China, urban stormwater management has been strongly shaped by the Sponge City concept, emphasizing distributed retention, infiltration, storage, and nature-based measures as complements to conventional conveyance [26]. While the strategy is well established at the policy level, implementation priorities remain uneven and heavily constrained by budgets and institutional capacity [27,28]. Therefore, a practical planning question arises: When decision-makers cannot retrofit everywhere at once, how can they identify the most consequential hotspots, and how can they distinguish (i) locations that are hazard-prone because of terrain–hydrology predisposition from (ii) locations that become chronic waterlogging points because of built-network bottlenecks, operational failures, or micro-topographic traps?

Answering that question requires two further ingredients, which are validation and interpretability. First, hotspot maps, whether produced by MCDA or machine learning, are only useful if they align with observed waterlogging patterns [29]. Without independent validation using flood or waterlogging point data, risk surfaces can become persuasive graphics rather than actionable evidence [4]. Second, governance and engineering teams need interpretable drivers rather than black-box scores [30]. Explainable machine learning is increasingly used to bridge this gap. Gradient-boosted decision tree models, such as LightGBM, are well-suited to susceptibility modeling because they handle non-linearity and interaction effects with limited parametric assumptions [8,31]. SHAP (Shapley Additive Explanations) enables consistent, feature-level attribution of predictions, helping translate model behavior into plausible mechanisms and thresholds that are communicable to planners and engineers [32]. This matters for waterlogging because the process is rarely linear due to the condition that rainfall extremes may only trigger high impacts where runoff potential is high; slope may create both fast-routing hazards and downstream convergence; and centrality variables may act as proxies for exposure intensity and reporting bias simultaneously [23,33].

Despite rapid progress, there remains a specific gap for mountainous megacities like Chongqing [4,5,21]. Many city-scale studies either focus on hazard or exposure components in isolation, and present composite indices without rigorous validation against independent waterlogging observations or use machine learning without sufficiently unpacking why capacity-related variables sometimes correlate positively with waterlogging occurrence. This last pattern is not merely a technical curiosity, as it reflects a deeper measurement pitfall in which capacity proxies capture urban centrality and service concentration rather than drainage-system robustness. Empirically separating “structural advantage in services” from “operational vulnerability in drainage performance” is critical if resilience metrics are to support flood governance rather than inadvertently mislead it [16,34].

Against this backdrop, the present study advances three specific contributions. Methodologically, we decouple capacity (service accessibility) from risk by formulating resilience as a stabilized adaptation-to-risk ratio, and empirically test against two independent waterlogging point sets whether this transformation better describes waterlogging geography in a mountainous megacity than the conventional inverse relationship. To our knowledge, this is the first deliberate test of whether accessibility-based capacity proxies behave as risk-reducing covariates or as urban-centrality markers. In terms of application, we deliver the first city-wide (22,500 × 500 m grids), externally validated risk and resilience surface for Chongqing using exclusively open data, reproducible from publicly available sources. Regarding validation, we combine two independent point sets (117 historical points for 2015–2021 and 70 points for 2022, combined to 167 unique positive grid cells) with ROC/PR metrics, top-k capture curves, 500-iteration Monte Carlo weight-perturbation sensitivity analysis, and residual spatial-autocorrelation diagnostics, thereby converting academic discrimination scores into operational inspection budgets.

2. Study Area and Data

2.1. Study Area

Chongqing is a mountainous municipality located in Southwest China (approximately 106–110° E, 28–32° N), characterized by pronounced terrain gradients and complex river–valley systems. The region spans from low-elevation river corridors to high-elevation mountain areas, producing strong spatial contrasts in runoff generation, drainage convergence, and flood susceptibility. Major waterways, including the Yangtze River and the Jialing River, structure the municipal drainage network and concentrate population and assets along flood-prone corridors [4,35,36]. Figure 1 illustrates the municipal boundary, topographic background, and the delineation of the central urban area, which serves as the focal area for waterlogging validation and driver identification.

The central city area covers the municipality’s core built-up districts, including Yuzhong, Jiangbei, Yubei, Shapingba, Jiulongpo, Dadukou, Nan’an, Banan, and Beibei, where high development intensity intersects with constrained terrain and dense drainage infrastructure. In such settings, pluvial flooding and waterlogging are not solely controlled by rainfall intensity; they also depend on slope breaks, flow accumulation pathways, impervious surfaces, and the spatial mismatch between drainage capacity and rapidly changing exposure.

Climatically, Chongqing is governed by a humid subtropical monsoon regime, with rainfall concentrated in the warm season. The precipitation heatmap (Figure 2) shows clear seasonality, with sustained high rainfall typically occurring from late spring to early autumn. This seasonal concentration increases the likelihood of short-duration extremes and successive rainfall events that can overwhelm drainage systems and trigger waterlogging, particularly in low-lying built-up pockets and convergent catchments. Given these coupled hydro-climatic and geomorphic conditions, Chongqing provides a representative setting for evaluating multi-component flood risk and resilience under strong terrain constraints and rapidly evolving urban exposure.

2.2. Data Source

This study integrates multi-source geospatial datasets to support two linked tasks: (i) a municipality-wide flood risk and resilience assessment on a uniform grid, and (ii) a central city area analysis for validating model outputs against observed waterlogging locations and diagnosing key drivers of waterlogging occurrence. All datasets were harmonized to a 500 m × 500 m fishnet grid, enabling consistent spatial aggregation across hydrometeorological, topographic–hydrological, land-cover, and socio-economic layers. For each grid cell, indicators were summarized using mean-based statistics, producing a structured indicator database for subsequent risk/resilience evaluation and machine-learning analysis.

Daily precipitation was sourced from the ChinaMet 0.01° gridded daily precipitation product (2020–2024, warm season May–October), which integrates rain-gauge observations with satellite products and is validated at the national scale [3]. Daily fields were clipped to the Chongqing municipal boundary, buffered outward by 8 km to avoid rolling-window boundary artifacts, and provider-flagged values were removed. Following the ETCCDI convention, eight extreme-precipitation indices were derived from the June–September target window, using May–October reads to preserve rolling-sum boundary behavior. Rx1day, Rx3day, and Rx5day as annual maxima of 1-, 3-, and 5-day running totals; R20 mm, R50 mm, and R100 mm as annual counts of days exceeding 20, 50, and 100 mm, respectively; and P95 and P99 as the 95th and 99th percentiles of June–September daily precipitation. Per-year indices were composited across 2020–2024 by multi-year mean (used in Table 1); sensitivity checks against median and maximum composites did not alter the top-decile ranking of Hazard indicators. Because the input is a pre-validated gridded product rather than station records, station-level kriging cross-validation does not apply. We cross-inspected the Rx1day spatial pattern against Chongqing Climate Center bulletins for 2015, 2018, and 2020 and found consistent major-event footprints.

Two derived datasets warrant explicit disclosure. The 250 m gridded mean-annual-runoff layer was provider-validated against national runoff stations and is used here as a screening-level hazard proxy at 500 m zonal-mean aggregation; event-scale validation against local gauges is flagged as future work. Service-accessibility indicators (fire stations, Class-A tertiary hospitals, public shelters) were computed from AMap POI records after duplicate removal and address cross-matching, using the Gaussian-enhanced two-step floating catchment area method (Gauss-2SFCA), which has been validated against gravity-model benchmarks in public-service and disaster-response applications [37]. We emphasize that accessibility is a structural proxy for response capacity, not a direct measurement of drainage-system robustness. This proxy–construct gap is examined empirically in Section 5.2.

The indicator system follows a commonly used framework that separates risk formation into hazard, exposure, sensitivity, and adaptation components, while additionally deriving composite indices such as vulnerability, risk, and resilience. Hazard indicators describe rainfall extremes and runoff-generating conditions such as multi-day precipitation maxima, percentile precipitation, runoff potential, and terrain-driven convergence. Exposure captures the spatial concentration of people and human activities, proxied by population and nighttime light intensity. Sensitivity is represented by land-cover composition and surface characteristics that modulate inundation likelihood, like construction land proportion and other land-cover shares. Adaptation reflects the capacity to cope with flood impacts through accessibility to critical services and infrastructure support. Table 1 summarizes the full set of indicators used in this study.

A key adaptation component is accessibility, quantified using the Gaussian two-step floating catchment area (Gaussian 2SFCA) approach. In brief, the method first estimates service supply-to-demand ratios within travel-time catchments and then aggregates these ratios to each grid cell with a distance-decay kernel, producing accessibility scores that reflect both proximity and competition for services. In this study, accessibility indicators were computed for multiple service types relevant to flood response (e.g., Class-A tertiary hospital facilities, fire stations, and emergency shelters), then summarized at the grid level to represent the spatial distribution of adaptive capacity. This design allows adaptation to be interpreted as an operational, place-based capacity rather than an abstract attribute.

Two independent waterlogging point sets are used for external validation and driver modeling. Set A (“historical”) comprises 117 points compiled from the Central City Drainage and Waterlogging Control Special Plan (Revision 2022), issued by the Chongqing Municipal Commission of Housing and Urban–Rural Development. This plan integrates provenance streams over 2015–2021. Municipal engineering-inspection records submitted by district drainage authorities and consolidated by the municipal emergency-management office, and post-event field surveys following the major rain events of 2015, 2018, and 2020. Set B (“2022”) comprises 70 points released with the plan’s 2022 update, reflecting events recorded during the 2021–2022 warm seasons. Both sets are multi-year aggregates of recurrent waterlogging locations, not single-event snapshots. Each record specifies the textual location of a chronic waterlogging occurrence but does not report inundation depth, duration, or annual recurrence frequency, which were not publicly disclosed for operational reasons as a limitation acknowledged in Section 6.

The phenomena represented are predominantly surface ponding combined with drainage-system overload, and pure fluvial inundation along the Yangtze and Jialing corridors are excluded by the plan’s scope and are governed by a separate flood-defense plan. The predictive target of both the composite index and the LightGBM classifier is therefore best characterized as chronic pluvial waterlogging locations recurrent street-level ponding reflecting short-duration rainfall extremes, terrain-driven convergence, runoff potential, imperviousness, and local drainage bottlenecks rather than event-specific inundation depth or fluvial stage.

For grid-level validation, the 187 field-reported points were spatially joined to the 500 m fishnet and deduplicated at the cell level, producing 167 unique positive grid cells (106 from Set A, 65 from Set B, with four cells shared). This grid-level count defines the binary label used throughout Section 4. Given the resulting prevalence of ≈0.74% across the 22,500-cell central-city grid, validation emphasizes ranking-based diagnostics (ROC-AUC, PR-AUC, top-k capture) rather than accuracy, which is uninformative at this rarity. Reporting-density considerations are addressed in Section 4.4.

3. Methods

3.1. Indicator Construction and Framework

This study adopts a uniform 500 m × 500 m fishnet as the analytical unit to support a city-wide, comparable assessment of flood risk and resilience. All input layers were spatially aligned to the grid, and each indicator was aggregated to grid cells using a consistent rule, such as zonal mean for raster-derived variables, density or distance-based statistics for vector-derived variables. To reduce scale-induced bias, indicators were processed under a unified coordinate reference system and clipped to the administrative boundary of Chongqing Municipality. The same framework was then applied to the central city area as a focused sub-region for validation and mechanism interpretation. Figure 3 demonstrates the overall research framework.

The 500 m fishnet was selected as a compromise among three constraints. First, the coarsest native resolution among required inputs, 250 m for the gridded runoff layer, approximately 1 km for gridded population, and approximately 500 m for VIIRS-like nighttime light, prevents physically meaningful downscaling below 250 m without spurious artifacts. Second, Chongqing’s central-city super-block scale averages 200–400 m in the dense core and 600–1200 m in peripheral zones, so a 500 m cell captures at least one super-block of urban fabric. Third, at sub-500 m resolutions, the waterlogging positivity rate falls below 0.3%, destabilizing spatially grouped cross-validation. Because any aggregation is subject to the modifiable areal unit problem [38], we acknowledge this as a residual limitation; a full multi-resolution replication at 250 m and 1 km is flagged as the highest-priority future-work extension. Within the 500 m framework, we report a closely related robustness diagnostic, such as a 500-iteration Monte Carlo weight perturbation (Section 4.4) that directly tests the stability of the top-k ranking to weighting uncertainty.

Indicators were organized into four conceptual components that jointly describe the flood risk–resilience system: Hazard, Exposure, Sensitivity, And Adaptation. Hazard captures hydro-climatic and terrain-related forcing (e.g., precipitation extremes, runoff potential, topographic wetness, slope, elevation, proximity to rivers). Exposure represents the concentration of people and assets (e.g., population density proxies, nighttime light intensity). Sensitivity reflects the susceptibility of the built and natural environment to inundation under a given forcing, while adaptation measures the capacity of emergency response and services to reduce impacts and support recovery. In particular, accessibility variables in the adaptation component were computed using a Gaussian-enhanced two-step floating catchment area (Gauss-2SFCA) approach, which models distance-decayed service availability and better reflects real-world spatial attenuation than binary catchments.

For Gauss-2SFCA, the first step calculates a supply-to-demand ratio for each facility

j

(e.g., hospitals, fire stations, shelters):

R_{j} = \frac{S_{j}}{\sum_{i \in Ω (j)} P_{i} G (d_{i j})}

(1)

where

S_{j}

is the service supply of facility

j

,

P_{i}

is the demand at grid

i

,

d_{i j}

is the travel distance or time between

i

and

j

,

Ω (j)

denotes grids within the catchment of facility

j

, and

G (d_{i j})

is a Gaussian decay function. The second step sums the decayed ratios from all reachable facilities for each grid

i

:

A_{i} = \sum_{j \in Ω (i)} R_{j} G (d_{i j})

(2)

where

Ω (i)

indicates facilities within the catchment of grid

i

. The Gaussian decay is specified as

G (d) = e x p (- {(\frac{d}{σ})}^{2})

(3)

where

σ

controls the decay rate. The resulting accessibility scores are then used as adaptation indicators and integrated with the other components to compute composite indices.

3.2. CRITIC–Entropy Fusion and TOPSIS Weighting

To ensure comparability across indicators with different units and directions, all indicators were normalized to [0, 1]. For a “positive” indicator (higher is worse for risk components, or higher is better for adaptation depending on the component definition), min–max scaling was applied. For “negative” indicators, a direction correction was performed so that larger normalized values consistently represent a stronger contribution to the intended latent construct within that component.

Objective weights were then computed by combining CRITIC and Entropy weighting to reduce reliance on a single statistical criterion. CRITIC assigns a higher weight to indicators with larger variability and lower redundancy [11]. Let

x_{i j}

be the normalized value of indicator

j

in grid

i

,

σ_{j}

be the standard deviation of indicator

j

, and

r_{j k}

be the Pearson correlation between indicators

j

and

k

. CRITIC defines the information content of indicator

j

as

C_{j} = σ_{j} \sum_{k \neq j} (1 - r_{j k}) \Rightarrow w_{j}^{CRITIC} = \frac{C_{j}}{\sum_{j} C_{j}}

(4)

Entropy weighting quantifies the dispersion of each indicator based on information entropy [39]. With

p_{i j} = \frac{x_{i j}}{\sum_{i} x_{i j}}, e_{j} = - \frac{1}{\ln n} \sum_{i} p_{i j} l n (p_{i j}), d_{j} = 1 - e_{j},

(5)

The entropy weight is

w_{j}^{ENT} = \frac{d_{j}}{\sum_{j} d_{j}}

(6)

The two sets of weights were fused by averaging (consistent with the code implementation):

w_{j} = \frac{w_{j}^{CRITIC} + w_{j}^{ENT}}{2}

(7)

Objective weighting via CRITIC–Entropy fusion was adopted because it is fully reproducible, avoids expert-panel bias where consensus priors were unavailable, and preserves within-component comparability. We acknowledge the critique that statistical dispersion is not a substitute for process importance [16,29] as a highly variable indicator is not necessarily a causally dominant one. Three design choices mitigate this. First, CRITIC–Entropy is applied within each of the four components separately, rather than across the entire indicator set, preventing structurally different constructs from competing for weight in a single statistical pool. Second, a 500-iteration Monte Carlo weight-perturbation analysis (Section 4.4) diagnoses both margin-level sensitivity and order-level stability of the resulting surface. Third, the two waterlogging point sets used for external validation were not used in any weighting or normalization step, so circularity is prevented by design. We surface in the main text (Section 4.1) that three highly skewed hazard indicators receive disproportionate entropy weight because entropy amplifies distributions with many near-zero values, as a caveat previously hidden in Appendix A.

For each component (hazard, exposure, sensitivity, adaptation), a weighted decision matrix was constructed and synthesized using TOPSIS [40]. TOPSIS ranks alternatives by their relative closeness to the ideal solution. For grid

i

, the distance to the positive ideal

D_{i}^{+}

and negative ideal

D_{i}^{-}

yields the closeness score:

S_{i} = \frac{D_{i}^{-}}{D_{i}^{+} + D_{i}^{-}}

(8)

where

S_{i} \in [0, 1]

is the component index. This procedure produces the component indices

H_{i}

(hazard),

E_{i}

(exposure),

S_{i}

(sensitivity), and

A_{i}

(adaptation).

Following the system logic implemented in the code, Vulnerability emphasizes exposure and sensitivity while treating higher adaptation as risk-reducing:

V_{i} = Norm (α E_{i} + β S_{i} + γ (1 - A_{i}))

(9)

where

α, β, γ

are component weights (set in the configuration and applied consistently across the study). Flood risk is then modeled as a multiplicative interaction between hazard and vulnerability to capture compounding effects:

R_{i} = Norm (H_{i} \cdot V_{i})

(10)

Finally, resilience is derived as a monotonic transformation of adaptation relative to risk. The main implementation uses a stabilized ratio (log form) to improve numerical behavior and interpretability:

{Res}_{i} = Norm (\ln \frac{A_{i} + ε}{R_{i} + ε})

(11)

where

ε

is a small constant preventing division by zero. This formulation makes resilience high when adaptation capacity is strong, and risk pressure is comparatively low.

3.3. Validation and Driver Identification with LightGBM–SHAP

To test whether the grid-based indices capture real-world flood-prone patterns, waterlogging points within the central city area were extracted from the “Central City Drainage and Waterlogging Prevention Special Plan (Revised, 2022)”. The original records are text-based; place names and descriptions were converted to coordinates via geocoding, after which points were spatially matched to grid cells. A grid cell was labeled as an “event cell” if it intersects at least one waterlogging point, producing a binary label

y_{i} \in 0, 1

for validation and modeling.

Validation focused on whether event cells exhibit systematically higher hazard, vulnerability, and risk scores than non-event cells. The workflow computes distributional contrasts and classification-oriented diagnostics, including ROC/PR performance, top-

k

capture curves, and calibration checks. These outputs quantify how well the risk surface can “retrieve” known waterlogging locations under different screening intensities, offering a practical interpretation aligned with risk management.

To explain why waterlogging concentrates in certain cells, a LightGBM classifier [31] was trained within the central-city area to predict the binary label y from the selected indicators. The label is extremely imbalanced, with 167 positive cells out of 22,500, a prevalence ≈ 0.74%, which decisively shapes both model design and evaluation. Because the downstream use is ranking-based screening rather than hard classification, preserving the predicted-probability rank structure is more important than any particular accuracy threshold. We therefore adopted LightGBM’s native scale_pos_weight parameter set to n⁻/n⁺, which re-weights minority-class gradient contributions without injecting synthetic samples. We did not adopt SMOTE or related resampling corrections, because recent evidence shows they systematically distort probability calibration in low-prevalence risk-prediction settings—bending calibration curves away from the identity line and corrupting rank-order fidelity precisely in the high-probability region that governs top-k screening [41]. A systematic benchmark against SMOTE variants and focal loss is flagged as future work. To reduce over-optimistic generalization caused by spatial autocorrelation in features [10], spatial groups were constructed via MiniBatch K-means on grid centroids, and group-aware K-fold cross-validation ensured that training and validation folds were spatially separated. Performance is reported via cross-validated ROC-AUC, PR-AUC, and log loss, with the final model interpreted through SHAP:

f (x_{i}) = ϕ_{0} + \sum_{j} ϕ_{i j}

(12)

where

ϕ_{i j}

is the SHAP contribution of feature

j

to the prediction at grid

i

. Dependence plots and interaction analyses are used to diagnose non-linear thresholds and coupled mechanisms. These interpretations are reported as mechanistic evidence linking hydro-topographic forcing, exposure concentration, and adaptation accessibility to observed waterlogging patterns.

Final LightGBM hyperparameters (num_leaves = 63, learning_rate = 0.03, n_estimators up to 3000 with 150-round early stopping, min_child_samples = 50, subsample = 0.8, colsample_bytree = 0.8, reg_lambda = 1.0) were selected by targeted grid search under the same spatial-group K-fold cross-validation, optimizing mean PR-AUC across folds. Choices were deliberately conservative: tree depth and leaf count were capped to reduce overfitting risk on a dataset with only 167 positive cells, and a low learning rate with many shallow trees preserves feature-level SHAP attribution stability. We did not conduct a systematic benchmark against alternative gradient-boosting libraries (XGBoost, CatBoost) or deep-learning classifiers for two reasons. First, the central argument of the paper, like the dissociation between accessibility-based capacity proxies and operational waterlogging performance, is derived from SHAP-level feature–target structure and is stable under library substitution when the feature representation is held constant. Second, the principal source of residual error in this application is data limitation (reporting-density bias in the label, absence of pipe-network attributes in features) rather than model capacity. Systematic cross-library benchmarking and Bayesian hyperparameter optimization are flagged as future methodological extensions.

4. Results

4.1. Spatial Patterns of Flood Risk Evaluation in CHONGQING

The composite maps reveal pronounced spatial heterogeneity in flood-related conditions across the study area, and the different dimensions do not simply overlap. Sensitivity, exposure, and adaptive capacity show a coupled yet spatially misaligned structure: high exposure tends to coincide with built-up corridors and urban cores, whereas sensitivity and hazard are more strongly constrained by terrain variability, flow-convergence structure, and the hydroclimatic background (see Figure 4). Also, the specific weighting proportions of each dimension are displayed in Table A1. The resulting composite risk surface (RISK_T) is strongly spatially clustered with global Moran’s I = 0.814 (z = 241.6, p < 0.001, KNN k = 8, 999 permutations), confirming that risk in Chongqing has a systemic, not random, geography and justifying targeted rather than uniform screening. Table A1 shows the full indicator-level weights; three highly skewed hazard indicators receive disproportionate entropy weights because entropy amplifies distributions with many near-zero values (Section 3.2).

After combining the three components, high-vulnerability zones mainly emerge where exposure and sensitivity reinforce each other while adaptive capacity is insufficient to offset their joint effects. The resulting risk surface further condenses into continuous belts and nodal hotspots rather than scattered patches. This pattern suggests that observed waterlogging is not a random local anomaly; instead, it reflects a systemic outcome shaped by hydroclimatic forcing, terrain-controlled convergence, and the spatial concentration of urban elements.

Importantly, the spatial pattern of resilience is not a simple inverse of risk. Some high-risk core areas also exhibit relatively high resilience scores. In this study, resilience indicators primarily represent the “capacity side” of the system. Urban cores often have stronger infrastructure and service provision, yet they also carry higher exposure and heavier drainage loads. The co-existence of high risk and high resilience is therefore not contradictory; it is a typical structural feature of complex urban systems.

4.2. Validation Against Observed Waterlogging Points

Figure 5 focuses on the central city area and overlays observed waterlogging points. At this scale, risk hotspots become more localized and corridor-like, and the waterlogging points cluster within and around these hotspots. Notably, resilience in the central city is not uniformly low; rather, many waterlogging points fall within zones that show medium-to-high resilience values, suggesting that “capacity” indicators alone do not guarantee the avoidance of localized drainage failures. Quantifying this overlap explicitly, two complementary validation layers are reported. The composite risk index captures 75/167 positive cells (44.9%) in its top 10% and 127/167 (76.0%) in its top 20%; the Spearman correlation between the risk index and a 500 m-bandwidth kernel density of positive-cell centroids is ρ = 0.495 (p < 0.001), with observed-to-expected enrichment in the top decile of 4.49. The non-linear LightGBM model reported in Section 4.3 further concentrates risk, capturing 147/167 (88.0%) in its top 10% and 159/167 (95.2%) in its top 20%, with top-decile enrichment of 8.8. The composite index provides an interpretable screening surface with moderate-to-strong ranking alignment; the LightGBM model quantifies how much additional concentration is achievable through non-linear interaction and regularized feature selection. The remaining gap between either index or a hypothetical perfect map is consistent with reporting-density bias in the label set analyzed in Section 4.4.

To assess how well the constructed indices explain real-world waterlogging locations, the composite indices were validated against two sets of observed waterlogging points. Overall, the risk index shows more stable discriminatory power between “waterlogging” and “non-waterlogging” samples than single-factor hazard or vulnerability indicators. This finding implies that hydrologic or topographic hazard alone is often insufficient to explain where urban waterlogging occurs; the coupled structure captured by the risk surface linking hazard, exposure, and capacity better matches the underlying formation logic (see Figure 5).

For the historical point set, the risk score achieves an ROC-AUC of 0.834, outperforming Hazard (0.802) and Vulnerability (0.789). The mean risk score for waterlogging locations is higher than that for non-waterlogging locations (mean difference is 0.048), and the Mann–Whitney U test indicates this separation is statistically significant (p < 0.001).

For the 2022 point set, discrimination improves further: risk reaches ROC-AUC 0.873, with vulnerability close behind (0.863) and hazard (0.824). Again, the risk score at observed waterlogging points is substantially higher than the negative samples (mean difference is 0.067, p < 0.001). These results (Table 2) confirm that the composite risk surface is not merely visually plausible but is quantitatively aligned with observed waterlogging patterns, as the illustration of Figure 5 validates.

Notably, the resilience-related score (1 − resilience) yields ROC-AUC values below 0.5 (0.245 for historical points and 0.296 for 2022 points), indicating that waterlogging points tend to occur in places labeled as “more resilient” by the capacity-oriented measure. Rather than being an error, this result suggests an important empirical feature of the study area: observed waterlogging points concentrate in dense urban cores where service and infrastructure capacity may be higher, but where exposure and drainage loading are also much higher. This finding reinforces the need to interpret resilience as a capacity dimension, not as the absence of risk.

4.3. LightGBM-SHAP Model and Key Drivers’ Interpretation

Beyond index-based validation, this study further models the statistical relationship between multi-source features and waterlogging occurrence at the grid scale. A predictive model was trained and evaluated under a spatial cross-validation framework to test generalizability. The dataset is extremely imbalanced: among 22,500 grids, only 167 are positive cases, corresponding to a positive rate of approximately 0.74%. Under such rarity, evaluation focuses on ranking performance and the ability to concentrate risk into a small spatial fraction, rather than headline accuracy.

The model shows strong ranking ability, as Figure 6 shows. The aggregated out-of-fold (OOF) results indicate an ROC-AUC of 0.920, suggesting robust separation between higher- and lower-risk grids. The precision–recall curve yields an average precision (AP) of 0.074, which remains clearly above a random baseline given the very low event rate.

The model output is operationally meaningful because risk is highly concentrated (see Figure 7). Top-k capture analysis indicates that a small proportion of high-scoring grids account for a large share of observed waterlogging points. Specifically, the top 1% of grids capture 16 points, the top 2% captures 35 points, and the top 5% captures 83 points. When focusing on the top 10% of grids, the model captures 147 points (88.02%), and the top 20% captures 159 points (95.21%). This concentration implies that, under limited resources, prioritizing inspection, monitoring, or drainage interventions in the highest-scoring 10% of areas can cover most of the historically waterlogged locations.

From the perspective of prevalence enrichment, the positive rate within the top 10% of grids rises to about 6.53%, compared with the baseline of 0.74% across all grids. This indicates that the model is best understood as a spatial prioritization tool as a risk ranking mechanism rather than a hard binary classifier.

Calibration analysis suggests that model scores should be interpreted as relative risk rather than literal probabilities (see Figure 8). The calibration curve shows clear deviations from the ideal diagonal: even in bins with higher predicted probabilities, the observed event rate remains substantially lower. This behavior is common for rare events and spatially heterogeneous samples. In practical terms, the model output is more reliable for ordering and zoning than for direct probability claims. If the model is later used for threshold-based warning or probabilistic communication, additional calibration methods (e.g., Platt scaling or isotonic regression) and spatial uncertainty assessment should be considered.

To interpret drivers, this study uses SHAP (Shapley additive explanations) to quantify how each feature contributes to the model’s predictions. The SHAP summary plots indicate a three-way coupled structure, involving urban accessibility/centrality proxies, hydrologic response and rainfall extremes, and terrain and flow-convergence constraints. Based on mean absolute SHAP values, key variables include accf60, RUNOFFmean, SLOPEmean, R50mean, P95mean, FLOWACCmean, ELEmean, and BTSMmean. Figure 9 presents the combination of SHAP beeswarm and bar charts. This ranking is consistent with process expectations as rainfall extremes and runoff provide triggering, and background loads, terrain, and convergence shape where water accumulates, and urban centrality proxies reflect imperviousness, drainage burden, and exposure intensity.

Binned SHAP dependence plots further reveal clear non-linear thresholds and compound effects. First, RUNOFFmean shows an overall strengthening effect with stepwise increases across certain ranges, indicating that once the long-term runoff background exceeds a critical level, the model shifts from negative to positive contributions, consistent with a threshold transition from “background load” to “heightened susceptibility” (Figure 10a). Second, accf60 exhibits a sharp threshold jump, such that contributions remain negative in low ranges but become strongly positive after a critical interval and continue to rise. This pattern is more indicative of an “urban core attribute” than a simple linear protective effect; once an area reaches high centrality or high service accessibility, it is more likely to be predicted as high risk (Figure 10b). Third, FLOWACCmean increases overall, but shows large dispersion at low values, implying unstable outcomes where convergence is weak; in areas with strong flow accumulation, risk contributions become more consistently positive, reflecting spatial locking by convergent terrain structure (Figure 10c).

Fourth, both P95mean and R50mean gradually shift from negative to positive contributions and increase more strongly at higher ranges, suggesting that extreme rainfall is not only a trigger but substantially raises the likelihood of system overload once intensity crosses key thresholds (Figure 10d,e). Fifth, SLOPEmean displays a more complex non-monotonic pattern: low slopes correspond to higher contributions (consistent with poor drainage and ponding), mid-range slopes show lower or negative contributions, and high slopes rise again. This likely reflects two different waterlogging processes: persistent ponding in flat areas versus rapid runoff and downstream concentration producing localized hotspots near convergence nodes (Figure 10f).

The SHAP dependence structure in Figure 10 is hydrologically coherent and can be anchored to known process regimes, transforming what might otherwise read as a descriptive curve-fitting exercise into a mechanistic narrative grounded in classical urban hydrology. For SLOPEmean (Figure 10f), the non-monotonic U-shape separates three physical regimes consistent with long-established findings on rainfall-runoff generation in mountainous catchments [42]: (i) a low-slope saturation-ponding regime, where gravitational drainage is insufficient to clear rainfall excess within the time of concentration; (ii) an intermediate well-drained regime (negative SHAP), where runoff moves efficiently to formal drainage; and (iii) a high-slope convergence–hotspot regime, where runoff velocity drives rapid routing that concentrates at slope breaks, underpasses, and road–valley confluences. For RUNOFFmean (Figure 10a), the step-wise positive contribution at elevated mean annual runoff corresponds to the regime in which the long-term runoff coefficient exceeds infiltration and channel-storage capacity [43], beyond which additional rainfall translates proportionally into overland flow. For P95mean and R50mean (Figure 10d,e), the visible inflection regions are consistent with typical Chinese storm-sewer design capacities in the 2–5-year return-period range, beyond which system-overflow probability rises sharply. For FLOWACCmean (Figure 10c), dispersion at low values collapses into consistently positive SHAP at high log₁₀ (FLOWACC), consistent with catchment-scale flow convergence at which cells become nodes rather than sources. Finally, the sharp threshold in accf60 (Figure 10b) does not reflect a protective mechanism: once the 2SFCA accessibility score exceeds its middle range, the grid is almost always located in a dense central-city fabric where impervious fraction, drainage loading, and reporting density all co-vary. This confounded urban-centrality signal is decomposed in Section 5.2, where we argue that the positive SHAP contribution of accessibility is a compound of genuine exposure co-location and reporting-density enhancement, rather than a causal failure of capacity provision.

4.4. Sensitivity, Error Anatomy, and Reporting-Bias Considerations

To quantify how sensitive RISK_T is to the weighting scheme, we ran a 500-iteration Monte Carlo analysis under two scenarios (Appendix B; Table A2). In the aggressive scenario (each indicator’s CRITIC–Entropy weight scaled by a factor drawn from U(0.5, 1.5) and renormalized within its component), the top 10% capture rate varies by up to ±22.9 percentage points around the baseline (Figure A1), indicating that top-decile membership is weight-sensitive. However, the ranking of grids is highly stable: Spearman ρ between baseline and perturbed surfaces has a median of 0.978, with 95% of perturbations maintaining ρ ≥ 0.953 (Figure A2). Under a moderate scenario (U(0.8, 1.2)), the top 10% spread narrows to ±8.6 pp and Spearman stability strengthens to a median 0.996 (95% ≥ 0.993). The composite index is therefore margin-sensitive at the cutoff but order-stable across perturbations—precisely the property that justifies rank-based screening over threshold-based classification. A one-at-a-time (OAT) tornado decomposition (Figure A3) further identifies the indicators whose individual perturbation contributes most to top-decile instability—predominantly the three heavy-tailed hazard indicators (R100mean, FLOWACCmean, SINKDEPmean) whose entropy weights already dominate the Hazard component (Table A1).

OOF residuals retain Moran’s I = 0.530 (z = 160.2, p < 0.001), against a baseline RISK_T field of I = 0.814. In severely imbalanced tasks (≈0.74%), residual fields are dominated by the clustered negative majority; complete removal would require eigenvector spatial filters [10] (Section 6). Residual clustering is itself a diagnosis that unobserved pipe-network attributes retain explanatory power, open-data features cannot recover, reinforcing the capacity–performance gap.

The 167 positive cells are subject to reporting-density bias across inspection, hotline, and survey streams [16,44]. Ranking-based validation mitigates but does not eliminate this: under uniform inspection intensity across the central nine districts, spatial bias distorts absolute rates but preserves relative rankings. The positive SHAP contribution of accf60 is reinterpreted in Section 5.2 as reporting-density enhancement compounded with exposure co-location. Bayesian under-reporting correction [44] is flagged as a priority extension (Section 6).

5. Discussion

5.1. The Spatial Logic of Flood Risk Formation in Chongqing

This study provides convergent evidence that a composite, grid-based index can capture the first-order geography of pluvial waterlogging risk in Chongqing, particularly within the central city where exposure is highly concentrated. The risk surface shows strong discrimination against two independent point sets (historical and 2022), with ROC-AUC values of 0.834 and 0.873, respectively. Importantly, the same validation table also shows consistently low PR-AUC values, which is expected under extremely low event prevalence and reinforces that the map should be interpreted primarily as a ranking and screening tool rather than a literal probability surface. This pattern is consistent with well-known evaluation theory: when positives are rare, ROC-AUC can remain high even when precision is modest, whereas PR-AUC more directly reflects the challenge of correctly identifying scarce events [45].

Mechanistically, the component maps with hazard, exposure, sensitivity, and adaptive capacity illustrations in Figure 4 suggest a structured “risk production chain” consistent with mainstream risk framing: hazard provides the physical forcing background, exposure loads assets and people onto that background, sensitivity translates urban surface characteristics into runoff propensity, and adaptive capacity modulates the severity and recoverability of impacts [1]. In Chongqing, terrain and hydro-climatic constraints create a persistent predisposition template: strong relief, pronounced valley systems, and spatially heterogeneous precipitation extremes jointly condition a region where rapid runoff generation and flow convergence are likely. Urbanization then intensifies this template by expanding impervious cover, reshaping micro-topography, and increasing the density of receptors, which elevates both impact potential and the likelihood that waterlogging is reported.

At the central city scale, corridor-like hotspots and clustered waterlogging points imply that risk is shaped not only by natural topographic convergence but also by constructed drainage pathways and bottlenecks. The built environment can re-route overland flow along roads, rail corridors, and underpasses, and can generate localized ponding at slope breaks or where stormwater inlets are sparse or blocked. The key implication is that urban waterlogging in Chongqing should be understood as a systemic outcome produced by the intersection of (i) intense short-duration rainfall, (ii) terrain-constrained convergence, and (iii) high-density urban development that both amplifies runoff and increases exposure, rather than as a set of isolated “bad spots”.

5.2. Why Risk and Resilience Co-Exist

A potentially counterintuitive but substantively important result is that resilience operationalized here as “capacity per unit risk” does not predict the absence of waterlogging. In fact, observed waterlogging points align with higher resilience values, as shown by the low ROC-AUC for “1—resilience” (0.245 and 0.296 for the two sets). It reveals a common conceptual and empirical pitfall, as capacity indicators and performance outcomes are not interchangeable [46].

Two structural mechanisms explain the observed co-existence. First, in dense urban cores, the proxies used for adaptive capacity, such as service accessibility, facility density, and infrastructure presence, are intrinsically high because central areas are well served, well connected, and historically prioritized for investment. Meanwhile, the same cores also concentrate exposure and runoff generation. The result is a “risk–resource co-location” pattern: cities accumulate both hazards and capacities [46,47]. In other words, the resilience index captures a structural advantage in resources and services, but waterlogging points represent operational and micro-scale failures that are only weakly reflected by city-scale capacity proxies.

Second, resilience in this framework is primarily a “coping and recovery capacity” construct, whereas pluvial waterlogging is often governed by “avoidance and conveyance performance” at fine scales—storm sewer capacity, inlet spacing, debris blockage, local subsidence, construction disturbance, micro-topographic depressions, and the timing of peak rainfall relative to network loading. This mismatch naturally produces situations where a place is “resourced” yet still “frequently waterlogged”. The model evidence is consistent with this interpretation: the document notes that predictors such as fire-station accessibility can increase predicted waterlogging likelihood because they operate as signals of urban centrality, not direct protective factors.

Conceptually, this dissilience is framed in broader literature. Resilience is about the ability to absorb disturbance, maintain function, and recover—not necessarily about eliminating disturbance occurrence [47,48]. For resilience to work as a more direct “waterlogging avoidance” indicator, it likely requires drainage-system robustness variables such as pipe capacity, inlet density, pump station coverage, storage facilities, maintenance and blockage records, observed inundation duration, in addition to general service accessibility.

5.3. Driving Factor Insights for Compound Triggers and Urban Centrality

The explainable machine-learning results provide a process-consistent interpretation of why waterlogging concentrates where it does, and they add nuance that is difficult to derive from component maps alone. Under spatial cross-validation, the model shows strong ranking ability (OOF ROC-AUC around 0.920; AP about 0.074), despite the extremely imbalanced dataset (167 positive grids out of 22,500; prevalence is about 0.74%). This matters because it captures stable spatial regularities rather than overfitting to idiosyncratic point locations.

Mechanistically, the SHAP-based interpretation [32] supports a “compound trigger” perspective: short-duration extreme precipitation acts as the initiating forcing, but its effect is amplified where runoff potential and convergence are high. This is exactly what one would expect in a city like Chongqing, where relief is strong, and drainage pathways are tightly constrained: intense rainfall becomes most consequential in landscapes that rapidly generate surface flow and route it into valley outlets or artificial low points. The non-linear contributions of slope and topographic wetness indicators further suggest multiple susceptible regimes, including low-gradient built-up pockets prone to ponding and steep settings where fast runoff overloads downstream conveyance or concentrates at topographic constrictions.

A second major insight concerns urban centrality variables. Accessibility-related predictors are statistically informative, but their directionality should be interpreted carefully. The document explicitly notes that higher accessibility (e.g., fire-station access) can correlate with higher waterlogging likelihood because it indicates where infrastructure and services concentrate precisely where imperviousness, runoff generation, traffic disruption, and reporting probability are also high. This illustrates a key methodological lesson for resilience analytics, as many proxies for urbanization intensity, without careful causal framing, can appear as risk enhancers in occurrence models. In practice, this does not imply that emergency services cause flooding; it implies that centrality, density, and infrastructure concentration co-locate with both resources and waterlogging stressors.

Finally, the model’s utility is operational rather than purely explanatory. The results demonstrate strong spatial concentration of predicted risk: prioritizing a small fraction of grids captures a large share of observed waterlogging points. This supports a pragmatic policy stance: even when probability calibration can guide efficient inspection, diagnosis, and retrofit targeting.

5.4. Implications for Zoned Governance and Engineering Intervention

The strongest governance message from this study is that precision targeting is feasible. The top-k capture analysis shows that the top 10% highest-scoring grids capture 147 points (88.02%) and the top 20% capture 95.21%. This is not a trivial performance statistic; it is effectively a resource-targeted, under-constrained budget. A city can generate a ranked “high-risk list,” then progressively narrow from district-level zoning to corridor-level diagnosis and site-level remediation, combining map-based screening with on-the-ground verification such as micro-topography checks, inlet condition audits, CCTV inspection, and localized hydraulic capacity testing.

Given the inferred driver structure, interventions should be differentiated by dominant mechanism rather than applied uniformly. Where extreme rainfall and high runoff potential coincide, the priority is to increase peak buffering and redundancy so that forcing does not exceed system capacity—through distributed storage, detention, and strategically placed overflow routes. In high-convergence corridors and confluence nodes, the emphasis should be on relieving bottlenecks and preventing backwater effects by improving key conveyance links and ensuring continuity of flow paths. In high-centrality, high-exposure urban cores, engineering measures should be coupled with operational management: real-time traffic organization, rapid inlet clearing, temporary drainage deployment during forecasted extremes, and targeted sponge retrofits that reduce recurrent ponding and protect critical urban functions.

Crucially, this zoned approach also resolves an apparent tension in the areas of “high resilience” in the sense of service provision and recovery capacity, yet they remain priority zones for drainage-performance improvement because repeated waterlogging imposes cumulative economic and social costs. Therefore, a resilience-informed governance strategy needs two layers: (i) system-level coping and recovery capacity building (emergency access, shelters, warnings), and (ii) micro-scale drainage robustness improvements that directly reduce waterlogging occurrence and duration.

5.5. Assumptions, Proxy Limitations, and Transferability

The framework rests on four falsifiable assumptions. Aggregation-invariance, like top-k rankings at 500 m, is qualitatively preserved at finer/coarser grids, and is untested here and constitutes the principal MAUP caveat [38]. The weight-perturbation Monte Carlo (Section 4.4) offers an indirect check (ranking stability is high), but only multi-resolution replication can close the question. Rainfall stationarity, such as the 2020–2024 composite, which treats warm-season rainfall as statistically stationary, becomes tenuous under climate change [3]. Integrating non-stationary CMIP6 scenarios is a priority extension. Capacity–performance substitution, implicit in any resilience index constructed as 1 − risk, is the assumption this paper empirically falsifies; closure requires direct drainage-system attributes (Section 5.2). Label completeness is the most operationally consequential: reporting-density bias (Section 4.4) threatens calibrated absolute-risk estimation, and closure requires Bayesian under-reporting correction [44] with multi-stream event fusion.

Transferability to other mountainous megacities follows a tiered logic. Terrain-controlled indicators (ELEmean, SLOPEmean, FLOWACCmean, TWImean, SINKDEPmean, dist2riv, rivden) transfer directly. Rainfall extreme indicators (Rx1day through P99) transfer in methodological form but require locally appropriate ETCCDI thresholds. Exposure and adaptation structures (population proxy, nighttime light, POI accessibility) transfer in concept but require locally re-curated POI sources and re-calibrated Gauss-2SFCA parameters. The weighting scheme itself must be re-estimated on the destination dataset rather than transplanted numerically. The framework is therefore offered as a transferable protocol, not a transferable set of numerical weights.

6. Conclusions

This study proposes and tests a city-wide framework for assessing urban flood risk and resilience in a terrain-constrained megacity, using Chongqing as a representative case. By integrating multi-source datasets into a uniform 500 m × 500 m fishnet, the framework supports consistent comparison of flood-related conditions across the municipality and provides a practical bridge between climate-related hazard signals, built-environment characteristics, and capacity proxies relevant to emergency response and recovery. The indicator system is organized around hazard, exposure, sensitivity, and adaptation, which allows risk formation and capacity distribution to be mapped jointly rather than treated as a single composite outcome.

A central result is that the composite risk surface aligns with observed waterlogging locations better than single-factor representations. Validated against two independent waterlogging point sets, the risk index achieves ROC-AUC values of 0.834 for historical points and 0.873 for the 2022 set, and it shows statistically significant separation between waterlogging and non-waterlogging samples. This matters because it suggests that observed waterlogging in Chongqing is not well explained by hydrologic or topographic hazard alone. Instead, hotspots reflect coupled effects in which hydroclimatic forcing and terrain convergence interact with exposure concentration and land-surface susceptibility, producing corridor-like and nodal patterns that are consistent with systemic urban structure rather than isolated anomalies.

Equally important, resilience should not be interpreted as the inverse of risk [20,49]. Many observed waterlogging points occur in areas classified as having medium-to-high capacity, and the resilience-related score performs below random for discriminating against waterlogging points. This is not a failure of the method; it is an empirical signal that dense urban cores can be simultaneously high-risk and high-capacity [50]. In practice, core districts often have better service accessibility and infrastructure provision, yet they also experience higher runoff loads, higher imperviousness, and far higher exposure. Treating “capacity” as though it guarantees safety can therefore mislead flood governance. The more defensible interpretation is that resilience indicators reflect response and recovery capacity, while risk indicators reflect the likelihood and potential impact of overload and failure. Planning decisions need both maps because they answer different questions: where failures are likely to occur, and where the system can absorb and recover from them.

To move from mapping to explanation, we further tested the information content of the assembled features using a LightGBM model evaluated with spatial cross-validation in a highly imbalanced setting. The model shows strong ranking performance, which supports the view that multi-source indicators capture consistent signals related to waterlogging occurrence. More importantly for governance, the model output concentrates risk into a small spatial fraction: the top 10% of grids capture 88.02% of observed points, and the top 20% capture 95.21%. This kind of concentration is operationally valuable because municipal resources are limited. It implies that targeted inspection, monitoring, and drainage retrofits can be prioritized in a small set of high-scoring areas while still covering most historically affected locations. At the same time, calibration analysis indicates that scores should be treated as relative risk rankings rather than literal probabilities, a common issue for rare events and heterogeneous urban systems.

Future work should integrate five methodological extensions following directly from the diagnostics reported: (1) multi-resolution MAUP replication at 250 m and 1 km to test aggregation-invariance; (2) non-stationary rainfall integration via CMIP6-downscaled precipitation scenarios; (3) direct drainage-system attributes (pipe capacity, inlet spacing, pump-station coverage, blockage records) replacing accessibility proxies in the adaptation component, converting the resilience indicator from a structural-availability measure into a hydraulic-performance measure, and closing the capacity–performance gap diagnosed in Section 5.2; (4) Bayesian spatial under-reporting correction [16] fused with geocoded social-media reports, supporting calibrated absolute-risk estimation; and (5) residual spatial-autocorrelation absorption via eigenvector spatial filters [10] or Gaussian-process residual layers, targeting the 0.53 residual Moran’s I diagnosed in Section 4.4. Complementary lower-priority extensions include systematic cross-library benchmarking (XGBoost, CatBoost) with Bayesian hyperparameter optimization, and calibration-preserving imbalance corrections.

In conclusion, the proposed risk–resilience assessment provides a validated, interpretable, and operational basis for urban flood governance in complex terrain-constrained settings. By explicitly separating capacity from risk, validating against observed events, and translating results into spatial prioritization logic, the study supports more targeted and defensible decisions on drainage configuration, flooding risk monitoring, and resilience planning.

Author Contributions

Conceptualization, T.Y.; methodology, T.Y.; software, F.T.; validation, T.Y., X.Z. and F.T.; formal analysis, T.Y. and X.Z.; investigation, F.T.; resources, T.Y. and X.Z.; data curation, T.Y.; writing—original draft preparation, T.Y.; writing—review and editing, F.T. and X.Z.; visualization, F.T.; supervision, Y.Y. and X.Z.; project administration, Y.Y. and X.Z.; funding acquisition, X.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the Fundamental Research Program of the Chongqing Municipal Commission of Housing and Urban–Rural Development (Chengke No. 2024, 7-6) and the Science and Technology Research Program of the Chongqing Municipal Education Commission (Grant No. KJQN202404315).

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Table A1 below shows the exact weighting numbers of every individual indicator; the last column, named ‘w_final’, is the final weighting used in the overall calculation.

Table A1. The details of weighting metrics.

System	Field	Std_Norm	Critic_Conflict	Critic_Info	W_Critic	W_Entropy	W_Final
HAZARD	RX1mean	0.222466	11.08928	2.466987	0.064206	0.008104	0.036155
	RX3mean	0.204177	10.38629	2.120643	0.055192	0.01507	0.035131
	RX5mean	0.242709	9.952086	2.41546	0.062865	0.019661	0.041263
	R20mean	0.245672	11.63674	2.858823	0.074404	0.008902	0.041653
	R50mean	0.266375	10.14134	2.701401	0.070307	0.016215	0.043261
	R100mean	0.143823	13.73332	1.975168	0.051406	0.278126	0.164766
	P95mean	0.257731	10.48454	2.702188	0.070327	0.013033	0.04168
	P99mean	0.301684	9.704231	2.927614	0.076194	0.018619	0.047407
	ELEmean	0.223057	10.10144	2.2532	0.058642	0.005265	0.031953
	SLOPEmean	0.237815	10.51654	2.50099	0.065091	0.00727	0.03618
	FLOWACCmea	0.114128	12.32087	1.406157	0.036597	0.241598	0.139098
	TWImean	0.196006	10.19315	1.997921	0.051998	0.026792	0.039395
	SINKDEPmea	0.166429	11.28743	1.878553	0.048891	0.143067	0.095979
	dist2riv	0.248086	10.98661	2.72563	0.070938	0.005907	0.038422
	RUNOFFmean	0.221075	13.19442	2.916962	0.075917	0.019861	0.047889
	rivden	0.217165	11.85851	2.575251	0.067024	0.172509	0.119766
SENSIBILITY	BTSMmean	0.319848	3.836293	1.22703	0.144798	0.13174	0.138269
	NDVImean	0.245131	3.556383	0.87178	0.102876	0.044608	0.073742
	FARMLmean	0.465519	4.09305	1.905393	0.224849	0.071924	0.148386
	FORESTmean	0.370732	4.387123	1.626447	0.191932	0.026566	0.109249
	GLASSmean	0.102442	5.613504	0.575061	0.067861	0.001596	0.034728
	WATERmean	0.154027	5.324078	0.820054	0.096772	0.496087	0.296429
	CONSTmean	0.381289	3.79851	1.44833	0.170913	0.227481	0.199197
EXPOSURE	POPmean	0.212337	0.358392	0.0761	0.47763	0.347836	0.412733
EXPOSURE	LIGHT24mea	0.232226	0.358392	0.083228	0.52237	0.652164	0.587267
ADPTATION	acch60	0.283186	2.17845	0.616906	0.291156	0.03693	0.164043
	accf60	0.258815	2.172369	0.562241	0.265356	0.017283	0.14132
	accsr2000	0.129884	2.929134	0.380448	0.179557	0.214296	0.196927
	rd_den	0.211929	2.638707	0.559219	0.26393	0.053854	0.158892
	UGS20mean	0	5	0	0	0.677637	0.338819

Appendix B

This appendix presents the detailed Monte Carlo weight-perturbation sensitivity results summarized in Section 4.4. To assess how robust the composite RISK_T ranking is to the CRITIC–Entropy weighting scheme, each indicator’s baseline weight was independently scaled by a random factor and renormalized within its component, and the full aggregation pipeline was re-run end-to-end for 500 Monte Carlo iterations under two scenarios: an aggressive scenario with perturbation factor k ~ U(0.5, 1.5), and a moderate scenario with k ~ U(0.8, 1.2). Table A2 summarizes margin-level and order-level stability across both scenarios. Figure A1 shows the distribution of the top 10% capture rate; Figure A2 shows the distribution of Spearman rank correlation between baseline and perturbed RISK_T surfaces; and Figure A3 presents the one-at-a-time (OAT) tornado decomposition identifying indicator-level contributions to margin instability.

Table A2. Weight-perturbation sensitivity summary.

Scenario	Metric	Value
Aggressive ± 50%	Baseline top 10% (historical/2022)	0.385/0.571
Aggressive ± 50%	Top 10% 95% range	±22.9 pp
Aggressive ± 50%	Spearman ρ median/≥95%	0.978/0.953
Moderate ± 20%	Top 10% 95% range	±8.6 pp
Moderate ± 20%	Spearman ρ median/≥95%	0.996/0.993
—	Monte Carlo iterations per scenario	500

Figure A1. Top 10% capture histogram under ±50% + baseline.

Figure A2. Top 10% capture histogram under ±20% + baseline.

Figure A3. OAT indicator-level tornado, aggressive scenario.

References

Intergovernmental Panel On Climate Change (IPCC). Climate Change 2022—Impacts, Adaptation and Vulnerability: Working Group II Contribution to the Sixth Assessment Report of the Intergovernmental Panel on Climate Change, 1st ed.; Cambridge University Press: Cambridge, UK, 2023. [Google Scholar]
Wen, G.; Ji, F. Flood Resilience Assessment of Region Based on TOPSIS-BOA-RF Integrated Model. Ecol. Indic. 2024, 169, 112901. [Google Scholar] [CrossRef]
Zhang, W.; Liu, M.; Zhuang, Z.; Wang, Y.; Liu, J.; Qiao, X.; Li, B.; Sun, H. Spatiotemporal Characteristics of Extreme Rainfall in a Mountain City Under Urbanization: Case Study of Chongqing. Desalin. Water Treat. 2025, 324, 101443. [Google Scholar] [CrossRef]
Li, Y.; Gao, J.; Yin, J.; Liu, L.; Zhang, C.; Wu, S. Flood Risk Assessment of Areas Under Urbanization in Chongqing, China, by Integrating Multi-Models. Remote Sens. 2024, 16, 219. [Google Scholar] [CrossRef]
Chen, X.; Guo, Z.; Zhou, H.; Qian, X.; Zhang, X. Urban Flood Resilience Assessment Based on VIKOR-GRA: A Case Study in Chongqing, China. KSCE J. Civ. Eng. 2022, 26, 4178–4194. [Google Scholar] [CrossRef]
Wu, Z.; Chen, Y.; Zheng, X.; Huang, S.; Duan, C.; Wang, P. A Novel Framework for Evidence-Based Assessment of Flood Resilience Integrating Multi-Source Evidence: A Case Study of the Yangtze River Economic Belt, China. Ecol. Indic. 2024, 167, 112705. [Google Scholar] [CrossRef]
Khoshkonesh, A.; Nazari, R.; Nikoo, M.R.; Karimi, M. Enhancing Flood Risk Assessment in Urban Areas by Integrating Hydrodynamic Models and Machine Learning Techniques. Sci. Total Environ. 2024, 952, 175859. [Google Scholar] [CrossRef] [PubMed]
Wang, M.; Li, Y.; Yuan, H.; Zhou, S.; Wang, Y.; Adnan Ikram, R.M.; Li, J. An XGBoost-SHAP Approach to Quantifying Morphological Impact on Urban Flooding Susceptibility. Ecol. Indic. 2023, 156, 111137. [Google Scholar] [CrossRef]
Li, S.; Ge, X.; Jin, G.; Lou, Z.; Yang, H. Flood Dynamic Monitoring and XGBoost-SHAP Based Risk Assessment: A Case Study of the 23·7 Extreme Rainstorm in BTH Region, China. Environ. Sustain. Indic. 2025, 28, 101020. [Google Scholar] [CrossRef]
Liu, X.; Kounadi, O.; Zurita-Milla, R. Incorporating Spatial Autocorrelation in Machine Learning Models Using Spatial Lag and Eigenvector Spatial Filtering Features. ISPRS Int. J. Geo-Inf. 2022, 11, 242. [Google Scholar] [CrossRef]
Diakoulaki, D.; Mavrotas, G.; Papayannakis, L. Determining Objective Weights in Multiple Criteria Problems: The Critic Method. Comput. Oper. Res. 1995, 22, 763–770. [Google Scholar] [CrossRef]
Lee, Y.; Jeong, H.; Lee, Y.; Lee, B.; Lee, S. Assessing Urban Flood Susceptibility in Seoul, South Korea Using Machine Learning Models: Effects of Urban Infrastructure and Sampling Variability. J. Hydrol. 2026, 674, 135531. [Google Scholar] [CrossRef]
Fraehr, N.; Wang, Q.J.; Wu, W.; Nathan, R. Assessment of Surrogate Models for Flood Inundation: The Physics-Guided LSG Model vs. State-of-the-Art Machine Learning Models. Water Res. 2024, 252, 121202. [Google Scholar] [CrossRef]
Song, W.; Guan, M.; Guo, K.; Yu, D. Rapid Flood Inundation Mapping by Integrating Deep Learning-Based Image Super-Resolution with Coarse-Grid Hydrodynamic Modeling. Eng. Appl. Comput. Fluid Mech. 2025, 19, 2481115. [Google Scholar] [CrossRef]
Borsekova, K.; Nijkamp, P.; Guevara, P. Urban Resilience Patterns After an External Shock: An Exploratory Study. Int. J. Disaster Risk Reduct. 2018, 31, 381–392. [Google Scholar] [CrossRef]
Bertilsson, L.; Wiklund, K.; De Moura Tebaldi, I.; Rezende, O.M.; Veról, A.P.; Miguez, M.G. Urban Flood Resilience—A Multi-Criteria Index to Integrate Flood Resilience into Urban Planning. J. Hydrol. 2019, 573, 970–982. [Google Scholar] [CrossRef]
Liao, R.; Xu, Z.; Huang, Y. Dynamic Response of Urban Pluvial Flood Resilience Under a Multi-Dimensional Assessment Framework. Sustainability 2025, 17, 10044. [Google Scholar] [CrossRef]
Cheng, L.; Wang, Z.; Pei, R.; Wu, J. Sustainable Urban Management and Flood Resilience in China’s Yangtze River Economic Belt: Drivers, Patterns, and Policy Synergies. Sustain. Cities Soc. 2025, 131, 106737. [Google Scholar] [CrossRef]
Qian, J.; Du, Y.; Liang, F.; Yi, J.; Zhang, X.; Jiang, J.; Wang, N.; Tu, W.; Huang, S.; Pei, T.; et al. Measuring Community Resilience Inequality to Inland Flooding Using Location Aware Big Data. Cities 2024, 149, 104915. [Google Scholar] [CrossRef]
Shamsudduha, M. Redefining Flood Hazard and Addressing Emerging Risks in an Era of Extremes. npj Nat. Hazards 2025, 2, 29. [Google Scholar] [CrossRef]
Han, S.; Wang, B.; Ao, Y.; Bahmani, H.; Chai, B. The Coupling and Coordination Degree of Urban Resilience System: A Case Study of the Chengdu–Chongqing Urban Agglomeration. Environ. Impact Assess. Rev. 2023, 101, 107145. [Google Scholar] [CrossRef]
Wang, Q.; Gu, H.; Zang, X.; Zuo, M.; Li, H. Flood Resilience in Cities and Urban Agglomerations: A Systematic Review of Hazard Causes, Assessment Frameworks, and Recovery Strategies Based on LLM Tools. Nat. Hazards 2025, 121, 12391–12426. [Google Scholar] [CrossRef]
Khodadad, M.; Aguilar-Barajas, I.; Khan, A. Green Infrastructure for Urban Flood Resilience: A Review of Recent Literature on Bibliometrics, Methodologies, and Typologies. Water 2023, 15, 523. [Google Scholar] [CrossRef]
Chen, L.; Yao, Y.; Xiang, K.; Dai, X.; Li, W.; Dai, H.; Lu, K.; Li, W.; Lu, H.; Zhang, Y.; et al. Spatial-Temporal Pattern of Ecosystem Services and Sustainable Development in Representative Mountainous Cities: A Case Study of Chengdu-Chongqing Urban Agglomeration. J. Environ. Manag. 2024, 368, 122261. [Google Scholar] [CrossRef]
Guo, Z.; Li, Z.; Lu, C.; She, J.; Zhou, Y. Spatio-Temporal Evolution of Resilience: The Case of the Chengdu-Chongqing Urban Agglomeration in China. Cities 2024, 153, 105226. [Google Scholar] [CrossRef]
Xue, F.; Wang, J.; Huang, Y.; Jing, R.; Lu, Q. From Sponge City to Sponge Watershed: Addressing Comprehensive Water Issues through an Innovative Framework. IOP Conf. Ser. Earth Environ. Sci. 2020, 569, 012083. [Google Scholar] [CrossRef]
Cao, Y.; Yang, T.; Wu, H.; Yan, S.; Yang, H.; Zhu, C.; Liu, Y. Resilience Assessment and Improvement Strategies for Urban Haze Disasters Based on Resident Activity Characteristics: A Case Study of Gaoyou, China. Atmosphere 2024, 15, 289. [Google Scholar] [CrossRef]
Chen, L.; Chang, M.; Yang, H.; Xiao, Y.; Huang, H.; Wang, X. Comprehensive Evaluation and Optimal Management of Extreme Disaster Risk in Chinese Urban Agglomerations by Integrating Resilience Risk Elements and Set Pair Analysis. Int. J. Disaster Risk Reduct. 2024, 111, 104671. [Google Scholar] [CrossRef]
Lu, Y.; Zhai, G.; Zhai, W. Quantifying Urban Spatial Resilience Using Multi-Criteria Decision Analysis (MCDA) and Back Propagation Neural Network (BPNN). Int. J. Disaster Risk Reduct. 2024, 111, 104694. [Google Scholar] [CrossRef]
Tang, F.; Zeng, P.; Guo, Y.; Shen, Y.; Wang, L.; Liu, K.; Zhang, L. Decoding the Spatiotemporal Dynamics and Driving Mechanisms of Ecological Resilience in the Beijing-Tianjin-Hebei Urban Agglomeration: A Deep Learning Approach. Urban Clim. 2025, 61, 102436. [Google Scholar] [CrossRef]
Ke, G.; Meng, Q.; Finley, T.; Wang, T.; Chen, W.; Ma, W.; Ye, Q.; Liu, T.-Y. LightGBM: A Highly Efficient Gradient Boosting Decision Tree. In Proceedings of the 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
Lundberg, S.; Lee, S.-I. A Unified Approach to Interpreting Model Predictions. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
Deng, J.; Zhang, R.; Chen, S.; Li, Z.; Gao, L.; Li, Y.; Wei, C. Spatiotemporal Evolution and Influencing Factors of Flood Resilience in Beibu Gulf Urban Agglomeration. Int. J. Disaster Risk Reduct. 2024, 114, 104905. [Google Scholar] [CrossRef]
Gralepois, M. What Can We Learn from Planning Instruments in Flood Prevention? Comparative Illustration to Highlight the Challenges of Governance in Europe. Water 2020, 12, 1841. [Google Scholar] [CrossRef]
Wang, B.; Han, S.; Ao, Y.; Liao, F. Evaluation and Factor Analysis for Urban Resilience: A Case Study of Chengdu–Chongqing Urban Agglomeration. Buildings 2022, 12, 962. [Google Scholar] [CrossRef]
Tan, L.; Schultz, D.M. Damage Classification and Recovery Analysis of the Chongqing, China, Floods of August 2020 Based on Social-Media Data. J. Clean. Prod. 2021, 313, 127882. [Google Scholar] [CrossRef]
Luo, W.; Qi, Y. An Enhanced Two-Step Floating Catchment Area (E2SFCA) Method for Measuring Spatial Accessibility to Primary Care Physicians. Health Place 2009, 15, 1100–1107. [Google Scholar] [CrossRef] [PubMed]
Fotheringham, A.S.; Wong, D.W.S. The Modifiable Areal Unit Problem in Multivariate Statistical Analysis. Environ. Plan. A Econ. Space 1991, 23, 1025–1044. [Google Scholar] [CrossRef]
Shannon, C.E. A Mathematical Theory of Communication. Bell Syst. Tech. J. 1948, 27, 379–423. [Google Scholar] [CrossRef]
Hwang, C.-L.; Yoon, K. Methods for Multiple Attribute Decision Making. In Multiple Attribute Decision Making: Methods and Applications A State-of-the-Art Survey; Hwang, C.-L., Yoon, K., Eds.; Springer: Berlin/Heidelberg, Germany, 1981; pp. 58–191. [Google Scholar]
Chawla, N.V.; Bowyer, K.W.; Hall, L.O.; Kegelmeyer, W.P. SMOTE: Synthetic Minority Over-Sampling Technique. J. Artif. Intell. Res. 2002, 16, 321–357. [Google Scholar] [CrossRef]
Berti, M.; Bernard, M.; Gregoretti, C.; Simoni, A. Physical Interpretation of Rainfall Thresholds for Runoff-Generated Debris Flows. J. Geophys. Res. Earth Surf. 2020, 125, e2019JF005513. [Google Scholar] [CrossRef]
Shuster, W.D.; Bonta, J.; Thurston, H.; Warnemuende, E.; Smith, D.R. Impacts of Impervious Surface on Watershed Hydrology: A Review. Urban Water J. 2005, 2, 263–275. [Google Scholar] [CrossRef]
Agostini, G.; Pierson, E.; Garg, N. A Bayesian Spatial Model to Correct Under-Reporting in Urban Crowdsourcing. Proc. AAAI Conf. Artif. Intell. 2024, 38, 21888–21896. [Google Scholar] [CrossRef]
Davis, J.; Goadrich, M. The Relationship Between Precision-Recall and ROC Curves. In Proceedings of the 23rd International Conference on Machine Learning—ICML’06; ACM Press: Pittsburgh, PA, USA, 2006; pp. 233–240. [Google Scholar]
Meerow, S.; Newell, J.P.; Stults, M. Defining Urban Resilience: A Review. Landsc. Urban Plan. 2016, 147, 38–49. [Google Scholar] [CrossRef]
Bruneau, M.; Chang, S.; Eguchi, R.; Lee, G.; O’Rourke, T.; Reinhorn, A.; Shinozuka, M.; Tierney, K.; Wallace, W.; Winterfeldt, D. A Framework to Quantitatively Assess and Enhance the Seismic Resilience of Communities. Earthq. Spectra 2003, 19, 733–752. [Google Scholar] [CrossRef]
Cutter, S.L.; Barnes, L.; Berry, M.; Burton, C.; Evans, E.; Tate, E.; Webb, J. A Place-Based Model for Understanding Community Resilience to Natural Disasters. Glob. Environ. Change 2008, 18, 598–606. [Google Scholar] [CrossRef]
Heinzlef, C.; Robert, B.; Hémond, Y.; Serre, D. Operating Urban Resilience Strategies to Face Climate Change and Associated Risks: Some Advances from Theory to Application in Canada and France. Cities 2020, 104, 102762. [Google Scholar] [CrossRef]
Wang, Q.; Zhang, R.; Li, H.; Zang, X. Analysis of Mechanism and Optimal Value of Urban Built Environment Resilience in Response to Stormwater Flooding. Ecol. Indic. 2024, 158, 111625. [Google Scholar] [CrossRef]

Figure 1. The site location and elevation of Chongqing and its satellite image of the central city area.

Figure 2. The monthly precipitation heatmap and the annual changes in Chongqing from 2010 to 2024.

Figure 3. The research framework.

Figure 4. The spatial distribution of sensitivity–exposure–adaptive capacity–hazard–vulnerability–risk–resilience in Chongqing.

Figure 5. The spatial distribution of risk resilience of the central city area in Chongqing.

Figure 6. The ROC and precision–recall curves of the model.

Figure 7. The top-k cumulative gain curve.

Figure 8. The calibration curve.

Figure 9. SHAP value for key drivers.

Figure 10. The binned SHAP dependence plots for important driving forces. ((a) Runoff; (b) Accessibility of fire stations; (c) Flow accumulation; (d) 95th percentile daily precipitation; (e) Annual number of very heavy precipitation days (50mm); (f) Slope).

Table 1. Indicator system for the 500 m grid database.

System	Indicator	Unit	Data Source	Impact	Abbreviation
Hazard	Annual maximum 1-day precipitation (Rx1day)	mm	Calculated using the ChinaMet Dataset of daily precipitation in Chongqing from May to October, 2020–2024 (https://www.doi.org/10.12072/ncdc.nieer.db6722.2025 accessed on 12 December 2025.)	Positive	RX1mean
	Annual maximum 3-day precipitation (Rx3day)	mm		Positive	RX3mean
	Annual maximum 5-day precipitation (Rx5day)	mm		Positive	RX5mean
	Annual number of heavy precipitation days (R20mm)	days		Positive	R20mean
	Annual number of very heavy precipitation days (R50mm)	days		Positive	R50mean
	Annual number of extreme precipitation days (R100mm)	days		Positive	R100mean
	95th percentile daily precipitation (P95)	mm		Positive	P95mean
	99th percentile daily precipitation (P99)	mm		Positive	P99mean
	Elevation	m	Calculated using the ASTER Global Digital Elevation Model (https://www.earthdata.nasa.gov/data/catalog/lpcloud-astgtm-003 accessed on 15 December 2025.)	Negative	ELEmean
	Slope	degree		Negative	SLOPEmean
	Topographic wetness index (TWI)	–		Positive	TWImean
	Flow accumulation	–		Positive	FLOWACCmean
	Sink depth	m		Positive	SINKDEPmean
	Land surface runoff	–	Data available at https://figshare.com/articles/dataset/Gridded_mean_annual_runoff_for_2021_at_250m_resolution/19596157 accessed on 2 December 2025.	Positive	RUNOFFmean
	River density	km/km²	Calculated using Open Street Map shapefiles (https://download.geofabrik.de/asia/china.html accessed on 15 December 2025.)	Positive	rivden
	Distance to nearest river	m	Calculated using Open Street Map shapefiles (https://download.geofabrik.de/asia/china.html accessed on 20 December 2025.)	Negative	dist2riv
Exposure	Population density	persons/km²	Data available at https://www.resdc.cn/DOI/DOI.aspx?DOIID=32 accessed on 15 December 2025.	Positive	POPmean
	GDP per capita	Yuan	Data available at https://www.resdc.cn/DOI/DOI.aspx?DOIID=33 accessed on 15 December 2025.	Positive	GDPmean
	Nighttime light intensity	–	Extended VIIRS-like dataset, available at https://doi.org/10.11888/HumanNat.tpdc.302930 accessed on 15 December 2025.	Positive	LIGHT24mean
Sensitivity	Farmland proportion	%	Calculated using the CNLUCC dataset, available at https://www.resdc.cn/DOI/DOI.aspx?DOIID=54 accessed on 12 December 2025.	Positive	FARMLmean
	Forest proportion	%		Negative	FORESTmean
	Grassland proportion	%		Negative	GLASSmean
	Waterbody proportion	%		Positive	WATERmean
	Construction land proportion	%		Positive	CONSTmean
	Urban green space proportion	%	Data available at https://doi.org/10.3974/geodb.2025.01.04.V1 accessed on 12 December 2025.	Negative	UGS20mean
	Vegetation index (NDVI)	–	MODIS NDVI, available at https://doi.org/10.5067/MODIS/MOD13A3.006 accessed on 12 December 2025.	Negative	NDVImean
	Impervious surface ratio	–	Data available at https://data-starcloud.pcl.ac.cn/iearthdata/13 accessed on 12 December 2025.	Positive	BTSMmean
Adaptation	Accessibility of fire/rescue services (Gaussian 2SFCA)	–	Calculated using the Amap POI dataset (https://www.amap.com/ accessed on 20 December 2025.)	Positive	accf60
	Accessibility of healthcare services (Gaussian 2SFCA)	–		Positive	acch60
	Shelter accessibility (Gaussian 2SFCA)	–		Positive	accsr1000
	Road density	km/km²	Calculated using Open Street Map shapefiles (https://download.geofabrik.de/asia/china.html accessed on 20 December 2025.)	Positive	rd_den

Table 2. Validation metrics with waterlogging points.

Point_Set	Score_Used_for_Auc	Roc_Auc	Pr_Auc (AP)
historical	RISK	0.834139	0.016677
historical	HAZARD	0.801836	0.011654
historical	VULNERABILITY	0.789066	0.019468
historical	1—RESILIENCE	0.245187	0.002877
current	RISK	0.873448	0.016006
current	HAZARD	0.824419	0.007715
current	VULNERABILITY	0.862586	0.019972
current	1—RESILIENCE	0.295575	0.001892

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Yang, T.; Yun, Y.; Tang, F.; Zheng, X. A Multi-Source Geospatial Framework for the Evaluation of Urban Flood Resilience Under Extreme Rainfall: Evidence from Chongqing, China. Water 2026, 18, 1067. https://doi.org/10.3390/w18091067

AMA Style

Yang T, Yun Y, Tang F, Zheng X. A Multi-Source Geospatial Framework for the Evaluation of Urban Flood Resilience Under Extreme Rainfall: Evidence from Chongqing, China. Water. 2026; 18(9):1067. https://doi.org/10.3390/w18091067

Chicago/Turabian Style

Yang, Tao, Yingxia Yun, Fengliang Tang, and Xiaolei Zheng. 2026. "A Multi-Source Geospatial Framework for the Evaluation of Urban Flood Resilience Under Extreme Rainfall: Evidence from Chongqing, China" Water 18, no. 9: 1067. https://doi.org/10.3390/w18091067

APA Style

Yang, T., Yun, Y., Tang, F., & Zheng, X. (2026). A Multi-Source Geospatial Framework for the Evaluation of Urban Flood Resilience Under Extreme Rainfall: Evidence from Chongqing, China. Water, 18(9), 1067. https://doi.org/10.3390/w18091067

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Multi-Source Geospatial Framework for the Evaluation of Urban Flood Resilience Under Extreme Rainfall: Evidence from Chongqing, China

Abstract

1. Introduction

2. Study Area and Data

2.1. Study Area

2.2. Data Source

3. Methods

3.1. Indicator Construction and Framework

3.2. CRITIC–Entropy Fusion and TOPSIS Weighting

3.3. Validation and Driver Identification with LightGBM–SHAP

4. Results

4.1. Spatial Patterns of Flood Risk Evaluation in CHONGQING

4.2. Validation Against Observed Waterlogging Points

4.3. LightGBM-SHAP Model and Key Drivers’ Interpretation

4.4. Sensitivity, Error Anatomy, and Reporting-Bias Considerations

5. Discussion

5.1. The Spatial Logic of Flood Risk Formation in Chongqing

5.2. Why Risk and Resilience Co-Exist

5.3. Driving Factor Insights for Compound Triggers and Urban Centrality

5.4. Implications for Zoned Governance and Engineering Intervention

5.5. Assumptions, Proxy Limitations, and Transferability

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Appendix A

Appendix B

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI