Explainable Digital Twins for Urban Drainage Resilience: A Multi-Source TCN-LSTM Framework for Real-Time Water Flow Prediction

Wang, Yinglin; Wen, Xiaofang; Kong, Lingyu; Chan, Anson Tsz Kwan; Zhu, Liang

doi:10.3390/buildings16101856

Open AccessArticle

Explainable Digital Twins for Urban Drainage Resilience: A Multi-Source TCN-LSTM Framework for Real-Time Water Flow Prediction

by

Yinglin Wang

¹,

Xiaofang Wen

²,

Lingyu Kong

¹,

Anson Tsz Kwan Chan

²

and

Liang Zhu

^1,*

¹

Department of Environmental Engineering, Zhejiang University, Hangzhou 310058, China

²

Department of Engineering, University of Cambridge, Cambridge CB3 0FA, UK

^*

Author to whom correspondence should be addressed.

Buildings 2026, 16(10), 1856; https://doi.org/10.3390/buildings16101856

Submission received: 27 February 2026 / Revised: 27 April 2026 / Accepted: 29 April 2026 / Published: 7 May 2026

(This article belongs to the Special Issue Emerging Technologies for Digital Transformation of Resilient Built Assets)

Download

Browse Figures

Versions Notes

Abstract

Urban drainage systems (UDSs) are critical built assets increasingly challenged by short-duration extreme rainfall, aging infrastructure, and rising surcharge risk. Physics-based hydrodynamic models are widely used for system assessment, but their high computational cost limits real-time operational prediction. Existing data-driven prediction approaches improve computational efficiency, but often rely mainly on sensor inputs and provide limited asset-level interpretation. This study develops an explainable digital twin for real-time prediction of storm-driven water level response in a separate sewer network in the Yangtze River Delta, China. The framework integrates 5 min monitoring and SCADA data, including water level, flow, pump status, and rainfall, with GIS and as-built asset information, including pipe geometry, hydraulic capacity, catchment characteristics, and network connectivity. A hybrid TCN-LSTM model was developed to predict water level and surcharge risk probability at 15–60 min lead times. A surrogate-based SHAP module was used to explain model predictions at the node and subcatchment scales. Multi-source fusion reduced the RMSE by approximately 18% compared with sensor-only baselines. The SHAP results showed that the pipe capacity-related variables and upstream contributing area were the main drivers of surcharge onset. The framework provides interpretable, operationally relevant predictions to support the resilience-oriented management of urban drainage systems.

Keywords:

digital twin; urban drainage systems; explainable AI; SHAP; TCN-LSTM; real-time prediction; built asset management; Yangtze River Delta

1. Introduction

Urban drainage infrastructure constitutes one of the most safety-critical and climate-vulnerable classes of built assets in the contemporary city. Defined here as the network of pipes, channels, retention basins, pump stations, and associated hydraulic control structures designed to convey wastewater, urban drainage systems must simultaneously absorb the hydrological variability imposed by climate change and the increased impervious surface fractions associated with ongoing urbanisation. The consequences of drainage system failure range from localised basement flooding and traffic disruption to large-scale urban inundation, with associated impacts on property, public health, and economic productivity. In China, flood disasters have imposed substantial economic burdens over recent decades. It is estimated that direct economic losses from flood events in China from 1990 to 2018 exceeded CNY 4 trillion, reflecting the increasing scale of flood risk in rapidly urbanising regions [1].

Conventional approaches to drainage system management rely on periodic inspection regimes, physics-based hydrodynamic simulation (e.g., SWMM, InfoWorks ICM), and reactive maintenance triggered by observed failures. While these approaches provide engineering rigour, they face fundamental limitations in the context of real-time operational decision support. Full hydrodynamic models require detailed calibration, are computationally prohibitive for operational time horizons, and cannot readily assimilate real-time sensor data for state estimation. Inspection-based management is inherently retrospective and fails to anticipate failure modes associated with compound events, such as simultaneous pump failure and extreme rainfall. The gap between the technical capability of current tools and the operational demands of resilience-oriented asset management is, therefore, substantial [2,3]. This study addresses this gap by developing an explainable digital twin framework for real-time prediction of storm-driven water level responses in urban drainage systems. Additional background on conventional hydrodynamic modelling and operational constraints is provided in Supplementary Material S3 [4,5,6,7,8].

The digital transformation of built asset management, reflected in the use of digital twins, Internet of Things (IoT) sensing, and data-driven machine learning models, provides a practical pathway for addressing this gap. A digital twin, in the context of built infrastructure, denotes a continuously updated, data-enriched virtual representation of a physical asset that supports simulation, monitoring, and operational decision-making across the asset lifecycle [9]. Recent studies have demonstrated the value of digital twins and real-time control frameworks for urban drainage, stormwater, and wastewater systems, particularly through sensor integration, online state estimation, and operational prediction [10,11,12,13]. The emergence of deep learning architectures capable of learning complex temporal patterns from high-frequency sensor data has made data-driven surrogate modelling of drainage system hydraulics computationally tractable for real-time applications [14,15]. However, many existing drainage prediction models rely mainly on rainfall and sensor observations, while the static characteristics of drainage assets, such as pipe geometry, hydraulic capacity, upstream contributing area, and network connectivity, are not fully incorporated. This limits their ability to explain why specific locations are more vulnerable than others under the same rainfall forcing.

Model interpretability remains a key challenge. Deep learning models can improve hydraulic prediction, but their outputs are difficult to interpret in operational settings. For drainage management, a forecast is more useful when operators can also identify the factors driving the predicted surcharge risk [16]. This is particularly important when model outputs are used to support pump operation, emergency response, inspection planning, or rehabilitation prioritisation. SHAP can attribute model outputs to input features [17]. However, its use in drainage digital twins remains limited, especially for node-level and asset-level interpretation.

This paper addresses these challenges by presenting an explainable digital twin framework for real-time prediction of urban drainage system responses to extreme rainfall. The principal contributions of this study are as follows: (i) a multi-source data fusion architecture that integrates dynamic sensor streams with static built-asset descriptors to enable hydraulic-capacity-aware forecasting in urban drainage networks; (ii) a hybrid TCN–LSTM deep learning architecture designed to capture both local event-driven dynamics and long-range hydraulic memory effects relevant to pipe network surcharge propagation; (iii) an integrated SHAP Explainability Module that provides asset-level attribution of forecast outputs, supporting drainage asset prioritisation and rehabilitation planning; and (iv) a case study deployment in the Yangtze River Delta demonstrating the operational viability and performance improvement of the proposed framework in real-world drainage management contexts, including SCADA-informed decision support.

The remainder of this paper is organised as follows. Section 2 reviews the state of research in data-driven drainage modelling, digital twin frameworks for built assets, and explainable artificial intelligence, identifying the specific knowledge gaps this work addresses. Section 3 presents the proposed framework architecture and its underlying technical design rationale. Section 4 describes the research methodology, including the case study site, data sources, and experimental evaluation protocol. Section 5 discusses the results, their implications, and synthesis with the existing literature. Section 6 concludes with practical recommendations and directions for future research.

2. Background

2.1. Data-Driven Modelling of Urban Drainage Systems

The application of machine learning to urban drainage hydraulics has progressed substantially over the past decade. Early work employed artificial neural networks (ANNs) for rainfall–runoff modelling and flood inundation mapping, establishing that data-driven models could approximate physics-based simulation outputs at a fraction of the computational cost [18]. Recurrent neural network architectures, particularly LSTMs, subsequently demonstrated superior performance in capturing the sequential dependencies inherent in hydrological time series, with demonstrated advantages for multi-step-ahead forecasting of catchment discharge [19,20].

More recent work has explored convolutional architectures for hydraulic forecasting. TCNs, which employ dilated causal convolutions to capture multi-scale temporal patterns, have shown competitive or superior performance compared to LSTM models on several hydrological benchmarking datasets [21,22]. Hybrid TCN-LSTM models that combine the local feature extraction capability of TCNs with the sequential memory of LSTMs have been proposed for traffic flow and energy forecasting tasks, but their systematic application to drainage hydraulics remains largely unexplored [23]. The majority of published drainage forecasting models rely exclusively on meteorological and sensor inputs, without incorporating the static asset characteristics that fundamentally condition hydraulic behaviour [24].

2.2. Digital Twin Frameworks for Built Asset Management

The concept of the digital twin, originating in aerospace manufacturing, has been progressively adapted to built environment applications including buildings, bridges, and urban infrastructure [25,26]. In the drainage context, digital twin implementations have typically combined physically based simulation engines (e.g., SWMM or MIKE FLOOD) with real-time data assimilation pipelines to update model states from sensor observations [27]. This hybrid approach improves accuracy but still suffers from a high computational cost. In real-time drainage applications, the computational burden of the simulation engine can restrict model updating and scenario testing, particularly for full-network models with fine spatial and temporal resolution [10,28,29].

Fully data-driven digital twin components for drainage systems have been proposed as computationally efficient surrogates, particularly for operational flood forecasting applications [18]. However, existing implementations rarely integrate built-asset descriptors from GIS or asset management systems into the predictive model architecture. This omission is significant: pipe condition, connectivity, hydraulic capacity, and catchment morphology are primary determinants of where and when surcharging occurs under any given rainfall forcing. Failing to condition forecasts on these attributes limits the spatial transferability of trained models and reduces their physical interpretability [16,30].

2.3. Explainable AI in Infrastructure Management

The deployment of black-box machine learning models in critical infrastructure decision contexts raises legitimate concerns regarding accountability, regulatory compliance, and operator trust [31,32]. Explainable artificial intelligence (XAI) methods provide post hoc or inherently interpretable tools for attributing model outputs to input features. SHAP, grounded in cooperative game theory, offers theoretically consistent attribution of individual predictions to contributing features and has been applied to flood risk assessment, structural health monitoring, and energy system fault detection [17,33]. However, SHAP-based explainability has not been systematically integrated with multi-source drainage forecasting frameworks to attribute forecast uncertainty to physical asset properties, which would increase operational utility for infrastructure managers [34]. Integrating SHAP into this framework improves the interpretability of model predictions.

2.4. Knowledge Gaps and Research Problem Statement

The review above highlights several interrelated gaps in the current literature. Although data-driven approaches for urban drainage forecasting have progressed substantially, the performance and stability of hybrid convolution–recurrent architectures have not been systematically evaluated for drainage hydraulics. In particular, the incremental predictive value of integrating static built-asset descriptors with dynamic sensor observations has rarely been quantified within a controlled ablation framework. Moreover, existing drainage digital twin implementations typically adopt one of two paradigms: computationally intensive physics-based simulation engines or sensor-driven surrogate models. The former remains constrained by runtime limitations that limit operational deployment, whereas the latter often lack specific conditioning on infrastructure characteristics that govern hydraulic response under rainfall forcing.

While XAI methods have gained traction in environmental modelling, their application in urban drainage has largely focused on catchment-scale interpretation. Feature attribution at the node level, explicitly linking forecast outputs to physical infrastructure attributes, remains underdeveloped. This limits the operational interpretability and decision-support value of data-driven models in safety-critical drainage management. Against this background, the present study develops a data-driven digital twin framework designed to integrate multi-source observations with infrastructure descriptors, generate multi-horizon hydraulic forecasts, and provide physically interpretable attribution of prediction outcomes. The framework is designed to support real-time decision-making in urban drainage systems subject to rainfall-driven surcharge risk. A structured comparison is provided in Supplementary Material S4 [17,18,19,22,24,25,35,36,37,38,39,40,41,42,43,44,45,46,47,48] and Table S1.

The research problem addressed in this paper is therefore how to construct a data-driven digital twin framework that uses the complementary information from dynamic sensor streams and static asset descriptors, generates accurate multi-horizon hydraulic predictions in real time, and explains prediction outputs in relation to physical drainage drivers to support operational intervention.

The specific objectives are to (i) develop a multi-source TCN–LSTM model for drainage hydraulic prediction; (ii) quantify the added value of built-asset descriptors compared with sensor-only baselines; (iii) use SHAP attribution to identify the main physical drivers of surcharge risk at node and subcatchment scales; and (iv) validate the framework in an urban drainage system in the Yangtze River Delta.

The research problem addressed in this paper is, therefore, as follows: how can a data-driven digital twin framework be constructed that (a) uses the complementary information content of dynamic sensor streams and static asset descriptors, (b) generates accurate multi-horizon hydraulic forecasts in near-real-time, and (c) provides interpretable attribution of forecast outputs to physical drivers, sufficient to support proactive, evidence-based operational intervention in urban drainage systems?

The specific objectives are to (i) develop and implement a multi-source hybrid TCN–LSTM architecture for drainage hydraulic prediction; (ii) quantify the incremental predictive contribution of built-asset descriptors relative to sensor-only baselines through controlled ablation experiments; (iii) integrate SHAP-based attribution to identify dominant physical drivers of surcharge risk at the subcatchment and node scales; and (iv) validate the complete framework in a representative urban drainage system in the Yangtze River Delta.

3. Proposed Framework

3.1. Scope and Design Rationale

The framework focuses on real-time prediction of hydraulic state variables in UDS under rainfall conditions, with prediction horizons of 15, 30, and 60 min. The target prediction variables are (1) surcharge depth at designated monitoring nodes and (2) a binary flooding risk indicator (surcharge depth exceeding manhole rim elevation). The system is designed to operate on commodity server hardware with an inference latency below 500 milliseconds per forecast cycle, compatible with standard 5 min SCADA update cycles. Surface inundation mapping is explicitly outside this study’s scope and would require additional topographic modelling.

3.2. Framework Architecture Overview

The framework comprises four integrated modules (Figure 1): (M1) multi-source data ingestion and preprocessing; (M2) feature engineering producing dynamic and static input tensors; (M3) the TCN–LSTM predictive model with multi-horizon output heads; and (M4) SHAP-based explainability and attribution. During inference, modules M1–M4 are implemented sequentially within a single forecast cycle triggered by incoming sensor data, forming a consistent data–model–interpretation pipeline and producing hydraulic predictions and their SHAP attributions simultaneously.

3.3. Multi-Source Data Fusion Architecture (M1–M2)

The data ingestion module (M1) interfaces with two primary data streams. The dynamic stream comprises high-frequency time-series data sampled at 5 min intervals: water level and flow rate at monitoring nodes, pump operational status, SCADA control signals, and rainfall depth from a dense rain gauge network supplemented by radar quantitative precipitation estimates. The static stream comprises built-asset descriptors extracted from GIS and as-built databases: pipe diameter, length, material age, full-bore capacity, invert levels, manhole rim elevation, contributing catchment area, impervious surface fraction, terrain slope, road-surface gradient, and a set of network connectivity indicators (upstream pipe count, flow path length to outlet).

The feature engineering layer (M2) constructs the input tensor passed to the predictive model. Dynamic features are organised into a three-dimensional tensor representing nodes, time steps, and feature variables. The tensor has the shape (N, T, D_d), where N is the number of monitoring nodes, T is the lookback window, and D_d is the number of dynamic features per node. In this study, T was fixed at 12 time steps, corresponding to a 60 min lookback window at the 5 min sampling resolution. Static features are concatenated to the final hidden state of the temporal encoder, rather than being repeated along the time axis, to avoid temporal distortion of time-invariant attributes. This concatenation strategy treats static asset descriptors as context vectors that modulate the temporal prediction, consistent with conditional modelling approaches used in analogous infrastructure domains [49].

3.4. TCN-LSTM Predictive Architecture (M3)

The predictive model employs a two-stage temporal architecture. In the first stage, a TCN encoder processes the lookback window tensor to extract multi-scale temporal features. As illustrated in Figure 2c, the TCN uses dilated causal convolutions with dilation factors of 1, 2, 4, and 8, a kernel size of 3, and 64 filters per layer, yielding a receptive field of 31 time steps at the deepest layer. Residual connections are applied across each dilation block to mitigate vanishing gradient issues during training. The TCN output is a feature tensor preserving the temporal dimension, allowing the subsequent LSTM to operate on learned temporal abstractions rather than raw sensor values.

In the second stage, a two-layer LSTM with 128 hidden units processes the TCN output sequence to model long-range hydraulic dependencies, such as the propagation delay of surcharge waves through the pipe network. The LSTM final hidden state is concatenated with the static asset descriptor vector to form a combined representation of dimension 128 + D_s, where D_s is the number of static features (nominally 12 after principal component analysis preprocessing). Three parallel fully connected output heads produce forecasts for the 15, 30, and 60 min horizons, respectively, enabling the model to capture the progressive uncertainty increase with forecast lead time. Together, these components form an integrated prediction framework (Figure 2), capturing both short-term dynamics and longer-term hydraulic dependencies in UDSs.

Overall, the hybrid TCN–LSTM architecture integrates multi-scale temporal feature extraction with long-range hydraulic dependency modelling within a unified framework. Dilated causal convolutions capture short-term rainfall-driven dynamics, while the LSTM layers encode delayed routing and storage effects across the network. Conditioning the temporal representation on static asset descriptors ensures that forecasts remain infrastructure-aware rather than sensor-driven alone. The multi-horizon design further supports operational early warning at 15–60 min lead times. This architecture therefore balances predictive accuracy, computational efficiency, and physical interpretability, consistent with the resilience-oriented objectives of the proposed digital twin framework.

3.5. SHAP Explainability Module (M4)

The explainability module applies TreeSHAP to a gradient boosting surrogate model to approximate the prediction behaviour of the trained TCN-LSTM model. The surrogate is used only to generate computationally efficient SHAP attributions and is not intended to replace the main prediction model. The surrogate was trained using the TCN-LSTM outputs as target values and evaluated on the validation subset, achieving strong agreement with the original model. SHAP values were then computed for the independent test subset. The resulting attributions were analysed at both the node and catchment scales to identify dominant input features associated with the predicted surcharge risk. Because the stability of SHAP values across different storm events, node groups, and random initialisation seeds was not systematically tested in this study, the attribution results are interpreted as aggregated indicative patterns rather than causal explanations.

3.6. Research Hypotheses

The framework design is guided by three testable hypotheses.

H1. Multi-source fusion of dynamic sensor streams with static asset descriptors will improve forecast accuracy relative to sensor-only models, as measured by root-mean-square error (RMSE) of surcharge depth prediction.

H2. The hybrid TCN-LSTM architecture will outperform single-component LSTM-only and TCN-only baselines by leveraging complementary temporal feature extraction capabilities.

H3. SHAP attributions will consistently identify a small subset of physical asset descriptors (specifically pipe hydraulic capacity ratio and contributing catchment area) as dominant drivers of surcharge risk, consistent with established hydraulic principles.

4. Research Methodology

4.1. Assumptions and Scope Constraints

The following modelling assumptions are adopted. Sensor data quality is assumed sufficient following automated outlier detection and gap-filling; no explicit provision is made for catastrophic sensor failure affecting more than 20% of active monitoring points simultaneously. The static asset descriptor database is assumed to reflect the as-built state of the network; ongoing asset deterioration or rehabilitation during the study period is not modelled. The forecast framework is scoped to pipe network hydraulics in a separate sewer system, where localised misconnections may occur, and does not model surface flow pathways or groundwater interactions explicitly. These effects are implicitly captured through the data-driven modelling approach. These assumptions are considered reasonable for the operational timeframes addressed (15–60 min).

4.2. Study Area

The case study is located in an urban district of the Yangtze River Delta, Anhui Province, China, comprising a drainage catchment area of approximately 48 km² served by a predominantly separate sewer network. The area is characterised by high impervious surface fractions (mean 0.72), flat terrain (average slope 0.3%), and frequent short-duration convective rainfall events during the May–September monsoon period. The network comprises approximately 2400 pipe segments, 1850 manholes, and 8 pump stations, with pipe diameters ranging from 300 mm to 1000 mm. The key characteristics of the case study drainage system are summarised in Table 1.

4.3. Data Sources

Sensor data were provided by the district urban management bureau and cover a 12-month observation period (November 2021 to November 2022). The monitoring network includes 86 water level sensors, 34 flow metres, and 68 rain gauges distributed across the catchment at an average density of one gauge per 0.7 km². Data are transmitted via 4G telemetry at 5 min intervals to a centralised SCADA platform. Pump operational logs and SCADA control signals are available for all 8 pump stations at the same temporal resolution. The complete dataset comprises approximately 9.1 million individual sensor records prior to quality control processing.

Asset descriptor data were extracted from the district geographic information system and as-built record drawings, supplemented by field survey records from the most recent CCTV inspection programme (completed 2022). Catchment morphological attributes (contributing area, imperviousness, slope) were derived from a 1-m-resolution LiDAR digital terrain model acquired in 2021. Network connectivity indicators were computed programmatically from the pipe network topology using graph-theoretic algorithms. All asset descriptors were validated against hydraulic model parameters used in the district’s existing hydrodynamic model developed in Autodesk InfoWorks ICM 2024.

4.4. Data Preprocessing and Quality Control

Sensor data preprocessing followed a four-stage protocol. Stage 1 applied range-based outlier flagging using physically plausible bounds derived from the hydraulic model (e.g., water level constrained between invert and rim elevation plus 0.5 m allowance for pressurised conditions). Stage 2 identified and imputed short gaps of up to three consecutive missing observations, equivalent to 15 min at the 5 min sampling resolution, using linear interpolation. This threshold was selected because short telemetry interruptions of this duration are common in SCADA records and are unlikely to remove a complete rainfall–response cycle in the study network. Longer gaps were not interpolated because they may obscure rapid wet-weather hydraulic changes and introduce artificial smoothing during surcharge events. These longer gaps were therefore excluded from model training and evaluation. Stage 3 applied cross-sensor consistency checking for level and flow measurements at the same location. Stage 4 aligned all time series to a common 5 min UTC timestamp grid. After quality control, 94.7% of sensor-days were retained for analysis (Supplementary Material S1) [24]. The overall framework architecture and data processing procedures are presented in Figure 1 and Figure 2.

4.5. Experimental Design and Model Configurations

The dataset comprises a 12-month observation period (November 2021 to October 2022) and was partitioned into training, validation, and test subsets using an 8:1:1 ratio in chronological order. The first 80% of the time series was used for model training, the subsequent 10% for hyperparameter validation, and the final 10% for test evaluation. This temporal partitioning preserves the temporal structure of monitored hydraulic responses and avoids information leakage. Although the network is predominantly separate, potential RDII, local misconnections, and cross-connections may still affect wet-weather system behaviour. The test subset includes rainfall events with peak 1 h intensity exceeding 30 mm/h, representing conditions of greatest operational relevance.

All models were trained using the Adam optimiser with an initial learning rate of 1 × 10⁻³ and a batch size of 64. Hyperparameters were selected based on validation-set performance rather than test-set results. The tested ranges included learning rates of 1 × 10⁻⁴ to 1 × 10⁻³, batch sizes of 32 and 64, LSTM hidden units of 64 and 128, TCN kernel sizes of 2 and 3, and dropout rates of 0.1 to 0.3. The final configuration was selected as the model with the lowest validation MAE while maintaining stable validation loss. Early stopping was applied with a patience of 15 epochs to reduce overfitting. Model performance was then reported only on the independent chronological test subset.

The models were implemented in Python 3.10 using PyTorch 2.0.1 for deep learning model construction and training, scikit-learn 1.3.0 for baseline models and preprocessing, pandas 2.0.3 and NumPy 1.24.4 for data handling, and SHAP 0.42.1 Python package for post hoc feature attribution. All input variables were normalised using parameters estimated from the training set and then applied unchanged to the validation and test sets. The same chronological train–validation–test split was used for all model configurations to ensure a consistent comparison. The six model configurations used for the ablation study are summarised in Table 2. Training was repeated using fixed random seeds to improve reproducibility, and the reported results correspond to the final selected configuration evaluated on the independent test subset.

4.6. SHAP Explainability Setup

SHAP values were computed for the full framework (C6) on all test events, using the gradient boosting surrogate described in Section 3.5. The surrogate was retrained on C6 predictions for the training period prior to each test evaluation, ensuring temporal consistency. Feature importance rankings were computed as the mean absolute SHAP value across all test predictions for each input feature, aggregated separately for dynamic sensor features and static asset features to enable direct comparison of their relative attribution magnitudes.

5. Results and Discussion

5.1. Forecast Performance: Architecture Effect (H2)

Table 3 summarises the predictive performance of conventional machine learning (ML) baselines and deep learning (DL) architectures evaluated on the test dataset. Water level forecasting was selected as the primary evaluation target due to its strong hydraulic interpretability and lower measurement noise relative to flow observations. Among conventional ML models, the Decision Tree achieved the lowest MSE (80.03), outperforming Linear Regression (92.43) and Random Forest (88.08). However, all ML baselines showed limited generalisation under transient wet-weather fluctuations, as reflected by the relatively higher RMSE and lower R² values.

The DL architectures demonstrated consistently improved performance. The LSTM model achieved an RMSE of 0.081 m and R² of 0.8275, indicating improved representation of temporal dependencies compared to ML baselines. The TCN model further improved performance, reducing the RMSE to 0.067 m and increasing R² to 0.9143, highlighting the advantage of multi-scale temporal feature extraction through dilated convolutions. The proposed hybrid TCN-LSTM architecture achieved the best overall performance, with an RMSE of 0.061, MAE of 0.044 m, MSE of 37.21, and R² of 0.9318. The hybrid TCN–LSTM model reduced the RMSE from 0.081 m for the LSTM baseline to 0.061 m, corresponding to a 24.7% reduction. The performance gain over the standalone TCN model was more moderate but consistent, particularly during peak and recession periods, suggesting enhanced robustness in capturing both rapid fluctuations and longer-term hydraulic dynamics.

Figure 3 presents a representative segment of the test results at 5 min resolution for the LSTM, TCN, and hybrid TCN-LSTM architectures. Clear differences are observed during rainfall-driven peak events and subsequent recession periods. The LSTM model captures the overall trend in water level variation, but it shows a clear phase lag and reduced peak magnitude during rapid surcharge transitions. In comparison, the TCN model responds more quickly to short-term fluctuations, suggesting that dilated convolution more effectively captures multi-scale temporal features. The proposed hybrid TCN-LSTM architecture further enhances prediction stability during peak water level events while maintaining smooth recession tracking, indicating complementary strengths of convolutional feature extraction and recurrent memory mechanisms.

Performance differences were most evident during rapid water-level changes and delayed recession. Although Table 3 reports the overall test-set performance rather than horizon-specific metrics, Figure 3 shows that the hybrid TCN–LSTM model produced more stable predictions during peak and post-peak periods. These findings support H2 and indicate improved representation of multi-scale temporal dynamics in urban water level prediction.

5.2. Forecast Performance: Asset Feature Fusion Effect (H1)

The inclusion of built-asset descriptors produced systematic performance improvements across all forecast horizons. For the 15 min horizon, the RMSE decreased from 0.063 m (sensor-only TCN-LSTM) to 0.052 m (asset-informed C6), representing an 18% reduction. At the 60 min horizon, the reduction increased to 22%. These improvements are interpreted as enhanced predictive robustness, particularly under near-capacity conditions where measurement uncertainty may increase due to hydraulic sensitivity and potential sensor noise (Supplementary Material S2). The improvement was spatially heterogeneous. Nodes with a pipe full-bore capacity ratio exceeding 0.85 exhibited the largest performance gains, with local RMSE decreases reaching 26%. These nodes are typically associated with near-capacity operation and increased likelihood of manhole surcharge conditions, where small variations in hydraulic state can lead to rapid changes in water level. In such conditions, asset descriptors (e.g., capacity ratio and upstream contributing area) provide additional structural context that helps stabilise predictions and constrain variability in model outputs. In contrast, low-stress nodes showed marginal gains, indicating that asset descriptors are most beneficial under near-capacity conditions, where system behaviour is more sensitive to structural constraints.

Figure 3. Comparison of measured and predicted water levels using LSTM, TCN, and TCN–LSTM architectures over a representative three-day test period. The y-axis represents water level in millimetres, and the x-axis represents the time-step index, with each step corresponding to 5 min. Vertical dashed lines indicate daily boundaries at 288-step intervals.

5.3. SHAP Attribution Analysis (H3)

The aggregated SHAP results identified three static asset features as major contributors to surcharge prediction, namely, pipe full-bore capacity ratio, upstream contributing catchment area, and impervious surface fraction (Figure 4). These patterns are physically consistent with drainage system behaviour. Pipes operating close to full-bore capacity have limited hydraulic buffer, while larger impervious contributing areas can generate faster and higher peak inflows during intense rainfall. These findings support the physical plausibility of the model interpretation, although further stability testing across additional storms, node groups, and random seeds is required.

Among the dynamic features, antecedent water level 30 min before event onset and peak 15 min rainfall intensity were the two highest-ranked inputs at the 15 min horizon (Figure 4). At the 60 min horizon, the attribution of rainfall intensity decreased, while the attribution of network connectivity indicators, such as upstream pipe segment count, increased. This shift suggests that short-horizon predictions are mainly influenced by local wet-weather conditions, whereas longer-horizon predictions are more affected by routing and redistribution through the drainage network. These horizon-dependent patterns indicate that additional upstream flow metering could help reduce uncertainty in 60 min predictions.

Overall, the SHAP results provide a physically reasonable interpretation of the prediction outputs. Static asset features explain where the surcharge risk is structurally higher, while dynamic rainfall and water-level features explain short-term variations during storm events. The results should therefore be interpreted as aggregated attribution patterns within the present case study, rather than as causal explanations or universally stable feature rankings.

5.4. Computational Performance

The mean inference time across the 86-node monitoring network was 210 ms per forecast cycle on a standard server (Intel i7 CPU, 32 GB RAM). The latency was measured over 100 repeated runs and is reported as the average runtime, including data preprocessing and model inference, with low variability across runs. The reported inference time is compatible with the 5 min SCADA update cycle and satisfies real-time operational requirements (<500 ms). This demonstrates a substantial computational improvement compared to conventional physics-based simulation approaches, enabling near-real-time prediction and rapid evaluation of multiple rainfall scenarios on standard hardware. Inference time is expected to scale approximately linearly with network size under similar settings.

5.5. Synthesis with Prior Literature

The 18–22% RMSE reduction from asset feature fusion corroborates the argument advanced in prior work [19,24] that static infrastructure attributes carry independent predictive information not recoverable from sensor observations alone. The TCN–LSTM architecture advantage aligns with findings in traffic and energy forecasting [18], while the drainage-specific validation reported here addresses a gap highlighted in systematic reviews of deep learning for hydrological forecasting [15]. The integrated SHAP attribution methodology extends existing XAI applications in hydraulic engineering by delivering node-level, multi-horizon attribution that is directly actionable for asset managers at a level of spatial granularity not previously reported in the drainage XAI literature [27,28].

5.6. Limitations and Future Research Directions

Although the proposed explainable DT demonstrates clear performance gains and operational interpretability within this case study, several limitations should be recognised when interpreting the results and considering broader deployment. In particular, under near-capacity conditions, increased hydraulic sensitivity may introduce higher measurement uncertainty. Performance improvements should therefore be interpreted as enhanced predictive robustness rather than solely reduced error. A primary constraint concerns generalisability. The framework is calibrated and validated on a single urban drainage network characterised by high imperviousness, relatively flat terrain, and dense monitoring infrastructure. Under such conditions, the model can effectively learn rainfall–runoff response patterns and routing dynamics. However, drainage systems in steeper catchments, tidal environments, or systems with different structural conditions may show different hydraulic behaviours. Because the TCN-LSTM architecture implicitly encodes system-specific temporal memory and connectivity effects, direct transfer to morphologically distinct networks may not yield comparable accuracy. In this sense, the current study demonstrates feasibility rather than universal scalability.

In addition, static built-asset descriptors are treated as time-invariant contextual variables. In practice, sediment deposition, pipe roughness evolution, structural deterioration, and rehabilitation interventions gradually alter hydraulic capacity. The present framework assumes that GIS and inspection-derived attributes remain representative over the modelling period. Although sensor data undergo systematic quality control, persistent measurement bias or correlated telemetry failure are not explicitly modelled. The validation of explainability results also remains limited. The SHAP analysis was based on a gradient boosting surrogate trained to approximate the TCN-LSTM predictions, and the stability of SHAP attributions across storm types, node groups, and random seeds was not systematically tested. Therefore, the SHAP results should be interpreted as aggregated patterns within this case study, rather than as causal explanations or universally stable feature rankings. These assumptions are reasonable for short-term operational prediction but constrain longer-term asset degradation analysis and broader interpretation of the explanation results.

Future research should therefore focus on improving transferability and uncertainty representation. Transfer learning approaches could be explored to adapt a trained model to new drainage systems using limited local data. Asset-related variables, such as hydraulic capacity ratio and connectivity metrics, may support cross-site generalisation. At the same time, extending the framework to probabilistic forecasting would provide prediction intervals rather than single point estimates, allowing risk-based warning thresholds. Further integration with pump control or storage optimisation algorithms would also enhance practical value. Linking short-term forecasts with operational decision modules could support a shift from prediction toward proactive flood risk reduction, supporting more resilient urban drainage management under appropriate system conditions.

6. Conclusions

This paper has presented an explainable DT framework for near-real-time prediction of urban drainage hydraulic responses to extreme rainfall, validated through a case study in the Yangtze River Delta. The proposed framework demonstrates improved predictive accuracy and interpretability, offering practical value for real-time drainage management and infrastructure planning. Rather than relying on sensor observations alone, the framework integrates dynamic monitoring and SCADA data with static built-asset descriptors, allowing predictions to reflect both short-term hydraulic variation and infrastructure conditions. The hybrid TCN–LSTM architecture captured rainfall-driven water-level changes and delayed hydraulic responses, while the SHAP module provided interpretable attribution of prediction outputs to key physical and operational factors. These findings indicate that explainable prediction models can support not only real-time warning but also inspection planning and rehabilitation prioritisation.

For drainage authorities, the framework can support earlier recognition of surcharge-prone locations within standard SCADA update cycles. The attribution results help link predicted surcharge risk to pipe capacity, upstream contributing area, imperviousness, and recent rainfall conditions. This provides a more transparent basis for operational response and asset planning than prediction accuracy alone.

At the societal level, the framework has practical relevance for cities exposed to short-duration extreme rainfall and increasing drainage pressure. In dense and rapidly urbanising regions such as the Yangtze River Delta, improved warning and response capacity can help reduce disruption to transport, property, and public services. The framework demonstrates the feasibility of explainable drainage prediction in the present case study, but its transferability to other drainage systems requires recalibration and independent validation using local sensor, GIS, and as-built data.

Future research should extend the framework by linking pipe network prediction with surface inundation modelling, testing transfer learning across drainage systems, developing uncertainty-aware interpretation methods, and integrating prediction outputs with real-time pump control or storage optimisation. These developments would further support the use of explainable digital twins as practical tools for resilient urban drainage management.

Supplementary Materials

The following supporting information can be downloaded at https://www.mdpi.com/article/10.3390/buildings16101856/s1.

Author Contributions

Conceptualization, L.Z. and Y.W.; methodology, Y.W. and X.W.; software, Y.W. and L.K.; validation, X.W. and A.T.K.C.; formal analysis, Y.W.; data curation, L.K.; writing—original draft preparation, Y.W.; writing—review and editing, L.Z. and A.T.K.C.; supervision, L.Z.; project administration, L.Z.; funding acquisition, L.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Key Research and Development Program of Zhejiang Province, China (Grant No. 2023C03151), and the National Natural Science Foundation of China, China (Grant No. 52370050).

Data Availability Statement

Sensor and asset data are subject to confidentiality agreements with the district urban management bureau and are not publicly available. Processed model outputs and evaluation metrics are available from the corresponding author on reasonable request.

Acknowledgments

The authors gratefully acknowledge the Innovation Centre of Yangtze River Delta, Zhejiang University, for providing system data and institutional support. We also thank the Huadong Engineering Corporation for supplying engineering data and practical insights that supported model validation and system interpretation. The authors further acknowledge the Department of Engineering, University of Cambridge, for technical collaboration and academic support.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Yuan, D.; Wang, H.; Wang, C.; Yan, C.; Xu, L.; Zhang, C.; Wang, J.; Kou, Y. Characteristics of urban flood resilience evolution and analysis of influencing factors: A case study of Yingtan city, China. Water 2024, 16, 834. [Google Scholar] [CrossRef]
Xu, H.; Wang, Y.; Fu, X.; Wang, D.; Luan, Q. Urban flood modeling and risk assessment with limited observation data: The Beijing future science city of China. Int. J. Environ. Res. Public Health 2023, 20, 4640. [Google Scholar] [CrossRef]
Piadeh, F.; Behzadian, K.; Alani, A.M. A critical review of real-time modelling of flood forecasting in urban drainage systems. J. Hydrol. 2022, 607, 127476. [Google Scholar] [CrossRef]
Mahmoud, A.; Mohammed, A. Leveraging hybrid deep learning models for enhanced multivariate time series forecasting. Neural Process. Lett. 2024, 56, 223. [Google Scholar] [CrossRef]
Bam, P.G.; Rezaei, N.; Roubanis, A.; Austin, D.; Austin, E.; Tarroja, B.; Takacs, I.; Villez, K.; Rosso, D. Digital Twin Applications in the Water Sector: A Review. Water 2025, 17, 2957. [Google Scholar] [CrossRef]
Xu, Q.; Shi, Y.; Bamber, J.; Tuo, Y.; Ludwig, R.; Zhu, X.X. Physics-aware machine learning revolutionizes scientific paradigm for machine learning and process-based hydrology. arXiv 2023, arXiv:2310.05227. [Google Scholar]
Li, M.; Sun, H.; Huang, Y.; Chen, H. Shapley value: From cooperative game to explainable artificial intelligence. Auton. Intell. Syst. 2024, 4, 2. [Google Scholar] [CrossRef]
Efremov, C.; Le, T.T.; Paramasivam, P.; Rudzki, K.; Osman, S.M.; Chau, T.H. Improving syngas yield and quality from biomass/coal co-gasification using cooperative game theory and local interpretable model-agnostic explanations. Int. J. Hydrogen Energy 2024, 96, 892–907. [Google Scholar] [CrossRef]
Mousavi, Y.; Gharineiat, Z.; Karimi, A.A.; McDougall, K.; Rossi, A.; Gonizzi Barsanti, S. Digital twin technology in built environment: A review of applications, capabilities and challenges. Smart Cities 2024, 7, 2594–2615. [Google Scholar] [CrossRef]
Oh, J.; Bartos, M. Model predictive control of stormwater basins coupled with real-time data assimilation enhances flood and pollution control under uncertainty. Water Res. 2023, 235, 119825. [Google Scholar] [CrossRef]
Kim, M.-G.; Bartos, M. A digital twin model for contaminant fate and transport in urban and natural drainage networks with online state estimation. Environ. Model. Softw. 2024, 171, 105868. [Google Scholar] [CrossRef]
Bartos, M.; Kerkez, B. Pipedream: An interactive digital twin model for natural and urban drainage systems. Environ. Model. Softw. 2021, 144, 105120. [Google Scholar] [CrossRef]
Lumley, D.; Jursic Wanninger, D.; Magnusson, Å.; I’Ons, D.; Gustafsson, L.-G. Implementing a digital twin for optimized real-time control of Gothenburg’s regional sewage system. Water Pract. Technol. 2024, 19, 657–670. [Google Scholar] [CrossRef]
Mosavi, A.; Ozturk, P.; Chau, K.-W. Flood prediction using machine learning models: Literature review. Water 2018, 10, 1536. [Google Scholar] [CrossRef]
Lumley, D.J.; Polesel, F.; Refstrup Sørensen, H.; Gustafsson, L.-G. Connecting digital twins to control collections systems and water resource recovery facilities: From siloed to integrated urban (waste) water management. Water Pract. Technol. 2024, 19, 2267–2278. [Google Scholar] [CrossRef]
Park, S.; Kim, J.; Kim, Y.; Kang, J. Participatory framework for urban pluvial flood modeling in the digital twin era. Sustain. Cities Soc. 2024, 108, 105496. [Google Scholar] [CrossRef]
Aydin, H.E.; Iban, M.C. Predicting and analyzing flood susceptibility using boosting-based ensemble machine learning algorithms with SHapley Additive exPlanations. Nat. Hazards 2023, 116, 2957–2991. [Google Scholar] [CrossRef]
Aderyani, F.R.; Mousavi, S.J. Machine learning-based rainfall forecasting in real-time optimal operation of urban drainage systems. J. Hydrol. 2024, 645, 132118. [Google Scholar] [CrossRef]
Kratzert, F.; Klotz, D.; Brenner, C.; Schulz, K.; Herrnegger, M. Rainfall–runoff modelling using long short-term memory (LSTM) networks. Hydrol. Earth Syst. Sci. 2018, 22, 6005–6022. [Google Scholar] [CrossRef]
Kim, D.; Lee, J.; Kim, J.; Lee, M.; Wang, W.; Kim, H.S. Comparative analysis of long short-term memory and storage function model for flood water level forecasting of Bokha stream in NamHan River, Korea. J. Hydrol. 2022, 606, 127415. [Google Scholar] [CrossRef]
Kao, I.-F.; Zhou, Y.; Chang, L.-C.; Chang, F.-J. Exploring a Long Short-Term Memory based Encoder-Decoder framework for multi-step-ahead flood forecasting. J. Hydrol. 2020, 583, 124631. [Google Scholar] [CrossRef]
Bai, S.; Kolter, J.Z.; Koltun, V. An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. arXiv 2018, arXiv:1803.01271. [Google Scholar] [CrossRef]
Liu, Y.; Gong, C.; Yang, L.; Chen, Y. DSTP-RNN: A dual-stage two-phase attention-based recurrent neural network for long-term and multivariate time series prediction. Expert Syst. Appl. 2020, 143, 113082. [Google Scholar] [CrossRef]
Palmitessa, R.; Mikkelsen, P.S.; Borup, M.; Law, A.W. Soft sensing of water depth in combined sewers using LSTM neural networks with missing observations. J. Hydro-Environ. Res. 2021, 38, 106–116. [Google Scholar] [CrossRef]
Lu, Q.; Xie, X.; Heaton, J.; Parlikad, A.K.; Schooling, J. From BIM towards digital twin: Strategy and future development for smart asset management. In Proceedings of the International Workshop on Service Orientation in Holonic and Multi-Agent Manufacturing, Valencia, Spain, 3–4 October 2019; pp. 392–404. [Google Scholar]
Grieves, M.; Vickers, J. Digital twin: Mitigating unpredictable, undesirable emergent behavior in complex systems. In Transdisciplinary Perspectives on Complex Systems: New Findings and Approaches; Springer: Berlin/Heidelberg, Germany, 2016; pp. 85–113. [Google Scholar]
Garrido-Baserba, M.; Corominas, L.; Cortés, U.; Rosso, D.; Poch, M. The fourth-revolution in the water sector encounters the digital revolution. Environ. Sci. Technol. 2020, 54, 4698–4705. [Google Scholar] [CrossRef]
Willard, J.; Jia, X.; Xu, S.; Steinbach, M.; Kumar, V. Integrating physics-based modeling with machine learning: A survey. arXiv 2020, arXiv:2003.04919. [Google Scholar]
Seyyedi, A.; Bohlouli, M.; Oskoee, S.N. Machine learning and physics: A survey of integrated models. ACM Comput. Surv. 2023, 56, 1–33. [Google Scholar] [CrossRef]
Orozco López, E.; Kaplan, D.; Linhoss, A. Interpretable transformer neural network prediction of diverse environmental time series using weather forecasts. Water Resour. Res. 2024, 60, e2023WR036337. [Google Scholar] [CrossRef]
Rudin, C. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat. Mach. Intell. 2019, 1, 206–215. [Google Scholar] [CrossRef]
Adadi, A.; Berrada, M. Peeking inside the black-box: A survey on explainable artificial intelligence (XAI). IEEE Access 2018, 6, 52138–52160. [Google Scholar] [CrossRef]
Lundberg, S.M.; Lee, S.-I. A unified approach to interpreting model predictions. In Advances in Neural Information Processing Systems; NeurIPS: San Diego, CA, USA, 2017; Volume 30. [Google Scholar]
Reddy, C.N. Explainable artificial intelligence (xai) for climate hazard assessment: Enhancing predictive accuracy and transparency in drought, flood, and landslide modeling. Int. J. Sci. Technol. 2025, 16, 1–16. [Google Scholar] [CrossRef]
Cho, M.; Kim, C.; Jung, K.; Jung, H. Water level prediction model applying a long short-term memory (LSTM)–gated recurrent unit (GRU) method for flood prediction. Water 2022, 14, 2221. [Google Scholar] [CrossRef]
Zhang, J.; Zhu, Y.; Zhang, X.; Ye, M.; Yang, J. Developing a Long Short-Term Memory (LSTM)-based model for predicting water table depth in agricultural areas. J. Hydrol. 2018, 561, 918–929. [Google Scholar] [CrossRef]
Ozdemir, S.; Yildirim, S.O. Prediction of water level in lakes by RNN-based deep learning algorithms to preserve sustainability in changing climate and relationship to microcystin. Sustainability 2023, 15, 16008. [Google Scholar] [CrossRef]
Li, H.; Zhang, L.; Zhang, Y.; Yao, Y.; Wang, R.; Dai, Y. Water-level prediction analysis for the Three Gorges Reservoir area based on a hybrid model of LSTM and its variants. Water 2024, 16, 1227. [Google Scholar] [CrossRef]
Zhang, D.; Martinez, N.; Lindholm, G.; Ratnaweera, H. Manage sewer in-line storage control using hydraulic model and recurrent neural network. Water Resour. Manag. 2018, 32, 2079–2098. [Google Scholar] [CrossRef]
Haurum, J.B.; Bahnsen, C.H.; Pedersen, M.; Moeslund, T.B. Water level estimation in sewer pipes using deep convolutional neural networks. Water 2020, 12, 3412. [Google Scholar] [CrossRef]
Yan, J.; Mu, L.; Wang, L.; Ranjan, R.; Zomaya, A.Y. Temporal convolutional networks for the advance prediction of ENSO. Sci. Rep. 2020, 10, 8055. [Google Scholar] [CrossRef] [PubMed]
Zuo, H.; Gou, X.; Wang, X.; Zhang, M. A combined model for water quality prediction based on VMD-TCN-ARIMA optimized by WSWOA. Water 2023, 15, 4227. [Google Scholar] [CrossRef]
Liu, Y.; Wang, Y.; Liu, X.; Wang, X.; Ren, Z.; Wu, S. Research on runoff prediction based on Time2Vec-TCN-Transformer driven by multi-source data. Electronics 2024, 13, 2681. [Google Scholar] [CrossRef]
Kim, D. Temporal Convolutional Networks-Architectures and Applications: Investigating temporal convolutional networks (TCNs) and their applications in modeling sequential data with long-range dependencies. J. Artif. Intell. Res. Appl. 2023, 3, 1–7. [Google Scholar]
Zheng, Y.; Chen, X.; Zhang, Q.; Zhang, Y.; Wang, Y.; Zou, X.; Zhou, Y. Spatial heterogeneity identification for rainfall-derived inflow and infiltration in urban sewer systems based on water level sensor networks: Insights from an interpretable deep learning method. Environ. Res. 2025, 286, 122999. [Google Scholar] [CrossRef]
Longyang, Q.; Choi, S.; Tennant, H.; Hill, D.; Ashmead, N.; Neilson, B.T.; Newell, D.L.; McNamara, J.P.; Xu, T. An attention-based explainable deep learning approach to spatially distributed hydrologic modeling of a snow dominated mountainous karst watershed. Water Resour. Res. 2024, 60, e2024WR037878. [Google Scholar] [CrossRef]
Yan, J.; Lu, Q.; Li, N.; Chen, L.; Pitt, M. Common data environment for digital twins from building to city levels. Autom. Constr. 2025, 174, 106131. [Google Scholar] [CrossRef]
Mumuni, F.; Mumuni, A. Explainable artificial intelligence (XAI): From inherent explainability to large language models. arXiv 2025, arXiv:2501.09967. [Google Scholar] [CrossRef]
Perez-Pozuelo, I.; Zhai, B.; Palotti, J.; Mall, R.; Aupetit, M.; Garcia-Gomez, J.M.; Taheri, S.; Guan, Y.; Fernandez-Luque, L. The future of sleep health: A data-driven revolution in sleep science and medicine. npj Digit. Med. 2020, 3, 42. [Google Scholar] [CrossRef] [PubMed]

Figure 1. Schematic overview of the four-module explainable digital twin framework. The framework includes the following: M1—Data Collection and Preprocessing (integration of dynamic sensor data and GIS asset information); M2—Feature Construction (dynamic feature tensor generation with wavelet denoising and FFT features, and static feature embedding via PCA); M3—Hybrid TCN–LSTM Prediction Model (dilated TCN encoder, LSTM sequence learning, and multi-horizon forecasting); and M4—SHAP Explainability Module (gradient boosting surrogate and TreeSHAP feature attribution). Note: Arrows indicate data flow during inference.

Figure 2. Multi-panel architecture of the proposed explainable digital twin framework. (a) Overall system workflow integrating multi-source inputs, hybrid prediction, and SHAP-based attribution. (b) Rolling window scheme for multi-horizon forecasting. (c) Structure of a dilated TCN residual block with weight normalisation, activation, dropout, and residual connection.

Figure 4. Aggregated SHAP feature importance at the 15 min (a) and 60 min (b) prediction horizons. Values represent mean absolute SHAP values across test events and monitoring nodes. Static built-asset descriptors are shown with diagonal pattern fills, and dynamic sensor features are shown with dotted pattern fills.

Table 1. Key characteristics of the case study drainage system, Yangtze River Delta, Anhui Province, China.

Category	Parameter	Value
Catchment characteristics	Catchment area	~48 km²
	Mean impervious surface fraction	0.72
	Average terrain slope	0.3%
Network infrastructure	Pipe segments	~2400
	Manholes	~1850
	Pump stations	8
	Pipe diameter range	300–1000 mm
Monitoring system	Water level sensors	86
	Flow metres	34
	Rain gauges	68 (one per 0.7 km²)
Dataset characteristics	Observation period	November 2021–November 2022 (12 months)
	Temporal resolution	5 min
	Raw sensor records (pre-QC)	~9.1 × 10⁶
	Data retention after QC	94.7%

Note. QC = quality control.

Table 2. Six model configurations evaluated in the ablation study. C1–C3 isolate the effect of architecture (H2); C3 vs. C6 isolates the effect of built-asset descriptor fusion (H1); C6 is the full proposed framework.

Config.	Architecture	Features	Tests Hypothesis
C1	LSTM only	Sensor only	H2 (baseline)
C2	TCN only	Sensor only	H2
C3	TCN–LSTM	Sensor only	H2
C4	LSTM only	Sensor + asset	H1 (baseline)
C5	TCN only	Sensor + asset	H1
C6 (proposed)	TCN–LSTM	Sensor + asset	H1, H2, H3

Table 3. Water level prediction performance on test dataset.

Configuration	Category	RMSE (m)	MAE (m)	MSE (×10⁻⁴ m²)	R²
Linear Regression	ML baseline	0.098	0.075	92.43	0.742
Decision Tree	ML baseline	0.089	0.064	80.03	0.801
Random Forest	ML baseline	0.094	0.070	88.08	0.815
LSTM	DL	0.081	0.058	65.61	0.8275
TCN	DL	0.067	0.048	44.89	0.9143
TCN-LSTM	DL	0.061	0.044	37.21	0.9318

Note. MSE values are reported after multiplication by 10⁴ for readability. RMSE and MAE are reported in metres.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wang, Y.; Wen, X.; Kong, L.; Chan, A.T.K.; Zhu, L. Explainable Digital Twins for Urban Drainage Resilience: A Multi-Source TCN-LSTM Framework for Real-Time Water Flow Prediction. Buildings 2026, 16, 1856. https://doi.org/10.3390/buildings16101856

AMA Style

Wang Y, Wen X, Kong L, Chan ATK, Zhu L. Explainable Digital Twins for Urban Drainage Resilience: A Multi-Source TCN-LSTM Framework for Real-Time Water Flow Prediction. Buildings. 2026; 16(10):1856. https://doi.org/10.3390/buildings16101856

Chicago/Turabian Style

Wang, Yinglin, Xiaofang Wen, Lingyu Kong, Anson Tsz Kwan Chan, and Liang Zhu. 2026. "Explainable Digital Twins for Urban Drainage Resilience: A Multi-Source TCN-LSTM Framework for Real-Time Water Flow Prediction" Buildings 16, no. 10: 1856. https://doi.org/10.3390/buildings16101856

APA Style

Wang, Y., Wen, X., Kong, L., Chan, A. T. K., & Zhu, L. (2026). Explainable Digital Twins for Urban Drainage Resilience: A Multi-Source TCN-LSTM Framework for Real-Time Water Flow Prediction. Buildings, 16(10), 1856. https://doi.org/10.3390/buildings16101856

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Explainable Digital Twins for Urban Drainage Resilience: A Multi-Source TCN-LSTM Framework for Real-Time Water Flow Prediction

Abstract

1. Introduction

2. Background

2.1. Data-Driven Modelling of Urban Drainage Systems

2.2. Digital Twin Frameworks for Built Asset Management

2.3. Explainable AI in Infrastructure Management

2.4. Knowledge Gaps and Research Problem Statement

3. Proposed Framework

3.1. Scope and Design Rationale

3.2. Framework Architecture Overview

3.3. Multi-Source Data Fusion Architecture (M1–M2)

3.4. TCN-LSTM Predictive Architecture (M3)

3.5. SHAP Explainability Module (M4)

3.6. Research Hypotheses

4. Research Methodology

4.1. Assumptions and Scope Constraints

4.2. Study Area

4.3. Data Sources

4.4. Data Preprocessing and Quality Control

4.5. Experimental Design and Model Configurations

4.6. SHAP Explainability Setup

5. Results and Discussion

5.1. Forecast Performance: Architecture Effect (H2)

5.2. Forecast Performance: Asset Feature Fusion Effect (H1)

5.3. SHAP Attribution Analysis (H3)

5.4. Computational Performance

5.5. Synthesis with Prior Literature

5.6. Limitations and Future Research Directions

6. Conclusions

Supplementary Materials

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI