Dominant Factor Analysis and Threshold Inflection Point Determination in Deep Learning-Based SWAT-LSTM Training Models with SHAP Interpretability Analysis

Tian, Jiake; Zhang, Jun; Tong, Jianjie; He, Huaxiang; Gu, Ruidan; Shang, Fenjie

doi:10.3390/w18080960

Open AccessArticle

Dominant Factor Analysis and Threshold Inflection Point Determination in Deep Learning-Based SWAT-LSTM Training Models with SHAP Interpretability Analysis

by

Jiake Tian

¹,

Jun Zhang

²,

Jianjie Tong

²,

Huaxiang He

^3,*,

Ruidan Gu

⁴ and

Fenjie Shang

¹

School of Civil and Hydraulic Engineering, Ningxia University, Yinchuan 750021, China

²

JiLin Province Water Resource and Hydropower Consultative Company of P.R.CHINA, Changchun 130021, China

³

Department of Water Resources, China Institute of Water Resources and Hydropower Research, Beijing 100038, China

⁴

School of Water Resources and Civil Engineering, China Agricultural University, Beijing 100083, China

^*

Author to whom correspondence should be addressed.

Water 2026, 18(8), 960; https://doi.org/10.3390/w18080960

Submission received: 4 March 2026 / Revised: 5 April 2026 / Accepted: 15 April 2026 / Published: 17 April 2026

(This article belongs to the Section Ecohydrology)

Download

Browse Figures

Versions Notes

Abstract

Climate change has intensified extreme hydrological risks, particularly in basins characterized by frequent seasonal streamflow interruptions and discontinuous hydrological records, where traditional process-based models exhibit limited capability for adaptive water resource management. This study develops a hybrid SWAT-LSTM framework that integrates SWAT-derived hydrological variables with meteorological factors and applies SHAP interpretability analysis to quantify dominant drivers and identify threshold inflection points of runoff variability. Using the upper and middle reaches of the Huolin River Basin as a case study, the coupled model outperformed the standalone SWAT model during the test period (NSE: 0.876 vs. 0.710; R²: 0.884 vs. 0.736) and more accurately reproduced extreme flood and drought events. Future projections (2026–2100), driven by the optimized FGOALS-g3 climate model under SSP2-4.5 and SSP5-8.5 scenarios, indicate increasing precipitation, accelerated minimum temperature rise, and a non-stationary runoff pattern characterized by a mid-century decline followed by a late-century increase. The SHAP results reveal strengthened meteorological dominance, particularly for precipitation and minimum temperature, while soil moisture, evapotranspiration, and percolation remain key hydrological controls. The upward shift in the minimum temperature threshold reflects strengthened temperature control on runoff dynamics under warming. The proposed framework improves extreme runoff prediction and provides a quantitative basis for climate-adaptive basin management.

Keywords:

SWAT-LSTM; runoff simulation; model interpretability; climate change projection

1. Introduction

Runoff variability is influenced by multiple interacting factors, including climate change and human activities, and exhibit pronounced spatiotemporal heterogeneity and nonlinear behavior. These characteristics pose significant challenges for the development of robust watershed hydrological models [1]. This challenge is particularly evident in regions characterized by low runoff volumes and frequent seasonal streamflow interruptions, where hydrological observations are often discontinuous and unevenly distributed. Under such conditions, the calibration and validation of traditional process-based models become highly uncertain, highlighting the need for models with enhanced robustness, generalization capability, and the ability to capture complex nonlinear relationships [2].

The Soil and Water Assessment Tool (SWAT) [3] has been widely applied in watershed hydrological simulations due to its strong physical basis, computational efficiency, and broad applicability. It represents key hydrological processes by integrating multiple factors, including climate, vegetation, land use, and soil properties. Through Hydrologic Response Unit (HRU) discretization, SWAT effectively captures spatial heterogeneity and simulates processes such as infiltration, evapotranspiration, and runoff generation. Owing to its modular structure and scalability, SWAT has been extensively used in flood forecasting, drought assessment, and climate change impact studies worldwide [4,5,6]. For instance, in climate change research, the SWAT model can resolve the impact mechanisms of key climate factors—such as precipitation intensity and temperature rise—on watershed runoff. Through long-term simulations, it reveals the evolution patterns of hydrological processes, providing quantitative support for developing adaptive water resource allocation strategies [7]. The model can also be coupled with multi-scenario climate models to characterize the spatiotemporal differentiation of runoff generation and transport processes. Particularly in the context of frequent extreme rainfall events, this enables the optimization of flood control thresholds and the development of drought risk management plans for watersheds [8].

However, as application scenarios expand, the limitations of SWAT have gradually become apparent. The model involves a large number of parameters, and the interactions between these parameters produce the phenomenon of “different parameters with the same effect,” increasing the complexity of calibration. This is particularly problematic in basins with discontinuous hydrological records caused by seasonal river drying, where it may lead to significant simulation uncertainty [9]. Furthermore, SWAT has insufficient capability to fully capture abrupt hydrological responses under strong climatic variability and the special suppression mechanism of evapotranspiration at low temperatures [10]. Its limited capacity for instantaneous responses to extreme hydrological events, inadequate characterization of surface–groundwater interaction mechanisms, and deficiencies in coupling hydrological processes may trigger cumulative errors in long-term simulations, compromising the reliability of results [11].

Deep learning has increasingly complemented traditional hydrological modeling by capturing complex nonlinear relationships from historical data with reduced reliance on explicit process equations. While physically based models like SWAT provide a robust mechanical foundation, Long Short-Term Memory (LSTM) networks excel in capturing the long-range temporal dependencies inherent in rainfall–runoff processes [12,13]. This adaptability is particularly valuable for basins characterized by seasonal streamflow interruptions and discontinuous records, where LSTM can leverage multi-source data integration to maintain simulation accuracy. However, as hydrological modeling moves toward hybrid frameworks, the focus is shifting from static prediction to understanding the evolution of these relationships. This study integrates SWAT-derived physical constraints with LSTM’s learning capacity to not only optimize simulation in data-scarce conditions, but also to diagnose the non-stationary shifts in hydrological drivers and threshold behaviors under future climate stresses.

In recent years, coupling SWAT with LSTM has emerged as a promising “physical–data fusion” approach, integrating process-based understanding with data-driven learning to improve runoff simulation accuracy and robustness under complex hydrological conditions [14,15,16]. Further research indicates that this fusion approach not only enhances prediction accuracy, but also effectively addresses data-scarce regions. Particularly in basins lacking detailed observational data, coupled models demonstrate significant advantages by fully leveraging physical constraints and data-driven capabilities [17]. Existing studies have explored various model coupling methods; for example, Lyu et al. [18] integrated SWAT with an LSTM-based SWAT-MODFLOW framework to improve hydrological prediction in basins with seasonal flow interruption. Jin et al. [19] enhanced runoff and peak flow simulation by incorporating remote sensing data into SWAT-LSTM models. Huang et al. [20] further improved model accuracy by using deep learning to correct residual errors in SWAT simulations. However, most existing SWAT-LSTM coupling studies focus primarily on improving simulation accuracy, and lack in-depth analysis of the internal driving mechanism of runoff variation within the coupled framework.

With the rapid development of data-driven approaches, concerns regarding their “black box” nature have become increasingly prominent [21]. Explainable artificial intelligence (XAI) has therefore emerged as a key research direction to improve model transparency by quantifying the relationships between input features and model outputs. Among various XAI methods, LIME and SHAP are widely used to interpret complex machine learning models. LIME provides local explanations but has limited capability in capturing global model behavior [22]. whereas SHAP offers both local and global interpretability based on game-theoretic principles, making it particularly suitable for analyzing nonlinear hydrological processes. Previous studies have applied SHAP to runoff modeling, such as Bian et al. [23], who interpreted LSTM-based runoff simulations, and Wang and Peng [24], who analyzed driving factors using XGBoost models.

Although coupling physical models with deep learning has become a mainstream approach in hydrological modeling to improve predictive accuracy, and has achieved notable success in studies of land use change and climate response, the interpretability of such hybrid frameworks remains limited. As a result, simulation outcomes often lack support from basin-specific hydrological mechanisms [25]. Recent studies, such as Chen et al. [15], have utilized the SWAT-LSTM framework to markedly improve simulation performance in data-scarce basins; however, by neglecting the contribution mechanisms of internal variables, they fail to provide a sufficient theoretical foundation for water resource management decisions. Furthermore, although some researchers have introduced explainable AI (XAI) methods like SHAP, these applications are mostly confined to simpler machine learning models such as XGBoost. When applied to deep learning, these interpretations typically remain limited to historical attribution [26,27].

Against this backdrop, this study proposes a “Physics–Data–Interpretability” trinity framework that integrates the process-based SWAT model, the memory-enhanced LSTM network, and SHAP interpretability analysis. Unlike conventional standalone approaches, the framework incorporates SWAT-derived hydrological state variables as physical constraints to guide LSTM training, thereby ensuring physically consistent learning while preserving the model’s capability to capture long-term temporal dependencies and nonlinear residual patterns. Driven by multiple CMIP6 climate models and emission scenarios (e.g., SSP2-4.5 and SSP5-8.5), the SWAT-LSTM coupled model is employed to simulate and project runoff evolution in the Huolin River Basin over the period 2015–2100 under different future climate conditions. The optimal model configuration is determined based on validation performance and subsequently used for scenario-based future runoff simulations. In addition, SHAP interpretability analysis is applied to the optimal model to quantify the contributions of key climatic and hydrological drivers and to examine their temporal evolution. The results indicate a non-stationary shift in the threshold of minimum temperature (MinT) inflection points, providing a mechanistic explanation for the intensified hydrological response under warming conditions. Overall, the proposed framework extends beyond conventional black-box projections and offers a transparent and robust approach for understanding and managing runoff dynamics under climate change.

2. Materials and Methods

2.1. Study Area

The Huolin River is a primary tributary of the Nenjiang River system within the Songhua River basin. Located at the junction of western Jilin Province and Xing’an League, Inner Mongolia, it covers a total area of 11,500 km² with a total river length of 590 km (Figure 1). The upper reaches of the Huolin River basin traverse the hilly terrain of Huolingol City, where the Huolin River Reservoir has been constructed. This reservoir primarily supplies water to enterprises such as coal mines and thermal power plants, with an annual water supply volume reaching 25 million m³. Compared to pre-reservoir levels, runoff at the Mengji provincial boundary section has decreased by approximately 11%. The basin predominantly consists of alluvial plains, featuring floodplain wetlands approximately 300 m wide. Vegetation such as sedges and star grasses thrive here, forming a crucial component of the Xianghai Wetland Ecological Corridor. The basin experiences an arid climate with low rainfall, averaging about 380 mm annually, primarily concentrated between June and September. Annual evaporation reaches 1091 mm. This river exhibits seasonal flow patterns, with the wet season (June–September) accounting for 83% of annual runoff volume. The dry season spans November to March of the following year. Monitoring data from the Tongfaba hydrological station (1970–2020) indicates that continuous riverbed dry-outs exceeding 50 days predominantly occur in the downstream region between October and December.

2.2. Data Preparation

The meteorological driving data in this study were obtained from the National Meteorological Science Data Center of China (https://data.cma.cn), encompassing key variables including precipitation (PCP), maximum/minimum temperature (MaxT/MinT), wind speed (WS), solar radiation (SR), and relative humidity (RH). The data were extracted from seven representative meteorological stations located within the basin and its adjacent border areas (spatial distribution shown in Figure 1) to ensure accurate characterization of the regional climatic gradient. For the study area of approximately 11,500 km², the daily observed station data were assigned to each sub-basin based on centroid distance using the weather generator (WXGEN) embedded in the SWAT model, thereby effectively capturing the spatial heterogeneity of meteorological factors across the basin. The spatial datasets used for model construction consist of a digital elevation model (DEM), land use, and soil data. The DEM was derived from the Geospatial Data Cloud. Land use was based on the Land Use/Cover Change (LUCC) dataset provided by the Chinese Academy of Sciences, which was reclassified into six categories: cropland, forest, grassland, water bodies, urban/rural residential and industrial/mining land, and unused land. Soil data were sourced from the 1:1,000,000 soil database generated by the Institute of Soil Science, Chinese Academy of Sciences, during the Second National Soil Survey, with all soil parameters calculated using the Soil–Plant–Atmosphere–Water (SPAW) model. Runoff data were obtained from monthly observations at the Tongfaba Hydrological Station during 1970–2010, as documented in the China Hydrological Yearbook (Table 1).

For the monthly runoff observation data from 1970 to 2010, outlier detection was performed to identify and exclude non-physical continuous constant values resulting from instrument malfunctions or manual recording errors, thereby ensuring the statistical consistency of the input series. Given the specific nature of the Huolin River Basin as a typical semi-arid seasonal river, zero-flow records were explicitly retained. In the context of frequent streamflow interruptions (with dry periods often exceeding 50 days), zero flow represents the actual state of the hydrological cycle rather than missing data. Retaining these observations enables the model—particularly the Long Short-Term Memory (LSTM) network—to accurately capture hydrological response thresholds under extreme drought. This approach maintains hydrological integrity during the simulation and prevents the overestimation of total water resources that would arise from the artificial exclusion of low-flow values.

2.3. Coupled SWAT-LSTM Approach Preparation of the Coupled Model

This study proposes the development of a SWAT-LSTM training model based on deep learning to address the challenges of poor simulation performance caused by low runoff and frequent streamflow interruptions in the watershed.

2.3.1. SWAT Model

This study employs SWAT (Version 2012) as the core model for hydrological process simulation, providing long-term hydrological data for coupling with the LSTM model. Using ArcGIS Pro (Version 3.4.3) and based on DEM, the watershed was divided into 43 sub-basins, which were further subdivided into 293 hydrological response units (HRUs) according to land use, soil data, and slope, as shown in Figure 2. The model was driven by daily meteorological data, with input datasets summarized in Table 1, and simulated six hydrological variables: SW (Soil Water), GWQ (Groundwater discharge), SURQ (surface runoff), PERC (Percolation), ET (evapotranspiration) and PET (potential evapotranspiration). A one-year warm-up period (1970) was adopted to initialize soil moisture and groundwater conditions. The calibration period was set from 1971 to 2000, and the validation period from 2001 to 2010. Model calibration and validation were performed using the SUFI-2 algorithm within SWAT-CUP, and sensitivity analysis was conducted to identify parameters with significant influences on runoff. For sub-basins within the hydrological station’s controlled area, sub-basin data were integrated into representative input characteristics using the area-weighted average method (see Formula (1)). Time series consistency was ensured, with all data exhibiting good matching in both temporal resolution and time span.

\bar{V A R_{i}} = \frac{\sum_{j = 1}^{n} (V A R_{i, j} \times {A r e a}_{j})}{\sum_{j = 1}^{n} {A r e a}_{j}}

(1)

In the formula,

\bar{{V A R}_{i}}

represents the mean of variable

i

,

V A R_{i, j}

denotes the simulated value of variable

i

in sub-basin

j

, and

{A r e a}_{j}

is the area of sub-basin

j

.

2.3.2. Coupled SWAT-LSTM Approach

This study employed a SWAT-LSTM model to simulate basin hydrological processes from 1970 to 2010. Building upon the SWAT model, the LSTM model was constructed using TensorFlow Keras (Version 2.10). A neural network with 2–3 LSTM layers was established, incorporating Dropout and Dense layers between layers. The Adam optimizer and ReLU activation function were adopted, with MSE as the loss function and MAE as the evaluation metric. An early stopping mechanism was implemented to monitor validation set loss and prevent overfitting [28]. The detailed workflow is illustrated in Figure 3.

To further enhance model generalization capability and prevent overfitting, several regularization strategies were incorporated. First, Dropout layers were applied between LSTM layers to randomly deactivate a fraction of neurons during training, reducing co-adaptation among hidden units. Second, an early stopping mechanism was implemented to monitor validation loss; training was automatically terminated when validation performance ceased to improve, thereby avoiding excessive fitting to training data. Third, a sliding window strategy was adopted to capture temporal dependencies while controlling model complexity through optimized window size selection.

The input data for the SWAT-LSTM training model consists of six meteorological variables (PCP, MinT, MaxT, WS, SR, RH) and six SWAT-output hydrological variables (SW, GWQ, SURQ, PERC, ET, PET). Prior to input, the data undergoes variance-mean normalization preprocessing, with observed monthly runoff serving as the ground truth label. The historical records from 1970 to 2000 were utilized for model training and internal calibration. To prevent overfitting and ensure robust hyperparameter tuning, an internal validation mechanism was implemented during the training phase, where 25% of the samples were used in each epoch to monitor convergence and optimize the model configuration (e.g., learning rate, iteration count, and network depth) [29]. Once the optimal hyperparameters were determined based on this internal validation, the period from 2001 to 2010 was employed as a strictly independent testing set. This chronological separation ensures that the model’s generalization ability is rigorously evaluated on ‘unseen’ data, maintaining the integrity of the temporal forecasting task.

For the prediction of future runoff evolution (2026–2100), this study adopts a one-way coupling strategy. First, the downscaled CMIP6 meteorological forcing data are input into the calibrated SWAT model to simulate basin-scale physical hydrological variables for the future period, including SW, GWQ, SURQ, PERC, ET, and PET. Subsequently, these SWAT-generated hydrological variables are integrated with the original CMIP6 meteorological factors to construct a multi-source feature set, which is then fed into the trained LSTM model to obtain runoff responses under future climate forcing conditions. This workflow ensures that the deep learning model incorporates physically representative constraint variables when predicting future runoff dynamics.

To determine the optimal model architecture and training configurations, a systematic Grid Search strategy, coupled with cross-validation, was employed to explore the hyperparameter space outlined in Table 2. Among these parameters, the time window size is particularly critical, as it dictates the historical temporal dependencies the LSTM can capture. Consequently, a specific sensitivity analysis was conducted on the window size (ranging from 12 to 64 months). The results demonstrated that a 32-month window yielded the optimal predictive performance on the validation set.

2.4. Evaluation Metric

The model simulation performance was evaluated using NSE (Formula (2)), R² (Formula (3)) and the Mean Absolute Error (MAE, Formula (4)) as assessment metrics [30,31]. The NSE ranges from (−∞, 1], with values closer to 1 indicating higher prediction accuracy. An NSE of 1 signifies perfect agreement between simulated and observed values. The R² ranges from [0, 1], reflecting the linear correlation between predicted and actual values; values closer to 1 indicate stronger explanatory power of the model. The MAE ranges from [0, +∞), measuring model deviation by calculating the average absolute difference between predicted and observed values. A value closer to 0 indicates higher prediction accuracy.

NSE = 1 - \frac{\sum_{i = 1}^{n} {(O_{i} - P_{i})}^{2}}{\sum_{i = 1}^{n} {(Q_{i} - \bar{O})}^{2}}

(2)

R^{2} = \frac{\sum_{i = 1}^{n} (O_{i} - \bar{O}) (P_{i} - \bar{P})}{\sqrt{\sum_{i = 1}^{n} {(P_{i} - \bar{P})}^{2} \sum_{i = 1}^{n} {(O_{i} - \bar{O})}^{2}}}

(3)

M A E = \frac{1}{n} \sum_{i = 1}^{n} |O_{i} - P_{i}|

(4)

In the equation,

P_{i}

represents the simulated value at time step

i

;

O_{i}

is the observed value at time step

i

;

\bar{P}

is the mean of the simulated values;

\bar{O}

is the mean of the observed values; and

n

denotes the total number of data samples.

2.5. SHAP Interpretability Analysis

This study incorporates the Shapley Explainability Analysis (SHAP) method to simplify complex temporal feature interaction mechanisms. Based on game theory principles, this method quantifies the contribution weights of model input features to simulation outcomes. SHAP values represent each input variable’s influence on runoff simulation results, measuring how input variables independently or collectively impact final simulation outcomes [32]. This method combines global feature importance assessment with local interpretability. Its mathematical foundation stems from additive feature attribution theory to construct linear explanatory models [33], and its core expression is given by Formula (5):

g (z^{'}) = \emptyset + \sum_{I = 1}^{N} \emptyset_{i} {z^{'}}_{i}

(5)

In the equation,

g (z^{'})

denotes the predicted feature for a specific instance,

z^{'}

is the feature vector of the evaluated instance,

\emptyset

represents the input baseline attributes,

\emptyset_{i}

is the contribution of the

i

feature to the prediction,

{z^{'}}_{i}

is the value of the

i

feature for the evaluated instance, and N is the total number of input features.

For the deep neural network structure of the LSTM model, the DeepExplainer variant within the SHAP framework was employed for feature attribution analysis. DeepExplainer integrates the DeepLIFT algorithm with the Shapley value concept from game theory, enabling efficient handling of nonlinear activation functions within LSTM while providing high-fidelity approximations of feature contributions.

2.6. Selection of Future Climate Models

Climate factors are the primary drivers of runoff variability in this watershed. To investigate the impact of future climate change on runoff, this study employs the Common Metadata Test Data from the Sixth Phase of the Coupled Model Intercomparison Project (CMIP6). It incorporates Shared Socioeconomic Pathways (SSPs) and Representative Concentration Pathways (RCPs) to form the SSP-RCP combination. The SSP2-4.5 (medium emissions) and SSP5-8.5 (high emissions) scenarios were selected to evaluate the effects of different climate pathways on runoff processes. To mitigate uncertainties inherent in a single model, a multi-model ensemble mean (MME) was employed, comprising four leading CMIP6 models: CanESM5, FGOALS-g3, GFDL-CM4, and IPSL-CM6A-LR. These models were developed by authoritative institutions from different countries, with data sourced from a 0.25° daily meteorological dataset for the Chinese region [34], as detailed in Table 3.

3. Results

3.1. Parameter Sensitivity Analysis

This study calibrated the SWAT model using the SUFI-2 algorithm with the SWAT-CUP software. After 1000 iterations, 15 parameters significantly influencing runoff simulation results were selected for sensitivity ranking and optimal value selection, as detailed in Table 4. Identifying sensitive parameters helps clarify key factors affecting runoff simulation, thereby improving model parameter tuning efficiency. The selection criteria for sensitive parameters are defined by t-values and p-values. The t-value indicates the magnitude of parameter sensitivity, with higher values signifying greater sensitivity. The p-value represents the confidence level of parameter sensitivity, where values closer to 0 indicate greater importance.

Table 4 reveals that CN2 is the most sensitive parameter, primarily related to the initial conditions of surface runoff generation, determining the “trigger mechanism” for runoff formation [35]. Next is SOL_AWG, which governs soil moisture redistribution processes, influencing evapotranspiration rates and groundwater recharge dynamics; and thirdly GW_DELAY, which determines groundwater response time. These three parameters form a surface–soil–groundwater sensitivity chain, constituting the primary sensitive factors affecting runoff.

3.2. Model Performance Evaluation

This study systematically evaluates the runoff simulation performance of the SWAT model and the SWAT-LSTM training model. As shown in Figure 4, key evaluation metrics for both models during the training and testing periods are compared and summarized in Table 4.

The results indicate that the SWAT model performed well overall during the calibration period (1971–2000), with an R² of 0.753 and NSE of 0.738, maintaining consistent trends between simulated and observed values. However, during the simulation of the 1998 extreme flood event, simulated values were significantly lower than observed values (deviation > 30%), indicating the model failed to adequately capture the actual flow during extreme flood events. During the validation period (2001–2010), model accuracy slightly decreased, with R² dropping to 0.736 and NSE to 0.710. Although simulated values closely matched observations at low flow rates, systematic underestimation persisted at high flow rates. Overall, the SWAT model demonstrates reasonable reliability in runoff simulation, yet its ability to capture and accurately simulate extreme events remains limited. To address this shortcoming, this study proposes a deep learning-based SWAT-LSTM training model. By integrating LSTM’s robust capabilities in time series modeling, this approach aims to further optimize runoff simulation performance.

The runoff simulation results of the SWAT-LSTM trained model are shown in Figure 5. The SWAT-LSTM coupled model underwent 200 independent training–validation iterations. During the training period (1971–2000), the model achieved an average R² of 0.953, average NSE of 0.930, and MAE of 0.522, representing improvements over the standalone SWAT model. Particularly during extreme events—such as the typical flood year of 1998 and the severe drought year of 2004—the SWAT-LSTM model’s simulated values closely matched observed measurements. This demonstrates that the LSTM neural network can effectively learn and capture the nonlinear characteristics of hydrological processes. During the testing period (2001–2010), the SWAT-LSTM model maintained high simulation accuracy, achieving an R² of 0.884, an NSE improved to 0.876, and a reduced MAE of 0.765, representing significant optimization over the standalone SWAT model, as shown in Table 5. Overall, the SWAT-LSTM coupled model, by integrating physical mechanisms with data-driven advantages, significantly outperformed the standalone SWAT model in simulating extreme events, coordinating high- and low-flow segments, and overall accuracy. This provides a more reliable solution for simulating complex hydrological processes.

3.3. Interpreted LSTM Behaviors

3.3.1. Global Feature Impact

This study assessed the importance of 12 key variables sensitive to meteorological and hydrological factors using Shapley values to clarify the contribution mechanisms of each factor to runoff formation, as shown in Figure 6.

The results indicate that PCP is the most influential factor in the model, with a SHAP value of 0.521 accounting for approximately 26.1% of total importance, highlighting its dominant role in runoff generation within the study basin. Variations in precipitation are strongly associated with positive or negative changes in runoff. SW and PET rank second and third with contribution rates of 12.17% and 10.62%, respectively, highlighting the critical role of soil moisture dynamics and evaporation processes in the hydrological cycle. MaxT ranked fourth, where higher temperatures increase evaporation, directly affecting runoff formation; SR indirectly influenced runoff by regulating energy balance and evapotranspiration demand, ranking fifth with an 8.52% contribution. PERC and ET jointly regulated runoff dynamics by altering soil infiltration rates and surface water loss intensity, collectively contributing 14.3%.

Notably, meteorological factors (PCP, MAXT, SR) collectively contributed over 44%, highlighting climate’s decisive role. Hydrological factors (SW, PERC) regulated runoff components through vertical processes of storage–infiltration, accounting for 22.79%. In contrast, factors such as RH, SURQ, and GWQ contributed less than 5% each, primarily due to the study area’s runoff generation mechanism dominated by rapid surface responses. Although this global ranking pattern structurally confirms the basin’s characteristics, relying solely on aggregated importances obscures the dynamic nature of runoff generation. Therefore, the subsequent analysis delves into SHAP dependency plots to rigorously evaluate the nonlinear threshold effects and complex compounding interactions between these primary drivers.

3.3.2. Total Effects of Factors

Building upon the global feature importance analysis, this study further constructs a SHAP dependency plot to elucidate the dynamic regulatory mechanisms of key variables on runoff. As shown in Figure 7, the dependency plot plots variable observations on the x-axis and their corresponding SHAP values on the y-axis. By combining scatter plots with polynomial regression to smooth the curves, it illustrates the nonlinear mapping relationship between driving factors and runoff generation. Compared to traditional piecewise models, polynomial fitting can more continuously and realistically reflect the gradual transition characteristics of hydrological responses. In this study, the ‘sign-switching point’—defined as the point where the fitted curve crosses the SHAP zero baseline—is identified as the functional threshold for distinguishing driving factors. This critical point marks a fundamental shift in the variable’s effect on runoff from inhibitory (negative contribution) to promotional (positive contribution), thereby providing a robust statistical basis for quantitatively identifying threshold effects in hydrological processes.

Analysis of the dynamic response relationship between variables and runoff through SHAP dependency plots reveals that some variables exhibit positive correlations, while others demonstrate distinct threshold effects, primarily categorized into two types. The first is the abrupt threshold effect, where runoff response exhibits a nonlinear surge or abrupt decline when the variable reaches a critical threshold, exhibiting a typical “inflection point effect.” The second is the effect reversal type, where the SHAP value shifts from positive to negative after the variable exceeds a critical threshold, indicating a transition from a “promoting effect” to an “inhibiting effect” on runoff. This study focuses on revealing the correlations between variables and runoff, along with their threshold regulation mechanisms. The specific conclusions are as follows:

(a): The results indicate a significant positive correlation between PCP and runoff (Figure 7a). SHAP values consistently increase with rising precipitation levels, indicating that increased precipitation significantly promotes runoff generation.
(b): SR and RH exhibit threshold effects (see Figure 7b,c). When SR is below 23 MJ/m², its impact on runoff is negligible; however, exceeding this threshold inhibits runoff generation, with runoff gradually decreasing as SR increases overall. For RH values within the 30%–60% range, the effect on runoff is negligible. However, when RH exceeds 60%, runoff generation is significantly enhanced.
(c): Temperature exhibits a threshold effect: When MaxT is below approximately 20 °C, its influence on runoff is negligible; however, once it exceeds 23 °C, the SHAP value shifts sharply in a negative direction (Figure 7e), indicating that high temperatures suppress runoff generation by enhancing evapotranspiration. MinT below 5 °C inhibits runoff, with negligible effects between 0 °C and 15 °C. Above 15 °C, it shifts to a promoting effect, as shown in Figure 7f, reflecting the temperature sensitivity of runoff generation processes.
(d): SW also exhibits a significant positive correlation with runoff, as shown in Figure 8a. Particularly when SW exceeds 30%, higher soil moisture levels facilitate the formation of surface runoff, playing a positive promoting role.
(e): The SHAP values for both PET and ET show a slight downward trend, as illustrated in Figure 8b–d. The distributions are relatively concentrated and exhibit an overall negative correlation, indicating that they have a suppressing effect on runoff.
(f): As infiltration increases, the SHAP value rises significantly, as shown in Figure 8f, indicating that PERC exhibits a continuously strengthening positive correlation with runoff contribution. However, when infiltration is less than 5 mm, its contribution to runoff remains unclear.
(g): As shown in Figure 8e,f, the SHAP values of GWQ and SURQ both exhibit a pronounced positive response to increasing groundwater and surface runoff, respectively, with GWQ sharply rising above 0.15 mm when groundwater flow exceeds approximately 6 mm, and SURQ increasing particularly markedly when surface runoff exceeds 20 mm, indicating that high-intensity flows directly and significantly contribute to the total watershed runoff.

3.4. The Impact of Climate Change on Runoff

3.4.1. Evaluation of Climate Model Simulation Capabilities

This study employed the Thiessen polygon method to spatially weight the PCP, MaxT, and MinT observed data from 1995 to 2015 in the Huolin River Basin, along with corresponding meteorological elements from four climate models. Spatial weighting was applied to these data. Three metrics—standard deviation, correlation coefficient, and centered root mean square error (CRMSE)—were employed to evaluate each model’s simulation capability, ensuring the reliability of future climate input data. The assessment results were visualized using a Taylor plot, as shown in Figure 9. In the Taylor plot, each scatter point represents a different model. The horizontal and vertical axes denote standard deviation and correlation coefficient, respectively, while concentric circles represent different CRMSE values. A CRMSE closer to 0, a spatial correlation coefficient closer to 1, and a standard deviation closer to the observed value indicate superior performance.

Regarding precipitation, all four models had correlation coefficients above 0.94 with the observed data. IPSL-CM6A-LR performed best with the highest correlation coefficient (0.9699), a standard deviation of 30.76 (closest to the observed 32.0), and the smallest CRMSE, while FGOALS-g3 followed with only a minor gap from the optimal model. For MaxT, all models showed correlation coefficients over 0.95; GFDL-CM4 had the highest (0.9731), and FGOALS-g3 had a standard deviation of 13.49 that perfectly matched the observed 13.5, with GFDL-CM4 presenting the smallest CRMSE. For MinT, all models had correlation coefficients above 0.95: GFDL-CM4 had the highest (0.9848) with a standard deviation of 13.10 close to the observed 13.0, and CanESM5 had the smallest CRMSE. Overall, FGOALS-g3 achieved the most balanced and stable performance in reproducing historical climate characteristics; although not the absolute optimal for precipitation simulation, it perfectly matched the observed variability in both MaxT and MinT, with only a slight gap in precipitation results compared with the optimal model.

3.4.2. Climate Change Trends Under Different Scenarios

This section examines the future changes in precipitation, maximum temperature, and minimum temperature within the basin under two scenarios across four models, using the period from 1995 to 2014 as the baseline.

Figure 10 presents the annual precipitation (PCP) projections from four CMIP6 models (CanESM5, FGOALS-g3, GFDL-CM4, and IPSL-CM6A-LR) under different scenarios. During the baseline period (1995–2014), simulated precipitation remained generally stable without significant long-term trends. However, in the future projection period, all models consistently exhibit a marked increasing trend under both SSP2-4.5 and SSP5-8.5, with the magnitude and slope of increase generally higher under the SSP5-8.5 scenario. Despite this consensus, discrepancies exist in extreme responses: CanESM5 projects the most drastic increases, with late-century peaks frequently exceeding 900 mm, while FGOALS-g3 and IPSL-CM6A-LR show more robust interannual variability. Overall, these multi-model results converge on a wetter future for the basin, with enhanced precipitation variability implying more frequent and intense hydrological extremes. This cross-model consistency underscores the high reliability of the projected increases.

Figure 11 focuses on temperature change projections from four models. Temperatures rise continuously under all scenarios, with high emissions (SSP5-8.5) showing far greater warming than medium emissions (SSP2-4.5). Taking the FGO model as an example, under the SSP5-8.5 scenario, the maximum temperature in 2100 reaches 20 °C (7 °C above baseline), while the minimum temperature rises to 16 °C (+6 °C). Notably, while seasonal fluctuation ranges are maintained, the overall temperature baseline has significantly shifted upward (e.g., the 2085 minimum temperature approaches today’s maximum). This signals frequent future occurrences of extreme heatwaves and unusually warm winters.

3.4.3. Runoff Variations Under Different Scenarios

Using the period 1995–2010 as the historical baseline, the calibration-validated SWAT model and the SWAT-LSTM coupled model were employed to drive SSP2-4.5 and SSP5-8.5 scenarios under multiple modes, respectively, simulating runoff changes from 2015 to 2100 (Figure 9 and Figure 10). Using 1990–2010 as the baseline period, the 2010–2100 period was divided into four stages. The rate of change in annual runoff for each stage relative to the observed runoff during the baseline period was calculated (Figure 12 and Figure 13).

Combining SHAP-revealed threshold control mechanisms with the nonlinear characteristics of future runoff changes: near-term (2000–2025) runoff decline dominates all model projections of precipitation reduction (Figure 9) coupled with rising temperatures (Figure 10), triggering the dual suppression mechanism identified by SHAP. Taking the CanESM5 model as an example, a 50% precipitation reduction directly weakens the core driver (contributing 22.1% to SHAP); simultaneously, when MaxT exceeds the 25 °C threshold (SHAP value turns negative), enhanced evapotranspiration further suppresses runoff generation, resulting in a simulated decrease of −38.2% in the SWAT-LSTM model.

Long-term (2076–2100) high-emission runoff surge: The FGOALS-g3 model exhibits a 58% precipitation surge under SSP5-8.5, where its positive SHAP effect (0.351) dominates runoff generation; simultaneously, its MinT (16 °C) approaches the identified temperature threshold enhancing runoff response (Figure 7e), while MaxT (20 °C) remains below the 25 °C suppression threshold. This creates a synergistic gain from “precipitation-driven + temperature window” effects, driving a 235.6% increase in SWAT-LSTM projections (Figure 14).

Regarding emission pathways, runoff increases under SSP5-8.5 were substantially higher than under SSP2-4.5, with greater uncertainty. At the climate model level, CanESM5 and FGOALS-g3 exhibited high-value simulations, while IPSL-CM6A-LR showed negative growth trends, indicating that model structural differences significantly influence runoff projections.

Regarding model performance, SWAT-LSTM better captures nonlinear hydrological response processes compared to traditional SWAT. It demonstrates greater stability and adaptability under scenarios of high temperature and high precipitation. The distribution of high-value zones in the heatmap is more concentrated, and the identification of inflection points in medium-to-long-term trends is more accurate.

3.4.4. Analysis of the Evolution of Dominant Factors and Threshold Inflection Points Under Future Climate Models

In the runoff simulations, the SWAT-LSTM coupled model outperformed the standalone SWAT model with markedly improved performance. Among the four climate models, FGOALS-g3 yielded the most consistent and stable runoff simulation results under both the SSP2-4.5 and SSP5-8.5 scenarios, with the simulated runoff featuring a reasonable variability range, well-calibrated peak magnitudes, and a clear temporal evolution trend. In contrast to the other three models, FGOALS-g3 avoided the spurious artificial peaks present in CanESM5’s simulations and mitigated the excessive dampening of variability observed in IPSL-CM6A-LR. In summary, the SWAT-LSTM model driven by FGOALS-g3 provides the most balanced characterization of runoff’s long-term evolution trend, interannual variability, and extreme runoff processes, and is thus selected as the optimal climate model for subsequent analyses.

Two emission scenarios (SSP2-4.5 and SSP5-8.5), driven by the optimal climate model FGOALS-g3 and employing the trained SWAT-LSTM coupled framework combined with SHAP explainability analysis, were used to quantitatively assess the dominant drivers of future (2010–2100) runoff evolution, the temporal dynamics of factor importance, and shifts in critical thresholds (inflection points).

Previous analysis revealed that future runoff exhibits a pronounced non-stationary evolution pattern under both emission scenarios, characterized by a “near-term decline—mid-to-late-century fluctuation intensification—late-century surge” structure. The high-emission scenario (SSP5-8.5) shows more frequent and higher-magnitude extreme peaks, indicating enhanced hydroclimatic instability under intensified warming. This pattern is mechanistically explained by the coupled model and SHAP analysis as follows:

(a): The SHAP value for PCP increased from 0.531 in the historical period to 0.708, indicating a significant strengthening of precipitation’s relative dominance over runoff in the future;
(b): The SHAP value for MinT rose from 0.092 to 0.16, reflecting an increasingly important role of minimum temperature in regulating hydrological responses. Elevated nighttime temperatures may influence soil moisture redistribution and atmospheric demand, thereby enhancing runoff sensitivity. In contrast, the SHAP value for MaxT declined from 0.188 to below 0.10, indicating that under higher thermal conditions, evapotranspiration intensification becomes dominant, suppressing runoff formation and reducing its net positive impact.
(c): In contrast, other meteorological factors (such as relative humidity, solar radiation, wind speed) and key hydrological factors (SW, PERC, GWQ, etc.) exhibit limited changes in magnitude, maintaining relatively stable overall influence patterns.
(d): Historical SHAP analysis identified key thresholds including MinT = 15 °C, MaxT = 23 °C for suppression effects, and SR ≈ 23 MJ/m². Under future scenarios, the most significant change is the elevation of the MinT threshold from 15 °C to 17 °C, while PCP thresholds remain largely stable. Three primary factors drive this threshold elevation: First, the overall rise in MinT modifies land–atmosphere energy exchange and soil moisture regulation processes, increasing the temperature sensitivity of runoff generation and making hydrological responses more temperature-driven. Second, significantly enhanced and increasingly extreme precipitation in the future leads to faster soil saturation, heightening runoff’s sensitivity to temperature. Finally, differences in seasonal precipitation and temperature distributions across models also contribute to threshold shifts.

4. Discussion

4.1. Physical Consistency and Uncertainty of the SWAT-LSTM Coupling Strategy

The SWAT-LSTM framework integrates the advantages of mechanism-based and data-driven approaches, offering stronger generalization capabilities and physical plausibility compared to standalone models [36,37]. While the SWAT model effectively provides physical process constraints, its sensitivity to extreme events is often limited [38]. In contrast, the LSTM model captures nonlinear precipitation–runoff relationships more effectively through sequence feature learning. In this study, the SWAT-LSTM simulation demonstrated superior adaptability during extreme events, such as the 1998 flood and the severe 2004 drought, significantly reducing deviations compared to the standalone SWAT model. This suggests that the time-series data generated by physical models embody internal hydrological mechanisms that can effectively guide the training of deep learning models [29,39].

However, the implementation of this coupling, particularly for future projections (2026–2100), requires careful consideration of error propagation. In this study, a one-way coupling strategy was employed; in particular, meteorological forcing from CMIP6 climate models was first input into the calibrated SWAT model to generate future intermediate physical state variables (e.g., SW, GWQ, SURQ), which were then used alongside raw meteorological data as inputs for the LSTM. While this ensures the model adheres to physical constraints, it introduces “cascading uncertainty.” Any simulation errors within SWAT under future climate scenarios—stemming from structural uncertainties or parameter non-stationarity—propagate directly into the LSTM input layer. This potential accumulation and amplification of error may affect the precision of long-term streamflow projections. Future research should therefore explore bidirectional coupling or error-correction algorithms (such as residual analysis) to quantify and mitigate this propagation of uncertainty across the physical–statistical interface.

4.2. Nonlinear Response and Interpretation of Runoff Drivers

SHAP interpretability analysis highlighted the critical roles of precipitation (PCP) and minimum temperature (MinT) in runoff generation. Regarding the identified MinT threshold (approximately 15–17 °C), this study suggests a cautious interpretation. In the semi-arid Huolin River Basin, MinT likely functions as a “seasonal indicator” rather than a direct physical driver of runoff. As MinT crosses the freezing point and continues to rise, it serves as a proxy for changes in the basin’s energy state, shifts in vegetation phenology, and transitions in precipitation phases (e.g., snow-to-rain). Since SHAP values reflect the statistical behavior of the model rather than proving direct physical causality, the observed threshold shift primarily indicates that under a warming climate, runoff generation is becoming increasingly sensitive to seasonal heat balance transitions. Future work should integrate higher-resolution physical process data, such as frozen soil thawing depths and fine-scale evapotranspiration monitoring, to further validate the mechanistic basis of these statistical findings.

4.3. Limitations and Future Directions

Despite the superior performance of the proposed framework in streamflow simulation, several limitations remain in its interpretability.

Simplification in SHAP Implementation: This study utilized the DeepExplainer variant of the SHAP framework to account for the nonlinear activation functions in the LSTM. However, when dealing with Recurrent Neural Networks (RNNs), SHAP typically treats the input features at each time step as independent variables for attribution. This is a simplification that overlooks the inherent temporal dependencies and lag effects of the LSTM architecture. This treatment may lead to an incomplete explanation of cumulative hydrological processes.

Exclusion of Human Activities: The current model does not fully incorporate anthropogenic impacts on the lateral water cycles, such as reservoir regulation and irrigation diversions, which significantly impact actual runoff in arid and semi-arid regions [40].

Uncertainty in Future Climate Extrapolation: Applying the trained model to future climate conditions (2026–2100) introduces additional uncertainties, as future temperature and precipitation patterns may fall outside the range of historical observations. Deep learning models, such as LSTM, are inherently constrained by their training data and may face challenges in “out-of-distribution” extrapolation. Therefore, the projected hydrological responses in this study should be interpreted as indicators of potential trends and system sensitivities rather than precise quantitative forecasts [41].

Future research may expand and optimize in the following directions: (a) Develop an extended SHAP framework integrating local interpretability with spatiotemporal explicit modeling to enhance the resolution of heterogeneous hydrological behaviors within basins; (b) Construct simulation models coupling natural and artificial dual water cycles, incorporating actual water use and scheduling data to systematically identify the combined effects of human activities and climatic factors on runoff.

5. Conclusions

This study employed a deep learning-based SWAT-LSTM training model to simulate runoff processes in small watersheds. During the testing period, the NSE and R² values reached 0.876 and 0.884, respectively, representing a significant improvement over the standalone SWAT model (NSE = 0.710, R² = 0.736), demonstrating the advantage of integrating physical constraints with sequence learning in reproducing extreme events (e.g., the 1998 extreme flood and 2004 extreme drought).

Among the evaluated climate models, FGOALS-g3 was selected as the optimal model due to its balanced performance in reproducing historical variability and extreme characteristics. Future projections reveal a non-stationary runoff evolution pattern characterized by an initial decline, followed by intensified mid-term fluctuations and a pronounced late-century increase.

Shift in Climatic Control Mechanisms: SHAP analysis revealed a fundamental shift in runoff drivers from the historical to the future period. The relative importance of precipitation (PCP) increased from 0.53 to 0.71, while minimum temperature (MinT) rose from 0.09 to 0.16. Conversely, the contribution of maximum temperature (MaxT) declined due to intensified evapotranspiration suppression. These results indicate that future runoff will become increasingly precipitation-dominated and more sensitive to MinT.

The key finding of threshold (inflection point) shifts is that the historical MinT threshold (≈15 °C) for snowmelt and runoff rises to approximately 17 °C in the future. This upward shift is mainly driven by altered rain–snow phase transition and snowmelt timing due to rising MinT, and modified soil–surface water hydrodynamic processes induced by precipitation extremification. A higher MinT threshold makes the watershed more prone to substantial runoff at an elevated baseline temperature, increasing the risk of extreme hydrological events.

Simulation results exhibit a non-stationary pattern of “recent attenuation followed by late-stage surge” (more pronounced under high-emission scenarios). The frequency and intensity of extreme runoff peaks have significantly increased. Strengthening threshold-sensitive management in watershed management plays a positive role in responding to extreme climate change.

Author Contributions

Methodology, J.T. (Jiake Tian); writing—original draft preparation, J.T. (Jiake Tian); writing—review and editing, H.H.; data curation, J.T. (Jiake Tian), J.T. (Jianjie Tong), R.G., J.Z. and F.S.; investigation, J.T. (Jianjie Tong), R.G., J.Z. and F.S.; supervision, H.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by The National Keypoint Research and Invention Program of the Fourteenth (No. 2021YFC3000204) and National Key Science and Technology Major Project for Environmental Governance in the Beijing–Tianjin–Hebei Region (Project No. 2025ZD1208303).

Data Availability Statement

The data presented in this study are available on request from the corresponding author, due to restrictions related to the management and confidentiality requirements of a national major water conservancy project.

Acknowledgments

The authors acknowledge the contributions of all authors of the ten papers in this Special Issue.

Conflicts of Interest

The authors Jun Zhang and Jianjie Tong were employed by the JiLin Province Water Resource and Hydropower Consultative Company of P.R.CHINA. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as potential conflicts of interest.

References

Xin, Z.; Li, Y.; Zhang, L.; Ding, W.; Ye, L.; Wu, J.; Zhang, C. Quantifying the Relative Contribution of Climate and Human Impacts on Seasonal Streamflow. J. Hydrol. 2019, 574, 936–945. [Google Scholar] [CrossRef]
Hülsmann, L.; Geyer, T.; Schweitzer, C.; Priess, J.; Karthe, D. The Effect of Subarctic Conditions on Water Resources: Initial Results and Limitations of the SWAT Model Applied to the Kharaa River Basin in Northern Mongolia. Environ. Earth Sci. 2015, 73, 581–592. [Google Scholar] [CrossRef]
Zhao, J.; Zhang, N.; Liu, Z.; Zhang, Q.; Shang, C. SWAT Model Applications: From Hydrological Processes to Ecosystem Services. Sci. Total Environ. 2024, 931, 172605. [Google Scholar] [CrossRef] [PubMed]
Zhang, X.; Chen, P.; Dai, S.; Han, Y. Analysis of Non-Point Source Nitrogen Pollution in Watersheds Based on SWAT Model. Ecol. Indic. 2022, 138, 108881. [Google Scholar] [CrossRef]
Jodar-Abellan, A.; Valdes-Abellan, J.; Pla, C.; Gomariz-Castillo, F. Impact of Land Use Changes on Flash Flood Prediction Using a Sub-Daily SWAT Model in Five Mediterranean Ungauged Watersheds (SE Spain). Sci. Total Environ. 2019, 657, 1578–1591. [Google Scholar] [CrossRef] [PubMed]
Pereira, D.d.R.; Martinez, M.A.; Pruski, F.F.; da Silva, D.D. Hydrological Simulation in a Basin of Typical Tropical Climate and Soil Using the SWAT Model Part I: Calibration and Validation Tests. J. Hydrol. Reg. Stud. 2016, 7, 14–37. [Google Scholar] [CrossRef]
Kannan, N.; White, S.M.; Worrall, F.; Whelan, M.J. Sensitivity Analysis and Identification of the Best Evapotranspiration and Runoff Options for Hydrological Modelling in SWAT-2000. J. Hydrol. 2007, 332, 456–466. [Google Scholar] [CrossRef]
Jimeno-Sáez, P.; Senent-Aparicio, J.; Pérez-Sánchez, J.; Pulido-Velazquez, D. A Comparison of SWAT and ANN Models for Daily Runoff Simulation in Different Climatic Zones of Peninsular Spain. Water 2018, 10, 192. [Google Scholar] [CrossRef]
Nyeko, M. Hydrologic Modelling of Data Scarce Basin with SWAT Model: Capabilities and Limitations. Water Resour. Manag. 2015, 29, 81–94. [Google Scholar] [CrossRef]
Zare, M.; Azam, S.; Sauchyn, D. A Modified SWAT Model to Simulate Soil Water Content and Soil Temperature in Cold Regions: A Case Study of the South Saskatchewan River Basin in Canada. Sustainability 2022, 14, 10804. [Google Scholar] [CrossRef]
Tan, L.; Qi, J.; Marek, G.W.; Zhang, X.; Ge, J.; Sun, D.; Li, B.; Feng, P.; Liu, D.L.; Li, B.; et al. Assessing the Impacts of Extreme Precipitation Projections on Haihe Basin Hydrology Using an Enhanced SWAT Model. J. Hydrol. Reg. Stud. 2025, 58, 102235. [Google Scholar] [CrossRef]
Cho, K.; Kim, Y. Improving Streamflow Prediction in the WRF-Hydro Model with LSTM Networks. J. Hydrol. 2022, 605, 127297. [Google Scholar] [CrossRef]
Kratzert, F.; Klotz, D.; Brenner, C.; Schulz, K.; Herrnegger, M. Rainfall–Runoff Modelling Using Long Short-Term Memory (LSTM) Networks. Hydrol. Earth Syst. Sci. 2018, 22, 6005–6022. [Google Scholar] [CrossRef]
Mei, Z.; Peng, T.; Chen, L.; Singh, V.P.; Yi, B.; Leng, Z.; Gan, X.; Xie, T. Coupling SWAT and LSTM for Improving Daily Streamflow Simulation in a Humid and Semi-Humid River Basin. Water Resour. Manag. 2025, 39, 397–418. [Google Scholar] [CrossRef]
Chen, Z.; Xu, H.; Jiang, P.; Yu, S.; Lin, G.; Bychkov, I.; Hmelnov, A.; Ruzhnikov, G.; Zhu, N.; Liu, Z. A Transfer Learning-Based LSTM Strategy for Imputing Large-Scale Consecutive Missing Data and Its Application in a Water Quality Prediction System. J. Hydrol. 2021, 602, 126573. [Google Scholar] [CrossRef]
Zhu, N.; Ji, X.; Tan, J.; Jiang, Y.; Guo, Y. Prediction of Dissolved Oxygen Concentration in Aquatic Systems Based on Transfer Learning. Comput. Electron. Agric. 2021, 180, 105888. [Google Scholar] [CrossRef]
Phetanan, K.; Hong, S.M.; Yun, D.; Lee, J.; Chotpantarat, S.; Jeong, H.; Cho, K.H. Enhancing Flow Rate Prediction of the Chao Phraya River Basin Using SWAT–LSTM Model Coupling. J. Hydrol. Reg. Stud. 2024, 53, 101820. [Google Scholar] [CrossRef]
Lyu, K.; Dong, Y.; Lyu, W.; Zhou, Y.; Wang, S.; Wang, Z.; Cui, W.; Zhang, Y.; Zhang, Q.; Cui, Y. Data-Driven and Numerical Simulation Coupling to Quantify the Impact of Ecological Water Replenishment on Surface Water-Groundwater Interactions. J. Hydrol. 2025, 649, 132508. [Google Scholar] [CrossRef]
Jin, L.; Xue, H.; Dong, G.; Han, Y.; Li, Z.; Lian, Y. Coupling the Remote Sensing Data-Enhanced SWAT Model with the Bidirectional Long Short-Term Memory Model to Improve Daily Streamflow Simulations. J. Hydrol. 2024, 634, 131117. [Google Scholar] [CrossRef]
Huang, C.; Zhang, Y.; Hou, J. Soil and Water Assessment Tool (SWAT)-Informed Deep Learning for Streamflow Forecasting with Remote Sensing and In Situ Precipitation and Discharge Observations. Remote Sens. 2024, 16, 3999. [Google Scholar] [CrossRef]
Cambria, E.; Malandri, L.; Mercorio, F.; Mezzanzanica, M.; Nobani, N. A Survey on XAI and Natural Language Explanations. Inf. Process. Manag. 2023, 60, 103111. [Google Scholar] [CrossRef]
Lundberg, S.M.; Erion, G.; Chen, H.; DeGrave, A.; Prutkin, J.M.; Nair, B.; Katz, R.; Himmelfarb, J.; Bansal, N.; Lee, S.-I. From Local Explanations to Global Understanding with Explainable AI for Trees. Nat. Mach. Intell. 2020, 2, 56–67. [Google Scholar] [CrossRef] [PubMed]
Bian, L.; Qin, X.; Zhang, C.; Guo, P.; Wu, H. Application, Interpretability and Prediction of Machine Learning Method Combined with LSTM and LightGBM-a Case Study for Runoff Simulation in an Arid Area. J. Hydrol. 2023, 625, 130091. [Google Scholar] [CrossRef]
Wang, S.; Peng, H. Multiple Spatio-Temporal Scale Runoff Forecasting and Driving Mechanism Exploration by K-Means Optimized XGBoost and SHAP. J. Hydrol. 2024, 630, 130650. [Google Scholar] [CrossRef]
Khorn, N.; Ismail, M.H.; Nurhidayu, S.; Kamarudin, N.; Sulaiman, M.S. Land Use/Land Cover Changes and Its Impact on Runoff Using SWAT Model in the Upper Prek Thnot Watershed in Cambodia. Environ. Earth Sci. 2022, 81, 466. [Google Scholar] [CrossRef]
Huan, J.; Fan, Y.; Xu, X.; Zhou, L.; Zhang, H.; Zhang, C.; Hu, Q.; Cai, W.; Ju, H.; Gu, S. Deep Learning Model Based on Coupled SWAT and Interpretable Methods for Water Quality Prediction under the Influence of Non-Point Source Pollution. Comput. Electron. Agric. 2025, 231, 109985. [Google Scholar] [CrossRef]
Woo, S.; Kim, W.; Jung, C.; Lee, J.; Kim, Y.; Kim, S. Spatial Analysis of Aquatic Ecological Health under Future Climate Change Using Extreme Gradient Boosting Tree (XGBoost) and SWAT. Water 2024, 16, 2085. [Google Scholar] [CrossRef]
Duong, T.D.; Tran, V.N.; Nguyen, T.V. Evaluating Rainfall-Runoff Generation Mechanisms of Deep Learning Models Using a Process-Based Rainfall-Runoff Model. Water Resour. Manag. 2025, 39, 5845–5859. [Google Scholar] [CrossRef]
Chen, S.; Huang, J.; Huang, J.-C. Improving Daily Streamflow Simulations for Data-Scarce Watersheds Using the Coupled SWAT-LSTM Approach. J. Hydrol. 2023, 622, 129734. [Google Scholar] [CrossRef]
Chai, T.; Draxler, R.R. Root Mean Square Error (RMSE) or Mean Absolute Error (MAE)?—Arguments against Avoiding RMSE in the Literature. Geosci. Model Dev. 2014, 7, 1247–1250. [Google Scholar] [CrossRef]
Nash, J.E.; Sutcliffe, J.V. River Flow Forecasting through Conceptual Models Part I—A Discussion of Principles. J. Hydrol. 1970, 10, 282–290. [Google Scholar] [CrossRef]
Štrumbelj, E.; Kononenko, I. Explaining Prediction Models and Individual Predictions with Feature Contributions. Knowl. Inf. Syst. 2014, 41, 647–665. [Google Scholar] [CrossRef]
Antwarg, L.; Miller, R.M.; Shapira, B.; Rokach, L. Explaining Anomalies Detected by Autoencoders Using Shapley Additive Explanations. Expert Syst. Appl. 2021, 186, 115736. [Google Scholar] [CrossRef]
Flato, G.; Marotzke, J.; Abiodun, B.; Braconnot, P.; Chou, S.C.; Collins, W.; Cox, P.; Driouech, F.; Emori, S.; Eyring, V.; et al. Evaluation of Climate Models. In Climate Change 2013: The Physical Science Basis. Contribution of Working Group I to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change; Cambridge University Press: Cambridge, UK, 2014; pp. 741–866. [Google Scholar]
Pang, S.; Wang, X.; Melching, C.S.; Feger, K.-H. Development and Testing of a Modified SWAT Model Based on Slope Condition and Precipitation Intensity. J. Hydrol. 2020, 588, 125098. [Google Scholar] [CrossRef]
Reichstein, M.; Camps-Valls, G.; Stevens, B.; Jung, M.; Denzler, J.; Carvalhais, N. Prabhat Deep Learning and Process Understanding for Data-Driven Earth System Science. Nature 2019, 566, 195–204. [Google Scholar] [CrossRef] [PubMed]
Yang, S.; Yang, D.; Chen, J.; Santisirisomboon, J.; Lu, W.; Zhao, B. A Physical Process and Machine Learning Combined Hydrological Model for Daily Streamflow Simulations of Large Watersheds with Limited Observation Data. J. Hydrol. 2020, 590, 125206. [Google Scholar] [CrossRef]
Arnold, J.G.; Fohrer, N. SWAT2000: Current Capabilities and Research Opportunities in Applied Watershed Modelling. Hydrol. Process. 2005, 19, 563–572. [Google Scholar] [CrossRef]
Liang, Z.; Zou, R.; Chen, X.; Ren, T.; Su, H.; Liu, Y. Simulate the Forecast Capacity of a Complicated Water Quality Model Using the Long Short-Term Memory Approach. J. Hydrol. 2020, 581, 124432. [Google Scholar] [CrossRef]
Mengistu, A.G.; van Rensburg, L.D.; Woyessa, Y.E. Techniques for Calibration and Validation of SWAT Model in Data Scarce Arid and Semi-Arid Catchments in South Africa. J. Hydrol. Reg. Stud. 2019, 25, 100621. [Google Scholar] [CrossRef]
Lundberg, S.; Lee, S.-I. A Unified Approach to Interpreting Model Predictions. arXiv 2017, arXiv:1705.07874. [Google Scholar] [CrossRef]

Figure 1. Overview Map of the Study Area.

Figure 2. SWAT Modeling Spatial Data Map: (a) Watershed DEM elevation; (b) Land use data; (c) Watershed soil data classification; (d) Watershed sub-basin delineation.

Figure 3. Schematic Diagram of the Coupling Process between the SWAT and LSTM Models.

Figure 4. Comparison of simulated and observed runoff: (a) time series of runoff volume during the calibration period; (b) scatter plot of simulated versus observed values.

Figure 5. Comparison of SWAT-LSTM simulated and observed runoff: (a) time series of average runoff volume from 200 simulations; (b) scatter plot of simulated versus observed values.

Figure 6. SHAP Mean Calculation and Importance Ranking.

Figure 7. SHAP Dependency Plot for SWAT-LSTM Meteorological Factors.

Figure 8. SHAP Dependency Plot of Hydrological Factors in SWAT-LSTM.

Figure 9. Comparison of simulated and observed PCP, MaxT, and MinT data under four climate models (the black pentagram represents the observed data).

Figure 10. Future Rainfall Trend Maps.

Figure 11. Future Precipitation and Temperature Trends.

Figure 12. SWAT Runoff Simulation.

Figure 13. SWAT-LSTM Future Runoff Simulation.

Figure 14. Heatmap of Future Runoff Changes.

Table 1. Data types and sources for the model.

Data Type	Main Source	Description	Resolution
Meteorological Data	National Meteorological Science Data Center	Daily observations from seven stations during 1970–2010, including precipitation, temperature, wind speed, humidity, and solar radiation	Daily
DEM	Geospatial Data Cloud	Digital elevation model with a spatial resolution of 30 m	30 m
Land Use Data	Resource and Environmental Science Data Center, Chinese Academy of Sciences	Land use remote sensing data at 30 m resolution from 1970–2010	30 m
Soil Data	National Cryosphere Desert Data Center	The 1:1,000,000 scale soil data provided by the Nanjing Institute of Soil Science during the Second National Soil Survey.	1 km
Runoff Data	Tongfaba Hydrological Station	Monthly runoff observations from Tongfaba Hydrological Station during 1970–2010	Monthly

Table 2. SWAT-LSTM Model Hyperparameter Ranges and Optimal Values.

Hyperparameter Category	Parameter Name	Range	Optimal Values
Model Architecture	Number of LSTM Layers	(2~3)	3
Optimizer	Learning rate	(0.0001~0.01)	0.003
Training Settings	Batch size	(12~128)	64
	Epochs	(200~1000)	400
	Window size	(12~64)	32

Table 3. CMIP6 Global Climate Models and Scenarios.

Model Name	Country	Institution	Scenarios	Rationale
CanESM5	Canada	CCCma	SSP2-4.5, SSP5-8.5	Representative Canadian model—Core member of CMIP
FGOALS-g3	China	IAP/CAS	SSP2-4.5, SSP5-8.5	Chinese model—Adapted to East Asian climate characteristics
GFDL-CM4	USA	NOAA-GFDL	SSP2-4.5, SSP5-8.5	Main NOAA model (USA)—Well-developed physical processes
IPSL-CM6A-LR	France	IPSL	SSP2-4.5, SSP5-8.5	Representative European model—High climate sensitivity

Table 4. Sensitivity analysis of SWAT Model Parameters.

Sorting	Parameter Name	Physical Meaning	Parameter Range	Optimal Value
1	CN2	SCS runoff curve number	−0.5~0.5	−0.26
2	SOL_AWC	Soil available water capacity	−1~1	−0.28
3	GW_DELAY	Groundwater delay factor	0~300	17.5
4	GWQMN	Shallow groundwater flow coefficient	0~5000	2243
5	CH_K2	Main channel hydraulic conductivity coefficient	0~500	385.8
6	CANMX	Canopy interception	0~100	67.3
7	GW_REVAP	Groundwater re-evaporation coefficient	0–0.2	0.06
8	OV_N	Overland flow Manning’s roughness coefficient	0.01~1	0.49
9	SURLAG	Surface runoff lag coefficient	0~20	6.6
10	SOL_BD	Soil bulk density	0~5	2.74
11	REVAPMN	Shallow groundwater evapotranspiration threshold	0~500	71.33
12	ALPHA_BF.	Baseflow recession coefficient	0~2	0.15
13	ESCO	Soil evaporation compensation factor	0~0.5	0.44
14	SOL_K	Soil saturated hydraulic conductivity	0~0.5	0.15
15	CH_N2	Channel Manning’s roughness coefficient	0~0.2	0.08

Table 5. Runoff simulation results of the Tongfaba Hydrological Station.

Model	Training/Calibration Period (1970–2000)			Testing/Validation Period (2001–2010)
Model	R²	NSE	MAE	R²	NSE	MAE
SWAT	0.753	0.738	3.634	0.736	0.710	1.086
SWAT-LSTM	0.953	0.930	0.522	0.884	0.876	0.765

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Tian, J.; Zhang, J.; Tong, J.; He, H.; Gu, R.; Shang, F. Dominant Factor Analysis and Threshold Inflection Point Determination in Deep Learning-Based SWAT-LSTM Training Models with SHAP Interpretability Analysis. Water 2026, 18, 960. https://doi.org/10.3390/w18080960

AMA Style

Tian J, Zhang J, Tong J, He H, Gu R, Shang F. Dominant Factor Analysis and Threshold Inflection Point Determination in Deep Learning-Based SWAT-LSTM Training Models with SHAP Interpretability Analysis. Water. 2026; 18(8):960. https://doi.org/10.3390/w18080960

Chicago/Turabian Style

Tian, Jiake, Jun Zhang, Jianjie Tong, Huaxiang He, Ruidan Gu, and Fenjie Shang. 2026. "Dominant Factor Analysis and Threshold Inflection Point Determination in Deep Learning-Based SWAT-LSTM Training Models with SHAP Interpretability Analysis" Water 18, no. 8: 960. https://doi.org/10.3390/w18080960

APA Style

Tian, J., Zhang, J., Tong, J., He, H., Gu, R., & Shang, F. (2026). Dominant Factor Analysis and Threshold Inflection Point Determination in Deep Learning-Based SWAT-LSTM Training Models with SHAP Interpretability Analysis. Water, 18(8), 960. https://doi.org/10.3390/w18080960

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Dominant Factor Analysis and Threshold Inflection Point Determination in Deep Learning-Based SWAT-LSTM Training Models with SHAP Interpretability Analysis

Abstract

1. Introduction

2. Materials and Methods

2.1. Study Area

2.2. Data Preparation

2.3. Coupled SWAT-LSTM Approach Preparation of the Coupled Model

2.3.1. SWAT Model

2.3.2. Coupled SWAT-LSTM Approach

2.4. Evaluation Metric

2.5. SHAP Interpretability Analysis

2.6. Selection of Future Climate Models

3. Results

3.1. Parameter Sensitivity Analysis

3.2. Model Performance Evaluation

3.3. Interpreted LSTM Behaviors

3.3.1. Global Feature Impact

3.3.2. Total Effects of Factors

3.4. The Impact of Climate Change on Runoff

3.4.1. Evaluation of Climate Model Simulation Capabilities

3.4.2. Climate Change Trends Under Different Scenarios

3.4.3. Runoff Variations Under Different Scenarios

3.4.4. Analysis of the Evolution of Dominant Factors and Threshold Inflection Points Under Future Climate Models

4. Discussion

4.1. Physical Consistency and Uncertainty of the SWAT-LSTM Coupling Strategy

4.2. Nonlinear Response and Interpretation of Runoff Drivers

4.3. Limitations and Future Directions

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI