Dominant Factor Analysis and Threshold Inflection Point Determination in Deep Learning-Based SWAT-LSTM Training Models with SHAP Interpretability Analysis
Abstract
1. Introduction
2. Materials and Methods
2.1. Study Area
2.2. Data Preparation
2.3. Coupled SWAT-LSTM Approach Preparation of the Coupled Model
2.3.1. SWAT Model
2.3.2. Coupled SWAT-LSTM Approach
2.4. Evaluation Metric
2.5. SHAP Interpretability Analysis
2.6. Selection of Future Climate Models
3. Results
3.1. Parameter Sensitivity Analysis
3.2. Model Performance Evaluation
3.3. Interpreted LSTM Behaviors
3.3.1. Global Feature Impact
3.3.2. Total Effects of Factors
- (a)
- The results indicate a significant positive correlation between PCP and runoff (Figure 7a). SHAP values consistently increase with rising precipitation levels, indicating that increased precipitation significantly promotes runoff generation.
- (b)
- SR and RH exhibit threshold effects (see Figure 7b,c). When SR is below 23 MJ/m2, its impact on runoff is negligible; however, exceeding this threshold inhibits runoff generation, with runoff gradually decreasing as SR increases overall. For RH values within the 30%–60% range, the effect on runoff is negligible. However, when RH exceeds 60%, runoff generation is significantly enhanced.
- (c)
- Temperature exhibits a threshold effect: When MaxT is below approximately 20 °C, its influence on runoff is negligible; however, once it exceeds 23 °C, the SHAP value shifts sharply in a negative direction (Figure 7e), indicating that high temperatures suppress runoff generation by enhancing evapotranspiration. MinT below 5 °C inhibits runoff, with negligible effects between 0 °C and 15 °C. Above 15 °C, it shifts to a promoting effect, as shown in Figure 7f, reflecting the temperature sensitivity of runoff generation processes.
- (d)
- SW also exhibits a significant positive correlation with runoff, as shown in Figure 8a. Particularly when SW exceeds 30%, higher soil moisture levels facilitate the formation of surface runoff, playing a positive promoting role.
- (e)
- The SHAP values for both PET and ET show a slight downward trend, as illustrated in Figure 8b–d. The distributions are relatively concentrated and exhibit an overall negative correlation, indicating that they have a suppressing effect on runoff.
- (f)
- As infiltration increases, the SHAP value rises significantly, as shown in Figure 8f, indicating that PERC exhibits a continuously strengthening positive correlation with runoff contribution. However, when infiltration is less than 5 mm, its contribution to runoff remains unclear.
- (g)
- As shown in Figure 8e,f, the SHAP values of GWQ and SURQ both exhibit a pronounced positive response to increasing groundwater and surface runoff, respectively, with GWQ sharply rising above 0.15 mm when groundwater flow exceeds approximately 6 mm, and SURQ increasing particularly markedly when surface runoff exceeds 20 mm, indicating that high-intensity flows directly and significantly contribute to the total watershed runoff.
3.4. The Impact of Climate Change on Runoff
3.4.1. Evaluation of Climate Model Simulation Capabilities
3.4.2. Climate Change Trends Under Different Scenarios
3.4.3. Runoff Variations Under Different Scenarios
3.4.4. Analysis of the Evolution of Dominant Factors and Threshold Inflection Points Under Future Climate Models
- (a)
- The SHAP value for PCP increased from 0.531 in the historical period to 0.708, indicating a significant strengthening of precipitation’s relative dominance over runoff in the future;
- (b)
- The SHAP value for MinT rose from 0.092 to 0.16, reflecting an increasingly important role of minimum temperature in regulating hydrological responses. Elevated nighttime temperatures may influence soil moisture redistribution and atmospheric demand, thereby enhancing runoff sensitivity. In contrast, the SHAP value for MaxT declined from 0.188 to below 0.10, indicating that under higher thermal conditions, evapotranspiration intensification becomes dominant, suppressing runoff formation and reducing its net positive impact.
- (c)
- In contrast, other meteorological factors (such as relative humidity, solar radiation, wind speed) and key hydrological factors (SW, PERC, GWQ, etc.) exhibit limited changes in magnitude, maintaining relatively stable overall influence patterns.
- (d)
- Historical SHAP analysis identified key thresholds including MinT = 15 °C, MaxT = 23 °C for suppression effects, and SR ≈ 23 MJ/m2. Under future scenarios, the most significant change is the elevation of the MinT threshold from 15 °C to 17 °C, while PCP thresholds remain largely stable. Three primary factors drive this threshold elevation: First, the overall rise in MinT modifies land–atmosphere energy exchange and soil moisture regulation processes, increasing the temperature sensitivity of runoff generation and making hydrological responses more temperature-driven. Second, significantly enhanced and increasingly extreme precipitation in the future leads to faster soil saturation, heightening runoff’s sensitivity to temperature. Finally, differences in seasonal precipitation and temperature distributions across models also contribute to threshold shifts.
4. Discussion
4.1. Physical Consistency and Uncertainty of the SWAT-LSTM Coupling Strategy
4.2. Nonlinear Response and Interpretation of Runoff Drivers
4.3. Limitations and Future Directions
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Xin, Z.; Li, Y.; Zhang, L.; Ding, W.; Ye, L.; Wu, J.; Zhang, C. Quantifying the Relative Contribution of Climate and Human Impacts on Seasonal Streamflow. J. Hydrol. 2019, 574, 936–945. [Google Scholar] [CrossRef]
- Hülsmann, L.; Geyer, T.; Schweitzer, C.; Priess, J.; Karthe, D. The Effect of Subarctic Conditions on Water Resources: Initial Results and Limitations of the SWAT Model Applied to the Kharaa River Basin in Northern Mongolia. Environ. Earth Sci. 2015, 73, 581–592. [Google Scholar] [CrossRef]
- Zhao, J.; Zhang, N.; Liu, Z.; Zhang, Q.; Shang, C. SWAT Model Applications: From Hydrological Processes to Ecosystem Services. Sci. Total Environ. 2024, 931, 172605. [Google Scholar] [CrossRef] [PubMed]
- Zhang, X.; Chen, P.; Dai, S.; Han, Y. Analysis of Non-Point Source Nitrogen Pollution in Watersheds Based on SWAT Model. Ecol. Indic. 2022, 138, 108881. [Google Scholar] [CrossRef]
- Jodar-Abellan, A.; Valdes-Abellan, J.; Pla, C.; Gomariz-Castillo, F. Impact of Land Use Changes on Flash Flood Prediction Using a Sub-Daily SWAT Model in Five Mediterranean Ungauged Watersheds (SE Spain). Sci. Total Environ. 2019, 657, 1578–1591. [Google Scholar] [CrossRef] [PubMed]
- Pereira, D.d.R.; Martinez, M.A.; Pruski, F.F.; da Silva, D.D. Hydrological Simulation in a Basin of Typical Tropical Climate and Soil Using the SWAT Model Part I: Calibration and Validation Tests. J. Hydrol. Reg. Stud. 2016, 7, 14–37. [Google Scholar] [CrossRef]
- Kannan, N.; White, S.M.; Worrall, F.; Whelan, M.J. Sensitivity Analysis and Identification of the Best Evapotranspiration and Runoff Options for Hydrological Modelling in SWAT-2000. J. Hydrol. 2007, 332, 456–466. [Google Scholar] [CrossRef]
- Jimeno-Sáez, P.; Senent-Aparicio, J.; Pérez-Sánchez, J.; Pulido-Velazquez, D. A Comparison of SWAT and ANN Models for Daily Runoff Simulation in Different Climatic Zones of Peninsular Spain. Water 2018, 10, 192. [Google Scholar] [CrossRef]
- Nyeko, M. Hydrologic Modelling of Data Scarce Basin with SWAT Model: Capabilities and Limitations. Water Resour. Manag. 2015, 29, 81–94. [Google Scholar] [CrossRef]
- Zare, M.; Azam, S.; Sauchyn, D. A Modified SWAT Model to Simulate Soil Water Content and Soil Temperature in Cold Regions: A Case Study of the South Saskatchewan River Basin in Canada. Sustainability 2022, 14, 10804. [Google Scholar] [CrossRef]
- Tan, L.; Qi, J.; Marek, G.W.; Zhang, X.; Ge, J.; Sun, D.; Li, B.; Feng, P.; Liu, D.L.; Li, B.; et al. Assessing the Impacts of Extreme Precipitation Projections on Haihe Basin Hydrology Using an Enhanced SWAT Model. J. Hydrol. Reg. Stud. 2025, 58, 102235. [Google Scholar] [CrossRef]
- Cho, K.; Kim, Y. Improving Streamflow Prediction in the WRF-Hydro Model with LSTM Networks. J. Hydrol. 2022, 605, 127297. [Google Scholar] [CrossRef]
- Kratzert, F.; Klotz, D.; Brenner, C.; Schulz, K.; Herrnegger, M. Rainfall–Runoff Modelling Using Long Short-Term Memory (LSTM) Networks. Hydrol. Earth Syst. Sci. 2018, 22, 6005–6022. [Google Scholar] [CrossRef]
- Mei, Z.; Peng, T.; Chen, L.; Singh, V.P.; Yi, B.; Leng, Z.; Gan, X.; Xie, T. Coupling SWAT and LSTM for Improving Daily Streamflow Simulation in a Humid and Semi-Humid River Basin. Water Resour. Manag. 2025, 39, 397–418. [Google Scholar] [CrossRef]
- Chen, Z.; Xu, H.; Jiang, P.; Yu, S.; Lin, G.; Bychkov, I.; Hmelnov, A.; Ruzhnikov, G.; Zhu, N.; Liu, Z. A Transfer Learning-Based LSTM Strategy for Imputing Large-Scale Consecutive Missing Data and Its Application in a Water Quality Prediction System. J. Hydrol. 2021, 602, 126573. [Google Scholar] [CrossRef]
- Zhu, N.; Ji, X.; Tan, J.; Jiang, Y.; Guo, Y. Prediction of Dissolved Oxygen Concentration in Aquatic Systems Based on Transfer Learning. Comput. Electron. Agric. 2021, 180, 105888. [Google Scholar] [CrossRef]
- Phetanan, K.; Hong, S.M.; Yun, D.; Lee, J.; Chotpantarat, S.; Jeong, H.; Cho, K.H. Enhancing Flow Rate Prediction of the Chao Phraya River Basin Using SWAT–LSTM Model Coupling. J. Hydrol. Reg. Stud. 2024, 53, 101820. [Google Scholar] [CrossRef]
- Lyu, K.; Dong, Y.; Lyu, W.; Zhou, Y.; Wang, S.; Wang, Z.; Cui, W.; Zhang, Y.; Zhang, Q.; Cui, Y. Data-Driven and Numerical Simulation Coupling to Quantify the Impact of Ecological Water Replenishment on Surface Water-Groundwater Interactions. J. Hydrol. 2025, 649, 132508. [Google Scholar] [CrossRef]
- Jin, L.; Xue, H.; Dong, G.; Han, Y.; Li, Z.; Lian, Y. Coupling the Remote Sensing Data-Enhanced SWAT Model with the Bidirectional Long Short-Term Memory Model to Improve Daily Streamflow Simulations. J. Hydrol. 2024, 634, 131117. [Google Scholar] [CrossRef]
- Huang, C.; Zhang, Y.; Hou, J. Soil and Water Assessment Tool (SWAT)-Informed Deep Learning for Streamflow Forecasting with Remote Sensing and In Situ Precipitation and Discharge Observations. Remote Sens. 2024, 16, 3999. [Google Scholar] [CrossRef]
- Cambria, E.; Malandri, L.; Mercorio, F.; Mezzanzanica, M.; Nobani, N. A Survey on XAI and Natural Language Explanations. Inf. Process. Manag. 2023, 60, 103111. [Google Scholar] [CrossRef]
- Lundberg, S.M.; Erion, G.; Chen, H.; DeGrave, A.; Prutkin, J.M.; Nair, B.; Katz, R.; Himmelfarb, J.; Bansal, N.; Lee, S.-I. From Local Explanations to Global Understanding with Explainable AI for Trees. Nat. Mach. Intell. 2020, 2, 56–67. [Google Scholar] [CrossRef] [PubMed]
- Bian, L.; Qin, X.; Zhang, C.; Guo, P.; Wu, H. Application, Interpretability and Prediction of Machine Learning Method Combined with LSTM and LightGBM-a Case Study for Runoff Simulation in an Arid Area. J. Hydrol. 2023, 625, 130091. [Google Scholar] [CrossRef]
- Wang, S.; Peng, H. Multiple Spatio-Temporal Scale Runoff Forecasting and Driving Mechanism Exploration by K-Means Optimized XGBoost and SHAP. J. Hydrol. 2024, 630, 130650. [Google Scholar] [CrossRef]
- Khorn, N.; Ismail, M.H.; Nurhidayu, S.; Kamarudin, N.; Sulaiman, M.S. Land Use/Land Cover Changes and Its Impact on Runoff Using SWAT Model in the Upper Prek Thnot Watershed in Cambodia. Environ. Earth Sci. 2022, 81, 466. [Google Scholar] [CrossRef]
- Huan, J.; Fan, Y.; Xu, X.; Zhou, L.; Zhang, H.; Zhang, C.; Hu, Q.; Cai, W.; Ju, H.; Gu, S. Deep Learning Model Based on Coupled SWAT and Interpretable Methods for Water Quality Prediction under the Influence of Non-Point Source Pollution. Comput. Electron. Agric. 2025, 231, 109985. [Google Scholar] [CrossRef]
- Woo, S.; Kim, W.; Jung, C.; Lee, J.; Kim, Y.; Kim, S. Spatial Analysis of Aquatic Ecological Health under Future Climate Change Using Extreme Gradient Boosting Tree (XGBoost) and SWAT. Water 2024, 16, 2085. [Google Scholar] [CrossRef]
- Duong, T.D.; Tran, V.N.; Nguyen, T.V. Evaluating Rainfall-Runoff Generation Mechanisms of Deep Learning Models Using a Process-Based Rainfall-Runoff Model. Water Resour. Manag. 2025, 39, 5845–5859. [Google Scholar] [CrossRef]
- Chen, S.; Huang, J.; Huang, J.-C. Improving Daily Streamflow Simulations for Data-Scarce Watersheds Using the Coupled SWAT-LSTM Approach. J. Hydrol. 2023, 622, 129734. [Google Scholar] [CrossRef]
- Chai, T.; Draxler, R.R. Root Mean Square Error (RMSE) or Mean Absolute Error (MAE)?—Arguments against Avoiding RMSE in the Literature. Geosci. Model Dev. 2014, 7, 1247–1250. [Google Scholar] [CrossRef]
- Nash, J.E.; Sutcliffe, J.V. River Flow Forecasting through Conceptual Models Part I—A Discussion of Principles. J. Hydrol. 1970, 10, 282–290. [Google Scholar] [CrossRef]
- Štrumbelj, E.; Kononenko, I. Explaining Prediction Models and Individual Predictions with Feature Contributions. Knowl. Inf. Syst. 2014, 41, 647–665. [Google Scholar] [CrossRef]
- Antwarg, L.; Miller, R.M.; Shapira, B.; Rokach, L. Explaining Anomalies Detected by Autoencoders Using Shapley Additive Explanations. Expert Syst. Appl. 2021, 186, 115736. [Google Scholar] [CrossRef]
- Flato, G.; Marotzke, J.; Abiodun, B.; Braconnot, P.; Chou, S.C.; Collins, W.; Cox, P.; Driouech, F.; Emori, S.; Eyring, V.; et al. Evaluation of Climate Models. In Climate Change 2013: The Physical Science Basis. Contribution of Working Group I to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change; Cambridge University Press: Cambridge, UK, 2014; pp. 741–866. [Google Scholar]
- Pang, S.; Wang, X.; Melching, C.S.; Feger, K.-H. Development and Testing of a Modified SWAT Model Based on Slope Condition and Precipitation Intensity. J. Hydrol. 2020, 588, 125098. [Google Scholar] [CrossRef]
- Reichstein, M.; Camps-Valls, G.; Stevens, B.; Jung, M.; Denzler, J.; Carvalhais, N. Prabhat Deep Learning and Process Understanding for Data-Driven Earth System Science. Nature 2019, 566, 195–204. [Google Scholar] [CrossRef] [PubMed]
- Yang, S.; Yang, D.; Chen, J.; Santisirisomboon, J.; Lu, W.; Zhao, B. A Physical Process and Machine Learning Combined Hydrological Model for Daily Streamflow Simulations of Large Watersheds with Limited Observation Data. J. Hydrol. 2020, 590, 125206. [Google Scholar] [CrossRef]
- Arnold, J.G.; Fohrer, N. SWAT2000: Current Capabilities and Research Opportunities in Applied Watershed Modelling. Hydrol. Process. 2005, 19, 563–572. [Google Scholar] [CrossRef]
- Liang, Z.; Zou, R.; Chen, X.; Ren, T.; Su, H.; Liu, Y. Simulate the Forecast Capacity of a Complicated Water Quality Model Using the Long Short-Term Memory Approach. J. Hydrol. 2020, 581, 124432. [Google Scholar] [CrossRef]
- Mengistu, A.G.; van Rensburg, L.D.; Woyessa, Y.E. Techniques for Calibration and Validation of SWAT Model in Data Scarce Arid and Semi-Arid Catchments in South Africa. J. Hydrol. Reg. Stud. 2019, 25, 100621. [Google Scholar] [CrossRef]
- Lundberg, S.; Lee, S.-I. A Unified Approach to Interpreting Model Predictions. arXiv 2017, arXiv:1705.07874. [Google Scholar] [CrossRef]














| Data Type | Main Source | Description | Resolution |
|---|---|---|---|
| Meteorological Data | National Meteorological Science Data Center | Daily observations from seven stations during 1970–2010, including precipitation, temperature, wind speed, humidity, and solar radiation | Daily |
| DEM | Geospatial Data Cloud | Digital elevation model with a spatial resolution of 30 m | 30 m |
| Land Use Data | Resource and Environmental Science Data Center, Chinese Academy of Sciences | Land use remote sensing data at 30 m resolution from 1970–2010 | 30 m |
| Soil Data | National Cryosphere Desert Data Center | The 1:1,000,000 scale soil data provided by the Nanjing Institute of Soil Science during the Second National Soil Survey. | 1 km |
| Runoff Data | Tongfaba Hydrological Station | Monthly runoff observations from Tongfaba Hydrological Station during 1970–2010 | Monthly |
| Hyperparameter Category | Parameter Name | Range | Optimal Values |
|---|---|---|---|
| Model Architecture | Number of LSTM Layers | (2~3) | 3 |
| Optimizer | Learning rate | (0.0001~0.01) | 0.003 |
| Training Settings | Batch size | (12~128) | 64 |
| Epochs | (200~1000) | 400 | |
| Window size | (12~64) | 32 |
| Model Name | Country | Institution | Scenarios | Rationale |
|---|---|---|---|---|
| CanESM5 | Canada | CCCma | SSP2-4.5, SSP5-8.5 | Representative Canadian model—Core member of CMIP |
| FGOALS-g3 | China | IAP/CAS | SSP2-4.5, SSP5-8.5 | Chinese model—Adapted to East Asian climate characteristics |
| GFDL-CM4 | USA | NOAA-GFDL | SSP2-4.5, SSP5-8.5 | Main NOAA model (USA)—Well-developed physical processes |
| IPSL-CM6A-LR | France | IPSL | SSP2-4.5, SSP5-8.5 | Representative European model—High climate sensitivity |
| Sorting | Parameter Name | Physical Meaning | Parameter Range | Optimal Value |
|---|---|---|---|---|
| 1 | CN2 | SCS runoff curve number | −0.5~0.5 | −0.26 |
| 2 | SOL_AWC | Soil available water capacity | −1~1 | −0.28 |
| 3 | GW_DELAY | Groundwater delay factor | 0~300 | 17.5 |
| 4 | GWQMN | Shallow groundwater flow coefficient | 0~5000 | 2243 |
| 5 | CH_K2 | Main channel hydraulic conductivity coefficient | 0~500 | 385.8 |
| 6 | CANMX | Canopy interception | 0~100 | 67.3 |
| 7 | GW_REVAP | Groundwater re-evaporation coefficient | 0–0.2 | 0.06 |
| 8 | OV_N | Overland flow Manning’s roughness coefficient | 0.01~1 | 0.49 |
| 9 | SURLAG | Surface runoff lag coefficient | 0~20 | 6.6 |
| 10 | SOL_BD | Soil bulk density | 0~5 | 2.74 |
| 11 | REVAPMN | Shallow groundwater evapotranspiration threshold | 0~500 | 71.33 |
| 12 | ALPHA_BF. | Baseflow recession coefficient | 0~2 | 0.15 |
| 13 | ESCO | Soil evaporation compensation factor | 0~0.5 | 0.44 |
| 14 | SOL_K | Soil saturated hydraulic conductivity | 0~0.5 | 0.15 |
| 15 | CH_N2 | Channel Manning’s roughness coefficient | 0~0.2 | 0.08 |
| Model | Training/Calibration Period (1970–2000) | Testing/Validation Period (2001–2010) | ||||
|---|---|---|---|---|---|---|
| R2 | NSE | MAE | R2 | NSE | MAE | |
| SWAT | 0.753 | 0.738 | 3.634 | 0.736 | 0.710 | 1.086 |
| SWAT-LSTM | 0.953 | 0.930 | 0.522 | 0.884 | 0.876 | 0.765 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Tian, J.; Zhang, J.; Tong, J.; He, H.; Gu, R.; Shang, F. Dominant Factor Analysis and Threshold Inflection Point Determination in Deep Learning-Based SWAT-LSTM Training Models with SHAP Interpretability Analysis. Water 2026, 18, 960. https://doi.org/10.3390/w18080960
Tian J, Zhang J, Tong J, He H, Gu R, Shang F. Dominant Factor Analysis and Threshold Inflection Point Determination in Deep Learning-Based SWAT-LSTM Training Models with SHAP Interpretability Analysis. Water. 2026; 18(8):960. https://doi.org/10.3390/w18080960
Chicago/Turabian StyleTian, Jiake, Jun Zhang, Jianjie Tong, Huaxiang He, Ruidan Gu, and Fenjie Shang. 2026. "Dominant Factor Analysis and Threshold Inflection Point Determination in Deep Learning-Based SWAT-LSTM Training Models with SHAP Interpretability Analysis" Water 18, no. 8: 960. https://doi.org/10.3390/w18080960
APA StyleTian, J., Zhang, J., Tong, J., He, H., Gu, R., & Shang, F. (2026). Dominant Factor Analysis and Threshold Inflection Point Determination in Deep Learning-Based SWAT-LSTM Training Models with SHAP Interpretability Analysis. Water, 18(8), 960. https://doi.org/10.3390/w18080960
