Dam Seepage Analysis Based on Causal Testing and Regression Analysis
Abstract
1. Introduction
2. Method
2.1. Stepwise Regression Analysis
- (1)
- Initialization and setting of thresholds: Determine the significance levels for including and excluding variables from the model. In this paper, the inclusion threshold was set at 0.05 and the exclusion threshold at 0.1.
- (2)
- Calculate the contribution of variables not yet included in the model: First, for all candidate feature variables that have not yet been included in the model, calculate the partial F-statistic for each variable if it was added to the model individually. Next, identify the variable among these that yields the largest F-statistic—that is, the one with the smallest significance level (p) and the most significant contribution to reducing the model’s sum of squared errors. If the p-value for this variable is less than the predefined inclusion threshold, it is included in the model.
- (3)
- Calculate the degree of variable redundancy within the model: Since new variables were added to the model in the previous step, the partial correlations between the original variables and the dependent variable have been disrupted. Therefore, it is necessary to recalculate the partial F-statistics for all variables in the current model and identify the variable with the smallest F-statistic.
- (4)
- Iteration: Alternate between steps 2 and 3 to identify new variables and eliminate old ones.
- (5)
- Termination: The algorithm terminates when none of the remaining variables outside the model meet the inclusion criteria, and none of the variables inside the model meet the exclusion criteria.
2.2. Granger Causality Test (GCT)
- (1)
- Stationarity Test. Before conducting the GCT, it is necessary to first determine whether the time series is stationary, as non-stationary series can affect the reliability of the test results. Common methods for testing stationarity include the ADF test, the PP test, and the KPSS test. If the data are non-stationary, apply differencing.
- (2)
- Information Criterion Method. After conducting stationarity tests on the data, the optimal lag order between variables is determined using information criteria. This lag order directly serves as the initial input step size for influencing factors in subsequent models. Commonly used information criteria include AIC, BIC, and HQ. This paper employed the AIC criterion to determine the model order, calculated as follows:
- (3)
- Granger Causality Test. After determining the stationarity and the order of lag, we proceed to the essential Granger causality test. Essentially, this involves conducting an F-test on the sum of squared residuals from the restricted and unrestricted regression equations. The formula for the Granger causality test is as follows [25].
3. Project Overview
4. Analysis of Current Seepage Conditions
4.1. Regression Analysis of Seepage Pressure in Dams
- (1)
- Left Bank Pressure Tube L1: When the reservoir water level operates above 155.0 m, the higher the reservoir water level, the greater the difference between L1 and the reservoir water level. When reservoir water level drops below 155.0 m, the lower the reservoir water level, the higher the water level in L1 becomes relative to the reservoir water level. This indicates that the groundwater table within the mountain slopes on both sides of the project is relatively high. As a result, seepage is not only driven by the reservoir water level but is also subject to natural environmental influences. Therefore, when the reservoir water level is below 155.0 m, L1 is primarily controlled by the groundwater table within the mountain slopes.
- (2)
- When reservoir water levels are high, the water level in the L1 pressure tube exhibits a certain correlation with the reservoir water levels, with minimal head reduction. It has been confirmed that L1 is unrelated to water levels within the power tunnel, indicating seepage around the left dam toe. The L1 water level shows an upward trend, and its correlation with reservoir water levels has strengthened in recent years, suggesting a weakening of the impermeable barrier’s seepage control effectiveness.
- (3)
- The difference between the L2 water level and the reservoir water level generally stabilizes around 6 m. However, during periods of heavy rainfall, the groundwater level significantly impacts this difference, with L2 primarily influenced by rainfall.
- (4)
- L3–L4 exhibit a certain correlation with reservoir water level, but both experience significant head reduction. L5–L6 show little correlation with reservoir water level and exhibit substantial head reduction.
4.2. Determination of GCT Key Influence Factors
5. Conclusions
- This study employs a hybrid approach combining multivariate Johansen cointegration analysis, Granger causality testing (GCT), and a two-way stepwise regression algorithm. This framework overcomes the issue of spurious correlations that often arise in variable selection when using traditional black-box machine learning models or simple statistical correlation analysis. The results reveal the spatial heterogeneity and local dynamic evolution characteristics of the seepage flow field around the dam. A distinct seepage flow path exists in the left abutment area of the dam, and the groundwater levels on both sides of the mountain have long remained in a state of high potential energy. The time-series evolution at the key monitoring points LB-1 and L-1 exhibits non-stationary trend-like steps, quantitatively confirming the dynamic restructuring process of the hydraulic gradient and permeability characteristics in this local area, thereby providing a basis for seepage early warning at the dam.
- Combining Granger causality testing with stepwise regression analysis can effectively narrow down the range of characteristic parameters. The analysis indicates that the seepage behavior at LB-1 is primarily controlled by local hydraulic conduction within the dam, with P2-7 and P1-4 serving as the key driving factors; furthermore, GCT confirms that P2-7 exerts a temporal causal response on LB-1. In contrast, seepage at L-1 is influenced by both internal hydraulic conduction and external environmental factors. While it relies on internal hydraulic conduction via P1-4, reservoir water levels and rainfall also affect seepage at L-1.
- The results of the Granger causality test and stepwise regression analysis corroborate one another. The results of Johansen’s multivariate cointegration analysis reveal the intrinsic long-term physical dynamic equilibrium of the dam seepage field, while stepwise regression comprehensively screens for all potential factors influencing synchronous fluctuations. GCT, in turn, utilizes time lag characteristics to further narrow down the range of influencing factors. These two methods complement each other’s strengths. Not only do they precisely identify the true physical causes driving seepage around the dam, but they also streamline the input variables of the predictive model at the source, eliminating the spurious correlations that are prone to occur in pure machine learning models. This process identifies key influencing factors for the scientific identification and reliable prediction of abnormal dam seepage in the future.
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Cen, W.; Zheng, X.; Deng, C.; Cao, Y. Seepage safety of a rockfill-based earth dam heightening project. J. Chang. River Sci. Inst. 2025, 42, 147–153. Available online: https://link.cnki.net/urlid/42.1171.TV.20250507.1136.004 (accessed on 7 May 2025).
- Li, J.; Chen, X.; Gu, C.; Huo, Z. Seepage Comprehensive Evaluation of Concrete Dam Based on Grey Cluster Analysis. Water 2019, 11, 1499. [Google Scholar] [CrossRef]
- Al-Janabi, A.M.S.; Ghazali, A.H.; Ghazaw, Y.M.; Afan, H.A.; Al-Ansari, N.; Yaseen, Z.M. Experimental and Numerical Analysis for Earth-Fill Dam Seepage. Sustainability 2020, 12, 2490. [Google Scholar] [CrossRef]
- Liang, M.-C.; Chen, H.-E.; Tfwala, S.S.; Lin, Y.-F.; Chen, S.-C. The Application of Wireless Underground Sensor Networks to Monitor Seepage inside an Earth Dam. Sensors 2023, 23, 3795. [Google Scholar] [CrossRef]
- Cheng, X.; Li, Q.; Zhou, Z.; Luo, Z.; Liu, M.; Liu, L. Research on a Seepage Monitoring Model of a High Core Rockfill Dam Based on Machine Learning. Sensors 2018, 18, 2749. [Google Scholar] [CrossRef] [PubMed]
- Li, D.-Q.; Kang, Q.; Yan, K.; He, J.-P.; Liu, Y. A dynamic multi-objective inversion framework for seepage parameters based on monitoring data: Case study of an earth-rockfill dam. J. Hydrol. 2026, 669, 135064. [Google Scholar] [CrossRef]
- Tong, G.; Zhang, H.; He, J.; Huang, S.; Zhang, Y.; Dong, Z. Anti-sliding stability analysis of the three gorges dam through seepage stress coupling inversion analysis. Water Resour. Power 2025, 43, 132–136. [Google Scholar] [CrossRef]
- Li, Y.; Sun, X.; Li, G. Sensitivity analysis of 3D seepage in rockfill dam considering concrete panel crack. Yellow River 2024, 46, 155–160. Available online: https://kns.cnki.net/kcms2/article/abstract?v=iMwhGHIyCLY2rtvIkC88kIJhE44VTlsDC4S6wKjheXDa8JkqN_Fx608x7HIyE2mp1Tk-zjXLvs9WM0V0U-tTzqrp1H55ym5wfoSg1x5ikgNVd-CigVquF_NbQjwCyZZLYUXo-0PvPPhXaTJ5GgEnhwCz372e8xgeaWEl9Fe9efuC4LdgSSaa4w==&uniplatform=NZKPT&language=CHS (accessed on 10 August 2024).
- Zhou, R.; Hu, C.; Li, D. Analysis of Seepage Stability in Reinforced Clay Core Wall Dams Using a Hybrid Model Approach. Yellow River 2024, 46, 102–103. Available online: https://kns.cnki.net/kcms2/article/abstract?v=iMwhGHIyCLYCFFJoXtepEKo_Vd8Xet6PzXpsaNtyHlzYTUF_zymBCKuuZhi1Yq-6z1G8_HBJA430UOeF1ZqWWrSXqecg1dZ3cSKmfkkJAIt6HU4IBlbYBTG6mKMGATXuix467f350c0ecpV-bSwPMNGJ0elKOJQ0prU9ZdDtdKqjK-0SRZRHCQ==&uniplatform=NZKPT&language=CHS (accessed on 28 June 2024).
- Liu, C.; Lyu, L.; Zhang, Y. Inverse Analysis of Seepage Characteristics of Dalongdong Reservoir Dam. Pearl River 2025, 46, 19–21. Available online: https://kns.cnki.net/kcms2/article/abstract?v=iMwhGHIyCLbJgM7KJMs7IVKAff6Qf0nvbfHAyPtSixTyiCOb5y28XoodqyApyE9nq0hNyvK45PtE45iW7x4dOOOlb11JrUhRZCFnh83Tris9hZJ9SjZRFTIjY1V9P0it7XgEr-yoqKeAbThoLCu1tOfHGXUc_T9wLywSeDXF9BCN79j9d8z2Pw==&uniplatform=NZKPT&language=CHS (accessed on 30 June 2024).
- Li, F.; Wang, Z.Z.; Liu, G. Towards an Error Correction Model for dam monitoring data analysis based on Cointegration Theory. Struct. Saf. 2013, 43, 12–20. [Google Scholar] [CrossRef]
- Wang, S.W.; Bao, T.F. Monitoring Model for Dam Seepage Based on Lag Effect. Appl. Mech. Mater. 2013, 353–356, 2456–2462. [Google Scholar] [CrossRef]
- Su, H.; Hu, J.; Yang, M. Dam Seepage Monitoring Based on Distributed Optical Fiber Temperature System. IEEE Sens. J. 2014, 15, 9–13. [Google Scholar] [CrossRef]
- Chen, X.; Xu, Y.; Guo, H.; Hu, S.; Gu, C.; Hu, J.; Qin, X.; Guo, J. Comprehensive evaluation of dam seepage safety combining deep learning with Dempster-Shafer evidence theory. Measurement 2024, 226, 114172. [Google Scholar] [CrossRef]
- Arslan, C.A.; Al-Jalabi, F.A. Artificial intelligence models for seepage analysis through embankment dam-case study: Khasa Chi Dam. Earth Sci. Inform. 2025, 18, 550. [Google Scholar] [CrossRef]
- Danish, A. Understanding the Effect of Hydro-Climatological Parameters on Dam Seepage Using Shapley Additive Explanation (SHAP): A Case Study of Earth-Fill Tarbela Dam, Pakistan. Water 2022, 14, 2598. [Google Scholar] [CrossRef]
- Li, D.; Kang, Q.; Wang, R.; He, J.; Liu, Y. Application of artificial intelligence models in the seepage flow prediction of dam: A case study of Shenzhen reservoir. Georisk Assess. Manag. Risk Eng. Syst. Geohazards 2025, 19, 944–965. [Google Scholar] [CrossRef]
- Zheng, C.; Cen, W.; Liu, B.; Qian, J.; Ding, Y.; Mo, C. Hybrid optimization and AI-driven surrogate model for seepage parameters inversion in complex dam foundations. J. Hydrol. 2026, 664, 134484. [Google Scholar] [CrossRef]
- Yin, Q.; Li, Y.; Li, W.; Wen, L.; Zhang, Y.; Wang, T.; Yang, T.; Zhou, T. Intelligent inversion analysis of seepage parameters for deep overburden dam foundations based on an improved grey wolf optimization algorithm. Comput. Geotech. 2010, 188, 19. [Google Scholar] [CrossRef]
- Zhou, Y.; Bao, T.; Shu, X.; Li, Y.; Li, Y. BIM and ontology-based knowledge management for dam safety monitoring. Autom. Constr. 2023, 145, 104649. [Google Scholar] [CrossRef]
- Tian, D.; Liu, H.; Chen, S.; Li, M.; Liu, C. Human Error Analysis for Hydraulic Engineering: Comprehensive System to Reveal Accident Evolution Process with Text Knowledge. J. Constr. Eng. Manag. 2022, 148, 13. [Google Scholar] [CrossRef]
- Xu, B.; Rong, Z.; Pang, R.; Tan, W.; Wei, B. A novel method for settlement imputation and monitoring of earth-rockfill dams subjected to large-scale missing data. Adv. Eng. Inform. 2024, 62, 102642. [Google Scholar] [CrossRef]
- Li, D.; Chen, G.; He, N.; Xu, X. Advances in data processing and evaluation techniques for safety monitoring of earth-rock dams. Hydro-Sci. Eng. 2025, 5, 88–100. [Google Scholar] [CrossRef]
- Shojaie, A.; Fox, E.B. Granger Causality: A Review and Recent Advances. Annu. Rev. Stat. Its Appl. 2021, 9, 289–319. [Google Scholar] [CrossRef] [PubMed]
- Zhang, M.; Wang, W.; Yang, W.; Zhang, T.; Li, Z.; Jin, L.; Jiang, Z. Weir Flow Prediction of Panel Rockfill Dam Based on Causality Test and Elman Neural Network. Water Power 2026, 52, 108–115. Available online: https://link.cnki.net/urlid/11.1845.TV.20251212.1117.002 (accessed on 9 February 2026).



| Feature Data | Time Range | Data Volume |
|---|---|---|
| Reservoir water level | 27 July 2020–31 December 2022 | 890 |
| Rainfall | 28 July 2020–31 December 2022 | 560 |
| Seepage pressure P2-7 | 27 July 2020–31 December 2022 | 791 |
| Seepage pressure P2-8 | 27 July 2020–31 December 2022 | 158 |
| Seepage pressure P1-4 | 27 July 2020–31 December 2022 | 812 |
| Seepage pressure around the dam LB-1 | 27 July 2020–31 December 2022 | 1066 |
| Seepage pressure around the dam L-1 | 27 July 2020–31 December 2022 | 817 |
| Introducing Factors | Dependent Variable | R2 | F | P | Regression Model |
|---|---|---|---|---|---|
| Seepage pressure P1-4 | Seepage pressure around the dam LB-1 | 0.686 | 169.541 | 0.000 *** | |
| Seepage pressure P2-7 | 0.686 | 169.541 | 0.000 *** | ||
| Seepage pressure P1-4 | Seepage pressure around the dam L-1 | 0.954 | 795.757 | 0.000 *** | |
| Seepage pressure P2-7 | 0.954 | 795.757 | 0.000 *** | ||
| Reservoir water level | 0.954 | 795.757 | 0.000 *** | ||
| Rainfall | 0.954 | 795.757 | 0.000 *** |
| Order of Lag | AIC | SC | HQ | FPE | logL |
|---|---|---|---|---|---|
| 0 | 4.281 | 4.329 | 4.300 | 72.341 | −5779.300 |
| 1 | −16.208 | −15.875 * | −16.078 | 0.000 | −180.103 |
| 2 | −16.434 | −15.815 | −16.192 * | 0.000 | −82.378 |
| 3 | −16.425 | −15.519 | −16.070 | 0.000 | −48.699 |
| 4 | −16.363 | −15.169 | −15.896 | 0.000 | −29.041 |
| 5 | −16.566 | −15.083 | −15.986 | 0.000 | 61.846 |
| 6 | −16.549 | −14.777 | −15.856 | 0.000 | 93.650 |
| 7 | −16.480 | −14.418 | −15.674 | 0.000 | 111.407 |
| 8 | −16.386 | −14.033 | −15.465 | 0.000 | 122.475 |
| 9 | −16.442 | −13.797 | −15.407 | 0.000 | 173.784 |
| 10 | −16.641 * | −13.703 | −15.492 | 0.0 * | 263.155 |
| 11 | −16.606 | −13.374 | −15.341 | 0.000 | 289.872 |
| Original Hypothesis | Eigenvalue | Trace (Max Root) | 10% Threshold | 5% Threshold | 5% Threshold |
|---|---|---|---|---|---|
| No cointegration | 0.226 | 250.719 | 91.109 | 95.754 | 104.964 |
| Up to 1 cointegration | 0.085 | 112.366 | 65.820 | 69.819 | 77.820 |
| Up to 2 cointegration | 0.068 | 64.375 | 44.493 | 47.855 | 54.681 |
| Up to 3 cointegration | 0.030 | 26.414 | 27.067 | 29.796 | 35.463 |
| Up to 4 cointegration | 0.015 | 9.716 | 13.429 | 15.494 | 19.935 |
| Up to 5 cointegration | 0.002 | 1.282 | 2.705 | 3.841 | 6.635 |
| Mated Samples | F | P | |
|---|---|---|---|
| Rainfall | Reservoir water level | 4.059 | 0.047 ** |
| Reservoir water level | Rainfall | 0.13 | 0.719 |
| Reservoir water level | Seepage around the dam LB-1 | 2.968 | 0.088 * |
| Reservoir water level | Seepage around the dam L-1 | 1.586 | 0.211 |
| Rainfall | Seepage around the dam LB-1 | 0.924 | 0.339 |
| Rainfall | Seepage around the dam L-1 | 3.239 | 0.075 * |
| Seepage around the dam LB1 | Reservoir water level | 20.465 | 0.000 *** |
| Seepage around the dam LB1 | Rainfall | 0.105 | 0.746 |
| Seepage around the dam LB1 | Seepage pressure P2-7 | 5.08 | 0.026 ** |
| Seepage around the dam LB1 | Seepage pressure P2-8 | 0.45 | 0.504 |
| Seepage around the dam LB1 | Seepage around the dam L-1 | 1.458 | 0.230 |
| Seepage around the dam LB1 | Seepage pressure P1-4 | 3.281 | 0.073 * |
| Seepage around the dam L1 | Reservoir water level | 0.67 | 0.415 |
| Seepage around the dam L1 | Rainfall | 0.092 | 0.762 |
| Seepage around the dam L1 | Seepage pressure P2-7 | 0.944 | 0.334 |
| Seepage around the dam L1 | Seepage pressure P2-8 | 0.013 | 0.909 |
| Seepage around the dam L1 | Seepage pressure P1-4 | 13.23 | 0.000 *** |
| Seepage around the dam L1 | Seepage around the dam LB-1 | 0.637 | 0.427 |
| Dependent Variable | Stepwise Regression Analysis | GCT |
|---|---|---|
| Seepage around the dam LB-1 | Seepage pressure P1-4 | Reservoir water level Seepage pressure P2-7 |
| Seepage pressure P2-7 | ||
| Seepage around the dam L-1 | Seepage pressure P1-4 | Seepage pressure P1-4 |
| Seepage pressure P2-7 | ||
| Reservoir water level | ||
| Rainfall |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Liu, L.; Jin, Y.; Zhang, S.; Cheng, F. Dam Seepage Analysis Based on Causal Testing and Regression Analysis. Water 2026, 18, 1359. https://doi.org/10.3390/w18111359
Liu L, Jin Y, Zhang S, Cheng F. Dam Seepage Analysis Based on Causal Testing and Regression Analysis. Water. 2026; 18(11):1359. https://doi.org/10.3390/w18111359
Chicago/Turabian StyleLiu, Linsong, Yu Jin, Shengyang Zhang, and Fangjun Cheng. 2026. "Dam Seepage Analysis Based on Causal Testing and Regression Analysis" Water 18, no. 11: 1359. https://doi.org/10.3390/w18111359
APA StyleLiu, L., Jin, Y., Zhang, S., & Cheng, F. (2026). Dam Seepage Analysis Based on Causal Testing and Regression Analysis. Water, 18(11), 1359. https://doi.org/10.3390/w18111359
