# Understanding the Effect of Hydro-Climatological Parameters on Dam Seepage Using Shapley Additive Explanation (SHAP): A Case Study of Earth-Fill Tarbela Dam, Pakistan

^{1}

^{2}

^{3}

^{4}

^{5}

^{*}

## Abstract

**:**

^{2}scores between actual and predicted values of average seepage, suggesting their reliability in predicting the seepage in the Tarbela Dam. Moreover, the CatBoost algorithm outperformed, by achieving an R

^{2}score of 0.978 in training, 0.805 in validation, and 0.773 in testing phase. Similarly, RMSE was 0.025 in training, 0.076 in validation, and 0.111 in testing phase. Furthermore, to understand the sensitivity of each parameter on the output (average seepage), Shapley Additive Explanations (SHAP), a model explanation algorithm, was used to understand the affect of each parameter on the output. A comparison of SHAP used for all the machine learning models is also presented. According to SHAP summary plots, reservoir level was reported as the most significant parameter, affecting the average seepage in Tarbela Dam. Moreover, a direct relationship was observed between reservoir level and average seepage. It was concluded that the machine learning models are reliable in predicting and understanding the dam seepage in the Tarbela Dam. These Machine Learning models address the limitations of humans in data collecting and analysis which is highly prone to errors, hence arriving at misleading information that can lead to dam failure.

## 1. Introduction

## 2. Materials and Methods

#### 2.1. Study Area

^{3}, but this decreased to 6.8 billion m

^{3}due to siltation throughout the reservoir’s 35-year operation. The Tarbela Dam is 2743 m in length and 143 m in height above the riverbed. It has two spillways, one of which cuts through the left bank and discharges into a ghazi broth pound at a downstream site, and the other cuts through the right bank. There are several important characteristics of the reservoir, including the catchment area (169,600 sq. km), measured annual water inflow in Tarbela Dam (64 Million-acer-feet (MAF)), the area of the lake (259 sq. km), live storage capacity designed for water (9.680 MAF), the present live storage (6.849 MAF), the maximum depth (137 m), the maximum elevation (472.44 m), the minimum operational elevation (420.01 m), the crest elevation (477 m), and the length of the crest (2743 m). The dam has two spillways, one of which is a service spillway with seven gates, and the other is an auxiliary spillway with nine gates. The installed capacity of the 4888-megawatt (MW) Tarbela Dam hydroelectric station will expand to 6298 MW following the completion of the 5th extension project, which is being financed by the Asian Infrastructure Development Bank and the World Bank.

#### 2.2. Data Collection

#### 2.3. Algorithm Selection for Experiments

#### 2.3.1. Artificial Neural Network (ANN)

#### 2.3.2. Random Forest

_{tree}= 500.

#### 2.3.3. Support Vector Machine

#### 2.3.4. CatBoost

_{0}permutations are employed. Throughout the training, CatBoost (CB) maintains the supporting tree Tr,j where Tq,j (i) is the current prediction for the ith instance based on the initial j examples in the variation σ

_{r}. In other words, Tq,j (i) is the current prediction for the ith instance. The information is then used to construct a tree. The operation of CatBoost (CB) is described in the Algorithm 1 as below.

Algorithm 1: CatBoost |

Input:$\left\{\left({\mathit{X}}_{\mathit{k}}\mathbf{,}{\mathit{y}}_{\mathit{k}}\right)\right\}\mathbf{\forall}\mathit{K}\mathbf{=}\mathbf{1}\mathit{t}\mathit{o}\mathit{n}\mathbf{,}\mathit{I}$ |

$\mathbf{1}\mathbf{.}\mathit{\sigma}\mathbf{\leftarrow}\mathit{r}\mathit{a}\mathit{n}\mathit{d}\mathit{o}\mathit{a}\mathit{m}\mathit{P}\mathit{e}\mathit{r}\mathit{m}\mathit{u}\mathit{t}\mathit{a}\mathit{t}\mathit{i}\mathit{o}\mathit{n}\mathit{o}\mathit{f}\left[\mathbf{1}\mathbf{,}\mathit{n}\right]\mathbf{;}$ |

$\mathbf{2}\mathbf{.}\mathit{T}\mathit{i}\mathbf{\leftarrow}\mathbf{0}\mathit{f}\mathit{o}\mathit{r}\mathit{i}\mathbf{=}\mathbf{1}\mathbf{.}\mathit{n}\mathbf{;}$ |

$\mathbf{3}\mathbf{.}\mathit{f}\mathit{o}\mathit{r}\mathit{t}\mathbf{\leftarrow}\mathbf{1}\mathit{t}\mathit{o}\mathit{I}\mathit{d}\mathit{o}$ |

$\mathbf{4}\mathbf{.}\mathit{f}\mathit{o}\mathit{r}\mathit{i}\mathbf{\leftarrow}\mathbf{1}\mathit{t}\mathit{o}\mathit{n}\mathit{d}\mathit{o}$ |

$\mathbf{5}\mathbf{.}{\mathit{r}}_{\mathit{i}}\mathbf{\leftarrow}{\mathit{y}}_{\mathit{i}}\mathbf{-}{\mathit{T}}_{\mathit{\sigma}\left(\mathit{i}\right)\mathbf{-}\mathbf{1}}\left({\mathit{X}}_{\mathit{i}}\right)\mathbf{;}$ |

$\mathbf{6}\mathbf{.}\mathit{f}\mathit{o}\mathit{r}\mathit{i}\mathbf{\leftarrow}\mathbf{1}\mathit{t}\mathit{o}\mathit{n}\mathit{d}\mathit{o}$ |

$\mathbf{7}\mathbf{.}\mathsf{\Delta}\mathit{T}\mathbf{\leftarrow}\mathit{L}\mathit{e}\mathit{a}\mathit{r}\mathit{n}\mathit{T}\mathit{r}\mathit{e}\mathit{e}\left({\mathit{X}}_{\mathit{j}}\mathbf{,}{\mathit{r}}_{\mathit{j}}\right)\mathbf{:}\mathit{\sigma}\left(\mathit{j}\right)\mathbf{\le}\mathit{i}\mathbf{)}\mathbf{;}$ |

$\mathbf{8}\mathbf{.}{\mathit{T}}_{\mathit{i}}\mathbf{\leftarrow}{\mathit{T}}_{\mathit{i}}\mathbf{+}\mathsf{\Delta}\mathit{T}$ |

$\mathbf{9}\mathbf{.}\mathit{r}\mathit{e}\mathit{t}\mathit{u}\mathit{r}\mathit{n}{\mathit{T}}_{\mathit{n}}$ |

#### 2.4. Model Sensitivity

#### 2.5. Data Preparation

#### 2.6. Descriptive Statistic

#### 2.7. Model Evaluation Metrics

^{2}) and Nash-Sutcliffe Efficiency (NSE). Many studies have shown the use of these metrics to measure accuracy in machine learning models [96,97]. R

^{2}value ranges from 0–1 but commonly written as 0–100%. Higher values R

^{2}closer to 1 indicate better fit whereas values less than 0.5 closer to zero indicate poor fit Similarly, RMSE value ranges from 0–∞, they are negative scores, means that lower values are preferable and indicate better performance of the model. NSE value range from 0–1 which is comparable with R

^{2}. When NSE = 1.0, then it shows a perfect fit. NSE > 0.75 is a very good fit, NSE = 0.64–0.74 is a good fit. NSE = 0.5–0.64 is a satisfactory fit, and NSE < 0.5 is an unsatisfactory fit. RMSE R

^{2}and NSE were calculated using Equations no (5)–(7) to evaluate model accuracy.

^{2}compares the probability of the predicted and actual values.

## 3. Results and Discussion

#### 3.1. Correlation Matrix (Heat Map)

#### 3.2. Model Accuracy

^{2}) scores for the different models applied to predict the seepage of the Tarbela Dam. Catboost Regression model reports the best coefficient of determination (R

^{2}) scores of 0.978, 0.805, and 0.773 with the RMSE of 0.025, 0.076 and 0.111 on training, validation and testing datasets, respectively.

#### 3.3. Feature Importance

#### 3.3.1. Reservoir Level

#### 3.3.2. Temperature

#### 3.3.3. Precipitation

#### 3.3.4. Water Inflow

#### 3.3.5. Sediment Inflow

## 4. Conclusions

^{2}score during the training, testing and validation stage. The recommended algorithm is CatBoost for water resources management decision-making and policymaking and improved monitoring of seepage losses at Tarbela Dam using Artificial Intelligence-based modeling approaches. Furthermore, the SHAP algorithm for all the models, reports the Reservoir Level as the most important parameter affecting the average dam seepage. Increasing the Reservoir Level increases, the average dam seepage and vice versa.

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Acknowledgments

## Conflicts of Interest

## References

- Adebiyi, J.A.; Olabisi, L.S.; Liu, L.; Jordan, D.J.E. Development; Sustainability. Water–food–energy–climate nexus and technology productivity: A Nigerian case study of organic leafy vegetable production. Environ. Dev. Sustain.
**2021**, 23, 6128–6147. [Google Scholar] [CrossRef] - Shen, D. The Strictest Water Resources Management Strategy and Its Three Red Lines. In Water Resources Management of the People’s Republic of China; Springer: Berlin/Heidelberg, Germany, 2021; pp. 253–274. [Google Scholar]
- Demir, İ.; Kiliçkan, A. Renewable Energy Storage Methods. Int. Sci. J.
**2018**, 64, 103–107. [Google Scholar] - Rezaee, A.; Bozorg-Haddad, O.; Singh, V.P. Water and society. In Economical, Political, and Social Issues in Water Resources; Elsevier: Amsterdam, The Netherlands, 2021; pp. 257–271. [Google Scholar]
- Kahlown, M.A.; Majeed, A. Water-resources situation in Pakistan: Challenges and future strategies. In Water Resources in the South: Present Scenario and Future Prospects; Commission on Science and Technology for Sustainable Development in the South: Islamabad, Pakistan, 2003; Volume 20, pp. 33–45. [Google Scholar]
- Ishfaque, M.; Dai, Q.; Haq, N.u.; Jadoon, K.; Shahzad, S.M.; Janjuhah, H.T. Use of Recurrent Neural Network with Long Short-Term Memory for Seepage Prediction at Tarbela Dam, KP, Pakistan. Energies
**2022**, 15, 3123. [Google Scholar] [CrossRef] - Manivannan, S.; Thilagam, V.K.; Yaligar, R. Climate change impact on water resources in Indian river basins: A review. J. Soil Water Conserv.
**2022**, 21, 76–85. [Google Scholar] [CrossRef] - Lessard, J.; Hicks, D.M.; Snelder, T.H.; Arscott, D.B.; Larned, S.T.; Booker, D.; Suren, A.M. Dam design can impede adaptive management of environmental flows: A case study from the Opuha Dam, New Zealand. Environ. Manag.
**2013**, 51, 459–473. [Google Scholar] [CrossRef] - Rice, J.D.; Duncan, J.M. Findings of case histories on the long-term performance of seepage barriers in dams. J. Geotech. Geoenviron. Eng.
**2010**, 136, 2–15. [Google Scholar] [CrossRef] - Omofunmi, O.E.; Kolo, J.G.; Oladipo, A.S.; Diabana, P.D.; Ojo, A.S. A review on effects and control of seepage through earth-fill dam. Curr. J. Appl. Sci. Technol.
**2017**, 22, 1–11. [Google Scholar] [CrossRef] - Chen, G.; Jin, D.; Mao, J.; Gao, H.; Wang, Z.; Jing, L.; Li, Y.; Li, X. Seismic damage and behavior analysis of earth dams during the 2008 Wenchuan earthquake, China. Eng. Geol.
**2014**, 180, 99–129. [Google Scholar] [CrossRef] - Kayode, O.; Odukoya, A.M.; Adagunodo, T.; Adeniji, A. Monitoring of seepages around dams using geophysical methods: A brief review. IOP Conf. Ser. Earth Environ. Sci.
**2018**, 173, 012026. [Google Scholar] [CrossRef] - Zhao, E.; Jiang, Y. Seepage Evolution Model of the Fractured Rock Mass under High Seepage Pressure in Dam Foundation. Adv. Civ. Eng.
**2021**, 2021, 8832774. [Google Scholar] [CrossRef] - Himi, M.; Casado, I.; Sendros, A.; Lovera, R.; Rivero, L.; Casas, A. Assessing preferential seepage and monitoring mortar injection through an earthen dam settled over a gypsiferous substrate using combined geophysical methods. Eng. Geol.
**2018**, 246, 212–221. [Google Scholar] [CrossRef] - Coulibaly, Y.; Belem, T.; Cheng, L. Numerical analysis and geophysical monitoring for stability assessment of the Northwest tailings dam at Westwood Mine. Int. J. Min. Sci. Technol.
**2017**, 27, 701–710. [Google Scholar] [CrossRef] - Dahlin, T. Geoelectrical monitoring of embankment dams for detection of anomalous seepage and internal erosion—Experiences and work in progress in Sweden. In Proceedings of the Fifth International Conference on Engineering Geophysics (ICEG), Al Ain, United Arab Emirates, 21–24 October 2019; pp. 207–210. [Google Scholar]
- Komasi, M.; Beiranvand, B. Seepage and Stability Analysis of the Eyvashan Earth Dam under Drawdown Conditions. Civ. Eng. Infrastruct. J.
**2021**, 54, 205–223. [Google Scholar] - Fang, C.; Duan, Y. Statistical analysis of dam-break incidents and its cautions. Yangtze River
**2010**, 41, 97–100. [Google Scholar] - Jiang, X.; Wei, Y.; Wu, L.; Hu, K.; Zhu, Z.; Zou, Z.; Xiao, W. Laboratory experiments on failure characteristics of non-cohesive sediment natural dam in progressive failure mode. Environ. Earth Sci.
**2019**, 78, 538. [Google Scholar] [CrossRef] - Liu, L.-L.; Cheng, Y.-M.; Jiang, S.-H.; Zhang, S.-H.; Wang, X.-M.; Wu, Z.-H. Effects of spatial autocorrelation structure of permeability on seepage through an embankment on a soil foundation. Comput. Geotech.
**2017**, 87, 62–75. [Google Scholar] [CrossRef] - Adamo, N.; Al-Ansari, N.; Sissakian, V.; Laue, J.; Knutsson, S.; Engineering, G. Geophysical Methods and their Applications in Dam Safety Monitoring. J. Earth Sci. Geotech. Eng.
**2021**, 11, 291–345. [Google Scholar] [CrossRef] - Cui, H.D.; Chen, L.; Wang, J.L.; Zhang, W. Study on anti-seepage treatment and seepage control effect of core dam foundation curtain of the fault fracture zone in Xinjiang province. IOP Conf. Ser. Earth Environ. Sci.
**2020**, 643, 012108. [Google Scholar] [CrossRef] - Zhang, C.; Chai, J.; Cao, J.; Xu, Z.; Qin, Y.; Lv, Z. Numerical Simulation of Seepage and Stability of Tailings Dams: A Case Study in Lixi, China. Water
**2020**, 12, 742. [Google Scholar] [CrossRef] - Coppens, J.; Trolle, D.; Jeppesen, E.; Beklioğlu, M. The impact of climate change on a Mediterranean shallow lake: Insights based on catchment and lake modelling. Reg. Environ. Chang.
**2020**, 20, 62. [Google Scholar] [CrossRef] - Xu, S.; Chen, C.; Xu, F.; Li, J.; Zhang, Z.; Xu, T.; Zhu, L. Modeling Analysis of the Upper Limit Water Level Mechanism in the Upstream Reservoir of a Dam Embankment. Adv. Civ. Eng.
**2020**, 2020, 8850681. [Google Scholar] [CrossRef] - Beiranvand, B.; Komasi, M. An Investigation on performance of the cut off wall and numerical analysis of seepage and pore water pressure of Eyvashan earth dam. Iran. J. Sci. Technol. Trans. Civ. Eng.
**2021**, 45, 1723–1736. [Google Scholar] [CrossRef] - Wang, T.; Chen, J.; Li, P.; Yin, Y.; Shen, C. Natural tracing for concentrated leakage detection in a rockfill dam. Eng. Geol.
**2019**, 249, 1–12. [Google Scholar] [CrossRef] - Al-Fares, W. Application of electrical resistivity tomography technique for characterizing leakage problem in Abu Baara earth dam, Syria. Int. J. Geophys.
**2014**, 2014, 368128. [Google Scholar] [CrossRef] - Neyamadpour, A.; Abbasinia, M. Application of electrical resistivity tomography technique to delineate a structural failure in an embankment dam: Southwest of Iran. Arab. J. Geosci.
**2019**, 12, 420. [Google Scholar] [CrossRef] - Okpoli, C.; Tijani, R. Electromagnetic profiling of Owena Dam, Southwestern Nigeria, using very-low-frequency radio fields. Mater. Geoenviron.
**2016**, 63, 237–250. [Google Scholar] [CrossRef] - Ahmed, A.S.; Revil, A.; Bolève, A.; Steck, B.; Vergniault, C.; Courivaud, J.; Jougnot, D.; Abbas, M. Determination of the permeability of seepage flow paths in dams from self-potential measurements. Eng. Geol.
**2020**, 268, 105514. [Google Scholar] [CrossRef] - Li, X.; Fan, L.; Huang, H.; Hao, J.; Li, M. Application of Ground Penetrating Radar in Leakage Detection of Concrete Face Rockfill Dam. IOP Conf. Ser. Earth Environ. Sci.
**2018**, 189, 022044. [Google Scholar] [CrossRef] - Raji, W.O.; Aluko, K.O. Investigating the cause of excessive seepage in a dam foundation using seismic and electrical surveys—A case study of Asa Dam, West Africa. Bull. Eng. Geol. Environ.
**2021**, 80, 6445–6455. [Google Scholar] [CrossRef] - Al-Janabi, A.M.S.; Ghazali, A.H.; Ghazaw, Y.M.; Afan, H.A.; Al-Ansari, N.; Yaseen, Z.M. Experimental and numerical analysis for earth-fill dam seepage. Sustainability
**2020**, 12, 2490. [Google Scholar] [CrossRef] - Li, G.C.; Desai, C.S. Stress and seepage analysis of earth dams. J. Geotech. Eng.
**1983**, 109, 946–960. [Google Scholar] [CrossRef] - Finn, W.D.L. Finite-element analysis of seepage through dams. J. Soil Mech. Found. Div.
**1967**, 93, 41–48. [Google Scholar] [CrossRef] - Neuman, S.P.; Witherspoon, P.A. Finite element method of analyzing steady seepage with a free surface. Water Resour. Res.
**1970**, 6, 889–897. [Google Scholar] [CrossRef] - Bathe, K.J.; Khoshgoftaar, M.R. Finite element free surface seepage analysis without mesh iteration. Int. J. Numer. Anal. Methods Geomech.
**1979**, 3, 13–22. [Google Scholar] [CrossRef] - Ng, A.K.; Small, J.C. A case study of hydraulic fracturing using finite element methods. Can. Geotech. J.
**1999**, 36, 861–875. [Google Scholar] [CrossRef] - Callari, C.; Abati, A. Finite element methods for unsaturated porous solids and their application to dam engineering problems. Comput. Struct.
**2009**, 87, 485–501. [Google Scholar] [CrossRef] - Kazemzadeh-Parsi, M.J.; Daneshmand, F. Unconfined seepage analysis in earth dams using smoothed fixed grid finite element method. Int. J. Numer. Anal. Methods Geomech.
**2012**, 36, 780–797. [Google Scholar] [CrossRef] - Olonade, K.A.; Agbede, O.A. A study of seepage through oba dam using finite element method. Civ. Environ. Res.
**2013**, 3, 53–60. [Google Scholar] - Athani, S.S.; Shivamanth; Solanki, C.; Dodagoudar, G. Seepage and stability analyses of earth dam using finite element method. Aquat. Procedia
**2015**, 4, 876–883. [Google Scholar] [CrossRef] - Jamel, A.A.J. Analysis and estimation of seepage through homogenous earth dam without filter. Diyala J. Eng. Sci.
**2016**, 9, 38–49. [Google Scholar] [CrossRef] - Khassaf, S.I.; Madhloom, A.M. Effect of impervious core on seepage through zoned earth dam (case study: Khassa Chai dam). Int. J. Sci. Eng. Res.
**2017**, 8, 1053–1064. [Google Scholar] - Liu, C.; Shen, Z.; Gan, L.; Xu, L.; Zhang, K.; Jin, T. The seepage and stability performance assessment of a new drainage system to increase the height of a tailings dam. Appl. Sci.
**2018**, 8, 1840. [Google Scholar] [CrossRef] - Athani, S.S.; Solanki, C.; Dodagoudar, G.R.; Shukla, S.K. Finite-element analysis of strains in seepage barriers of the earth dam. Dams Reserv.
**2019**, 29, 87–96. [Google Scholar] [CrossRef] - Al-Nedawi, N.M. Finite element analysis of seepage for Hemrin earth dam using Geo-Studio software. Diyala J. Eng. Sci.
**2020**, 13, 66–76. [Google Scholar] [CrossRef] - Bai, C.; Chai, J.; Xu, Z.; Qin, Y. Numerical Simulation of Drainage Holes and Performance Evaluation of the Seepage Control of Gravity Dam: A Case Study of Heihe Reservoir in China. Arab. J. Sci. Eng.
**2021**, 47, 4801–4819. [Google Scholar] [CrossRef] - Tarinejad, R.; Alizadeh-Arasi, O.; Isari, M.; Foumani, R.S. Investigation of Sabalan Earth Dam Settlement at First Filling by Finite Difference Method. Transp. Infrastruct. Geotechnol.
**2021**, 8, 473–490. [Google Scholar] [CrossRef] - Aghdam, A.T.; Salmasi, F.; Abraham, J.; Arvanaghi, H. Effect of Drain Pipes on Uplift Force and Exit Hydraulic Gradient and the Design of Gravity Dams Using the Finite Element Method. Geotech. Geol. Eng.
**2021**, 39, 3383–3399. [Google Scholar] [CrossRef] - Yuan, S.; Zhong, H. Three dimensional analysis of unconfined seepage in earth dams by the weak form quadrature element method. J. Hydrol.
**2016**, 533, 403–411. [Google Scholar] [CrossRef] - Jing, T.; Yongbiao, L. Penalty function element free method to solve complex seepage field of earth fill dam. IERI Procedia
**2012**, 1, 117–123. [Google Scholar] [CrossRef] - Fallah, A.; Jabbari, E.; Babaee, R. Development of the Kansa method for solving seepage problems using a new algorithm for the shape parameter optimization. Comput. Math. Appl.
**2019**, 77, 815–829. [Google Scholar] [CrossRef] - Sharghi, E.; Nourani, V.; Behfar, N. Implementation of Data Jittering Technique for Seepage Analysis of Earth fill Dam Using Ensemble of AI Models. Water Soil Sci.
**2020**, 30, 29–41. [Google Scholar] - Sharghi, E.; Nourani, V.; Behfar, N. Earthfill dam seepage analysis using ensemble artificial intelligence based modeling. J. Hydroinform.
**2018**, 20, 1071–1084. [Google Scholar] [CrossRef] - Rehamnia, I.; Benlaoukli, B.; Heddam, S. Modeling of Seepage Flow Through Concrete Face Rockfill and Embankment Dams Using Three Heuristic Artificial Intelligence Approaches: A Comparative Study. Environ. Process.
**2020**, 7, 367–381. [Google Scholar] [CrossRef] - Alocén, P.; Fernández-Centeno, M.Á.; Toledo, M.Á. Prediction of Concrete Dam Deformation through the Combination of Machine Learning Models. Water
**2022**, 14, 1133. [Google Scholar] [CrossRef] - Ibañez, S.C.; Dajac, C.V.G.; Liponhay, M.P.; Legara, E.F.T.; Esteban, J.M.H.; Monterola, C.P. Forecasting reservoir water levels using deep neural networks: A case study of Angat Dam in the Philippines. Water
**2021**, 14, 34. [Google Scholar] [CrossRef] - Jiang, D.; Xu, Y.; Lu, Y.; Gao, J.; Wang, K. Forecasting Water Temperature in Cascade Reservoir Operation-Influenced River with Machine Learning Models. Water
**2022**, 14, 2146. [Google Scholar] [CrossRef] - Choi, H.S.; Kim, J.H.; Lee, E.H.; Yoon, S.-K. Development of a Revised Multi-Layer Perceptron Model for Dam Inflow Prediction. Water
**2022**, 14, 1878. [Google Scholar] [CrossRef] - Zhang, X.; Chen, X.; Li, J. Improving dam seepage prediction using back-propagation neural network and genetic algorithm. Math. Probl. Eng.
**2020**, 2020, 1404295. [Google Scholar] [CrossRef] - Nouri, M.; Salmasi, F. Predicting Seepage of Earth Dams using Artificial Intelligence Techniques. J. Irrig. Sci. Eng.
**2019**, 42, 83–97. [Google Scholar] - Nourani, V.; Sharghi, E.; Aminfar, M.H. Integrated ANN model for earthfill dams seepage analysis: Sattarkhan Dam in Iran. Artif. Intell. Res.
**2012**, 1, 22–37. [Google Scholar] [CrossRef] - Yaseen, Z.M.; Naghshara, S.; Salih, S.Q.; Kim, S.; Malik, A.; Ghorbani, M.A. Lake water level modeling using newly developed hybrid data intelligence model. Theor. Appl. Climatol.
**2020**, 141, 1285–1300. [Google Scholar] [CrossRef] - Parsaie, A.; Haghiabi, A.H.; Latif, S.D.; Tripathi, R.P. Predictive modelling of piezometric head and seepage discharge in earth dam using soft computational models. Environ. Sci. Pollut. Res.
**2021**, 28, 60842–60856. [Google Scholar] [CrossRef] [PubMed] - Sani, H.; Roushangar, K.; Ghasempour, R. Comparative study of the performance of finite element method and evolutionary model in seepage discharge predicting from the body of an earth dam. Civ. Infrastruct. Res.
**2019**, 4, 1–15. [Google Scholar] - Roushangar, K.; Garekhani, S.; Alizadeh, F. Forecasting daily seepage discharge of an earth dam using wavelet–mutual information–Gaussian process regression approaches. Geotech. Geol. Eng.
**2016**, 34, 1313–1326. [Google Scholar] [CrossRef] - Chen, S.; Gu, C.; Lin, C.; Wang, Y.; Hariri-Ardebili, M.A. Prediction, monitoring, and interpretation of dam leakage flow via adaptative kernel extreme learning machine. Measurement
**2020**, 166, 108161. [Google Scholar] [CrossRef] - Zhao, M.; Jiang, H.; Chen, S.; Bie, Y. Prediction of Seepage Pressure Based on Memory Cells and Significance Analysis of Influencing Factors. Complexity
**2021**, 2021, 5576148. [Google Scholar] [CrossRef] - El Bilali, A.; Moukhliss, M.; Taleb, A.; Nafii, A.; Alabjah, B.; Brouziyne, Y.; Mazigh, N.; Teznine, K.; Mhamed, M. Predicting daily pore water pressure in embankment dam: Empowering Machine Learning-based modeling. Environ. Sci. Pollut. Res.
**2022**, 29, 47382–47398. [Google Scholar] [CrossRef] - Rehamnia, I.; Benlaoukli, B.; Jamei, M.; Karbasi, M.; Malik, A. Simulation of seepage flow through embankment dam by using a novel extended Kalman filter based neural network paradigm: Case study of Fontaine Gazelles Dam, Algeria. Measurement
**2021**, 176, 109219. [Google Scholar] [CrossRef] - Khan, N.M.; Tingsanchali, T. Optimization and simulation of reservoir operation with sediment evacuation: A case study of the Tarbela Dam, Pakistan. Hydrol. Process.
**2009**, 23, 730–747. [Google Scholar] [CrossRef] - Rafique, A.; Burian, S.; Hassan, D.; Bano, R. Analysis of Operational Changes of Tarbela Reservoir to Improve the Water Supply, Hydropower Generation, and Flood Control Objectives. Sustainability
**2020**, 12, 7822. [Google Scholar] [CrossRef] - Roca, M. Tarbela Dam in Pakistan. Case study of reservoir sedimentation. In River Flow 2012: Proceedings of the International Conference on Fluvial Hydraulics, San José, Costa Rica, 5–7 September 2012; HR Wallingford: Wallingford, UK, 2012. [Google Scholar]
- Murphy, K.P. Probabilistic Machine Learning: An Introduction; MIT Press: Cambridge, MA, USA, 2022. [Google Scholar]
- Salem, H.; Kabeel, A.; El-Said, E.M.; Elzeki, O.M. Predictive modelling for solar power-driven hybrid desalination system using artificial neural network regression with Adam optimization. Desalination
**2022**, 522, 115411. [Google Scholar] [CrossRef] - Karami, H.; DadrasAjirlou, Y.; Jun, C.; Bateni, S.M.; Band, S.S.; Mosavi, A.; Moslehpour, M.; Chau, K.-W. A novel approach for estimation of sediment load in Dam reservoir with hybrid intelligent algorithms. Front. Environ. Sci.
**2022**, 165. [Google Scholar] [CrossRef] - Breiman, L. Random forests. Mach. Learn.
**2001**, 45, 5–32. [Google Scholar] [CrossRef] - Were, K.; Bui, D.T.; Dick, Ø.B.; Singh, B.R. A comparative assessment of support vector regression, artificial neural networks, and random forests for predicting and mapping soil organic carbon stocks across an Afromontane landscape. Ecol. Indic.
**2015**, 52, 394–403. [Google Scholar] [CrossRef] - Oliveira, S.; Oehler, F.; San-Miguel-Ayanz, J.; Camia, A.; Pereira, J.M. Modeling spatial patterns of fire occurrence in Mediterranean Europe using Multiple Regression and Random Forest. For. Ecol. Manag.
**2012**, 275, 117–129. [Google Scholar] [CrossRef] - Naghibi, S.A.; Ahmadi, K.; Daneshi, A. Application of support vector machine, random forest, and genetic algorithm optimized random forest models in groundwater potential mapping. Water Resour. Manag.
**2017**, 31, 2761–2775. [Google Scholar] [CrossRef] - Al-Abadi, A.M.; Shahid, S. Spatial mapping of artesian zone at Iraqi southern desert using a GIS-based random forest machine learning model. Modeling Earth Syst. Environ.
**2016**, 2, 96. [Google Scholar] [CrossRef] - Huang, N.; Lu, G.; Xu, D. A permutation importance-based feature selection method for short-term electricity load forecasting using random forest. Energies
**2016**, 9, 767. [Google Scholar] [CrossRef] - Vapnik, V.N. The Nature of Statistical Learning; Springer: Berlin/Heidelberg, Germany, 1995. [Google Scholar]
- Ke, G.; Meng, Q.; Finley, T.; Wang, T.; Chen, W.; Ma, W.; Ye, Q.; Liu, T.-Y. Lightgbm: A highly efficient gradient boosting decision tree. Adv. Neural Inf. Process. Syst.
**2017**, 30, 3149–3157. [Google Scholar] - Dhieb, N.; Ghazzai, H.; Besbes, H.; Massoud, Y. Extreme gradient boosting machine learning algorithm for safe auto insurance operations. In Proceedings of the 2019 IEEE International Conference on Vehicular Electronics and Safety (ICVES), Cairo, Egypt, 4–6 September 2019; pp. 1–5. [Google Scholar]
- Dorogush, A.V.; Ershov, V.; Gulin, A. CatBoost: Gradient boosting with categorical features support. arXiv
**2018**, arXiv:1810.11363. [Google Scholar] - Prokhorenkova, L.; Gusev, G.; Vorobev, A.; Dorogush, A.V.; Gulin, A. CatBoost: Unbiased boosting with categorical features. Adv. Neural Inf. Process. Syst.
**2018**, 31. [Google Scholar] [CrossRef] - Dorogush, A.V.; Gulin, A.; Gusev, G.; Kazeev, N.; Prokhorenkova, L.O.; Vorobev, A. Fighting biases with dynamic boosting. arXiv
**2017**, arXiv:1706.09516. [Google Scholar] - Lundberg, S.M.; Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst.
**2017**, 30. [Google Scholar] [CrossRef] - Bataineh, M.; Steenhard, D.; Singh, H. Feature Impact for Prediction Explanation. In Proceedings of the ICDM (Posters), New York, NY, USA, 17–21 July 2019; pp. 160–167. [Google Scholar]
- Tallón-Ballesteros, A.; Chen, C. Explainable AI: Using Shapley value to explain complex anomaly detection ML-based systems. In Machine Learning and Artificial Intelligence; IOS Press: Amsterdam, The Netherlands, 2020; Volume 332, p. 152. [Google Scholar]
- Wieland, R.; Lakes, T.; Nendel, C. Using SHAP to interpret XGBoost predictions of grassland degradation in Xilingol, China. Geosci. Model Dev. Discuss.
**2020**, 2020, 1–28. [Google Scholar] - Štrumbelj, E.; Kononenko, I. Explaining prediction models and individual predictions with feature contributions. Knowl. Inf. Syst.
**2014**, 41, 647–665. [Google Scholar] [CrossRef] - Legates, D.R.; McCabe Jr, G.J. Evaluating the use of "goodness-of-fit" measures in hydrologic and hydroclimatic model validation. Water Resour. Res.
**1999**, 35, 233–241. [Google Scholar] [CrossRef] - Moghimi, M.M.; Zarei, A.R. Evaluating performance and applicability of several drought indices in arid regions. Asia-Pac. J. Atmos. Sci.
**2021**, 57, 645–661. [Google Scholar] [CrossRef] - Akhter, M. Dams as a climate change adaptation strategy: Geopolitical implications for Pakistan. Strateg. Anal.
**2015**, 39, 744–748. [Google Scholar] [CrossRef] - Hewitt, K.; Wake, C.P.; Young, G.; David, C. Hydrological investigations at Biafo Glacier, Karakoram Range, Himalaya; An important source of water for the Indus River. Ann. Glaciol.
**1989**, 13, 103–108. [Google Scholar] [CrossRef] - Yaseen, M.; Latif, Y.; Waseem, M.; Leta, M.K.; Abbas, S.; Akram Bhatti, H. Contemporary Trends in High and Low River Flows in Upper Indus Basin, Pakistan. Water
**2022**, 14, 337. [Google Scholar] [CrossRef]

**Figure 1.**Satellite image of Tarbela Dam site used for this study. (

**A**) Tarbela dam satellite overview, (

**B**) Tarbela power house, (

**C**) Tarbela dam main abutment.

**Figure 3.**Artificial Neural Network (ANN) Architecture for hydro-climatological variable prediction schema.

**Figure 8.**Comparison of Average seepage (Actual vs. Predicted) for (

**a**) RF, (

**b**) CatBoost, (

**c**) ANN and (

**d**) SVM Models.

S. No | Input Parameters | Unit | Duration | Output |
---|---|---|---|---|

1 | Water Inflow | ft^{3}/s | 2003–2015 | Average Seepage (m^{3}/s) |

2 | Temperature | °C | 2003–2015 | |

3 | Precipitation | Inches | 2003–2015 | |

4 | Reservoir Level | Feet | 2003–2015 | |

5 | Sediment Inflow | Tons | 2003–2015 |

Algorithms | Python Module | Function | Symbols |
---|---|---|---|

Random Forest | sklearn. ensemble | Random Forest Regressor | RF |

CatBoost | CatBoost. regression | Catboost Regressor | CB |

Artificial Neural Network | keras. models. Sequential | Dense | ANN |

Support Vector Machine | sklearn.SVC | SVR | SVR |

Water Inflow | Reservoir Level | Temperature | Sediment Inflow | Precipitation | Average Seepage | |
---|---|---|---|---|---|---|

(ft^{3}/s) | (ft) | (°C) | (Tn) | (In) | (m^{3}/s) | |

Training Data | ||||||

Count | 109 | 109 | 109 | 109 | 109 | 109 |

Mean | 72.17 | 1449.13 | 18.34 | 9,363,574.31 | 82.64 | 8.17 |

Std | 74.68 | 51.83 | 7.18 | 19,964,062.22 | 76.41 | 3.63 |

Min | 13.21 | 1368.22 | 7.2 | 23,250.51 | 2.6 | 2.49 |

Max | 277.08 | 1549.79 | 29.6 | 137,865,137.40 | 391 | 23.11 |

Validation Data | ||||||

Count | 24 | 24 | 24 | 24 | 24 | 24 |

Mean | 115.57 | 1494.62 | 21.96 | 23,653,407.21 | 119.14 | 10.18 |

Std | 93.30 | 50.78 | 6.33 | 44,614,740.11 | 120.67 | 3.66 |

Min | 16.57 | 1386.45 | 8.1 | 63,171.09 | 20.4 | 3.35 |

Max | 357.51 | 1549.92 | 28.2 | 167,516,137.8 | 491.6 | 17.33 |

Testing Data | ||||||

Count | 23 | 23 | 23 | 23 | 23 | 23 |

Mean | 94.03 | 1466.60 | 22.29 | 9,694,283.85 | 86.06 | 9.28 |

Std | 82.97 | 63.05 | 5.49 | 14,439,229.52 | 63.65 | 4.94 |

Min | 16.72 | 1365.32 | 9.7 | 43,243.74 | 1.3 | 3.04 |

Max | 296.30 | 1547.92 | 29 | 48,530,269.8 | 268.9 | 21.95 |

Model | R^{2} Training | R^{2} Validation | R^{2} Testing | RMSE Training | RMSE Validation | RMSE Testing |
---|---|---|---|---|---|---|

(m^{3}/h) | (m^{3}/h) | (m^{3}/h) | (m^{3}/h) | (m^{3}/h) | (m^{3}/h) | |

Random Forest Regression | 0.884 | 0.801 | 0.782 | 0.059 | 0.077 | 0.109 |

Catboost Regression | 0.978 | 0.805 | 0.773 | 0.025 | 0.076 | 0.111 |

Artificial Neural Network | 0.981 | 0.835 | 0.715 | 0.245 | 0.249 | 0.297 |

Support Vector Regression | 0.591 | 0.794 | 0.783 | 0.111 | 0.078 | 0.108 |

**Table 5.**Prediction accuracy result of Nash-Sutcliffe Efficiency for different Machine learning Models.

Nash-Sutcliffe Efficiency | Random Forest | CatBoost | SVR | ANN |
---|---|---|---|---|

Training | 0.884 | 0.884 | 0.592 | −212.980 |

Validation | 0.803 | 0.806 | 0.799 | −48.289 |

Testing | 0.783 | 0.775 | 0.784 | −34.416 |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Ishfaque, M.; Salman, S.; Jadoon, K.Z.; Danish, A.A.K.; Bangash, K.U.; Qianwei, D. Understanding the Effect of Hydro-Climatological Parameters on Dam Seepage Using Shapley Additive Explanation (SHAP): A Case Study of Earth-Fill Tarbela Dam, Pakistan. *Water* **2022**, *14*, 2598.
https://doi.org/10.3390/w14172598

**AMA Style**

Ishfaque M, Salman S, Jadoon KZ, Danish AAK, Bangash KU, Qianwei D. Understanding the Effect of Hydro-Climatological Parameters on Dam Seepage Using Shapley Additive Explanation (SHAP): A Case Study of Earth-Fill Tarbela Dam, Pakistan. *Water*. 2022; 14(17):2598.
https://doi.org/10.3390/w14172598

**Chicago/Turabian Style**

Ishfaque, Muhammad, Saad Salman, Khan Zaib Jadoon, Abid Ali Khan Danish, Kifayat Ullah Bangash, and Dai Qianwei. 2022. "Understanding the Effect of Hydro-Climatological Parameters on Dam Seepage Using Shapley Additive Explanation (SHAP): A Case Study of Earth-Fill Tarbela Dam, Pakistan" *Water* 14, no. 17: 2598.
https://doi.org/10.3390/w14172598