A KPCA-ISSA-SVM Hybrid Model for Identifying Sources of Mine Water Inrush Using Hydrochemical Indicators
Abstract
1. Introduction
1.1. Traditional Approach
1.2. Machine Learning
1.3. Study Content and Aim
2. Materials and Methods
2.1. K–Means Cluster Analysis (KCA)
- (1)
- Initialization: Select a preset number of clusters K, and randomly choose K data points as the initial cluster centers.
- (2)
- Assignment step: Assign each data point to the nearest cluster center, typically using the Euclidean distance method to determine the distance between the points x = (x1, x2, …, xd) and centers c = (c1, c2, …, cd). The mathematical expression is as follows:
- (3)
- Update step: Calculate the mean (i.e., centroid) of all data points within each cluster and set this mean as the new cluster center.
- (4)
- Repetition step: Repeat the assignment and update steps until the cluster centers no longer change or change slightly, or reach the maximum number of iterations.
2.2. Kernel Principal Component Analysis (KPCA)
- (1)
- The KPCA maps the raw water chemistry characterization data to the high-dimensional space φ, forming new data φ(ei) = [φ(e1), φ(e2),……, φ(en)]. We assume that the samples in the high-dimensional space have shown a trend of centralization, and the covariance matrix is as follows:
- (2)
- By introducing the kernel function K* = φTφ, the data in S is solved by principal component analysis:
- (3)
- The cumulative contribution rate is set as 95%, in descending order, and takes the first m eigenvalues with their corresponding eigenvectors (j = 1, 2,…, m):
- (4)
- The nonlinear samples H from the dimension reduction mapping are counted when the cumulative contribution rate meets the set requirements:
2.3. Support Vector Machine Optimized by Improved Sparrow Search Algorithm (ISSA-SVM)
3. Application Case
4. Results and Discussion
4.1. K–Means Clustering Results
4.2. Correlation Analysis of Hydro-Chemical Indicators
4.3. MWISI Results Based on the KPCA-ISSA-SVM Model
4.4. Comparative Analysis of Different Predicted Models
5. Conclusions
- (1)
- Nine hydro-chemical indicators include Ca2+, Mg2+, K++Na+, HCO3−, Cl−, SO42−, total hardness (TH), alkalinity (Alk.), and pH. Statistical analysis reveals significant correlations between ions, such as K++, Na+, Ca2+, and TH. Utilizing the KPCA to realize dimensionality reduction is necessary for overcoming information redundancy. Six new features are extracted from raw data with an information content of 95%.
- (2)
- The optimized KPCA-ISSA-SVM model is trained with 49 water samples, and the results show that the model has a better fitting capability. In addition, the prediction accuracy of the testing water samples using the trained model is 90.476% (=19/21).
- (3)
- A comparative study is conducted to evaluate the KPCA-ISSA-SVM model against three benchmark models (SVM, SSA-SVM, and ISSA-SVM) through seven evaluation metrics of accuracy, P, R, F1-score, K, MCC, and G-mean. The results show that the KPCA-ISSA-SVM model demonstrated significantly higher values in seven evaluation indexes, suggesting that it outperforms the other benchmark models.
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Appendix A
Study (Author, Year) | Machine Learning Techniques Applied | Data Types | Application Case | Performance |
---|---|---|---|---|
Wei et al. (2022) [14] | PCSOM-GWOSVM | hydrochemical data | Zhaogezhuang mine | discrimination time = 1.1255 s |
Wang et al. (2023) [15] | KPCA-ISSA-KELM | hydrochemical data | accuracy increased by 4.17% | |
Chen et al. (2022) [25] | BP–Fisher | hydrochemical data | Luxi mining area | verification by hydrological observation holes |
Ji et al. (2022) [26] | PSO-LightGBM | hydrochemical data | Donghuatuo mine | highest accuracy of 97.22% |
Jiang et al. (2024) [27] | FA-PSO-BP | hydrochemical data | Gubei mine | accuracy = 100% |
Cui et al. (2025) [31] | SIOA-GWO-DFNN | hydrochemical data | Bingchang mining area | accuracy = 92.5% |
Fang et al. (2022) [32] | ELM-CNN | spectral data/electrical conductivity | No.2 mine | accuracy = 86.0% |
Bi et al. (2024) [33] | MFO-LSSVM | hydrochemical data | Yuanyi mine | accuracy = 94.1% |
Lin et al. (2021) [34] | IGA-ELM | hydrochemical data | Zhaogezhuang mining area | accuracy = 95.0% |
Dong et al. (2024) [35] | CSSOA-RF | spectra data | Donghuatuo mine | accuracy = 100% |
Li et al. (2022) [36] | GA-XGBoost | spectra data | Huangyuchuan mine | average accuracy = 94.0% |
References
- Zhao, Y.; Wu, Q.; Chen, T.; Zhang, X.; Du, Y.; Yao, Y. Location and flux discrimination of water inrush using its spreading process in underground coal mine. Safety Sci. 2020, 124, 104566. [Google Scholar] [CrossRef]
- Wu, M.; Ye, Y.; Hu, N.; Wang, Q.; Tan, W. Visualization analysis and progress of mine water inrush disaster-related research. Mine Water Environ. 2022, 41, 599–613. [Google Scholar] [CrossRef]
- Dong, S.; Zheng, L.; Tang, S.; Shi, P. A scientometric analysis of trends in coal mine water inrush prevention and control for the period 2000–2019. Mine Water Environ. 2020, 39, 3–12. [Google Scholar] [CrossRef]
- Meng, Z.; Li, G.; Xie, X. A geological assessment method of floor water inrush risk and its application. Eng. Geol. 2012, 143, 51–60. [Google Scholar] [CrossRef]
- Ji, Y.; Yu, L.; Wei, Z.; Ding, J.; Dong, D. Research progress on identification of mine water inrush sources: A visual analysis perspective. Mine Water Environ. 2025, 44, 3–15. [Google Scholar] [CrossRef]
- Yin, H.; Zhao, H.; Xie, D.; Sang, S.; Shi, Y.; Tian, M. Mechanism of mine water inrush from overlying porous aquifer in Quaternary: A case study in Xinhe coal mine of Shandong Province, China. Arab. J. Geosci. 2019, 12, 163. [Google Scholar] [CrossRef]
- Hou, Z.; Huang, L.; Zhang, S.; Han, X.; Xu, J.; Li, Y. Identification of groundwater hydrogeochemistry and the hydraulic connections of aquifers in a complex coal mine. J. Hydrol. 2024, 628, 130496. [Google Scholar] [CrossRef]
- Lu, C.; Cheng, W.; Yin, H.; Li, S.; Zhang, Y.; Dong, F.; Cheng, Y.; Zhang, X. Study on inverse geochemical modeling of hydrochemical characteristics and genesis of groundwater system in coal mine area—A case study of Longwanggou coal mine in Ordos Basin. Environ. Sci. Pollut. Res. 2024, 31, 16583–16600. [Google Scholar] [CrossRef]
- Li, P.; Wei, J.; Xu, J.; Li, F.; Liu, B.; Zheng, Y.; Chai, J. Simulation of abnormal evolution and source identification of groundwater chemistry in coal-bearing aquifers at Gaohe coal mine, China. Water 2024, 16, 2506. [Google Scholar] [CrossRef]
- Huang, P.; Yang, Z.; Wang, X.; Ding, F. Research on Piper-PCA-Bayes-LOOCV discrimination model of water inrush source in mines. Arab. J. Geosci. 2019, 12, 334. [Google Scholar] [CrossRef]
- Hou, E.; Wen, Q.; Che, X.; Wei, J.; Ye, Z. Study on recognition of mine water sources based on statistical analysis. Arab. J. Geosci. 2020, 13, 5. [Google Scholar] [CrossRef]
- Yan, P.; Li, G.; Wang, W.; Zhao, Y.; Wang, J.; Wen, Z. A mine water source prediction model based on LIF technology and BWO-ELM. J. Fluoresc. 2024, 35, 1063–1078. [Google Scholar] [CrossRef] [PubMed]
- Ma, X.; Yan, P.; Wang, K. Identification of mine water source by random forest combined with laser-induced fluorescence spectra. Front. Environ. Sci. 2024, 12, 1392496. [Google Scholar] [CrossRef]
- Wei, Z.; Dong, D.; Ji, Y.; Ding, J.; Yu, L. Source discrimination of mine water inrush using multiple combinations of an improved support vector machine model. Mine Water Environ. 2022, 41, 1106–1117. [Google Scholar] [CrossRef]
- Wang, W.; Cui, X.; Qi, Y.; Xue, K.; Liang, R.; Sun, Z.; Tao, H. Mine water inrush source discrimination model based on KPCA-ISSA-KELM. PLoS ONE 2024, 19, e0299476. [Google Scholar] [CrossRef]
- Zeng, Y.; Mei, A.; Wu, Q.; Meng, S.; Zhao, D.; Hua, Z. Double verification and quantitative traceability: A solution for mixed mine water sources. J. Hydrol. 2024, 630, 130725. [Google Scholar] [CrossRef]
- Liu, Q.; Sun, Y.; Xu, Z.; Xu, G. Application of the comprehensive identification model in analyzing the source of water inrush. Arab. J. Geosci. 2018, 11, 189. [Google Scholar] [CrossRef]
- Guo, C.; Gao, J.; Wang, S.; Zhang, C.; Li, X.; Guo, J.; Lu, L. Groundwater geochemical variation and controls in coal seams and overlying strata in the Shennan mining area, Shaanxi, China. Mine Water Environ. 2022, 41, 614–628. [Google Scholar] [CrossRef]
- Huang, P.; Gao, H.; Su, Q.; Zhang, Y.; Cui, M.; Chai, S.; Li, Y.; Jin, Y. Identification of mixing water source and response mechanism of radium and radon under mining in limestone of coal seam floor. Sci. Total Environ. 2023, 857, 159666. [Google Scholar] [CrossRef]
- Shi, L.; Ma, X.; Han, J.; Su, B. Identification of limestone aquifer inrush water sources in different geological ages based on trace components. Sustainability 2023, 15, 11646. [Google Scholar] [CrossRef]
- Wu, D.; Wu, J.; Wei, C.; Gao, X.; Li, B.; Lu, J. Identification and prediction of mixed water sources in adjacent limestone aquifers based on conventional hydrochemistry and strontium isotopes. J. Earth Syst. Sci. 2024, 133, 44. [Google Scholar] [CrossRef]
- Zhong, X.; Wu, Q.; Tang, B.; Wang, Y.; Chen, J.; Zeng, Y. Hydrogeochemical mechanisms and hydraulic connection of groundwaters in the Dongming opencast coal mine, Hailar, Inner Mongolia. Mine Water Environ. 2024, 43, 28–40. [Google Scholar] [CrossRef]
- Silaban, H.; Zarlis, M. Sawaluddin Analysis of accuracy and epoch on back-propagation BFGS Quasi-Newton. J. Phys. Conf. Ser. 2017, 930, 012006. [Google Scholar] [CrossRef]
- Asadisaghandi, J.; Tahmasebi, P. Comparative evaluation of back-propagation neural network learning algorithms and empirical correlations for prediction of oil PVT properties in Iran oilfields. J. Pet. Sci. Eng. 2011, 78, 464–475. [Google Scholar] [CrossRef]
- Chen, Y.; Tang, L.; Zhu, S. Comprehensive study on identification of water inrush sources from deep mining roadway. Environ. Sci. Pollut. Res. 2022, 29, 19608–19623. [Google Scholar] [CrossRef]
- Ji, Y.; Dong, D.; Mei, A.; Wei, Z. Study on key technology of identification of mine water inrush source by PSO-LightGBM. Water Supply 2022, 22, 7416–7429. [Google Scholar] [CrossRef]
- Jiang, Q.; Liu, Q.; Liu, Y.; Chai, H.; Zhu, J. Groundwater chemical characteristic analysis and water source identification model study in Gubei coal mine, Northern Anhui Province, China. Heliyon 2024, 10, e26925. [Google Scholar] [CrossRef] [PubMed]
- Hancock, J.; Khoshgoftaar, T.M. Leveraging LightGBM for categorical big data. In Proceedings of the 2021 IEEE Seventh International Conference on Big Data Computing Service and Applications (BigDataService), Oxford, UK, 23–26 August 2021; pp. 149–154. [Google Scholar]
- Gaurav, A.; Gupta, B.B.; Chui, K.T. Optimized cyber attack detection in iot networks using feature selection and LightGBM. In Proceedings of the 2024 27th International Symposium on Wireless Personal Multimedia Communications (WPMC), Greater Noida, India, 17–20 November 2024; pp. 1–5. [Google Scholar]
- Janizadeh, S.; Thi Kieu Tran, T.; Bateni, S.M.; Jun, C.; Kim, D.; Trauernicht, C.; Heggy, E. Advancing the LightGBM approach with three novel nature-inspired optimizers for predicting wildfire susceptibility in Kauaʻi and Molokaʻi Islands, Hawaii. Expert Syst. Appl. 2024, 258, 124963. [Google Scholar] [CrossRef]
- Cui, M.; Hou, E.; Feng, D.; Che, X.; Xie, X.; Hou, P. Identification of the water inrush source based on the deep learning model for mines in Shaanxi, China. Mine Water Environ. 2025, 44, 133–148. [Google Scholar] [CrossRef]
- Fang, B. Method for quickly identifying mine water inrush using convolutional neural network in coal mine safety mining. Wirel. Pers. Commun. 2022, 127, 945–962. [Google Scholar] [CrossRef]
- Bi, Y.; Shen, S.; Wu, J. An improved LSSVM discrimination model based on factor analysis and moth flame optimization algorithm for identifying water inrush sources across multiple aquifers in mines. Environ. Earth Sci. 2024, 83, 424. [Google Scholar] [CrossRef]
- Lin, G.; Jiang, D.; Dong, D.; Fu, J.; Li, X. A multilevel recognition model of water inrush sources: A case study of the Zhaogezhuang mining area. Mine Water Environ. 2021, 40, 773–782. [Google Scholar] [CrossRef]
- Dong, D.; Meng, F.; Zhang, J.; Zhang, J.; Lin, X. Comprehensive study on the electrical characteristics and full-spectrum tracing of water sources in water-rich coal mines. Water 2024, 16, 2673. [Google Scholar] [CrossRef]
- Li, X.; Dong, D.; Liu, K.; Zhao, Y.; Li, M. Identification of mine mixed water inrush source based on genetic algorithm and XGBoost algorithm: A case study of Huangyuchuan mine. Water 2022, 14, 2150. [Google Scholar] [CrossRef]
- Zhu, Y.; Yu, J.; Jia, C. Initializing K-means clustering using affinity propagation. In International Conference on Hybrid Intelligent Systems (HIS), Proceedings of the 2009 Ninth International Conference on Hybrid Intelligent Systems, Shenyang, China, 12–14 August 2009; Pan, J.S., Li, J., Abraham, A., Eds.; IEEE Computer Society: Los Alamitos, CA, USA; Volume 1, 2009; pp. 338–343. [Google Scholar]
- Kitagawa, Y.; Ishigoka, T.; Azumi, T. Anomaly prediction based on k-means clustering for memory-constrained embedded devices. In International Conference on Machine Learning and Applications (ICMLA), Proceedings of the 2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA), Cancun, Mexico, 18–21 December 2017; Chen, X., Luo, B., Luo, F., Palade, V., Wani, M.A., Eds.; IEEE: New York, NY, USA, 2017; pp. 26–33. [Google Scholar]
- Liu, Z.; Chen, D.; Bensmail, H.; Xu, Y. Clustering gene expression data with kernel principal components. J. Bioinform. Comput. Biol. 2005, 3, 303–316. [Google Scholar] [CrossRef]
- Vo, H.X.; Durlofsky, L.J. Regularized kernel PCA for the efficient parameterization of complex geological models. J. Comput. Phys. 2016, 322, 859–881. [Google Scholar] [CrossRef]
- Vapnik, V.; Izmailov, R. Synergy of monotonic rules. J. Mach. Learn. Res. 2016, 17, 136. [Google Scholar]
- Vapnik, V.; Izmailov, R. Reinforced SVM method and memorization mechanisms. Pattern Recogn. 2021, 119, 108018. [Google Scholar] [CrossRef]
- Zhao, D.; Zeng, Y.; Wu, Q.; Du, X.; Gao, S.; Mei, A.; Zhao, H.; Zhang, Z.; Zhang, X. Source discrimination of mine gushing water using self-organizing feature maps: A case study in Ningtiaota coal mine, Shaanxi, China. Sustainability 2022, 14, 6551. [Google Scholar] [CrossRef]
- Chicco, D.; Jurman, G. The advantages of the Matthews Correlation Coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genom. 2020, 21, 6. [Google Scholar] [CrossRef]
- Grandini, M.; Bagli, E.; Visani, G. Metrics for multi-class classification: An overview. arXiv 2020. [Google Scholar] [CrossRef]
- Tewari, S.; Dwivedi, U.D. A Comparative Study of heterogeneous ensemble methods for the identification of geological lithofacies. J. Petrol. Explor. Prod. Technol. 2020, 10, 1849–1868. [Google Scholar] [CrossRef]
- Prasad, A.; Chandra, S. PhiUSIIL: A diverse security profile empowered phishing URL detection framework based on similarity index and incremental learning. Comput. Secur. 2024, 136, 103545. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Lu, X.; Wang, Q.; Xie, B.; Zhu, J. A KPCA-ISSA-SVM Hybrid Model for Identifying Sources of Mine Water Inrush Using Hydrochemical Indicators. Water 2025, 17, 2859. https://doi.org/10.3390/w17192859
Lu X, Wang Q, Xie B, Zhu J. A KPCA-ISSA-SVM Hybrid Model for Identifying Sources of Mine Water Inrush Using Hydrochemical Indicators. Water. 2025; 17(19):2859. https://doi.org/10.3390/w17192859
Chicago/Turabian StyleLu, Xikun, Qiqing Wang, Baolei Xie, and Jingzhong Zhu. 2025. "A KPCA-ISSA-SVM Hybrid Model for Identifying Sources of Mine Water Inrush Using Hydrochemical Indicators" Water 17, no. 19: 2859. https://doi.org/10.3390/w17192859
APA StyleLu, X., Wang, Q., Xie, B., & Zhu, J. (2025). A KPCA-ISSA-SVM Hybrid Model for Identifying Sources of Mine Water Inrush Using Hydrochemical Indicators. Water, 17(19), 2859. https://doi.org/10.3390/w17192859