Ground-Type Classification from Earth-Pressure-Balance Shield Operational Data with Uncertainty Quantification
Abstract
1. Introduction
2. Modelling Background and Methodology
2.1. Categorical Boosting (CatBoost)
2.2. Optuna
2.3. Information Driven Metaheuristic Optimization (OriginalINFO/INFO)
2.4. Nevergrad Optimizer (NGOpt)
2.5. Hybrid Modelling Procedure
- Step 1: Data partitioning. The dataset is split 80:20 into a training set for learning the classification rule and a test set for evaluating generalization on unseen samples. Standardization and linear orthogonalization are fitted on the training set only and then applied to the test set using the same parameters to rigorously prevent information leakage (see Section 3 for preprocessing details). Stratified sampling is used during the split so that class proportions remain consistent across the two subsets, which mitigates bias in performance estimates for imbalanced data.
- Step 2: Objective function design. To obtain a more reliable estimate of generalization under limited sample sizes, the optimization objective is defined by cross-validation, as presented in Figure 2. Specifically, the mean accuracy over fivefold cross-validation during training serves as the objective value (fitness value) at each trial:
- Step 3: Model initialization. To fully explore the performance envelope of CatBoost, three hyperparameters are tuned: the number of trees/boosting iterations (iterations), tree depth (max_depth), and learning rate (learning_rate). These jointly govern model capacity and learning dynamics. Boosting iterations controls the number of base learners and directly affects fit and training time; max_depth determines the complexity of each tree and is closely related to the ability to capture higher-order nonlinearities as well as the risk of overfitting; learning_rate sets the update step of each boosting round and balances convergence speed against generalization. The ranges and default settings are summarized in Table 1.
- Step 4: Iterative optimization. After initialization, an iterative search is conducted to identify hyperparameter combinations that better serve the objective. The optimization loop proceeds over candidate configurations and terminates when the maximum number of trials reaches 200, yielding the best setting for subsequent final training and evaluation, as displayed in Figure 3.
3. Dataset Description
3.1. Data Source and Feature Analysis
3.2. PCA
4. Results and Discussion
4.1. Selection of the Optimal Model
4.2. Model Uncertainty Analysis
5. Limitations and Prospects
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Broere, W. Urban underground space: Solving the problems of today’s cities. Tunn. Undergr. Space Technol. 2016, 55, 245–248. [Google Scholar] [CrossRef]
- Yu, P.; Liu, H.; Wang, Z.; Fu, J.; Zhang, H.; Wang, J.; Yang, Q. Development of urban underground space in coastal cities in China: A review. Deep Undergr. Sci. Eng. 2023, 2, 148–172. [Google Scholar] [CrossRef]
- Guglielmetti, V.; Grasso, P.; Mahtab, A.; Xu, S. Mechanized Tunnelling in Urban Areas: Design Methodology and Construction Control; CRC Press: London, UK, 2008. [Google Scholar] [CrossRef]
- Maidl, B.; Herrenknecht, M.; Maidl, U.; Wehrmeyer, G. Mechanised Shield Tunnelling, 2nd ed.; Ernst & Sohn: Berlin, Germany, 2012. [Google Scholar] [CrossRef]
- Anagnostou, G.; Kovári, K. Face stability conditions with earth-pressure-balanced shields. Tunn. Undergr. Space Technol. 1996, 11, 165–173. [Google Scholar] [CrossRef]
- Peila, D. Soil conditioning for EPB shield tunnelling. KSCE J. Civ. Eng. 2014, 18, 831–836. [Google Scholar] [CrossRef]
- Peila, D.; Martinelli, D.; Todaro, C.; Luciani, A. Soil conditioning in EPB shield tunnelling—An overview of laboratory tests. Geomech. Tunn. 2019, 12, 491–498. [Google Scholar] [CrossRef]
- Tang, S.; Zhang, X.; Liu, Q.; Zhang, Q.; Li, X.; Wang, H. Experimental study on the influences of cutter geometry and material on scraper wear during shield TBM tunnelling in abrasive sandy ground. J. Rock Mech. Geotech. Eng. 2024, 16, 410–425. [Google Scholar] [CrossRef]
- Elbaz, K.; Shen, S.L.; Cheng, W.C.; Arulrajah, A. Cutter-disc consumption during earth-pressure-balance tunnelling in mixed strata. Geotech. Eng. 2018, 171, 363–376. [Google Scholar] [CrossRef]
- Ren, D.J.; Shen, S.L.; Zhou, A.; Chai, J.C. Prediction of lateral continuous wear of cutter ring in soft ground with quartz sand. Comput. Geotech. 2018, 103, 86–92. [Google Scholar] [CrossRef]
- Mucha, K. Application of rock abrasiveness and rock abrasivity test methods—A review. Sustainability 2023, 15, 11243. [Google Scholar] [CrossRef]
- Sun, Z.; Zhao, H.; Hong, K.; Chen, K.; Zhou, J.; Li, F.; Zhang, B.; Song, F.; Yang, Y.; He, R. A practical TBM cutter wear prediction model for disc cutter life and rock wear ability. Tunn. Undergr. Space Technol. 2019, 85, 92–99. [Google Scholar] [CrossRef]
- Li, S.; Liu, B.; Xu, X.; Nie, L.; Liu, Z.; Song, J.; Sun, H.; Chen, L.; Fan, K. An overview of ahead geological prospecting in tunneling. Tunn. Undergr. Space Technol. 2017, 63, 69–94. [Google Scholar] [CrossRef]
- Zaki, N.F.M.; Ismail, M.A.M.; Abidin, M.H.Z.; Madun, A. Geological prediction ahead of tunnel face in limestone formation tunnel using multi-modal geophysical surveys. J. Phys. Conf. Ser. 2018, 995, 012114. [Google Scholar] [CrossRef]
- Abate, G.; Catalano, E.; Ippolito, F.; Spagnoli, G. An early-warning system to validate the soil profile during TBM tunnelling by applying the HVSR method to TBM-induced microtremors. Geosciences 2022, 12, 113. [Google Scholar] [CrossRef]
- Yang, T.; Wen, T.; Huang, X.; Liu, B.; Shi, H.; Liu, S.; Peng, X.; Sheng, G. Predicting Model of Dual-Mode Shield Tunneling Parameters in Complex Ground Using Recurrent Neural Networks and Multiple Optimization Algorithms. Appl. Sci. 2024, 14, 581. [Google Scholar] [CrossRef]
- Huang, X.; Liu, Q.; Liu, H.; Zhang, P.; Pan, S.; Zhang, X.; Fang, J. Development and in-situ application of a real-time monitoring system for the interaction between TBM and surrounding rock. Tunn. Undergr. Space Technol. 2018, 81, 187–208. [Google Scholar] [CrossRef]
- Liu, J.; Li, S.; Wang, Y.; Wang, X.; Sun, Z. Application of specific energy in evaluation of geological conditions ahead of tunnel face. Energies 2020, 13, 909. [Google Scholar] [CrossRef]
- Cardu, M.; Coragliotto, M.; Oreste, P.; Papini, M. Performance analysis of tunnel boring machines for rock excavation. Appl. Sci. 2021, 11, 2794. [Google Scholar] [CrossRef]
- Zhou, X.; Zhang, Y.; Wang, W.; Li, X.; Lin, J. Performance evaluation of TBM using an improved load reversal method. Machines 2023, 11, 141. [Google Scholar] [CrossRef]
- Teale, R. The concept of specific energy in rock drilling. Int. J. Rock Mech. Min. Sci. Geomech. Abstr. 1965, 2, 57–73. [Google Scholar] [CrossRef]
- Yu, H.; Mooney, M.A. Characterizing the as-encountered ground condition with tunnel boring machine data using semi-supervised learning. Comput. Geotech. 2023, 154, 105159. [Google Scholar] [CrossRef]
- Zhao, D.; He, Y.; Chen, X.; Wang, J.; Liu, Y.; Zhang, Q.; Bai, J.; Liu, R. Data-driven intelligent prediction of TBM surrounding rock and personalized evaluation of disaster-inducing factors. Tunn. Undergr. Space Technol. 2024, 148, 105768. [Google Scholar] [CrossRef]
- Huang, Y.; Hu, X.; Pang, S.; Fu, W.; Chang, S.; Gao, B.; Hua, W. TBM enclosure rock grade prediction method based on multi-source feature fusion. Appl. Sci. 2025, 15, 6684. [Google Scholar] [CrossRef]
- Feng, S.; Wang, S. Theoretical considerations of field penetration index model and its application in TBM performance prediction. Geomech. Geophys. Geo Energ. Geo Resour. 2023, 9, 84. [Google Scholar] [CrossRef]
- Sun, M.; Chen, S.; He, H.; Wang, W.; Song, K.; Lin, X. Classification and prediction of rock mass drillability for a tunnel boring machine based on operational data mining. Front. Earth Sci. 2024, 12, 1518844. [Google Scholar] [CrossRef]
- She, L.; Hu, C.; Li, Y.; Hu, M.; Liu, Z.; Lei, F.; Wang, X.; Li, J. An empirical method for estimating TBM penetration rate using tunnelling specific energy. Tunn. Undergr. Space Technol. 2024, 144, 105525. [Google Scholar] [CrossRef]
- Zhang, Q.; Liu, Z.; Tan, J. Prediction of geological conditions for a tunnel boring machine using big operational data. Autom. Constr. 2019, 100, 73–83. [Google Scholar] [CrossRef]
- Zhao, J.; Shi, M.; Hu, G.; Song, X.; Zhang, C.; Tao, D.; Wu, W. A Data-Driven Framework for Tunnel Geological-Type Prediction Based on TBM Operating Data. IEEE Access 2019, 7, 66703–66713. [Google Scholar] [CrossRef]
- Jung, J.H.; Chung, H.; Kwon, Y.S.; Lee, I.M. An ANN to predict ground condition ahead of tunnel face using TBM operational data. KSCE J. Civ. Eng. 2019, 23, 3200–3206. [Google Scholar] [CrossRef]
- Liu, Q.; Wang, X.; Huang, X.; Yin, X. Prediction model of rock mass class using classification and regression tree integrated AdaBoost algorithm based on TBM driving data. Tunn. Undergr. Space Technol. 2020, 106, 103595. [Google Scholar] [CrossRef]
- Liu, S.; Yang, K.; Cai, J.; Zhou, S.; Zhang, Q. Prediction of Geological Parameters during Tunneling by Time Series Analysis on In Situ Data. Comput. Intell. Neurosci. 2021, 2021, 3904273. [Google Scholar] [CrossRef]
- Yu, H.; Tao, J.; Qin, C.; Xiao, D.; Sun, H.; Liu, C. Rock mass type prediction for tunnel boring machine using a novel semi-supervised method. Measurement 2021, 179, 109545. [Google Scholar] [CrossRef]
- Hou, S.; Liu, Y.; Yang, Q. Real-time prediction of rock mass classification based on TBM operation big data and stacking technique of ensemble learning. J. Rock Mech. Geotech. Eng. 2022, 14, 123–143. [Google Scholar] [CrossRef]
- Yan, T.; Shen, S.L.; Zhou, A.; Chen, X. Prediction of geological characteristics from shield operational parameters by integrating grid search and K-fold cross validation into stacking classification algorithm. J. Rock Mech. Geotech. Eng. 2022, 14, 1292–1303. [Google Scholar] [CrossRef]
- Yan, T. Data on prediction of geological characteristics during shield tunnelling in mixed soil and rock ground. Data Brief 2022, 45, 108726. [Google Scholar] [CrossRef]
- Fu, X.; Wu, M.; Tiong, R.L.K.; Zhang, L. Data-driven real-time advanced geological prediction in tunnel construction using a hybrid deep learning approach. Autom. Constr. 2023, 146, 104672. [Google Scholar] [CrossRef]
- Pan, Y.; Wu, M.; Zhang, L.; Chen, J. Time series clustering-enabled geological condition perception in tunnel boring machine excavation. Autom. Constr. 2023, 153, 104954. [Google Scholar] [CrossRef]
- Katuwal, T.B.; Panthi, K.K.; Basnet, C.B. Machine Learning Approach for Rock Mass Classification with Imbalanced Database of TBM Tunnelling in Himalayan Geology. Rock Mech. Rock Eng. 2024, 58, 11293–11318. [Google Scholar] [CrossRef]
- Chen, X.; Zhang, J.; Hu, Y.; Wang, W.; Liu, Y. Random large-deformation modelling on face stability considering dynamic excavation process during tunnelling through spatially variable soils. Can Geotech J. 2025, 62, 1–21. [Google Scholar] [CrossRef]
- Prokhorenkova, L.; Gusev, G.; Vorobev, A.; Dorogush, A.V.; Gulin, A. CatBoost: Unbiased Boosting with Categorical Features. Adv. Neural Inf. Process. Syst. 2018, 31, 6639–6649. [Google Scholar]
- Bergstra, J.S.; Bardenet, R.; Bengio, Y.; Kégl, B. Algorithms for hyper-parameter optimization. Adv. Neural Inf. Process. Syst. 2011, 24, 2546–2554. [Google Scholar]
- Akiba, T.; Sano, S.; Yanase, T.; Ohta, T.; Koyama, M. Optuna: A next-generation hyperparameter optimization framework. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD’19), Anchorage, AK, USA, 4–8 August 2019; pp. 2623–2631. [Google Scholar] [CrossRef]
- Ahmadianfar, I.; Heidari, A.A.; Noshadian, S.; Chen, H.; Gandomi, A.H. INFO: An efficient optimization algorithm based on weighted mean of vectors. Expert Syst. Appl. 2022, 195, 116516. [Google Scholar] [CrossRef]
- Trajanov, R.; Nikolikj, A.; Cenikj, G.; Teytaud, F.; Videau, M.; Teytaud, O.; Eftimov, T.; López-Ibáñez, M.; Doerr, C. Improving Nevergrad’s algorithm selection wizard NGOpt through automated algorithm configuration. In Parallel Problem Solving from Nature–PPSN XVII; Bäck, T., Preuss, M., Deutz, A., Eds.; Lecture Notes in Computer Science; Springer: Cham, Switzerland, 2022; Volume 13398, pp. 18–31. [Google Scholar] [CrossRef]
- Huang, S.; Zhou, J. Refined Approaches for Open Stope Stability Analysis in Mining Environments: Hybrid SVM Model with Multi-Optimization Strategies and GP Technique. Rock Mech. Rock Eng. 2024, 57, 9781–9804. [Google Scholar] [CrossRef]
- Schober, P.; Boer, C.; Schwarte, L.A. Correlation Coefficients: Appropriate Use and Interpretation. Anesth. Analg. 2018, 126, 1763–1768. [Google Scholar] [CrossRef] [PubMed]
- Hauke, J.; Kossowski, T. Comparison of Values of Pearson’s and Spearman’s Correlation Coefficients on the Same Sets of Data. Quaest. Geogr. 2011, 30, 87–93. [Google Scholar] [CrossRef]
- Ma, T.; Jin, Y.; Liu, Z.; Prasad, Y.K. Research on Prediction of TBM Performance of Deep-Buried Tunnel Based on Machine Learning. Appl. Sci. 2022, 12, 6599. [Google Scholar] [CrossRef]
- Hotelling, H. Analysis of a Complex of Statistical Variables into Principal Components. J. Educ. Psychol. 1933, 24, 417–441. [Google Scholar] [CrossRef]
- Jolliffe, I.T.; Cadima, J. Principal Component Analysis: A Review and Recent Developments. Philos. Trans. R. Soc. A Math. Phys. Eng. Sci. 2016, 374, 20150202. [Google Scholar] [CrossRef]
- Sokolova, M.; Lapalme, G. A Systematic Analysis of Performance Measures for Classification Tasks. Inf. Process. Manag. 2009, 45, 427–437. [Google Scholar] [CrossRef]
- Breiman, L.; Friedman, J.H.; Olshen, R.A.; Stone, C.J. Classification and Regression Trees; Wadsworth: Belmont, CA, USA, 1984. [Google Scholar]
- Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
- Geurts, P.; Ernst, D.; Wehenkel, L. Extremely Randomized Trees. Mach. Learn. 2006, 63, 3–42. [Google Scholar] [CrossRef]
- Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; ACM: New York, NY, USA, 2016; pp. 785–794. [Google Scholar] [CrossRef]
- Friedman, J.H.; Popescu, B.E. Predictive Learning via Rule Ensembles. Ann. Appl. Stat. 2008, 2, 916–954. [Google Scholar] [CrossRef]
- Fawcett, T. An Introduction to ROC Analysis. Pattern Recognit. Lett. 2006, 27, 861–874. [Google Scholar] [CrossRef]
- Efron, B. Bootstrap Methods: Another Look at the Jackknife. Ann. Stat. 1979, 7, 1–26. [Google Scholar] [CrossRef]
- Liu, M.-B.; Liao, S.-M.; Men, Y.-Q.; Xing, H.-T.; Liu, H.; Sun, L.-Y. Field Monitoring of TBM Vibration During Excavating Changing Stratum: Patterns and Ground Identification. Rock Mech. Rock Eng. 2022, 55, 1481–1498. [Google Scholar] [CrossRef]
- Metropolis, N.; Ulam, S. The Monte Carlo Method. J. Am. Stat. Assoc. 1949, 44, 335–341. [Google Scholar] [CrossRef] [PubMed]
- Niculescu-Mizil, A.; Caruana, R. Predicting Good Probabilities with Supervised Learning. In Proceedings of the 22nd International Conference on Machine Learning (ICML 2005), Bonn, Germany, 7–11 August 2005; pp. 625–632. [Google Scholar] [CrossRef]
- Guo, C.; Pleiss, G.; Sun, Y.; Weinberger, K.Q. On Calibration of Modern Neural Networks. In Proceedings of the 34th International Conference on Machine Learning (ICML 2017), Sydney, Australia, 6–11 August 2017; Volume 70, pp. 1321–1330. [Google Scholar] [CrossRef]
















| Hyperparameter | Boosting Iterations | Max_Depth | Learning_Rate |
|---|---|---|---|
| Search range | [50, 500] | [1, 10] | [0.001, 0.3] |
| Variable | Mean | Std | Min | 25% | Median | 75% | Max |
|---|---|---|---|---|---|---|---|
| CRS (rpm) | 1.43 | 0.18 | 0.90 | 1.30 | 1.40 | 1.50 | 2.00 |
| AR (mm/min) | 22.36 | 14.22 | 3.00 | 10.00 | 15.00 | 35.00 | 65.00 |
| MF (kN/m2) | 453.68 | 170.94 | 66.91 | 323.91 | 425.79 | 577.86 | 912.41 |
| MT (kN/m2) | 6.67 | 3.40 | 1.57 | 3.92 | 5.87 | 9.14 | 16.97 |
| UEP (MPa) | 0.16 | 0.08 | 0.00 | 0.12 | 0.17 | 0.21 | 0.33 |
| LEP (MPa) | 0.20 | 0.09 | 0.00 | 0.15 | 0.20 | 0.26 | 0.40 |
| PR (mm/r) | 16.35 | 11.27 | 2.31 | 7.14 | 11.54 | 25.00 | 50.00 |
| FPI (-) | 3032.61 | 2367.19 | 327.28 | 878.38 | 2550.00 | 4500.00 | 16,000.00 |
| TPI (-) | 542.01 | 475.61 | 48.75 | 127.50 | 443.48 | 780.00 | 3380.00 |
| SE (kW·h/m3) | 14.50 | 12.64 | 1.37 | 3.49 | 11.90 | 20.82 | 89.87 |
| Principal Component (PC) | Top-1 Feature (|Loading|) | Top-2 Feature (|Loading|) | Top-3 Feature (|Loading|) |
|---|---|---|---|
| PC1 | FPI (0.3846) | SE (0.3829) | TPI (0.3825) |
| PC2 | UEP (0.5012) | LEP (0.4471) | CRS (0.4034) |
| PC3 | CRS (0.6843) | LEP (0.3166) | UEP (0.3160) |
| PC4 | CRS (0.5869) | AR (0.5100) | PR (0.4235) |
| PC5 | MT (0.6808) | FPI (0.5996) | PR (0.2289) |
| Classifier | Acc | Pre | Rec |
|---|---|---|---|
| NGopt-CatBoost | 0.9979 | 0.9963 | 0.9984 |
| INFO-CatBoost | 0.9979 | 0.9942 | 0.9984 |
| Optuna-CatBoost | 0.9948 | 0.9898 | 0.9939 |
| CatBoost | 0.9854 | 0.9854 | 0.9854 |
| RF | 0.9812 | 0.9812 | 0.9812 |
| XGBoost | 0.9822 | 0.9822 | 0.9822 |
| DT | 0.9833 | 0.9833 | 0.9833 |
| ET | 0.9540 | 0.9540 | 0.9540 |
| RuleFit | 0.9833 | 0.9833 | 0.9833 |
| Classifier | Acc | Pre | Rec |
|---|---|---|---|
| NGopt-CatBoost | 0.9625 | 0.9715 | 0.9716 |
| INFO-CatBoost | 0.9542 | 0.9654 | 0.9570 |
| Optuna-CatBoost | 0.9500 | 0.9621 | 0.9622 |
| CatBoost | 0.9292 | 0.9292 | 0.9292 |
| RF | 0.9167 | 0.9167 | 0.9167 |
| XGBoost | 0.9125 | 0.9125 | 0.9125 |
| DT | 0.8917 | 0.8917 | 0.8917 |
| ET | 0.8958 | 0.8958 | 0.8958 |
| RuleFit | 0.8708 | 0.8708 | 0.8708 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Huang, S.; Chen, Y.; Khandelwal, M.; Zhou, J. Ground-Type Classification from Earth-Pressure-Balance Shield Operational Data with Uncertainty Quantification. Appl. Sci. 2025, 15, 13234. https://doi.org/10.3390/app152413234
Huang S, Chen Y, Khandelwal M, Zhou J. Ground-Type Classification from Earth-Pressure-Balance Shield Operational Data with Uncertainty Quantification. Applied Sciences. 2025; 15(24):13234. https://doi.org/10.3390/app152413234
Chicago/Turabian StyleHuang, Shuai, Yuxin Chen, Manoj Khandelwal, and Jian Zhou. 2025. "Ground-Type Classification from Earth-Pressure-Balance Shield Operational Data with Uncertainty Quantification" Applied Sciences 15, no. 24: 13234. https://doi.org/10.3390/app152413234
APA StyleHuang, S., Chen, Y., Khandelwal, M., & Zhou, J. (2025). Ground-Type Classification from Earth-Pressure-Balance Shield Operational Data with Uncertainty Quantification. Applied Sciences, 15(24), 13234. https://doi.org/10.3390/app152413234

