AI-Driven Tool Wear Prediction Under Severe Data Scarcity with SHAP-Guided Feature Selection and Fold-Safe Augmentation: A Case Study of Titanium Microdrilling
Abstract
1. Introduction
2. Models and Methodology
2.1. Machine Learning Models for Tool Wear Prediction
2.1.1. Stage 1: Data Acquisition & Screening
2.1.2. Stage 2: Model Development & Feature Engineering
2.1.3. Stage 3: Validation, Tuning & Robustness Analysis
3. Experimentation, Data Acquisition and Screening
3.1. Experimental Setup
3.2. Test Strategies and Data Acquisition
3.3. Flank-Wear Measurement and Screening (VB and BUE)
4. Model Development and Validation, Results and Discussion
4.1. Signal Preprocessing and Dataset Preparation
4.2. Explainable Feature Engineering
4.3. Cross-Validation Method Selection
4.4. Initial Model Selection and Screening
4.5. Weighted Composite Score for Model Selection
4.6. Hyperparameter Tuning Fold-Safe Augmentation Within LOOCV (Leakage-Controlled)
4.6.1. Hyperparameter Tuning
4.6.2. Fold-Safe (Leakage-Controlled) Data Augmentation
4.7. Tuned, Leakage-Controlled, Augmented Selected XGBoost Model Performance
4.8. External Validation on Unseen Tool Geometries
5. Conclusions
- A data-efficient and interpretable AI framework enables reliable VBmax prediction under extreme data scarcity.
- SHAP-based feature reduction improves model stability while preserving physical interpretability.
- XGBoost outperforms SVR and RF when combined with fold-safe augmentation and controlled tuning.
- External validation confirms bounded generalization, with higher sensitivity to tool geometry than coating.
- The proposed workflow is transferable to other machining processes with appropriate validation.
6. Future Research Direction
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Hyacinth Suganthi, X.; Natarajan, U.; Ramasubbu, N. A Review of Accuracy Enhancement in Microdrilling Operations. Int. J. Adv. Manuf. Technol. 2015, 81, 199–217. [Google Scholar] [CrossRef]
- Chaudhary, K.; Haribhakta, V.K. Micro-Drilling on Shape Memory Alloys—A Review. MethodsX 2024, 13, 102968. [Google Scholar] [CrossRef]
- Niinomi, M. Mechanical Properties of Biomedical Titanium Alloys. Mater. Sci. Eng. A 1998, 243, 231–236. [Google Scholar] [CrossRef]
- Hourmand, M.; Sarhan, A.A.D.; Sayuti, M.; Hamdi, M. A Comprehensive Review on Machining of Titanium Alloys. Arab. J. Sci. Eng. 2021, 46, 7087–7123. [Google Scholar] [CrossRef]
- Löffler, F. Wear and Cutting Performance of Coated Microdrills. Surf. Coat. Technol. 1998, 107, 191–196. [Google Scholar] [CrossRef]
- Christiand, C.; Kiswanto, G.; Baskoro, A.S.; Hasymi, Z.; Ko, T.J. Tool Wear Monitoring in Micro-Milling Based on Digital Twin Technology with an Extended Kalman Filter. J. Manuf. Mater. Process. 2024, 8, 108. [Google Scholar] [CrossRef]
- Beruvides, G.; Quiza, R.; Del Toro, R.; Haber, R.E. Sensoring Systems and Signal Analysis to Monitor Tool Wear in Microdrilling Operations on a Sintered Tungsten–Copper Composite Material. Sens. Actuators A Phys. 2013, 199, 165–175. [Google Scholar] [CrossRef]
- Gomes, M.C.; Brito, L.C.; Bacci Da Silva, M.; Viana Duarte, M.A. Tool Wear Monitoring in Micromilling Using Support Vector Machine with Vibration and Sound Sensors. Precis. Eng. 2021, 67, 137–151. [Google Scholar] [CrossRef]
- Wang, S.-M.; Tsou, W.-S.; Huang, J.-W.; Chen, S.-E.; Wu, C.-C. Development of a Method and a Smart System for Tool Critical Life Real-Time Monitoring. J. Manuf. Mater. Process. 2024, 8, 194. [Google Scholar] [CrossRef]
- Fattahi, S.; Azarhoushang, B.; Kitzig-Frank, H. Knowledge-Based Adaptive Design of Experiments (KADoE) for Grinding Process Optimization Using an Expert System in the Context of Industry 4.0. J. Manuf. Mater. Process. 2025, 9, 62. [Google Scholar] [CrossRef]
- Wu, D.; Jennings, C.; Terpenny, J.; Gao, R.X.; Kumara, S. A Comparative Study on Machine Learning Algorithms for Smart Manufacturing: Tool Wear Prediction Using Random Forests. J. Manuf. Sci. Eng. 2017, 139, 071018. [Google Scholar] [CrossRef]
- Axinte, D.; Gindy, N. Assessment of the Effectiveness of a Spindle Power Signal for Tool Condition Monitoring in Machining Processes. Int. J. Prod. Res. 2004, 42, 2679–2691. [Google Scholar] [CrossRef]
- Omole, S.; Dogan, H.; Lunt, A.J.G.; Kirk, S.; Shokrani, A. Using Machine Learning for Cutting Tool Condition Monitoring and Prediction during Machining of Tungsten. Int. J. Comput. Integr. Manuf. 2024, 37, 747–771. [Google Scholar] [CrossRef]
- Alajmi, M.S.; Almeshal, A.M. Predicting the Tool Wear of a Drilling Process Using Novel Machine Learning XGBoost-SDA. Materials 2020, 13, 4952. [Google Scholar] [CrossRef] [PubMed]
- Zhou, Y.; Liu, C.; Yu, X.; Liu, B.; Quan, Y. Tool Wear Mechanism, Monitoring and Remaining Useful Life (RUL) Technology Based on Big Data: A Review. SN Appl. Sci. 2022, 4, 232. [Google Scholar] [CrossRef]
- Liu, X.; Chen, G.; Li, Y.; Chen, L.; Meng, Q.; Mehdi-Souzani, C. Sampling via the Aggregation Value for Data-Driven Manufacturing. Natl. Sci. Rev. 2022, 9, nwac201. [Google Scholar] [CrossRef]
- Lv, H.; Chen, J.; Zhang, T.; Hou, R.; Pan, T.; Zhou, Z. SDA: Regularization with Cut-Flip and Mix-Normal for Machinery Fault Diagnosis under Small Dataset. ISA Trans. 2021, 111, 337–349. [Google Scholar] [CrossRef]
- Siahsarani, A.; Fattahi, S.; Alinaghizadeh, A.; Azarhoushang, B.; Bösinger, R. Data-Driven Optimization of Processing Parameters and Cooling Strategies in UHMWPE High Speed Milling Through Multi-Criteria Decision Making Using PCA and Pareto-Based Evolutionary Algorithms. Int. J. Precis. Eng. Manuf.-Green Technol. 2026. [Google Scholar] [CrossRef]
- Ahmed, S.F.; Alam, M.S.B.; Hassan, M.; Rozbu, M.R.; Ishtiak, T.; Rafa, N.; Mofijur, M.; Shawkat Ali, A.B.M.; Gandomi, A.H. Deep Learning Modelling Techniques: Current Progress, Applications, Advantages, and Challenges. Artif. Intell. Rev. 2023, 56, 13521–13617. [Google Scholar] [CrossRef]
- Alwosheel, A.; Van Cranenburgh, S.; Chorus, C.G. Is Your Dataset Big Enough? Sample Size Requirements When Using Artificial Neural Networks for Discrete Choice Analysis. J. Choice Model. 2018, 28, 167–182. [Google Scholar] [CrossRef]
- Nenchev, B.; Tao, Q.; Dong, Z.; Panwisawas, C.; Li, H.; Tao, B.; Dong, H. Evaluating Data-Driven Algorithms for Predicting Mechanical Properties with Small Datasets: A Case Study on Gear Steel Hardenability. Int. J. Miner. Metall. Mater. 2022, 29, 836–847. [Google Scholar] [CrossRef]
- Domínguez-Monferrer, C.; Fernández-Pérez, J.; De Santos, R.; Miguélez, M.H.; Cantero, J.L. Machine Learning Approach in Non-Intrusive Monitoring of Tool Wear Evolution in Massive CFRP Automatic Drilling Processes in the Aircraft Industry. J. Manuf. Syst. 2022, 65, 622–639. [Google Scholar] [CrossRef]
- Truong, T.T.; Airao, J.; Hojati, F.; Ilvig, C.F.; Azarhoushang, B.; Karras, P.; Aghababaei, R. Data-Driven Prediction of Tool Wear Using Bayesian Regularized Artificial Neural Networks. Measurement 2024, 238, 115303. [Google Scholar] [CrossRef]
- Truong, T.T.; Airao, J.; Fattahi, S.; Azarhoushang, B.; Karras, P.; Aghababaei, R. Image-Based Machine Learning Model for Tool Wear Estimation in Milling Inconel 718. Wear 2025, 571, 205865. [Google Scholar] [CrossRef]
- Gilpin, L.H.; Bau, D.; Yuan, B.Z.; Bajwa, A.; Specter, M.; Kagal, L. Explaining Explanations: An Overview of Interpretability of Machine Learning. In Proceedings of the 2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA), Turin, Italy, 1–3 October 2018; IEEE: New York, NJ, USA, 2018; pp. 80–89. [Google Scholar]
- Adadi, A.; Berrada, M. Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI). IEEE Access 2018, 6, 52138–52160. [Google Scholar] [CrossRef]
- Minh, D.; Wang, H.X.; Li, Y.F.; Nguyen, T.N. Explainable Artificial Intelligence: A Comprehensive Review. Artif. Intell. Rev. 2022, 55, 3503–3568. [Google Scholar] [CrossRef]
- Alomari, Y.; Andó, M. SHAP-Based Insights for Aerospace PHM: Temporal Feature Importance, Dependencies, Robustness, and Interaction Analysis. Results Eng. 2024, 21, 101834. [Google Scholar] [CrossRef]
- Cheng, W.-N.; Cheng, C.-C.; Lei, Y.-H.; Tsai, P.-C. Feature Selection for Predicting Tool Wear of Machine Tools. Int. J. Adv. Manuf. Technol. 2020, 111, 1483–1501. [Google Scholar] [CrossRef]
- Shen, Y.; Yang, F.; Habibullah, M.S.; Ahmed, J.; Das, A.K.; Zhou, Y.; Ho, C.L. Predicting Tool Wear Size across Multi-Cutting Conditions Using Advanced Machine Learning Techniques. J. Intell. Manuf. 2021, 32, 1753–1766. [Google Scholar] [CrossRef]
- Varghese, A.; Kulkarni, V.; Joshi, S.S. Tool Life Stage Prediction in Micro-Milling From Force Signal Analysis Using Machine Learning Methods. J. Manuf. Sci. Eng. 2021, 143, 054501. [Google Scholar] [CrossRef]
- Yang, Z.; Li, L.; Zhang, Y.; Jiang, Z.; Liu, X. Tool Wear State Monitoring in Titanium Alloy Milling Based on Wavelet Packet and TTAO-CNN-BiLSTM-AM. Processes 2024, 13, 13. [Google Scholar] [CrossRef]
- Yan, S.; Sui, L.; Wang, S.; Sun, Y. On-Line Tool Wear Monitoring under Variable Milling Conditions Based on a Condition-Adaptive Hidden Semi-Markov Model (CAHSMM). Mech. Syst. Signal Process. 2023, 200, 110644. [Google Scholar] [CrossRef]
- Sharma, P.; Thulasi, H.M.; Mishra, S.K.; Ramkumar, J. Identification of Parameter-Dependent Machine Learning Models for Tool Flank Wear Prediction in Dry Titanium Machining. Proc. Inst. Mech. Eng. Part E J. Process Mech. Eng. 2024, 09544089241304236. [Google Scholar] [CrossRef]
- Shurrab, S.; Almshnanah, A.; Duwairi, R. Tool Wear Prediction in Computer Numerical Control Milling Operations via Machine Learning. In Proceedings of the 2021 12th International Conference on Information and Communication Systems (ICICS), Valencia, Spain, 24 May 2021; IEEE: New York, NY, USA, 2021; pp. 220–227. [Google Scholar]
- Misal, A.; Karandikar, H.; Sayyad, S.; Bongale, A.; Kumar, S.; Warke, V. Milling Tool Wear Estimation Using Machine Learning with Feature Extraction Approach. In Proceedings of the 2024 MIT Art, Design and Technology School of Computing International Conference (MITADTSoCiCon), Pune, India, 25 April 2024; IEEE: New York, NY, USA, 2024; pp. 1–7. [Google Scholar]
- Rather, I.H.; Kumar, S.; Gandomi, A.H. Breaking the Data Barrier: A Review of Deep Learning Techniques for Democratizing AI with Small Datasets. Artif. Intell. Rev. 2024, 57, 226. [Google Scholar] [CrossRef]
- Danish, M.; Gupta, M.K.; Irfan, S.A.; Ghazali, S.M.; Rathore, M.F.; Krolczyk, G.M.; Alsaady, A. Machine Learning Models for Prediction and Classification of Tool Wear in Sustainable Milling of Additively Manufactured 316 Stainless Steel. Results Eng. 2024, 22, 102015. [Google Scholar] [CrossRef]
- Dilli Ganesh, V.; Thangaraj, S.J.J. Prediction of Flank Wear in Turning of Monel K500 by Using Machine Learning Model in Comparison With Experimental Analysis. In Proceedings of the 2023 International Conference on Data Science, Agents & Artificial Intelligence (ICDSAAI), Chennai, India, 21 December 2023; IEEE: New York, NY, USA, 2023; pp. 1–5. [Google Scholar]
- Kosarac, A.; Mladjenovic, C.; Zeljkovic, M.; Tabakovic, S.; Knezev, M. Neural-Network-Based Approaches for Optimization of Machining Parameters Using Small Dataset. Materials 2022, 15, 700. [Google Scholar] [CrossRef] [PubMed]
- Zhu, Y.; Yuan, Z.; Khonsari, M.M.; Zhao, S.; Yang, H. Small-Dataset Machine Learning for Wear Prediction of Laser Powder Bed Fusion Fabricated Steel. J. Tribol. 2023, 145, 091101. [Google Scholar] [CrossRef]
- Shah, R.; Pai, N.; Thomas, G.; Jha, S.; Mittal, V.; Shirvni, K.; Liang, H. Machine Learning in Wear Prediction. J. Tribol. 2025, 147, 040801. [Google Scholar] [CrossRef]
- Hirsch, E.; Friedrich, C. Data-Driven Tool Wear Prediction in Milling, Based on a Process-Integrated Single-Sensor Approach. arXiv 2024, arXiv:2412.19950. [Google Scholar]
- Dubey, V.; Sharma, A.K.; Pimenov, D.Y. Prediction of Surface Roughness Using Machine Learning Approach in MQL Turning of AISI 304 Steel by Varying Nanoparticle Size in the Cutting Fluid. Lubricants 2022, 10, 81. [Google Scholar] [CrossRef]
- Liu, Z.; Xu, Y.; Qiu, C.; Tan, J. A Novel Support Vector Regression Algorithm Incorporated with Prior Knowledge and Error Compensation for Small Datasets. Neural Comput. Appl. 2019, 31, 4849–4864. [Google Scholar] [CrossRef]
- Schulz, E.; Speekenbrink, M.; Krause, A. A Tutorial on Gaussian Process Regression: Modelling, Exploring, and Exploiting Functions. J. Math. Psychol. 2018, 85, 1–16. [Google Scholar] [CrossRef]
- Norazman, S.H.; Aspar, M.A.S.M.; Ghafar, A.N.A.; Karumdin, N.; Abidin, A.N.S.Z. Artificial Neural Network Analysis in Road Crash Data: A Review on Its Potential Application in Autonomous Vehicles. In Intelligent Manufacturing and Mechatronics; Isa, W.H.M., Khairuddin, I.M., Razman, M.A.M., Saruchi, S.A., Teh, S.-H., Liu, P., Eds.; Lecture Notes in Networks and Systems; Springer Nature: Singapore, 2024; Volume 850, pp. 95–104. [Google Scholar]
- Tazi, K.; Lin, J.A.; Viljoen, R.; Gardner, A.; John, S.; Ge, H.; Turner, R.E. Beyond Intuition, a Framework for Applying GPs to Real-World Data. arXiv 2023, arXiv:2307.03093. [Google Scholar] [CrossRef]
- Ougiaroglou, S.; Evangelidis, G. Dealing with Noisy Data in the Context of K-NN Classification. In Proceedings of the 7th Balkan Conference on Informatics Conference, Craiova, Romania, 2 September 2015; ACM: New York, NY, USA, 2015; pp. 1–4. [Google Scholar]
- Chaudhuri, A. Hierarchical Modified Regularized Least Squares Fuzzy Support Vector Regression through Multiscale Approach. In Advances in Computational Intelligence; Rojas, I., Joya, G., Gabestany, J., Eds.; Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 2013; Volume 7902, pp. 393–407. ISBN 978-3-642-38678-7. [Google Scholar]
- Acito, F. K Nearest Neighbors. In Predictive Analytics with KNIME; Springer Nature: Cham, Switzerland, 2023; pp. 209–227. ISBN 978-3-031-45629-9. [Google Scholar]
- Shrivastava, A.; Kotiyal, A.; Habelalmateen, M.I.; Rana, A.; Devi, V.S.A.; Rao, B.D.; Bansal, S. Leveraging XGBoost for Predictive Analytics in Healthcare: Enhancing Disease Diagnosis. In Proceedings of the 2024 7th International Conference on Contemporary Computing and Informatics (IC3I), Greater Noida, India, 18 September 2024; IEEE: New York, NY, USA, 2024; pp. 1666–1672. [Google Scholar]
- Yan, R.; Wang, S. Linear Regression Models. In Applications of Machine Learning and Data Analytics Models in Maritime Transportation; Institution of Engineering and Technology: London, UK, 2022; pp. 51–62. ISBN 978-1-83953-559-8. [Google Scholar]
- Wan, A.; Gong, Z.; Chen, T.; AL-Bukhaiti, K. Mass Flow Characteristics Prediction of Refrigerants through Electronic Expansion Valve Based on XGBoost. Int. J. Refrig. 2024, 158, 345–352. [Google Scholar] [CrossRef]
- Kretowski, M. Oblique and Mixed Decision Trees. In Evolutionary Decision Trees in Large-Scale Data Mining; Studies in Big Data; Springer International Publishing: Cham, Switzerland, 2019; Volume 59, pp. 101–113. ISBN 978-3-030-21850-8. [Google Scholar]
- Chen, Y.; Dong, Y.; Liu, W. Prediction of Credit Default Based on the XGBoost Model. Appl. Comput. Eng. 2024, 96, 85–92. [Google Scholar] [CrossRef]
- Lin, Z.; Fan, Y.; Tan, J.; Li, Z.; Yang, P.; Wang, H.; Duan, W. Tool wear prediction based on XGBoost feature selection combined with PSO-BP network. Sci. Rep. 2025, 15, 3096. [Google Scholar] [CrossRef] [PubMed]
- Qi, Y. Random Forest for Bioinformatics. In Ensemble Machine Learning; Zhang, C., Ma, Y., Eds.; Springer: New York, NY, USA, 2012; pp. 307–323. ISBN 978-1-4419-9325-0. [Google Scholar]
- Wang, L.; Li, Q.; Yu, Y.; Liu, J. Region Compatibility Based Stability Assessment for Decision Trees. Expert Syst. Appl. 2018, 105, 112–128. [Google Scholar] [CrossRef]
- Utkin, L.V.; Kovalev, M.S.; Frank Coolen, P.A. Robust Regression Random Forests by Small and Noisy Training Data. In Proceedings of the 2019 XXII International Conference on Soft Computing and Measurements (SCM)), St. Petersburg, Russia, 23–25 May 2019; IEEE: New York, NY, USA, 2019; pp. 134–137. [Google Scholar]
- Pukelis, L.; Stančiauskas, V. The Opportunities and Limitations of Using Artificial Neural Networks in Social Science Research. Politologija 2019, 94, 56–80. [Google Scholar] [CrossRef]
- Kim, D.; Lee, C.; Hwang, S.; Jeong, M.K. A Robust Support Vector Regression with a Linear-Log Concave Loss Function. J. Oper. Res. Soc. 2016, 67, 735–742. [Google Scholar] [CrossRef]
- McKearnan, S.B.; Vock, D.M.; Marai, G.E.; Canahuate, G.; Fuller, C.D.; Wolfson, J. Feature Selection for Support Vector Regression Using a Genetic Algorithm. Biostatistics 2023, 24, 295–308. [Google Scholar] [CrossRef]
- Liu, B.; Gao, L.; Li, B.; Marcos-Martinez, R.; Bryan, B.A. Nonparametric Machine Learning for Mapping Forest Cover and Exploring Influential Factors. Landsc. Ecol. 2020, 35, 1683–1699. [Google Scholar] [CrossRef]
- Scornet, E. Random Forests and Kernel Methods. IEEE Trans. Inf. Theory 2016, 62, 1485–1500. [Google Scholar] [CrossRef]
- Boldini, D.; Grisoni, F.; Kuhn, D.; Friedrich, L.; Sieber, S.A. Practical Guidelines for the Use of Gradient Boosting for Molecular Property Prediction. J. Cheminform. 2023, 15, 73. [Google Scholar] [CrossRef]
- Bentéjac, C.; Csörgő, A.; Martínez-Muñoz, G. A Comparative Analysis of Gradient Boosting Algorithms. Artif. Intell. Rev. 2021, 54, 1937–1967. [Google Scholar] [CrossRef]
- Che, Z.; Peng, C.; Wang, C.; Wang, J. A Novel Integrated TDLAVOA-XGBoost Model for Tool Wear Prediction in Lathe and Milling Operations. Results Eng. 2025, 27, 105984. [Google Scholar] [CrossRef]
- Patra, K.; Jha, A.K.; Szalay, T.; Ranjan, J.; Monostori, L. Artificial Neural Network Based Tool Condition Monitoring in Micro Mechanical Peck Drilling Using Thrust Force Signals. Precis. Eng. 2017, 48, 279–291. [Google Scholar] [CrossRef]
- Shahinur, S.; Ullah, A.M.M.S.; Noor-E-Alam, M.; Haniu, H.; Kubo, A. A Decision Model for Making Decisions under Epistemic Uncertainty and Its Application to Select Materials. Artif. Intell. Eng. Des. Anal. Manuf. 2017, 31, 298–312. [Google Scholar] [CrossRef]
- ISO 3685; Tool-Life Testing with Single-Point Turning Tools. ISO: Geneva, Switzerland, 1993.
- Li, G.; Li, N.; Wen, C.; Ding, S. Investigation and Modeling of Flank Wear Process of Different PCD Tools in Cutting Titanium Alloy Ti6Al4V. Int. J. Adv. Manuf. Technol. 2018, 95, 719–733. [Google Scholar] [CrossRef]
- Fang, N.; Pai, P.S.; Mosquea, S. The Effect of Built-up Edge on the Cutting Vibrations in Machining 2024-T351 Aluminum Alloy. Int. J. Adv. Manuf. Technol. 2010, 49, 63–71. [Google Scholar] [CrossRef]
- Kovvuri, V.; Wang, Z.; Araujo, A.; Da Silva, M.B.; Bukkapatnam, S.; Hung, W.N.P. Built-Up-Edge Formation in Micromilling. In Proceedings of the Volume 2A: Advanced Manufacturing, Houston, TX, USA, 13 November 2015; American Society of Mechanical Engineers: Houston, TX, USA, 2015; p. V02AT02A057. [Google Scholar]
- Oliaei, S.N.B.; Karpat, Y. Investigating the Influence of Friction Conditions on Finite Element Simulation of Microscale Machining with the Presence of Built-up Edge. Int. J. Adv. Manuf. Technol. 2017, 90, 819–829. [Google Scholar] [CrossRef]
- Faculty of Engineering; Thammasat School of Engineering (TSE); Thammasat University; Winnuwat, N.; Muttamara, A.; Kloypayan, J. A study of the phenomenon bue creation in trochoidal milling. MM Sci. J. 2023, 2023, 6435–6440. [Google Scholar] [CrossRef]
- Sadeghi, M.; Behnia, F.; Amiri, R. Window Selection of the Savitzky–Golay Filters for Signal Recovery From Noisy Measurements. IEEE Trans. Instrum. Meas. 2020, 69, 5418–5427. [Google Scholar] [CrossRef]
- Krishnan, S.R.; Seelamantula, C.S. On the Selection of Optimum Savitzky-Golay Filters. IEEE Trans. Signal Process. 2013, 61, 380–391. [Google Scholar] [CrossRef]
- Kondo, E.; Kamo, R.; Murakami, H. Monitoring of Burr and Prefailure Phase Caused by Tool Wear in Micro-Drilling Operations Using Thrust Force Signals. J. Adv. Mech. Des. Syst. Manuf. 2012, 6, 885–897. [Google Scholar] [CrossRef][Green Version]
- Li, G.S.; Lau, W.S.; Zhang, Y.Z. In-Process Drill Wear and Breakage Monitoring for a Machining Centre Based on Cutting Force Parameters. Int. J. Mach. Tools Manuf. 1992, 32, 855–867. [Google Scholar] [CrossRef]
- Wang, H.; Liang, Q.; Hancock, J.T.; Khoshgoftaar, T.M. Feature Selection Strategies: A Comparative Analysis of SHAP-Value and Importance-Based Methods. J. Big Data 2024, 11, 44. [Google Scholar] [CrossRef]
- Lundberg, S.; Lee, S.-I. A Unified Approach to Interpreting Model Predictions. arXiv 2017, arXiv:1705.07874. [Google Scholar] [CrossRef]
- Schmude, P. Feature Selection in Multiple Linear Regression Problems with Fewer Samples Than Features. In Bioinformatics and Biomedical Engineering; Rojas, I., Ortuño, F., Eds.; Lecture Notes in Computer Science; Springer International Publishing: Cham, Switzerland, 2017; Volume 10208, pp. 85–95. ISBN 978-3-319-56147-9. [Google Scholar]
- Baumann, K. Cross-Validation as the Objective Function for Variable-Selection Techniques. TrAC Trends Anal. Chem. 2003, 22, 395–406. [Google Scholar] [CrossRef]
- Qiu, J. An Analysis of Model Evaluation with Cross-Validation: Techniques, Applications, and Recent Advances. Adv. Econ. Manag. Polit. Sci. 2024, 99, 69–72. [Google Scholar] [CrossRef]
- Tantithamthavorn, C.; McIntosh, S.; Hassan, A.E.; Matsumoto, K. An Empirical Comparison of Model Validation Techniques for Defect Prediction Models. IEEE Trans. Softw. Eng. 2017, 43, 1–18. [Google Scholar] [CrossRef]
- Lumumba, V.; Kiprotich, D.; Mpaine, M.; Makena, N.; Kavita, M. Comparative Analysis of Cross-Validation Techniques: LOOCV, K-Folds Cross-Validation, and Repeated K-Folds Cross-Validation in Machine Learning Models. Am. J. Theor. Appl. Stat. 2024, 13, 127–137. [Google Scholar] [CrossRef]
- Roberts, D.R.; Bahn, V.; Ciuti, S.; Boyce, M.S.; Elith, J.; Guillera-Arroita, G.; Hauenstein, S.; Lahoz-Monfort, J.J.; Schröder, B.; Thuiller, W.; et al. Cross-validation Strategies for Data with Temporal, Spatial, Hierarchical, or Phylogenetic Structure. Ecography 2017, 40, 913–929. [Google Scholar] [CrossRef]
- Mohr, F.; Van Rijn, J.N. Fast and Informative Model Selection Using Learning Curve Cross-Validation. IEEE Trans. Pattern Anal. Mach. Intell. 2023, 45, 9669–9680. [Google Scholar] [CrossRef] [PubMed]
- Zhang, P. Model Selection Via Multifold Cross Validation. Ann. Stat. 1993, 21, 299–313. [Google Scholar] [CrossRef]
- Feng, C.-X.J.; Yu, Z.-G.S.; Emanuel, J.T.; Li, P.-G.; Shao, X.-Y.; Wang, Z.-H. Threefold versus Fivefold Cross-Validation and Individual versus Average Data in Predictive Regression Modelling of Machining Experimental Data. Int. J. Comput. Integr. Manuf. 2008, 21, 702–714. [Google Scholar] [CrossRef]
- Wainer, J.; Cawley, G. Nested Cross-Validation When Selecting Classifiers Is Overzealous for Most Practical Applications. Expert Syst. Appl. 2021, 182, 115222. [Google Scholar] [CrossRef]
- Szeghalmy, S.; Fazekas, A. A Comparative Study of the Use of Stratified Cross-Validation and Distribution-Balanced Stratified Cross-Validation in Imbalanced Learning. Sensors 2023, 23, 2333. [Google Scholar] [CrossRef]
- Virtanen, P.; Gommers, R.; Oliphant, T.E.; Haberland, M.; Reddy, T.; Cournapeau, D.; Burovski, E.; Peterson, P.; Weckesser, W.; Bright, J.; et al. SciPy 1.0: Fundamental Algorithms for Scientific Computing in Python. Nat. Methods 2020, 17, 261–272. [Google Scholar] [CrossRef] [PubMed]
- Bischl, B.; Binder, M.; Lang, M.; Pielok, T.; Richter, J.; Coors, S.; Thomas, J.; Ullmann, T.; Becker, M.; Boulesteix, A.; et al. Hyperparameter Optimization: Foundations, Algorithms, Best Practices, and Open Challenges. WIREs Data Min. Knowl. Discov. 2023, 13, e1484. [Google Scholar] [CrossRef]
- Lu, Z.; Dai, Y.; Li, W.; Su, Z. Joint Data and Feature Augmentation for Self-Supervised Representation Learning on Point Clouds. Graph. Models 2023, 129, 101188. [Google Scholar] [CrossRef]
- Liu, D.; Kababji, S.E.; Mitsakakis, N.; Pilgram, L.; Walters, T.; Clemons, M.; Pond, G.; El-Hussuna, A.; Emam, K.E. Synthetic Data Generation for Augmenting Small Samples. arXiv 2025, arXiv:2501.18741. [Google Scholar] [CrossRef]













| Model | Typical Data Regime (Reported) | Strengths | Limitations | Interpretability | Rationale for Inclusion | Key References |
|---|---|---|---|---|---|---|
| Linear Regression | Medium–large datasets | Simple baseline | Inadequate for nonlinear wear behavior | High | Used as baseline only | [37,53] |
| Decision Tree | Medium datasets | Simple structure | High variance, unstable with small data | High | Excluded due to instability | [55,59] |
| Random Forest (RF) | Small–medium datasets | Robust, reduced overfitting | Limited extrapolation | Moderate | Selected as stable ensemble model | [58,60] |
| Support Vector Regression (SVR) | Small datasets | Effective under limited data | Kernel-dependent, tuning sensitive | Moderate | Selected for small-data robustness | [45,50] |
| XGBoost | Small–medium datasets | High predictive accuracy | Risk of overfitting without validation | Low–Moderate | Selected due to strong performance | [52,54] |
| K-Nearest Neighbors (k-NN) | Medium–large datasets | Simple implementation | Highly noise-sensitive | Low | Excluded due to noise sensitivity | [49,51] |
| Artificial Neural Network (ANN) | Large datasets | Flexible nonlinear modeling | Data-hungry, poor interpretability | Low | Excluded due to limited data | [47,61] |
| Gaussian Process Regression (GPR) | Small datasets | Uncertainty-aware | Computationally expensive | High | Excluded due to limited data | [46,48] |
| Parameter | Condition/Value |
|---|---|
| Workpiece material | Titanium Grade 5 (Ti-6Al-4V) |
| Microdrill geometries | TD.MI.080.3D (HB microtec GmbH) |
| Tool diameter | 0.8 mm |
| Main cutting parameters | 75 m/min; fz: 0.02 mm; Drilling depth: 2 mm (blind hole) |
| Cooling method | External, Oil-based |
| Drilling strategy | Peck drilling, 10 pecks/hole, peck depth: 0.2 mm, retraction: 0.18 mm |
| Measurement intervals | Tool wear measured every 20 drilled holes, equal to one cycle (18 cycles) |
| Data acquisition signals | Force (Fx, Fy, Fz), Acoustic Emission (AE) |
| Metric | Initial Model | Tuned Model | Tuned with Data Augmentation | Relative Improvement |
|---|---|---|---|---|
| R2 | 0.81 | 0.85 | 0.89 | 9.92% |
| MSE | 1.22 | 0.91 | 0.70 | 42.63% |
| MAE | 0.88 | 0.76 | 0.57 | 34.68% |
| MAPE | 11.40 | 9.52 | 7.62 | 33.21% |
| Tool ID | Geometry/Coating | Cycles (Sample Size) | R2 | MSE [µm2] | MAPE [%] | MAE [µm] |
|---|---|---|---|---|---|---|
| TD.MI.080.3D | 3D, uncoated | 18 | 0.89 | 0.70 | 7.62 | 0.57 |
| TD.MI.080.8D | 8D, uncoated | 16 | 0.83 | 1.44 | 9.12 | 0.91 |
| TD.MI.080.3D.1 | 3D, α-INOX coated | 29 | 0.87 | 1.06 | 8.26 | 0.75 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Fattahi, S.; Azarhoushang, B.; Paknejad, M.; Kitzig-Frank, H. AI-Driven Tool Wear Prediction Under Severe Data Scarcity with SHAP-Guided Feature Selection and Fold-Safe Augmentation: A Case Study of Titanium Microdrilling. Machines 2026, 14, 196. https://doi.org/10.3390/machines14020196
Fattahi S, Azarhoushang B, Paknejad M, Kitzig-Frank H. AI-Driven Tool Wear Prediction Under Severe Data Scarcity with SHAP-Guided Feature Selection and Fold-Safe Augmentation: A Case Study of Titanium Microdrilling. Machines. 2026; 14(2):196. https://doi.org/10.3390/machines14020196
Chicago/Turabian StyleFattahi, Saman, Bahman Azarhoushang, Masih Paknejad, and Heike Kitzig-Frank. 2026. "AI-Driven Tool Wear Prediction Under Severe Data Scarcity with SHAP-Guided Feature Selection and Fold-Safe Augmentation: A Case Study of Titanium Microdrilling" Machines 14, no. 2: 196. https://doi.org/10.3390/machines14020196
APA StyleFattahi, S., Azarhoushang, B., Paknejad, M., & Kitzig-Frank, H. (2026). AI-Driven Tool Wear Prediction Under Severe Data Scarcity with SHAP-Guided Feature Selection and Fold-Safe Augmentation: A Case Study of Titanium Microdrilling. Machines, 14(2), 196. https://doi.org/10.3390/machines14020196

