Machine Learning-Based Prediction of Surface Integrity in High-Pressure Coolant-Assisted Machining of Near-β Ti-5553 Titanium Alloy

Yünlü, Lokman

doi:10.3390/machines14040367

Open AccessArticle

Machine Learning-Based Prediction of Surface Integrity in High-Pressure Coolant-Assisted Machining of Near-β Ti-5553 Titanium Alloy

by

Lokman Yünlü

Department of Mechanical Engineering, Engineering Faculty, Burdur Mehmet Akif Ersoy University, 15030 Burdur, Turkey

Machines 2026, 14(4), 367; https://doi.org/10.3390/machines14040367

Submission received: 17 February 2026 / Revised: 20 March 2026 / Accepted: 22 March 2026 / Published: 27 March 2026

(This article belongs to the Special Issue Innovations in the Design, Simulation, and Manufacturing of Production Systems)

Download

Browse Figures

Versions Notes

Abstract

This study investigates the factors affecting surface integrity during the machining of near-β Ti-5553, a critical material in the aerospace and defense industries. Considering this alloy as a difficult-to-machine material, the turning process was examined by analyzing the effects of cutting speed, feed rate, and cooling strategy (dry, conventional, and 30 MPa/High-Pressure cooling) on cutting force, temperature, surface roughness, and residual stress. The primary novelty of this research lies in its integrated approach: rather than evaluating surface integrity metrics in isolation, it simultaneously models interrelated responses to residual stress, cutting temperature, cutting force, and surface roughness under high-pressure coolant (HPC) conditions. Furthermore, it introduces a robust machine learning framework that uniquely applies data augmentation (Gaussian jittering and interpolation) to overcome the conventional constraints of limited experimental machining data, providing a highly accurate predictive tool. The experimental data were expanded using data augmentation methods (Gaussian jittering and interpolation) and modeled using five different machine learning algorithms (Extra Trees, Random Forest, Gradient Boosting, KNN, and AdaBoost). The results revealed that cooling pressure plays a dominant role, particularly in residual stress (importance score: 0.926) and cutting temperature (0.657). It was observed that high-pressure cooling (HPC) reduces thermal gradients, thereby lowering tensile stresses and improving surface integrity. When algorithm performances were compared, the Extra Trees and Random Forest models achieved the most accurate predictions after hyperparameter optimization. Specifically, the optimized Extra Trees regressor demonstrated exceptional predictive capability for residual stress, achieving an accuracy of 98.47%, a remarkably high coefficient of determination (R² = 0.9997), and a minimal Mean Squared Error (MSE = 6.8289). These quantitative results confirm that the proposed machine learning framework provides a highly reliable and precise tool for controlling surface quality in HPC- assisted machining.

Keywords:

Ti-5553; high pressure coolant; surface integrity; machine learning; feature importance prediction

1. Introduction

Titanium alloys, which are widely used in the manufacture of critical components for the aerospace and defense industries, have become indispensable, especially in structural applications, thanks to their high strength-to-weight ratios, excellent corrosion resistance, and high-temperature performance. However, characteristic properties such as low thermal conductivity and high chemical reactivity concentrate heat in the cutting zone, placing these alloys in the “difficult to machine” category. As a result, excessive heat accumulation at the tool–chip interface leads to problems including accelerated tool wear, increased cutting temperature, unstable cutting forces, and the activation of surface/subsurface damage mechanisms [1]. This situation makes it necessary, in machining, to manage not only the efficiency of material removal but also the quality indicators that ultimately determine the component’s in-service performance, simultaneously.

In this context, the key concept is “surface integrity.” Surface integrity refers to the combined effects of parameters such as surface roughness, microstructural alterations, hardness gradients, and, especially, residual stress, and it plays a critical role in fatigue life and damage tolerance [2]. Therefore, research on the machining of titanium alloys aims not merely to improve surface quality in isolation, but rather to identify process windows in which interrelated outputs such as cutting force, cutting temperature, surface roughness, and residual stress can be controlled simultaneously [3,4].

Current aerospace requirements have driven increased use of near-β titanium alloys beyond the widely adopted Ti-6Al-4V grade. One prominent representative of this class, Ti-5Al-5V-5Mo-3Cr (Ti-5553), is preferred for structural components, such as landing gear, that require high toughness and fatigue resistance. However, the high strength and microstructural characteristics of Ti-5553 can lead to higher cutting loads and a greater tendency to heat accumulation during machining, thereby accelerating tool wear and increasing the risk to surface integrity [4,5]. Indeed, comparative studies between Ti-5553 and Ti-6Al-4V have reported pronounced differences in tool wear mechanisms, tribological behavior, and chip formation, underscoring the need for alloy-specific process planning. This requirement makes it essential to systematically examine the effects of machining parameters, particularly feed rate and cooling strategy, on surface integrity in Ti-5553 [5].

Among machining parameters, feed rate is a geometric factor that directly influences surface roughness; however, by altering cutting forces and heat generation, it also indirectly governs residual stress development and subsurface deformation [2,3]. Nevertheless, in titanium alloys, parameter effects cannot be accurately interpreted without considering the cooling strategy, since heat removal and lubrication/tribological conditions exert direct control over the process. Heat management and the friction regime at the tool–chip interface determine the tendency for adhesion and the stability of chip evacuation, thereby setting the force temperature levels and jointly influencing all surface-integrity outcomes [6]. For this reason, even under identical cutting conditions, meaningful performance differences can be observed under different cooling/lubrication regimes.

Within this framework, high-pressure coolant (HPC), which goes beyond conventional flood cooling, has emerged as a powerful approach for improving the machinability of titanium alloys [7]. By enhancing the penetration of the coolant jet into the tool–chip interface, HPC strengthens convective heat transfer, facilitates chip breaking and evacuation, and reduces thermal loads by altering contact conditions [8,9]. Mechanism-oriented studies indicate that the effectiveness of HPC depends on the interaction between the jet and the contact zone and on its ability to access the interface; therefore, HPC should be treated not merely as an auxiliary condition but as a process-shaping variable [10]. It has been reported that this effect is more pronounced in difficult-to-machine alloys such as Ti-5553, where cooling strategy-dependent, significant changes occur in temperature force levels and surface-integrity outcomes [11,12].

The discussion of cooling strategies also extends to alternatives such as cryogenic machining and minimum quantity lubrication (MQL), which have the potential to suppress heat more effectively. Studies comparing cryogenic machining, MQL, and flood cooling for Ti-5553 have revealed pronounced differences in cutting mechanics, chip morphology, and the extent of the surface/subsurface affected zone [13]. In addition, investigations that examine chip formation under cryogenic–MQL–HPC conditions through combined experimental and numerical approaches have enabled a more physics-based interpretation of how process variables contribute to the output responses. Consequently, the cooling strategy evolves from a secondary preference to a primary design variable, reshaping the relative influence of machining parameters on surface integrity [14].

Among surface-integrity indicators, residual stress plays a central role in fatigue performance and crack initiation. In titanium machining, residual stress formation is highly sensitive to cutting conditions and the cooling/lubrication regime, as it is governed by the complex interactions among thermal gradients, plastic deformation, and possible microstructural transformations [2,3,6]. Studies on Ti-5553 have shown that cooling strategy and parameter settings can significantly alter the magnitude and depth distribution of residual stress, as well as the extent of subsurface modification [15]. Nevertheless, residual stress should be considered alongside cutting temperature, cutting force, and surface roughness: temperature and force define the plastic-deformation and thermal-loading components, while surface roughness reflects the combined effects of kinematic effects and tribological–microstructural interactions [16]. This multi-output nature makes the search for a single “best parameter” challenging and inherently shifts the problem toward multivariate prediction and optimization approaches.

In the machining of titanium alloys, residual stress formation is governed by the combined effect of thermal and mechanical loads acting within the cutting zone. The final near-surface stress state results from the interaction between thermally induced tensile stresses, associated with intense heat generation and constrained cooling, and mechanically induced compressive stresses arising from severe plastic deformation beneath the tool edge. Accordingly, the residual stress state should be interpreted as a direct consequence of machining physics rather than as a purely empirical outcome. In this framework, cutting speed influences heat generation, strain rate, and frictional conditions at the tool–chip interface, whereas feed rate affects the material removal load and the severity of deformation. Coolant pressure, in turn, directly modifies heat dissipation, lubrication efficiency, and chip evacuation. Under high-pressure coolant (HPC) conditions, the coolant jet can penetrate the tool–chip interface more effectively, thereby reducing thermal accumulation and frictional effects and altering the thermo-mechanical balance responsible for residual stress development. For this reason, data-driven residual stress predictions should be supported by the underlying physics of the machining process.

At this point, two main modeling directions stand out in the literature: physics-based analytical/numerical models and data-driven machine learning (ML) approaches. While classical predictive methods, such as mechanistic models, multiple linear regression, and Response Surface Methodology (RSM), have historically provided foundational insights, they often exhibit limitations in capturing the highly nonlinear, multi-physics nature of surface integrity generation under extreme conditions such as HPC. For outputs such as surface integrity, which are inherently multiscale and governed by multiple physical phenomena, the use of ML-based methods has been increasing due to the computational cost and limited generalizability of purely physics-based models [17].

Reviews on AI/ML applications in machining report that regression, artificial neural networks, tree-based methods, and deep learning approaches have become widespread for predicting cutting force, temperature, tool wear, and surface quality [18,19]. Examples such as real-time prediction of surface roughness via sensor-integrated monitoring also demonstrate the industrial applicability of ML-assisted prognostics [20,21]. More specifically, ML frameworks for surface integrity prediction aim to jointly model not only roughness but also subsurface indicators, such as hardness variation, affected layer depth, and residual stress, thereby enhancing decision-support capability for process quality assurance [22,23,24]. Indeed, in the context of residual stress prediction, early studies reported that neural networks could estimate subsurface stress distributions, whereas more recent work has sought to improve accuracy and robustness through hybrid and advanced regression approaches [25,26]. Similarly, modeling residual stress under different material removal conditions using radial basis function networks supports the view that ML provides a practical tool for predicting this output [27].

Recent studies have increasingly emphasized that data-driven manufacturing models become more reliable and interpretable when supported by process-specific physical knowledge. For example, Shen et al. proposed a multi-physics-constrained bi-layer machine learning framework for electrochemical machining, demonstrating that integrating alloy electrochemistry and process physics can improve predictive capability and process control in a complex manufacturing environment [28]. Likewise, Parida and Maity [29] developed an augmented machine learning approach for additive manufacturing, showing that the incorporation of physically meaningful process descriptors can accelerate prediction of process–structure relationships and enhance model relevance for manufacturing applications. These recent studies indicate that the current trend in machine learning-assisted manufacturing is moving beyond purely empirical prediction toward hybrid frameworks that incorporate physics-informed or physics-constrained strategies, thereby improving model robustness and interpretability. In this context, the present study adopts the same general perspective by evaluating machine learning results alongside the thermomechanical principles governing machining-induced surface integrity.

Despite the growing application of machine learning in manufacturing, a critical gap remains in the literature regarding the machining of near-β titanium alloys such as Ti-5553. Most existing studies have either focused on optimizing a single surface integrity metric in isolation (e.g., solely surface roughness) or have evaluated conventional cooling methods. Furthermore, while robust ML models typically require substantial datasets, conducting extensive experimental runs on expensive, difficult-to-machine alloys is often constrained by practical and economic considerations. Consequently, there is a lack of comprehensive studies that simultaneously model interrelated thermomechanical outputs (residual stress, cutting temperature, cutting force, and surface roughness) under HPC conditions while systematically overcoming limited data constraints through advanced data augmentation techniques.

Motivated by this need, the present study focuses on high-pressure coolant-assisted machining of the Ti-5553 titanium alloy. The aims are to (i) analyze surface integrity indicators such as residual stress, cutting temperature, cutting force, and surface roughness as interrelated outputs; (ii) identify the process parameters governing these outputs and quantitatively reveal their relative effect levels; and (iii) develop machine learning-based models capable of predicting surface integrity responses under different machining conditions.

While the previous literature has extensively investigated the independent effects of machining parameters, this study makes a novel contribution with an integrated, multi-output approach under high-pressure coolant (HPC) conditions. Specifically, this research distinguishes itself from existing studies through the following original contributions:

Integrated Multi-Output Evaluation: This study analyzes residual stress (the internal force retained in a material after machining), cutting temperature (the heat produced at the cutting zone), cutting force (the force required for the cutting process), and surface roughness (the texture of the machined surface) together as interrelated surface integrity indicators, rather than treating them as isolated metrics.

Quantitative Parameter Assessment: This study offers a data-driven, quantitative determination of feature importance. It clearly reveals the dominant role of the cooling strategy over kinematic parameters (cutting speed and feed rate) in thermo-mechanical outcomes.

Advanced Predictive Modeling Framework: Implementing a robust machine learning strategy that uniquely utilizes data augmentation techniques, including Gaussian jittering (adding random noise based on a normal distribution to data points) and interpolation (generating new data points between existing samples), to expand limited experimental data. This approach significantly improves the predictive accuracy of ensemble models (such as Extra Trees and Random Forests) without overfitting.

This study bridges the gap between physical–mechanistic machining principles and data-driven manufacturing. It provides a highly reliable, practical framework for predicting and optimizing surface integrity during HPC-assisted machining of near-β titanium alloys.

2. Materials and Methods

Figure 1 illustrates the integrated experimental and machine learning workflow of this study. First, the near-β Ti-5553 alloy was machined under various cutting conditions to measure key surface integrity indicators: residual stress, cutting force, temperature, and surface roughness. The resulting dataset was then expanded via data augmentation and analyzed for feature importance. Finally, five machine learning algorithms (Extra Trees, Random Forest, Gradient Boosting, KNN, and AdaBoost) were trained on the augmented data and evaluated using standard statistical metrics to identify the most accurate predictive model.

2.1. Material

In this study, Ti-5553 (Ti-5Al-5Mo-5V-3Cr), a metastable β titanium alloy, was used as the workpiece material. Owing to its high strength, good toughness, and widespread use in aerospace applications, this alloy is considered a “difficult-to-machine” titanium alloy. Due to the presence of β-stabilizing elements (Mo, V, Cr), the alloy can retain a stable β phase at room temperature; depending on the applied heat treatment, α/α″ precipitates may be observed within the β matrix [30]. This phase balance plays a decisive role in tool wear, cutting forces, and surface-integrity outcomes under the high temperatures and strains generated during machining. The nominal chemical composition of the Ti-5553 alloy used in this work is presented in Table 1 below. Owing to its relatively high content of β-stabilizing elements (Mo, V, Cr), this alloy exhibits higher toughness and strength compared with the conventional Ti-6Al-4V alloy.

In the experimental study, Ti-5Al-5Mo-5V-3Cr was used as the workpiece material. The workpiece was a cylindrical specimen measuring Ø80 × 445 mm. The Ti-5553 specimens, procured in the heat-treated condition, had a density of 4.650 g/cm³. As part of the preliminary preparation for the experiments, center holes were drilled into the material. Before testing, the alloy’s weight was measured at approximately 10.239 kg. The Ti-5553 alloy was supplied by the company within the framework of collaborative project activities with a defense industry company operating in Türkiye, as presented in Figure 2.

Hardness measurements were performed at the SDU-YETEM Laboratory using a TTS Matsuzawa HWMMT-X3 Micro Vickers Hardness Testing device is manufactured by Matsuzawa Co., Ltd., based in Akita, Japan. Prior to the tests, the specimens were cut and mounted in resin for metallographic preparation. The measurement surfaces were sequentially ground using SiC abrasive papers with progressively increasing grit sizes, and subsequently polished using diamond suspensions to obtain a flat, scratch-free, mirror-like surface suitable for indentation analysis. In the Vickers measurements, a load of 25 g was applied for a dwell time of 15 s. For the hardness calculation, measurements were taken from eight regions of each specimen, and the average of these values was reported as the representative hardness. The graph of the average hardness of the unmachined specimen under initial conditions is shown in Figure 3.

Machining of Ti-5553 superalloy was performed on a CNC lathe under three conditions: dry cutting, conventional cooling (0.6 MPa), and high-pressure cooling (30 MPa). In these experiments, PVD-coated cutting tools with a [TiN + Al₂O₃ + Ti(C, N)] coating composition were used in conjunction with jet-nozzle toolholders. Table 2 below presents the mechanical and physical properties of the Ti-5553 alloy.

2.2. Methods

2.2.1. Machining Parameters and High-Pressure Cooling System

For machining the difficult-to-machine Ti-5553 near-β alloy, the cutting parameters and tooling/cooling conditions, such as cutting speed, feed rate, and depth of cut, were selected based on the commonly reported parameter ranges in the literature. For the turning operation, a three-level experimental design was planned for three factors. The depth of cut (DoC) was kept constant at 1 mm. To determine the parameter and variable combinations, a General Full Factorial Design was adopted, yielding 27 tests. The equipment, methods, and parameters used in the tests are presented in Table 3.

High-pressure coolant (HPC) is a cooling strategy used in the machining of difficult-to-cut materials to enhance heat removal from the cutting zone and improve lubrication. The high-velocity jet directed into the cutting region generates turbulent flow, increasing convective heat transfer and enabling faster heat dissipation. While a boiling film layer that may form on the heated tool surface can limit heat removal, a high-pressure jet can penetrate this layer, thereby improving cooling effectiveness.

Applying the jet to the tool–chip and tool–workpiece interfaces can lift the chip through a hydraulic wedge effect. This allows the coolant to reach regions closer to the cutting edge. Directing the jet toward the rake face (free surface) supports lubrication by creating pressure in the clearance region. However, beyond a critical HPC pressure, gains in cooling and tool life saturate. Excessive pressure mainly increases fluid loss.

Before each cutting test, the worn tool was replaced with a new one to maintain the machined surface’s integrity. Lubricating/Cooling fluids used in the experiments included 5% concentration of water-soluble oil and chemical-based oil. Swisslube BCool 650 oil is injected at a low angle (tool rake angle of about 5 to 6°) to the chip-tool interface.

A schematic representation of the machining process is provided in Figure 4 below. In this figure, the machining and measurement procedures where the characteristics listed in Table 3 are presented together are also intended to be illustrated. The image depicts the near-β titanium alloy mounted in the chuck of an ALEX ANL-75 CNC lathe, together with the cutting tool and the coolant nozzle; it is also understood that the toolholder is secured to a dynamometer. In this schematic, a measuring device positioned on the workpiece surface represents the region where surface roughness measurements are taken. In addition, for temperature measurement in the cutting zone, the figure highlights that a blind hole was drilled into the tool and a thermocouple was placed inside. Moreover, a representation of residual stress measurement by X-ray diffraction (XRD), based on Bragg’s law, is included, illustrating the determination of residual stresses at the machined surface. Here, a single surface measurement was taken, and the magnitude and type of residual stress were calculated. Brief information on these processes is provided under the subsection headings below.

2.2.2. Residual Stress Measurement by X-Ray Diffraction (XRD)

For residual stress analysis, measurements were taken from the surfaces of the machined specimens for each set of cutting parameters. For this purpose, a GE–Seifert 3003 PTS X-ray diffraction (XRD) system was used. The analyses were carried out at the MSMM Laboratory of Atılım University (Ankara, Türkiye).

Residual stress measurements were performed point-by-point using an X-ray diffractometer equipped with Cr Kα radiation, a V filter, and a 0.2 mm collimator. The X-ray tube was operated at 40 kV and 40 mA, delivering 1.6 kW. For the evaluation of the Ti-5553 material, the {2 2 0} crystallographic plane family of the face-centered cubic NiCo phase, with a stress-free diffraction angle of 2θ = 133.53°, was used. Diffracted X-rays were detected using a position-sensitive Meteor1D detector, scanning the 2θ range from 120° to 135° with a precision of 0.001°. To determine all elements of the stress tensor, measurements were conducted in 3 azimuthal directions (ϕ = 0°, 45°, 90°) across 7 tilt angles (ψ = −45.0000°, −35.2640°, −24.095°, 0.0000°, +24.095°, +35.2640°, +45.0000°). The average data collected over 120 s was used for each measurement. The results were analyzed for elliptical sin² ψ behavior using Bragg’s law, allowing for the calculation of axial and tangential directions, the complete strain and stress tensors, principal strain and stress tensors, the angles of the principal stress vector with the sample axes, and the full width at half maximum (FWHM) values. For the stress calculations, the applied elastic constants were Young’s modulus E = 1.15 × 10⁵ MPa and Poisson’s ratio ν = 0.33. The approximate penetration depth of the Cr Kα radiation was ~3–5 µm, and the estimated measurement uncertainty was ±15 MPa.

2.2.3. Cutting Force Measurement

In the experiments, a three-axis dynamometer (Kistler Type 9257B is a 3-component piezoelectric dynamometer manufactured by the Kistler Group. The company is headquartered and the equipment is manufactured in Winterthur, Switzerland) and a data-acquisition card (DAQ 6062E is manufactured by National Instruments (NI), based in Austin, TX, USA) were used to measure cutting force signals. The forces generated during cutting were processed using CutPro software (It is developed at the Manufacturing Automation Laboratory (MAL) at The University of British Columbia). Among the forces recorded in three axes, the main cutting force (Fc) data were included in the machine learning dataset. Since this force (Fc) is dominant over the other components (Fp: radial force and Ff: feed force), the latter were neglected, and only a single force dataset was considered.

2.2.4. Cutting Temperature Measurement

Due to the low thermal conductivity (~6.7 W/m·K) and high chemical stability of the Ti-5553 alloy, high-performance cutting tools were selected for the experiments. This allowed detailed analysis of the pronounced heat accumulation generated in the cutting zone. PVD-coated inserts with a [TiN + Al₂O₃ + Ti(C, N)] coating composition were used. These were paired with a PCLNL2525M12JET “Jet Stream” (nozzle-type) toolholder. The turning toolholder is manufactured by Seco Tools, which was founded and is based in Fagersta, Sweden. One of the focal points of the study is to observe the effect of HPC in the tests and to determine the temperature generated during machining of the Ti-5553 alloy. Measuring machining temperature with a thermal camera or a pyrometer is not feasible. Water vapor and molecules are present in the cutting zone under high-pressure conditions, making such measurements unreliable. In this case, the thermocouple method emerges as an appropriate measurement technique. To obtain the closest temperature reading quickly and with minimal loss, holes were drilled as close as possible to the tool’s cutting edge. These holes accommodated the thermocouples. K-type thermocouples were placed in these holes, and the voids were filled with thermally conductive paste. A schematic illustration of this procedure is provided in Figure 4.

2.2.5. Surface Roughness Measurement

Surface roughness was measured under various cutting parameters using a HommelWerke T 500 (it was manufactured by Hommelwerke GmbH, located in Villingen-Schwenningen, Germany) surface roughness tester with a diamond stylus and 0.01 μm resolution. The measurement settings were as follows:

Sampling length (L) = 0.8 mm

Evaluation/measurement length (Lm) = 5 × Lc = 5 × 0.8 = 4 mm

Total length (Lt) = 4.8 mm

After machining, surface roughness was measured at three locations on the cylindrical workpiece. The final value was the average of these three readings. Figure 4 illustrates the procedure.

2.3. Machine Learning Methodology

2.3.1. Dataset

The four machining outputs related to surface integrity mentioned above were entered into their corresponding columns in the predefined experimental design table. This allowed them to be used for machine learning training and prediction models (Table 4). In this study, 27 experimental combinations were tested, generated using three cutting parameters at three levels each. Based on the test results, four surface integrity responses were experimentally observed during dry, conventional, and HPC-assisted machining of Ti-5553. Using these results, the relative influence of the machining parameters and their levels was analyzed with various machine learning algorithms. These analyses were then used for predicting the governing parameters. The most suitable models for this prediction and their success rates were determined by validating the results with several performance metrics.

The original dataset consisted of 27 experimental runs from a full-factorial design. To ensure reliable evaluation of the machine learning models and prevent potential data leakage, the dataset was split into training (80%) and test (20%) subsets prior to model training. The test subset was kept completely independent and was used only for model evaluation. This procedure ensured that the predictive performance metrics were obtained from unseen data.

2.3.2. Dataset Augmentation

In this study, a data augmentation process was applied to a dataset containing continuous variables. Since SMOTE (Synthetic Minority Over-sampling Technique) can only be used for classification problems, Gaussian jittering and sample-based interpolation techniques were used for this data set.

The Gaussian jittering method aims to improve the model’s generalisation capacity. It does this by adding small-scale, normally distributed random noise to existing data points to expand the data set. In this process, the minimum and maximum values of each variable are calculated. Random deviations, determined by the noise level, are applied so that the newly generated data points maintain the original distribution. However, sufficient diversity cannot be achieved with Gaussian jittering alone. Therefore, random sampling and linear interpolation techniques were also used to expand the data set in a more balanced way. The random sampling method introduces small-scale changes to samples selected from existing data points. By filling the gaps between data points with interpolation, a more homogeneous distribution is achieved. Integrating these techniques reduces the risk of overlearning by increasing data set diversity while preserving its statistical properties. This integration also improves the model’s generalisation performance. The results show that the augmented dataset remains largely faithful to the original distribution and provides a more robust training structure. The execution of the data augmentation method is shown as pseudo code in Algorithm 1. To avoid data leakage during model development, data augmentation was applied only during training. The test dataset remained unchanged and consisted solely of the original experimental observations. This approach ensured that evaluation metrics represent the models’ predictive capability on unseen experimental data.

Algorithm 1: Synthetic data generation pseudo code
1:	if additional_samples_needed > 0:
2:	additional_data = []
3:	for i in range(additional_samples_needed):
4:	sample = dataset.random_sample (1)
5:	jittered_sample = sample.copy()
6:	for column in dataset.numerical_columns():
7:	jittered_sample[column] = add_jitter(jittered_sample[column])
8:	additional_data.append(jittered_sample)
9:	dataset = dataset.append(additional_data)
10:	save_excel(dataset, “augmented_dataset.xlsx”)

2.3.3. Performance Evaluation of Machine Learning Algorithms and Prediction Results

In this section, the success, error and prediction values obtained from the machine learning algorithms prepared for prediction will be shown both numerically and graphically. Each algorithm tested for training is analyzed under a separate heading. Each section contains basic and brief information about the algorithm and the results obtained. The dataset for training was divided into 80% training data and 20% test data. The model performance was evaluated using the independent test dataset. This evaluation strategy allowed the predictive ability of the models to be assessed on unseen observations and provided a reliable estimate of their generalization performance.

Extra Trees Repressors Algorithm

The Extra Trees (Extremely Randomized Trees) algorithm is a machine learning method specifically used to solve classification and regression problems. Extra Trees is a method based on the decision of trees. It uses many trees, such as the Random Forest algorithm, as its working logic. In addition, unlike Random Forest, Extra Trees takes more randomness into account when constructing trees [31,32]. Gth denotes the prediction tree. Here, θ denotes a vector of independent uniform distributions assigned before the tree grows. All trees are combined and averaged into a tree ensemble of G(x), which is generated using the Ko and Yin [31] equation (Equation (1)) [33].

G (x, θ_{1}, \dots, θ_{2}) = \frac{1}{2} \sum_{r = 1}^{R} G (x, θ_{r})

(1)

K-Nearest Neighbors Regressor Algorithm

The K-Nearest Neighbors (KNN) algorithm is a heuristic, sampling-based machine learning method widely used for classification and regression. This algorithm predicts a new data point based on its spatial proximity and infers its label or value from the labels or values of neighboring points.

KNN is considered a lazy learning method; that is, the model learns only by storing the data, without requiring parameter optimization during training. In the model prediction phase, k nearest neighbors are determined using distance measures such as Euclidean, Manhattan, or Minkowski distances, and these points are averaged in the regression scenario or predicted using a weighted summation method.

The KNN algorithm is very successful, especially for small, well-distributed data sets, but it has disadvantages, such as increased computational cost in high-dimensional data sets (the Curse of Dimensionality) and the need to select the appropriate k value. The model’s performance may vary depending on factors such as data normalization, the choice of distance metric, and the determination of the number of neighbors.

Gradient Boosting Regressor Algorithm

Gradient Boosting (GB) is a machine learning algorithm that builds a strong predictive model by successively combining weak predictors and has shown high success, especially in regression and classification problems.

This method is based on minimizing residual errors. The algorithm starts the prediction process with a base model, such as a decision tree, analyzes the model’s errors, and trains new trees to correct them. It is a whole process informed by the negative gradient of the error function at each iteration, for the model to follow the direction of errors to advance the optimization process. While the learning rate will inform the number of updates in this incremental optimization process, all this becomes about critical hyperparameters in performance: three depths and several weak learners. Gradient Boosting is especially effective for large, complex data sets because it can improve generalization by learning from short-term errors. However, its high computational cost and tendency to overfit require careful parameterization.

Random Forest Regressor Algorithm

RF is a supervised machine learning algorithm that combines multiple decision trees to build a more powerful, more generalizable model. The problem of balancing variance and bias-that’s the essential problem of the decision trees-is largely overcome with this method. RF works by training each tree on a random subset of data and features, and its final prediction is the average of all trees’ predictions in regression or the majority vote in classification. This makes the model strong but resistant to overlearning. Advantages of RF include high accuracy, strong generalization on large datasets, and robustness to outliers. However, on large datasets, the computational time may be higher than for iterative models such as Gradient Boosting, and the model’s computational load may increase as the number of decision trees increases. Nevertheless, model performance can be further improved by parameter optimization (e.g., number of trees, maximum depth, and attribute selection strategies).

Adaboost Regressor Algorithm

AdaBoost (Adaptive Boosting) is an ensemble learning algorithm that builds a strong model by successively combining weak predictors. The basic principle of AdaBoost is that, at each iteration, new models are built by assigning greater weight to the errors of the previous model.

The model is first trained with a base learner (usually a weak decision tree), and the prediction errors are analyzed, with the weights of incorrectly predicted instances increased. Then, the new model is trained to incorporate these weights, and the process is repeated, resulting in a stronger model. Its main advantages include high accuracy and robustness to noise, since it is a mistake-learning-oriented approach.

It remains sensitive to the base learner’s performance and to hyperparameters. In the case of AdaBoost with shallow decision trees, specifically stump trees, as weak learners, the model could be insensitive to overlearning; this may have possible disadvantages of low accuracy and being computationally expensive for complex datasets. Conclusion: AdaBoost is effective for small- to medium-sized data sets and has been a successful strategy for improving model generalization performance.

Evaluation of All Models

The low MAE, MSE and RMSE values obtained are important numerical evidence supporting the success of the model. Error metrics used to evaluate the success of machine learning algorithms are used to measure how well the model performs. These metrics help to assess how well a model’s prediction matches the true values and the generalization ability of the model. Table 1 shows the comparative results of the R², MAE, MSE, RMSE, Accuracy, corr_coef, std_dev metrics of 5 different machine learning models.

Root mean square error (RMSE) was chosen to compare the prediction errors of different trained models. The closer the RMSE value is to 0, the better the predictive ability of the model in terms of its absolute deviation. The RMSE value is calculated by Equation (2) [33,34].

R M S E = \sqrt{\frac{1}{n} \sum_{r = 1}^{n} {(P_{d}^{r, m} - P_{d}^{r, c})}^{2}}

(2)

The coefficient of determination (R²) is used to estimate model efficiency and is calculated by Equation (3) [33].

R^{2} = 1 - \frac{\sum_{r = 1}^{n} {(P_{d}^{r, m} - P_{d}^{r, c})}^{2}}{\sum_{r = 1}^{n} {(P_{d}^{r, m} - P_{d}^{- r, m})}^{2}}

(3)

MSE either assesses the quality of an estimator. The MSE metric is calculated by Equation (4).

M S E = \frac{1}{n} \sum_{r = 1}^{n} {(P_{i} - P_{i}^{'})}^{2}

(4)

MAE assesses the quality of an estimator. The MAE metric is calculated by Equation (5) [33,35,36].

M A E = \frac{1}{n} \sum_{r = 1}^{n} |P_{d}^{r, m} - P_{d}^{r, c}|

(5)

In this study, the “Accuracy” metric was used as a complementary performance indicator to express the closeness between predicted and experimental values. The accuracy value was derived from the prediction error and reflects the percentage similarity between predicted and actual responses. Although R², MSE, MAE, and RMSE are the primary evaluation metrics for regression models, an accuracy indicator was included to provide an additional, intuitive interpretation of prediction performance.

3. Analysis and Results

3.1. Experimental Test Results and Analysis

3.1.1. Residual Stress Results

The graphs in Figure 5 present representative examples used to determine the residual stresses on the surface of the near-β titanium alloy by X-ray diffraction (XRD) within the test plan, which consisted of 27 experimental conditions. Since coolant pressure was considered one of the main influencing parameters in this study, representative peak deconvolution and sin²ψ plots used to calculate surface residual stress values under dry machining, conventional cooling, and high-pressure coolant-assisted machining conditions are provided. The graphs on the left illustrate the deconvolution of the measured diffraction peaks and the determination of the peak center, whereas the graphs on the right show the sin²ψ relationship obtained from these peak positions and the linear trend used as the basis for residual stress calculation.

Since presenting all XRD graphs for the complete set of 27 experimental conditions would require a very large amount of space, only representative graphs corresponding to specific parameter combinations were included here. The purpose of this approach is to clearly demonstrate the measurement and calculation methodology used and to show how the residual stress values were determined. Similarly, for temperature, cutting force, and surface roughness results, representative graphs corresponding to selected parameter combinations were presented instead of the full experimental set, thereby providing a reference to the related measurement, analysis, and calculation procedures.

The residual stress values in the dataset used in this study were obtained solely from surface measurements. The primary reason for this approach is that the single-point surface residual stress was used as a scalar metric, a methodological necessity for developing machine learning prediction models. In data-driven manufacturing research, supervised regression algorithms (such as Extra Trees, Random Forests, and Gradient Boosting) require a discrete target variable for each experimental condition to be trained effectively. Therefore, similar to the other test results, a single residual stress value was included in the dataset for each experimental condition. Figure 6 presents a three-dimensional surface plot of residual stress variation under dry cutting conditions as a function of cutting speed and feed rate.

3.1.2. Cutting Force Results

During the cutting force measurements, the signals from the three-axis dynamometer were converted into force–time graphs using a DAQ card and the CutPro interface software. The measured cutting force components were the tangential or main cutting force (Fc), the passive or radial cutting force (Fp), and the feed force (Ff).

For the determination of the main cutting force value, a region considered to be appropriate (stable) was selected for each test parameter set, and the average force value within this interval was calculated. These values were then recorded in the dataset table constructed according to the experimental design presented in Table 4. The same procedure was repeated for all 27 tests. An example of a cutting force signal corresponding to one of the experimental parameters is shown in Figure 7.

3.1.3. Cutting Temperature Results

In the tests, the maximum cutting temperature value obtained for each experiment was recorded in the dataset table prepared according to Table 4. In addition, Figure 8 presents a representative temperature variation graph obtained during the cutting experiment. The graph, corresponding to the same test parameters, illustrates the temperature variation over a one-minute machining period. To record a single temperature value in the relevant cell of the table, a stable temperature interval was identified, and the average temperature within this range was calculated.

3.1.4. Surface Roughness Measurement Results

Figure 9 presents the graph of surface roughness values of the machined material surface obtained under the cutting conditions of 50 m/min cutting speed, dry cutting, and 0.15 mm/rev feed rate, which are among the 27 test parameters. As with the hardness measurements, eight measurement points (Mp) were selected, and the average value was calculated to obtain a representative surface roughness (Ra) value.

3.2. Data Augmentation and Dataset Validation Analyses

The 27-row dataset in the current study was expanded to 120 rows using data augmentation methods. Figure 10 presents the distribution consistency and correlation plots between the original and augmented datasets for the four surface-integrity outputs. For each output, the first plot shows the original data, and the second shows the augmented data.

In Figure 10a, the data augmentation process has been largely successful in increasing data diversity while preserving the overall statistical characteristics of the original dataset. In particular, Gaussian jittering and interpolation techniques were used to expand the data points and increase their density for the variables cutting speed, cooling pressure, and feed rate. The distributional structure of cutting speed, cooling pressure, feed rate, and cutting force closely resembles that of the original dataset. The positions of the data points and the inter-variable relationship patterns were largely preserved, indicating that the data augmentation method was applied successfully.

In Figure 10b, the distributional structure of cutting speed, cooling pressure, feed rate, and cutting force is likewise highly similar to that of the original dataset. The locations of the data points and the relationships among variables were maintained to a significant extent, demonstrating that the successful implementation of the data augmentation pressure, feed rate, and cutting temperature in the augmented dataset is preserved very similarly to that in the original data. The positions of the data points and the correlation patterns were retained to a meaningful degree, confirming that the augmentation method was applied effectively. Compared to the original data, the augmented dataset does not exhibit artificial distortion or excessive concentration, indicating that no artificial clustering was introduced that could cause imbalance during model training. Overall, data augmentation expanded the dataset without disrupting the original data’s statistical structure, resulting in a balanced dataset that enables the model to generalize better.

Similarly, in Figure 10d, the distributional structure of cutting speed, cooling pressure, feed rate, and surface roughness in the augmented dataset was preserved in a manner comparable to that of the original data, and the overall distribution of data points remained largely unchanged. The augmented dataset does not exhibit excessive deviations relative to the original dataset; instead, it was expanded to increase generalization capacity by introducing diversity in regions near the existing data points. Moreover, the preservation of the relationship structure between certain variable pairs and the absence of artificial clustering or abnormal distributional shifts indicate that the method was applied in a balanced manner. Consequently, the data augmentation was performed without altering the statistical structure and allowed the model to be trained on a more balanced and generalizable dataset.

PCA analysis is of critical importance for visualizing the distributional similarities and differences between the original and augmented data by capturing the main variations in high-dimensional datasets. In addition, it reduces the dataset to a low-dimensional representation using principal components, enabling an assessment of whether the data augmentation process distorts the original data structure and how it influences the model’s learning. The PCA plots obtained for the prepared dataset are presented in Figure 11.

The PCA projection plot was used to evaluate the distributional similarities and differences between the original and augmented datasets. As shown in Figure 11a, the plot indicates that the augmented data points largely overlap with the original data points, implying that the data augmentation process preserves the original data structure. In particular, the new data points generated via Gaussian jittering and interpolation expand the dataset without disrupting the original clustering structure and do not include excessive deviations. The tight distribution of augmented data points around the original points is expected to reduce the risk of overfitting while improving the model’s generalization ability.

Similarly, when the distributions in Figure 11b are examined, the near one-to-one correspondence of the points shows that the augmentation method did not cause extreme deviations and that the statistical structure remained largely unchanged. Consequently, the data augmentation process expanded the dataset without compromising the original structure, resulting in a consistent and balanced dataset for model training.

In Figure 11c, the fact that the augmented data points largely overlap with the original points demonstrates that the augmentation process preserved the original structure and did not lead to a pronounced shift. The near-exact matching of data points indicates that the applied jittering and interpolation methods did not meaningfully alter the statistical distribution and did not generate artificial variations that could negatively affect the learning process.

Finally, Figure 11d also shows that the data augmentation methods (Gaussian jittering and interpolation) increased diversity without causing excessive deviations and expanded the dataset in a manner that does not impair generalization capacity. Overall, the data augmentation process enabled the model to be trained on a more balanced and comprehensive distribution without distorting the structure of the original dataset.

3.3. Feature Importance for All Datasets

Feature importance values were obtained using the intrinsic importance calculation of tree-based ensemble models. In algorithms such as Extra Trees and Random Forests, feature importance is computed as the average reduction in prediction error (variance reduction) contributed by each variable across all decision trees in the ensemble. This approach provides an estimate of the relative contribution of each machining parameter to the model predictions.

3.3.1. Feature Importance of Machining Responses

The feature-importance analysis shown in Figure 12a reveals the influence of three machining parameters (cutting speed, cooling pressure, and feed rate) on residual stress, which is a critical factor in the mechanical integrity of machined parts. The results indicate that cooling pressure plays a dominant role with an importance score of 0.926, demonstrating a pronounced effect on stress distribution and material deformation. In contrast, cutting speed (0.061) and feed rate (0.012) exhibit very low contributions, suggesting that the thermal and mechanical loads induced by these parameters have a more limited effect on residual stress formation. From a mechanical engineering perspective, optimizing cooling pressure is crucial for reducing residual stresses and thereby improving fatigue life, dimensional stability, and overall component reliability.

When Figure 12b is examined, the results obtained using an extra trees regressor quantitatively demonstrate the effects of cutting speed, cooling pressure, and feed rate on cutting force. The findings show that cooling pressure has the strongest influence (importance score: 0.427), followed very closely by cutting speed (0.405). This suggests that both parameters play a dominant role in determining cutting force through their direct effects on the tool–chip interaction, heat generation, and material deformation during cutting. In contrast, the importance of feed rate is noticeably lower (0.168); although it affects cutting force, its contribution is more limited than those of the other two parameters. From a mechanical engineering standpoint, optimizing cooling pressure and cutting speed is important to reduce cutting forces, extend tool life, lower energy consumption, and improve surface quality.

Similarly, the feature-importance analysis in Figure 12c quantifies the effects of cutting speed, cooling pressure, and feed rate on cutting temperature using a random forest regressor. The results indicate that cooling pressure has the strongest influence (0.657), followed by cutting speed (0.313), and feed rate contributes only marginally to temperature variations (0.030). From a mechanical engineering perspective, cooling pressure plays a critical role in regulating cutting temperature by directly affecting heat removal at the tool–workpiece interface. Cutting speed also increases temperature due to friction and plastic deformation, while the low importance score of feed rate indicates that heat generation is more strongly governed by cutting speed and cooling efficiency.

Finally, the feature-importance analysis in Figure 12d quantifies the effects of cutting speed, cooling pressure, and feed rate on surface roughness and identifies the dominant parameters governing surface quality. According to the results, cutting speed has the highest importance score (0.606), followed by cooling pressure (0.274), and feed rate (0.119) has the lowest influence. From a mechanical engineering perspective, cutting speed significantly affects roughness through chip formation, heat generation, and tool–workpiece interaction; in general, higher speeds may produce smoother surfaces, although excessively high speeds can degrade quality due to thermal damage or tool wear. Cooling pressure improves surface finish by enhancing chip evacuation, lubrication, and heat dissipation. The low feed rate contribution indicates that surface roughness is more sensitive to cutting speed and cooling conditions.

3.3.2. Cutting Speed and Machining Responses Relationship

The scatter plot in Figure 13a illustrates the relationship between cutting speed and residual stress, a key parameter in machining-induced stress analysis. The distribution of data points suggests that residual stress shows high variability across cutting speeds and that there is no clear linear correlation. This indicates that other factors, such as cooling conditions, tool geometry, and material properties, may significantly influence the stress distribution. In addition, the scatter observed at specific speed ranges (e.g., around 50, 80, and 120 mm/min) suggests that phase transformations, thermal expansion, or strain hardening may contribute to residual stress formation. From a mechanical engineering perspective, optimizing cutting speed alone may not be sufficient to control residual stress; instead, a multi-parameter optimization approach should be adopted by considering cooling strategies and material behavior.

In Figure 13b, the scatter plot shows the relationship between cutting speed and cutting force, which is critical for machining performance and tool wear. Although some variability is present, the data points suggest that cutting force may increase at higher cutting speeds; this variability may arise from additional factors such as material properties, tool wear, and cooling conditions. From a mechanical engineering standpoint, increasing cutting speed generally intensifies thermal effects at the tool–workpiece interface, which can reduce material strength and thereby decrease cutting forces. However, the persistence of relatively high forces over certain speed ranges suggests that strain hardening and dynamic tool–chip interactions may counterbalance the thermal softening effect. The scatter observed at each speed level also indicates that other process variables, such as tool geometry, lubrication, and chip formation, contribute to force fluctuations. For optimal machining conditions, the cutting speed should be balanced with appropriate cooling strategies and feed rate to minimize cutting forces, improve tool life, and enhance surface quality.

The scatter plot in Figure 13c presents the relationship between cutting speed and cutting temperature, which is critical for machining performance. The data distribution shows that cutting temperature generally increases at higher cutting speeds, consistent with increased frictional heating and plastic deformation. From a mechanical engineering perspective, as cutting speed increases, energy dissipation at the tool–workpiece interface rises, and because there is insufficient time for the generated heat to disperse, the temperature increases locally. The scatter observed within certain speed ranges indicates that tool material, cooling conditions, and chip formation dynamics also affect temperature. For effective thermal management, the cutting speed should be optimized in conjunction with the cooling pressure.

Finally, the scatter plot in Figure 13d demonstrates the relationship between cutting speed and surface roughness, highlighting the effect of machining speed on final surface quality. Although some variability exists, the data indicate an overall tendency for surface roughness to decrease as cutting speed increases. From a mechanical engineering perspective, higher cutting speeds can promote smoother cutting, reduce built-up edge (BUE) formation, and produce a better surface finish. At lower speeds, longer tool workpiece contact time can increase defects and tearing. The scatter observed at intermediate speeds suggests that additional factors, such as tool wear, vibration, and cooling conditions, also contribute to roughness. The reduction in roughness at higher speeds supports the view that, under appropriate conditions, high-speed machining can achieve improved surface quality.

3.3.3. Cooling Pressure and Machining Responses Relationship

The scatter plot in Figure 14a presents the relationship between cooling pressure and residual stress and shows that the stress distribution is strongly dependent on cooling conditions. At low cooling pressures (0.6 MPa/conventional), residual stress varies widely and is predominantly high tensile. As cooling pressure increases toward 30 MPa (HPC), residual stress is observed over a narrower range, predominantly compressive. This behavior suggests that higher cooling pressure increases the heat removal rate, thereby reducing thermally induced stress gradients and leading to a more uniform stress distribution. Moreover, the reduction in tensile stress at high pressures is important from a mechanical engineering perspective because compressive residual stresses generally improve fatigue life and resistance to crack propagation. These findings provide important evidence for optimizing cooling pressure to reduce thermal distortions and enhance the mechanical integrity of the workpiece.

The scatter plot in Figure 14b illustrates the relationship between cooling pressure and cutting force, a fundamental parameter affecting tool wear, chip formation, and workpiece integrity. The data points indicate that the cutting force remains relatively high at low cooling pressures (close to 0.6 MPa), whereas at 30 MPa it is slightly lower and remains within a more stable range. From a mechanical engineering standpoint, high cooling pressure can improve heat removal and lubrication, reducing friction and adhesion and, in turn, lowering cutting forces. Nevertheless, the scatter observed at each pressure level suggests that other factors, such as cutting speed, feed rate, and tool material, also contribute to force variation.

The scatter plot in Figure 14c emphasizes the relationship between cooling pressure and cutting temperature and highlights the effect of cooling efficiency on thermal management. The data points show that the cutting temperature decreases as cooling pressure increases, indicating the coolant’s enhanced ability to remove heat. At low pressures (near 0.6 MPa), insufficient heat removal leads to elevated temperatures, whereas as pressure increases toward 30 MPa, convective heat transfer and lubrication become more effective and the temperature decreases markedly. However, the scatter indicates that additional factors such as cutting speed, tool wear, and chip-formation behavior also influence temperature.

Similarly, the scatter plot in Figure 14d shows the relationship between cooling pressure and surface roughness and demonstrates the effect of coolant application on surface finish. The data indicate that surface roughness generally decreases with increasing cooling pressure, thereby improving machining performance. At low pressures, insufficient lubrication and poor chip evacuation can increase material adhesion and friction, which raises roughness. When pressure is increased to 30 MPa, more effective cooling/lubrication can reduce thermal effects and cutting resistance, enabling smoother surfaces to be obtained. Nevertheless, the scatter suggests that other factors, such as tool wear, cutting speed, and material properties, also affect surface roughness.

Scatter plots were used to directly visualize the agreement between experimental and predicted values for each individual experiment.

3.3.4. Feed Rate and Machining Responses Relationship

The scatter plot in Figure 15a shows the relationship between feed rate and residual stress and reveals a highly dispersed pattern without a distinct linear trend. This suggests that feed rate has a weaker, less consistent influence on the residual stress distribution than other parameters, such as cooling pressure. From a mechanical engineering perspective, feed rate affects chip formation mechanics and cutting forces, which in turn are reflected in the thermo-mechanical stress distribution within the workpiece. However, the wide variation in residual stress across different feed rates indicates that additional factors, such as tool wear, cutting temperature, and material properties, play a more decisive role. Moreover, the observation of both high tensile and compressive stresses at various feed rates suggests that factors such as tool–workpiece interaction, dynamic cutting conditions, and strain hardening may contribute to the observed stress inconsistencies. Therefore, while feed rate is important for surface quality and tool life, its direct effect on residual stress does not appear to be dominant, supporting a multi-parameter optimization approach for stress control.

The scatter plot in Figure 15b illustrates the relationship between feed rate and cutting force and highlights the effect of material removal rate on machining dynamics. The data points show that cutting force varies with feed rate, but no clear linear trend is observed, suggesting that other parameters may influence force fluctuations. From a mechanical engineering standpoint, as the feed rate increases, the cutting force generally increases because more material is engaged in the cutting process. However, conditions such as tool geometry, chip thickness formation, workpiece material properties, cooling efficiency, tool wear, and vibration can affect the consistency of the force. For optimal performance, the feed rate should be balanced with the cutting speed and cooling pressure.

The scatter plot in Figure 15c presents the relationship between feed rate and cutting temperature and demonstrates how material removal rate affects thermal behavior. The data points indicate that temperature changes at different feed rates lack a distinct linear correlation, suggesting that other parameters, such as cutting speed and cooling pressure, play a major role in temperature generation. From a mechanical engineering perspective, increasing the feed rate can increase friction and cutting forces because more material enters the cut per unit time, potentially raising temperatures; however, tool–chip interaction, heat removal efficiency, and material properties can cause fluctuations in this relationship. For optimal thermal control, the feed rate should be balanced with the cutting speed and cooling pressure.

Finally, the scatter plot in Figure 15d shows the relationship between feed rate and surface roughness and highlights the effect of material removal rate on surface quality. The data generally indicate that surface roughness increases with increasing feed rate, although some variability is observed across different feed rates. From a mechanical engineering perspective, increasing the feed rate raises cutting forces and chip thickness, which can amplify tool deflection, vibration, and surface irregularities. At lower feed rates, roughness is typically lower due to smoother cutting conditions. The scatter suggests that cutting speed, tool wear, and cooling conditions also affect roughness. For optimal surface quality, the feed rate should be optimized together with the cutting speed and cooling pressure.

The bar chart in Figure 16 illustrates the relative contribution of three key machining parameters, feed rate, cooling pressure, and cutting speed, to four different machining outputs: cutting force, cutting temperature, residual stress, and surface roughness. This analysis provides a detailed view of how these independent variables influence the machining process and overall workpiece integrity.

One of the most striking observations derived from the feature importance values is that cooling pressure plays a dominant role in controlling both residual stress (0.926) and cutting temperature (0.657). Effective cooling is critical for heat removal, chip evacuation, and lubrication, reducing the risk of excessive heat accumulation and the associated thermal expansion, tool wear, and undesired residual stress formation. The high importance of cooling pressure for these outputs indicates that thermal stability is a determining factor in machining efficiency and final part integrity.

Cutting speed, on the other hand, emerges as a highly influential factor, particularly for surface roughness (0.606) and cutting force (0.405). Higher cutting speeds often reduce built-up edge (BUE) formation, enabling smoother cutting and improved surface quality; however, excessively high speeds can increase thermal damage and tool wear, adversely affecting the surface. Feed Rate exhibits the lowest importance values across all outputs, suggesting that the influence of mechanical cutting parameters is overshadowed by the dominant thermal effects of cooling and speed.

From a mechanical engineering perspective, this analysis emphasizes the necessity of optimized cooling strategies and controlled cutting speeds to improve machining efficiency and workpiece quality. Reducing residual stress, improving surface finish, and lowering cutting forces depend not only on adjusting mechanical parameters but also on balancing thermal effects and lubrication efficiency. Future studies may further enhance manufacturing precision and efficiency by focusing on approaches such as adaptive cooling technologies, real-time cutting temperature monitoring, and high-speed machining optimization.

3.4. Findings and Discussions

In this section, the performance of five machine learning algorithms was evaluated across four datasets. In addition, hyperparameter optimization was applied to improve the performance of the machine learning algorithms. The findings and discussions were analyzed under separate headings for each dataset.

3.4.1. Residual Stress Findings

Table 5 presents a comparison of five machine learning algorithms before and after optimization: extra trees, random forests, gradient boosting, k-nearest neighbors (KNN), and AdaBoost. The performance metrics for all models have been analyzed for predictive power using R² (coefficient of determination), MSE (Mean Squared Error), MAE (Mean Absolute Error), RMSE (Root Mean Squared Error), and Accuracy. In this case, optimization, especially Extra Trees, outperformed other metrics with the highest performance: R² = 0.9997, MSE = 6.8289, and Accuracy = 98.4792%. Hence, the model achieves lower error rates and provides more stable predictions than other methods. The accuracy values were also quite high for the Random Forest and Gradient Boosting algorithms, while AdaBoost showed very poor performance. It is observed that, after optimization, all algorithms exhibit improved performance. However, it should be noted that the R² and error values remain very low for the AdaBoost model.

On the other hand, the performance of the models depends directly on the hyperparameter settings used during optimization. Increasing the number of trees for extra trees and Random Forest algorithms, and choosing the optimal depth, contributed significantly to improvements in prediction accuracy. For gradient boosting, higher learning rate and min_samples_leaf values improved generalization, while the best results were obtained with n_neighbors = 4 and p = 1 in the KNN algorithm. Although the highest learning rate (2) was used in the AdaBoost model, the high error rates suggest that the model suffers from overfitting or underfitting. The results show that the optimization process provides significant improvement, especially for complex tree-based models, and that the correct choice of hyperparameters directly affects model performance.

In Table 6, three graphs are presented to evaluate the prediction performance of extra trees, random forests, gradient boosting, k-nearest neighbors (KNN), and AdaBoost algorithms: a prediction graph on training data, an error distribution graph, and a prediction graph on test and training data. The prediction graph (Train) in the first column shows the relationship between the predicted values and the actual values. Extra trees and random forest algorithms fit the data quite well, while Gradient boosting and KNN models show more scatter. The AdaBoost model shows it can predict linearly, as its predictions are farther from the actual values, indicating less dispersion. Probably, this is the point at which boosting weak learners does not generalize well enough to nonlinear problems.

The distribution curves in the second column of graphs basically show the distribution of the prediction errors. The extra trees and random forests show narrow, centrally concentrated distributions of faults. That means the bias is very low and the accuracy is high. But in gradient boosting and KNN, there is a wide, irregular error distribution, so the model is more likely to assign certain data points high errors. In the AdaBoost model, the distribution of errors is wide and includes outliers, indicating that it has produced larger deviations for some observations. The third column shows the prediction graph (Test & Train), where the actual and predicted values are closer together, following a time-series-like distribution. Extra Trees and random forest have predictions that match the test data relatively well, but it is from gradient boosting and KNN that the oscillations become sharper; whereas in AdaBoost, one may spot greater variation and hence see that its predictive error was far behind those of the four competing models. Extra Trees and Random Forests achieve the best performance, with high accuracy and low error rates, while gradient boosting and KNN result in higher error rates, and AdaBoost provides the lowest accuracy.

3.4.2. Cutting Force Findings

Table 7 shows the results for extra trees, random forest, gradient boosting, k-nearest neighbors, and adaboost before and after optimization. R², MSE, MAE, RMSE, and Accuracy are statistical metrics used to assess algorithm performance. Hence, the best-performing models were extra trees and gradient boosting. The R² value is 0.999, and the error rates are pretty low (MSE = 0.0010 and 0.0867). This means that in both models, the actual and predicted values show a high correlation. Though the random forest and KNN algorithms also achieve high accuracy rates above 99%, their errors are larger than those of extra trees and gradient boosting. In contrast, AdaBoost has a considerably low R² of 0.7756 and high error rates, MSE = 355.8604, compared to the rest, indicating that this model offers a considerably lower prediction performance compared to the other methods.

It is observed that the choice of hyperparameters in the optimization process directly affects model performance. For extra trees and random forests, deeper tree structures (max_depth) and an increased number of trees (n_estimators) resulted in significant improvements in error metrics. For gradient boosting, choosing a low learning rate (0.05) reduced the risk of overfitting and improved the model’s overall performance. For KNN, the best results were obtained with n_neighbors = 4 and p = 1, indicating that the optimal number of neighbors was determined. In the AdaBoost model, although the learning rate of 2 was chosen quite high, the model’s performance was low and was considered to suffer from overfitting or underfitting. It can be concluded that the optimization process provides a significant improvement, especially for complex tree-based models, but this effect is limited for models based on weak learners such as AdaBoost.

Table 8 shows three sets of graphs evaluating the predictive performance of extra trees, random forests, gradient boosting, k-nearest neighbors (KNN), and AdaBoost. These include a training data prediction graph, an error distribution graph, and a test-and-training data prediction graph. When the first column’s prediction graph (Train) is analysed, extra trees, random forests, and gradient boosting models show a very good fit to the training data. The predicted and actual values almost coincide.

In the KNN model, the predictions show a relatively tight distribution, whereas in AdaBoost, there is a significant spread, indicating that the model does not fit the training data well and makes more errors. The fault distribution graph in the second column analyzes the fault distribution of the models. In extra trees and random forests, the error distribution is very narrowly centred around zero, indicating very low error rates for these models.

In gradient boosting and KNN models, the error distribution is slightly wider and shifted from the centre. This indicates more errors at certain data points. In AdaBoost, the error distribution is quite wide and irregular. This means the model makes large prediction errors at different points.

The prediction graphs (Test & Train) in the last column show the variation in predicted and actual values over time for the test and training data. Extra trees, random forests, and gradient boosting models give relatively stable predictions on the test data, staying close to the actual values. KNN and especially AdaBoost show greater fluctuations and higher prediction errors.

This shows that AdaBoost’s generalisation ability on test data is weaker than that of other algorithms. Extra trees and random forests stand out as the most successful models, with the lowest error rates. AdaBoost makes significantly more errors and fails to generalize.

3.4.3. Cutting Temperature Findings

In Table 9, the performance of the extra trees, random forests, gradient boosting, k-nearest neighbors (KNN), and AdaBoost algorithms is examined using R², MSE, MAE, RMSE, and Accuracy metrics. Post-optimization results show that extra trees, gradient boosting, and random forests offer the best predictive performance, with high accuracy and low error rates. The extra trees model shows almost perfect results with R² = 99.999 and MSE = 6.5182 × 10⁻²⁵, which may lead to overfitting. Gradient Boosting seems to be a very successful model, with an R² of 0.9997 and low MSE and RMSE values. The KNN model has relatively higher error rates than the other algorithms, but it provides over 99% accuracy. AdaBoost has the lowest R² value (0.9738) and the highest error rate among the models, and it performs the worst in terms of prediction performance.

The optimization process significantly reduced the error rates of the random forest and gradient boosting models. For example, the MSE value for the random forest decreased from 16.512 before optimization to 5.619 after optimization. Similarly, the MSE of gradient boosting decreased from 30.902 to 2.1709, improving the model’s prediction accuracy. While the KNN model’s error rates remained relatively high, AdaBoost benefited the least from the optimization process, and its error values remained high. In particular, the learning rate in AdaBoost was kept high (2), which may have led to overfitting or underfitting. Overall, it can be concluded that while the optimization process provides significant improvement for more complex tree-based models, AdaBoost’s ability to model complex nonlinear relationships is limited.

Table 10 presents three basic graph types to compare the predictive performance of extra trees, random forests, gradient boosting, k-nearest neighbors (KNN), and AdaBoost algorithms: a training data prediction graph, an error distribution graph, and a test-and-training data prediction graph. The prediction graphs (Train) in the first column show the models’ performance on the training data. It is observed that the predicted values from Extra Trees, random forests, and KNN models closely match the actual values, indicating that the models provide a very good fit to the training data. In the AdaBoost model, on the other hand, the predicted values deviate significantly from the actual values, indicating that the model does not fully adapt to the training data and performs poorly in modelling complex non-linear relationships. The fault distribution graphs in the second column show the error distribution of the models. The error distribution of the Random Forest and KNN models is concentrated in the centre, while the error distribution of the Extra Trees model has a relatively wider variance. In gradient boosting and AdaBoost models, the error distribution spans a wider range, indicating that larger errors occur in some predictions. The Prediction Graph (Test & Train) in the third column compares the change in predicted and actual values over time. While the predictions of extra trees, random forest, and gradient boosting algorithms are quite parallel to the actual values, more fluctuations and errors are observed in KNN and AdaBoost models. Especially in the AdaBoost model, the predicted values exhibit higher variance than the actual values and incur serious prediction errors at certain points. In general, extra trees and random forests achieve the best predictive performance, with the lowest error rates, while the AdaBoost model offers the lowest generalisation success.

3.4.4. Surface Roughness Findings

In Table 11, the performance of the extra trees, random forests, gradient boosting, k-nearest neighbors (KNN), and AdaBoost algorithms is compared before and after optimisation. The models’ predictive performance was evaluated using metrics such as R², MSE, MAE, RMSE, and Accuracy. The extra trees model stands out as the one with the lowest error rate, achieving almost perfect results with R² = 99.999 and MSE = 1.4230 × 10⁻²⁹. While this shows that the model fits the data perfectly, it also suggests the risk of overfitting. Random forests and gradient boosting models perform quite well, with R² values of 0.9481 and 0.8034, respectively, but their error rates are higher than those of extra trees. KNN and AdaBoost appear to be the weakest models in terms of prediction accuracy, with lower R² values (0.7627 and 0.7237, respectively). When the MSE and RMSE values are analysed, it is observed that the AdaBoost model has the highest error rates, indicating that it does not generalise well on the test data. The optimisation process provided significant improvements in the performance of all models. In particular, the error rates of random forest and gradient boosting algorithms decreased significantly, and the accuracy values increased. The MSE values for Random Forest decreased from 0.0816 to 0.0152, and for gradient boosting decreased from 0.0778 to 0.0577, improving prediction performance. Although the optimisation process also improved the KNN and AdaBoost algorithms, the overall performance of these models is lower. In particular, AdaBoost produces low R² and high MSE values even after optimisation, indicating that the model cannot learn non-linear relationships well enough. In general, Extra Trees stands out as the most successful model, while Random Forest and gradient boosting models are also successful alternatives. KNN and AdaBoost, on the other hand, performed poorly compared to the other models due to their higher error rates.

Table 12 presents three main types of graphs to analyse the prediction performance of extra trees, random forests, gradient boosting, K-Nearest Neighbors (KNN), and AdaBoost algorithms: a prediction graph on training data, an error distribution graph, and a prediction graph on test and training data. Looking at the prediction graph (Train) graphs in the first column, the extra trees model provides almost perfect agreement between predicted and actual values. In the random forest, gradient boosting, and KNN models, the predictions exhibit greater variability and errors at certain points. The AdaBoost model shows the highest scatter, and there are significant differences between the predicted and actual values, indicating that it does not achieve sufficient accuracy on the training data. When the fault distribution graph in the second column is analysed, it is seen that the error distribution of the Extra Trees model is very close to the zero point and has very low error rates. In the random forest and gradient boosting models, the error distribution is relatively widened, indicating that larger errors are produced at certain prediction points. For KNN and AdaBoost models, the error distribution is wider and exhibits higher variance, indicating lower generalisation ability. In the prediction graph (Test & Train) graphs in the third column, it is seen that extra trees and random forest models follow a parallel course with the actual values, which shows that the models have a good generalisation capacity on the test data. In Gradient Boosting, KNN, and AdaBoost models, on the other hand, the predictions deviate more from the actual values, and fluctuations over time are higher. Especially in the AdaBoost model, it is clear that there are significant prediction errors in certain regions. As a result, the Extra Trees model stands out as the most successful, offering the highest prediction accuracy and the lowest error rates. While random forests and gradient boosting models also have acceptable error levels, KNN and, especially, AdaBoost models perform worse than other models due to their high error rates and limited generalisation.

Figure 17 presents the R², Mean Squared Error (MSE), Mean Absolute Error (MAE), root mean squared error (RMSE), and accuracy values of different machine learning algorithms across four datasets: residual stress, cutting force, cutting temperature, and surface roughness.

In Figure 17a, the Extra Trees model consistently achieves the highest R² values across all datasets, indicating superior predictive accuracy. While random forests and gradient boosting maintain relatively high performance, a notable decline in R² is observed when KNN and AdaBoost are used, particularly for the surface roughness and cutting force datasets, suggesting that these models struggle with complex data patterns. The sharp decrease in AdaBoost’s R² values indicates a limited ability to capture non-linear relationships within these datasets.

Overall, tree-based models outperform distance-based (KNN) and boosting-based (AdaBoost) methods in maintaining high prediction accuracy. Turning to Figure 17b, Extra Trees consistently attains the lowest MSE values, further demonstrating its superior predictive performance. Similarly, random forests and gradient boosting maintain relatively low error rates. In contrast, KNN and AdaBoost exhibit sharp increases in MSE, especially for the residual stress and cutting force datasets.

This indicates that distance-based (KNN) and boosting-based (AdaBoost) approaches struggle to learn complex patterns and can lead to substantial prediction errors. In general, tree-based ensemble models are more successful than other methods in minimizing prediction error and therefore provide more reliable options for these datasets.

As shown in Figure 17c, Extra Trees consistently delivers the lowest MAE values, confirming its ability to make accurate predictions. Random forest and gradient boosting also yield comparatively low errors, while KNN and AdaBoost show a marked increase in MAE particularly for the residual stress and cutting force datasets.

This suggests that distance-based (KNN) and boosting-based (AdaBoost) models struggle to learn complex, nonlinear relationships and therefore deviate more substantially from the true values. Overall, tree-based models are more effective at reducing absolute errors, reinforcing their suitability for these datasets. Likewise, in Figure 17d, extra trees maintains the lowest RMSE values, reflecting superior predictive accuracy. Random forests and gradient boosting also perform well with moderate RMSE values, whereas KNN and AdaBoost exhibit significant increases in RMSE, especially on the residual stress and cutting force datasets. This indicates that distance-based (KNN) and boosting-based (AdaBoost) models struggle to capture complex relationships, leading to higher errors. Overall, tree-based ensemble approaches provide more reliable predictions by minimizing deviation from the true values.

Finally, in Figure 17e, Extra Trees consistently achieves the highest accuracy, followed closely by Gradient Boosting and Random Forest, all demonstrating strong predictive capability. For KNN and AdaBoost, a decrease in accuracy is observed, particularly on the residual stress dataset, suggesting that these models may fail to generalize well under certain conditions. AdaBoost’s sharp drop in accuracy across multiple datasets suggests the boosting approach may not be effective for these data distributions.

To support the robustness of the experimental dataset used for machine learning modelling, basic statistical descriptors of the measured responses were evaluated. The experimental dataset consisted of 27 machining experiments. The mean and standard deviation values of the measured responses were calculated as follows: cutting force (356.17 ± 48.95 N), cutting temperature (525.57 ± 106.20 °C), residual stress (52.13 ± 194.86 MPa), and surface roughness (2.64 ± 0.46 µm). These values indicate the variability of the experimental measurements and provide additional statistical support for the reliability of the dataset used in the machine learning analysis. The results are shown in Table 13.

3.4.5. Discussions

In this study, the responses of surface integrity metrics (residual stress, cutting temperature, cutting force, surface roughness, and tool wear) to process parameters during high-pressure coolant (HPC) machining of the Ti-5553 alloy were evaluated from both physical-mechanistic and machine learning (ML) perspectives. In titanium alloys, the long-recognized concentration of heat in the cutting zone, friction at the tool–chip interface, and microstructural changes have been emphasized in the literature as the main determinants of surface integrity. In their review of surface integrity in titanium and nickel-based alloys, Safari et al. explicitly stated that residual stress, work hardening, and microstructural transformations are primarily of thermo-mechanical origin and develop in strong dependence on machining conditions [37]. Within the scope of the present work, it is reasonable to consider that this provides the key basis for physically interpreting why HPC becomes particularly prominent with respect to temperature and residual stress. Nandy et al. [38] experimentally investigated the effects of HPC on the turning of Ti-6Al-4V and reported clear improvements in tool life, frictional behavior, forces, and surface integrity; they attributed these improvements to the high-pressure jet’s ability to more effectively access the tool–chip interface. Similarly, Palanisamy et al. demonstrated that chip morphology and chip breakability are pressure-sensitive, and noted that increasing pressure can yield more favorable chip formation and better cutting performance [39]. Therefore, it can be argued that these findings in the literature explain why, in the present study, coolant pressure has a strong basis in the literature for achieving high importance scores, particularly for thermally dominated outputs such as Tc and Rs.

Likewise, Mia showed that, by directing high-pressure coolant differently toward the rake and flank faces of the tool, tool wear and surface roughness in the turning of Ti-6Al-4V can change significantly under HPC [40]. This supports the view that HPC is not merely a “cooling” approach, but rather a “process design variable” that influences tribological parameters such as lubrication, contact length, and adhesion. In another study, Özel and Ulutan, using an experimental approach together with finite element analysis, demonstrated that machining-induced residual stresses are governed by the thermo-mechanical fields generated during cutting, and that process conditions alter these fields and thereby manifest in the residual stress distribution [41]. From this perspective, it is expected that HPC reduces the cutting-zone temperature and thermal gradients, suppresses the thermal component of residual stress, and thus has a strong effect on Rs. The review by Ulutan and Özel also supports this interpretation by emphasizing that thermal effects can shift the residual-stress regime in titanium alloys.

Regarding the influence of the cooling strategy on residual stress, the dominant role of high-pressure coolant (HPC) stems from its ability to alter the thermomechanical balance in the cutting zone. Machining-induced residual stresses are primarily governed by the competition between thermal expansion (which induces tensile stresses upon cooling) and mechanical plastic deformation (which induces compressive stresses). Under dry or conventional cooling conditions, the severe heat generation in near-β titanium alloys such as Ti-5553 leads to steep thermal gradients, promoting a tensile residual stress state. However, applying HPC at 30 MPa effectively penetrates the tool–chip interface, significantly enhancing convective heat transfer and reducing the thermal load. By suppressing the thermal component, the mechanical deformation effect becomes more pronounced, thereby shifting the residual stress distribution towards a more favorable, compressive state. This transition is highly desirable from a mechanical engineering perspective, as compressive residual stresses are known to inhibit fatigue crack initiation and improve the service life of critical aerospace components.

Furthermore, it is important to acknowledge the limitations associated with the residual stress measurements in this study. The X-ray diffraction (XRD) analyses were conducted as single-point surface measurements. While this approach effectively captures the stress state at the immediate machined surface, it does not reveal the in-depth residual stress gradient or the subsurface peak stresses, both of which are relevant to fatigue performance. Nevertheless, the use of single-point surface residual stress as a scalar metric was a methodological necessity for developing the machine learning prediction models. In data-driven manufacturing research, training supervised regression algorithms (such as Extra Trees, Random Forest, and Gradient Boosting) requires a discrete target variable for each experimental condition. Surface residual stress is widely recognized in the literature as a primary indicator of surface integrity, as fatigue cracks predominantly nucleate at the surface. Similar approaches have been successfully adopted in recent studies; for instance, machine learning frameworks and neural networks have been developed using single-point surface residual stress data to optimize the machining of titanium alloys and hard materials [42,43]. Therefore, while the single-point measurement is a physical limitation of the current experimental scope, it provides a robust, representative, and mathematically viable feature for establishing the multi-output predictive framework presented in this work.

Ezugwu et al. compared conventional and high-pressure cooling in finish turning of Ti-6Al-4V with PCD tools and showed that HPC can extend tool life and improve surface integrity [44]. Nevertheless, surface roughness is often a “speed-dominant” metric because cutting speed directly affects surface formation through phenomena such as flank wear, adhesion, and built-up edge. Accordingly, the strong influence of cutting speed on Ra is consistent with the idea that HPC does not solely determine Ra, but rather acts as a “stabilizer” that protects the surface at higher speeds by limiting thermal escalation. Yünlü reported for Ti-5553 that the cooling strategy under HPC is decisive for machinability and residual stress, and that HPC can meaningfully change output metrics by managing thermal and tribological loads [11]. This finding aligns the pronounced HPC effect observed in Ti-5553 with the literature and renders the high importance scores produced by coolant-related variables in the ML model consistent with prior work. To better position HPC’s role in thermal management, the cryogenic machining literature provides a useful basis for comparison. Hong and Ding showed in detail that, in cryogenic cooling of Ti-6Al-4V, the application mode is a key determinant of cutting temperature [45]. Bordin et al. compared dry versus cryogenic approaches for the turning of Ti-6Al-4V and reported that cryogenic strategies can improve wear and surface-related performance [46]. Shokrani et al. demonstrated that cryogenic cooling in CNC milling of Ti-6Al-4V can improve tool life and surface quality metrics [47]. The common message across these studies is that, regardless of the cooling strategy, its primary effect is to reduce thermal load and improve interfacial tribology; thus, the dominance of HPC on Tc and Rs in the present study is consistent with the general trend in the literature. In their review of the design of cryogenic machining setups, Khanna et al. emphasized that the main determinant of performance is not only the “fluid type,” but also delivery geometry, nozzle orientation, and effective access to the target interface [48]. This emphasis is directly applicable to HPC as well: performance is governed not only by pressure, but also by the jet’s ability to penetrate effectively into the tool–chip contact. Machining data are often nonlinear and interactive, and may exhibit threshold behavior; this becomes particularly pronounced in processes such as HPC, where “penetration/access” is a governing mechanism. For this reason, strong performance from tree-based ensemble approaches is expected. Breiman showed that Random Forests improve generalization through multiple randomized trees [49]. Geurts et al. demonstrated that the Extremely Randomized Trees (Extra Trees) approach can deliver stable performance on small-to-medium datasets [50]. Friedman showed that the Gradient Boosting Machine framework can successfully approximate complex functions via a sequential error-correction process [51]. Finally, Aggogeri et al. reviewed ML applications in machining and emphasized that ensemble methods are widely used and effective for nonlinear relationships and multi-output prediction [52]. This methodological framework supports both the strong performance of ensemble models for outputs such as Ra, Tc, Fc, and Rs in this study and the physically consistent separation of variable importances, consistent with the underlying mechanisms. It should be noted that the feature importance results should be interpreted with caution due to the relatively small size of the experimental dataset. Although ensemble tree-based models are known to provide robust importance estimates, the obtained values should primarily be considered as indicators of relative parameter influence within the investigated experimental conditions rather than absolute quantitative rankings.

The consistency of importance patterns across different machine learning models, such as Extra Trees and Random Forests, suggests that the dominant parameters—especially cooling pressure—show stable influence patterns within the dataset.

It should also be noted that the high R² values obtained in this study are mainly re-lated to the controlled nature of the experimental design and the limited number of input variables. The machining parameters considered in the study (cutting speed, feed rate, and cooling pressure) have strong and physically meaningful relationships with the output responses. Under such controlled experimental conditions, tree-based ensemble models can capture these relationships with very high predictive accuracy. Therefore, the high R² values observed in the models primarily reflect the deterministic relationships between process parameters and machining responses rather than overfitting.

4. Conclusions

This study investigated surface integrity-related outputs, residual stress (Rs), cutting force (Fc), cutting temperature (Tc), and surface roughness (Ra) in high-pressure coolant (HPC) assisted machining of the near-β Ti-5553 (Ti-5Al-5V-5Mo-3Cr) alloy within an integrated framework combining controlled experiments and machine-learning-based predictive modeling. Within a General Full Factorial Design comprising three factors at three levels, 27 experiments were conducted under dry, conventional cooling (0.6 MPa), and HPC (30 MPa) conditions using a Jet Stream (nozzle type) toolholder. The resulting dataset was used for two purposes: (i) to quantitatively determine the relative influence of machining parameters on multiple, interrelated outputs and (ii) to develop reliable predictive models. To strengthen learning from limited experimental data, an 80/20 train-test split was adopted, and continuous-variable data augmentation (Gaussian jittering and interpolation) was applied while preserving the dataset structure.

The results consistently show that coolant pressure is the dominant process variable governing surface integrity within the investigated parameter window. Feature-importance analysis revealed that coolant pressure plays a decisive role, particularly for residual stress and cutting temperature (Rs: 0.926; Tc: 0.657). These findings indicate that HPC is not merely an auxiliary condition; by directly controlling heat transport and the tool–chip–workpiece contact environment, it acts as a primary process parameter shaping residual stress formation mechanisms and temperature levels. Consistent with the experimental observations, HPC reduced thermal gradients and promoted more favorable near-surface residual stress states, suggesting potential benefits for fatigue performance.

Regarding cutting force, coolant pressure, and cutting speed, the effects were comparable in magnitude (Fc: 0.427 for P; 0.405 for Vc). This points to a coupled mechanism: HPC influences force generation through improved lubrication/cooling, and altered tribological conditions at the contact, whereas cutting speed governs force through cutting mechanics and heat generation. In contrast, surface roughness was mainly controlled by cutting speed (Ra: 0.606 for Vc), with coolant pressure providing a secondary yet meaningful contribution (0.274). Feed rate showed relatively low importance across all outputs within the examined ranges (e.g., Rs: 0.012; Tc: 0.030; Ra: 0.119), indicating that, for Ti-5553 under the present conditions, surface integrity sensitivity is primarily driven by cooling effectiveness and cutting speed rather than feed rate.

On the modeling side, tree-based ensemble methods successfully captured nonlinear parameter–response relationships and stood out with low error and high accuracy. Among the evaluated models, Extra Trees delivered the strongest overall performance, particularly for Rs and Tc predictions, suggesting that it can provide a reliable data-driven infrastructure for decision support, process optimization, and digital twin-type applications in HPC-assisted machining.

From a material-behavior standpoint, Ti-5553 is a “difficult-to-machine” alloy characterized by heat accumulation in the cutting zone, strong adhesion/friction tendency, work hardening, and elevated wear risk. In this context, the study quantitatively confirms that thermal and tribological control are decisive for surface integrity. While higher cutting speeds may be advantageous for minimizing Ra, the temperature-sensitive machinability window of Ti-5553 requires effective thermal risk management, which is most robustly provided by HPC. Therefore, a multi-objective optimization approach is necessary, where Ra–Tc–Fc–Rs are considered simultaneously rather than improving a single output in isolation. Overall, the study established a quantitative parameter-importance ranking for multiple surface integrity metrics in HPC machining of near-β titanium alloys and demonstrated that, even with a limited number of experiments, strong predictive performance can be achieved using ensemble learning supported by data augmentation.

The evaluation strategy used in this study ensured that the predictive performance of the models was assessed on independent data, thereby minimizing the risk of data leakage and providing a reliable estimation of model generalization.

From a practical perspective, the proposed machine learning framework offers significant value to the aerospace and defense industries, where near-β titanium alloys such as Ti-5553 are widely used for critical structural components (e.g., landing gear). By integrating these predictive models into digital manufacturing environments, operators can proactively optimize high-pressure coolant parameters. This capability not only minimizes scrap rates for expensive titanium alloys but also ensures the stringent surface integrity required for high-fatigue real-world applications, paving the way for more sustainable, cost-effective machining operations.

Recommendations for future work:

Depth- and orientation-resolved residual stress profiling should be performed to clarify how the relative contributions of residual stress evolve with depth and directionality across surface integrity outputs.

Strengthen the framework’s generalizability by validating it under various machining conditions, including machine tool, tool/holder geometry, coolant delivery methods, and pressure.

Incorporate tool wear, vibration signals, and chip morphology into the machine learning inputs. These additions can improve both interpretability and predictive robustness, particularly for Fc and Ra.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the author on request.

Conflicts of Interest

The author declares no conflict of interest.

Abbreviations

Vc	Cutting Speed (m/min)
f	Feed Rate (mm/rev)
HPJC/HPC	High-Pressure Jet-Assisted Cooling (MPa)
DoC	Depth of Cut (mm)
Fc	Cutting Force (N)
Rs	Residual Stress (MPa)
Tc	Cutting Temperature (°C)
Ra	Surface Roughness (µm)
P	Cooling Pressure (MPa)
ANNR	Artificial Neural Network Regression
XRD	X-Ray Diffraction
kNNR	k-Nearest Neighbors Regression
ML	Machine Learning
MLR	Multiple Linear Regression
RFR	Random Forest Regression
RMSE	Root Mean Square Error
SVR	Support Vector Regression
XGBoostR	Extreme Gradient Boosting Regression
GB	Gradient Boosting
R²	Coefficient of Determination
MSE	Mean Squared Error
MAE	Mean Absolute Error

References

Ezugwu, E.O.; Wang, Z.M. Titanium alloys and their machinability: A review. J. Mater. Process. Technol. 1997, 68, 262–274. [Google Scholar] [CrossRef]
Ulutan, D.; Özel, T. Machining induced surface integrity in titanium and nickel alloys: A review. Int. J. Mach. Tools Manuf. 2011, 51, 250–280. [Google Scholar] [CrossRef]
Kolli, R.P.; Devaraj, A. A review of metastable beta titanium alloys. Metals 2018, 8, 506. [Google Scholar] [CrossRef]
Loskutova, T.; Scheffler, M.; Pavlenko, I.; Zidek, K.; Pohrebova, I.; Kharchenko, N.; Smokovych, I.; Dudka, O.; Palyukh, V.; Ivanov, V.; et al. Corrosion Resistance of Coatings Based on Chromium and Aluminum of Titanium Alloy Ti-6Al-4V. Materials 2024, 17, 3880. [Google Scholar] [CrossRef]
Vukelic, D.; Prica, M.; Ivanov, V.; Jovicic, G.; Budak, I.; Luzanin, O. Optimization of surface roughness based on turning parameters and insert geometry. Int. J. Simul. Model. 2022, 21, 417–428. [Google Scholar] [CrossRef]
Kryzhanivskyy, V.; M’saoubi, R.; Bhallamudi, M.; Cekal, M. Machine learning based approach for the prediction of surface integrity in machining. Procedia CIRP 2022, 108, 537–542. [Google Scholar] [CrossRef]
Mitra, S.; Rahul. Machinability of cryogenically treated and non-treated Ti-5553 workpiece: A comparative study. Sādhanā 2024, 49, 271. [Google Scholar] [CrossRef]
Braham-Bouchnak, T.; Germain, G.; Morel, A.; Furet, B. Influence of high-pressure coolant assistance on the machinability of the titanium alloy Ti-5553. Mach. Sci. Technol. 2015, 19, 134–151. [Google Scholar] [CrossRef]
Vukelic, D.; Milosevic, A.; Ivanov, V.; Kočović, V.; Santosi, Z.; Šokac, M.; Simunovic, G. Modelling and optimization of dimensional accuracy and surface roughness in dry turning of Inconel 625 alloy. Adv. Prod. Eng. Manag. 2024, 19, 371–385. [Google Scholar] [CrossRef]
Pimenov, D.Y.; Mia, M.; Gupta, M.K.; Machado, A.R.; Tomaz, Í.V.; Sarikaya, M.; Wojtowicz, N.; Kapłonek, W. Improvement of machinability of Ti and its alloys using cooling-lubrication techniques: A review and future prospect. J. Mater. Res. Technol. 2021, 11, 719–753. [Google Scholar] [CrossRef]
Yünlü, L. Residual Stress Analysis in Machining of a Near Beta Ti Alloy, Ti-5553 Under High Pressure Cooling and Lubrication. Int. J. Eng. Innov. Res. 2023, 5, 13–22. [Google Scholar] [CrossRef]
Sun, Y.; Huang, B.; Puleo, D.A.; Jawahir, I.S. Enhanced machinability of Ti-5553 alloy from cryogenic machining: Comparison with MQL and flood-cooled machining and modeling. Procedia CIRP 2015, 31, 477–482. [Google Scholar] [CrossRef]
Zhang, X.; Wang, D.; Peng, Z. Effects of high-pressure coolant on cooling mechanism in high-speed ultrasonic vibration cutting interfaces. Appl. Therm. Eng. 2023, 233, 121125. [Google Scholar] [CrossRef]
Umbrello, D.; Ambrogio, G.; Filice, L.; Shivpuri, R. An ANN approach for predicting subsurface residual stresses and the desired cutting conditions during hard turning. J. Mater. Process. Technol. 2007, 189, 143–152. [Google Scholar] [CrossRef]
Ugarte, A.; M’Saoubi, R.; Garay, A.; Arrazola, P.J. Machining behaviour of Ti-6Al-4V and Ti-5553 alloys in interrupted cutting with PVD coated cemented carbide. Procedia CIRP 2012, 1, 202–207. [Google Scholar] [CrossRef]
Liu, E.; Wang, R.; Zhang, Y.; An, W. Tool wear analysis of cutting Ti-5553 with uncoated carbide tool under liquid nitrogen cooling condition using tool wear maps. J. Manuf. Process. 2021, 68, 877–887. [Google Scholar] [CrossRef]
Zhao, X.; Li, R.; Liu, E.; Lan, C. Effect of cryogenic cutting surface integrity on fatigue life of titanium alloy Ti-5553. Ferroelectrics 2022, 596, 115–125. [Google Scholar] [CrossRef]
Arrazola, P.J.; Garay, A.; Iriarte, L.M.; Armendia, M.; Marya, S.; Le Maître, F. Machinability of titanium alloys (Ti₆Al₄V and Ti555.3). J. Mater. Process. Technol. 2009, 209, 2223–2230. [Google Scholar] [CrossRef]
Zhuo, L.; Zhan, M.; Xie, Y.; Chen, B.; Ji, K.; Wang, H. Recent Advances in Near-β Titanium Alloys: Microstructure Control, Deformation Mechanisms, and Oxidation Behavior. Adv. Eng. Mater. 2024, 26, 2401837. [Google Scholar] [CrossRef]
Dinibutun, S.; Alshammari, Y.; Bolzoni, L. Machine Learning-Based Prediction of Young’s Modulus in Ti-Alloys. Metals 2026, 16, 233. [Google Scholar] [CrossRef]
Kaur, R.; Kumar, R.; Aggarwal, H. Systematic Review of Artificial Intelligence, Machine Learning, and Deep Learning in Machining Operations: Advancements, Challenges, and Future Directions. Arch. Comput. Methods Eng. 2025, 32, 4983–5036. [Google Scholar] [CrossRef]
Zhou, T.; Zhou, T.; Zhang, C.; Sun, C.; Cui, H.; Tian, P.; He, L. Hybrid modeling with finite element—Analysis—Neural network for predicting residual stress in orthogonal cutting of H13. J. Mater. Res. Technol. 2024, 29, 4954–4977. [Google Scholar] [CrossRef]
Mu, S.; Yu, C.; Lin, K.; Lu, C.; Wang, X.; Wang, T.; Fu, G. A review of machine learning-based thermal error modeling methods for CNC machine tools. Machines 2025, 13, 153. [Google Scholar] [CrossRef]
Makhfi, S.; Dorbane, A.; Harrou, F.; Sun, Y. Prediction of cutting forces in hard turning process using machine learning methods: A case study. J. Mater. Eng. Perform. 2024, 33, 9095–9111. [Google Scholar] [CrossRef]
Cover, T.; Hart, P. Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 1967, 13, 21–27. [Google Scholar] [CrossRef]
Webb, G.I.; Pazzani, M.J.; Billsus, D. Machine learning for user modeling. User Model. User-Adapt. Interact. 2001, 11, 19–29. [Google Scholar] [CrossRef]
Möhring, H.C.; Eschelbacher, S.; Georgi, P. Machine learning approaches for real-time monitoring and evaluation of surface roughness using a sensory milling tool. Procedia CIRP 2021, 102, 264–269. [Google Scholar] [CrossRef]
Shen, X.; He, S.; Zhao, B.; Ma, S.; Jiang, J.; Li, S.; Pan, S. Electrochemical machining (ECM) prediction and control via multi-physics-constrained bi-layer machine learning with alloy electrochemistry knowledge. J. Manuf. Process. 2026, 162, 326–346. [Google Scholar] [CrossRef]
Parida, A.K.; Maity, K. Analysis of some critical aspects in hot machining of Ti-5553 superalloy: Experimental and FE analysis. Def. Technol. 2019, 15, 344–352. [Google Scholar] [CrossRef]
Headley, C.V.; del Valle, R.J.H.; Ma, J.; Balachandran, P.; Ponnambalam, V.; LeBlanc, S.; Martin, J.B. The development of an augmented machine learning approach for the additive manufacturing of thermoelectric materials. J. Manuf. Process. 2024, 116, 165–175. [Google Scholar] [CrossRef]
Ko, J.H.; Yin, C. A review of artificial intelligence application for machining surface quality prediction: From key factors to model development. J. Intell. Manuf. 2025, 37, 775–798. [Google Scholar] [CrossRef]
Soori, M.; Arezoo, B.; Dastres, R. Machine learning and artificial intelligence in CNC machine tools, a review. Sustain. Manuf. Serv. Econ. 2023, 2, 100009. [Google Scholar] [CrossRef]
Hameed, M.M.; Al-Ansari, N.; Yaseen, Z.M. An Extra Tree Regression Model for Discharge Coefficient Prediction: Novel, Practical Applications in the Hydraulic Sector and Future Research Directions. Math. Probl. Eng. 2021, 2021, 7001710. [Google Scholar] [CrossRef]
Willmott, C.J.; Matsuura, K. Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance. Clim. Res. 2005, 30, 79–82. [Google Scholar] [CrossRef]
Hammid, A.T.; Sulaiman, M.H.B.; Abdalla, A.N. Prediction of small hydropower plant power production in Himreen Lake dam (HLD) using artificial neural network. Alex. Eng. J. 2018, 57, 211–221. [Google Scholar] [CrossRef]
Mishra, G.; Sehgal, D.; Valadi, J.K. Quantitative structure activity relationship study of the anti-hepatitis peptides employing random forests and extra-trees regressors. Bioinformation 2017, 13, 60–63. [Google Scholar] [CrossRef]
Safari, H.; Sharif, S.; Izman, S.; Jafari, H. Surface integrity characterization in high-speed dry end milling of Ti-6Al-4V titanium alloy. Int. J. Adv. Manuf. Technol. 2015, 78, 651–657. [Google Scholar] [CrossRef]
Nandy, A.K.; Gowrishankar, M.C.; Paul, S. Some studies on high-pressure cooling in turning of Ti–6Al–4V. Int. J. Mach. Tools Manuf. 2009, 49, 182–198. [Google Scholar] [CrossRef]
Palanisamy, S.; McDonald, S.D.; Dargusch, M.S. Effects of coolant pressure on chip formation while turning Ti₆Al₄V alloy. Int. J. Mach. Tools Manuf. 2009, 49, 739–743. [Google Scholar] [CrossRef]
Mia, M.; Al Bashir, M.A.; Dhar, N.R. High-pressure coolant effects on tool rake/flank in turning Ti-6Al-4V. Int. J. Adv. Manuf. Technol. 2017, 90, 2503–2512. [Google Scholar] [CrossRef]
Özel, T.; Ulutan, D. Prediction of machining induced residual stresses in turning of titanium and nickel based alloys with experiments and finite element simulations. CIRP Ann. 2012, 61, 547–550. [Google Scholar] [CrossRef]
Farias, A.; Paschoalinoto, N.W.; Bordinassi, E.C.; Leonardi, F.; Delijaicov, S. Predictive modelling of residual stress in turning of hard materials using radial basis function network enhanced with principal component analysis. Eng. Sci. Technol. Int. J. 2024, 55, 101743. [Google Scholar] [CrossRef]
Outeiro, J.; Cheng, W.; Chinesta, F.; Ammar, A. Modelling and Optimization of Machining of Ti-6Al-4V Titanium Alloy Using Machine Learning and Design of Experiments Methods. J. Manuf. Mater. Process. 2022, 6, 58. [Google Scholar] [CrossRef]
Ezugwu, E.O.; Bonney, J.; Da Silva, R.B.; Cakir, O. Surface integrity of finished turned Ti–6Al–4V alloy with PCD tools using conventional and high pressure coolant supplies. Int. J. Mach. Tools Manuf. 2007, 47, 884–891. [Google Scholar] [CrossRef]
Hong, S.Y.; Ding, Y. Cooling approaches and cutting temperatures in cryogenic machining of Ti-6Al-4V. Int. J. Mach. Tools Manuf. 2001, 41, 1417–1437. [Google Scholar] [CrossRef]
Bordin, A.; Sartori, S.; Bruschi, S.; Ghiotti, A. Feasibility of dry and cryogenic machining: Turning of Ti₆Al₄V. J. Clean. Prod. 2017, 142, 4142–4152. [Google Scholar] [CrossRef]
Shokrani, A.; Dhokia, V.; Newman, S.T. Comparative investigation on using cryogenic machining in CNC milling of Ti-6Al-4V. Mach. Sci. Technol. 2016, 20, 475–494. [Google Scholar] [CrossRef]
Khanna, N.; Shah, P.; Agrawal, C.; Sarikaya, M.; Pimenov, D.Y.; Gupta, M.K. Review on design and development of cryogenic machining setups and their performance. J. Manuf. Process. 2021, 68, 398–422. [Google Scholar] [CrossRef]
Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Geurts, P.; Ernst, D.; Wehenkel, L. Extremely randomized trees. Mach. Learn. 2006, 63, 3–42. [Google Scholar] [CrossRef]
Friedman, J.H. Greedy function approximation: A gradient boosting machine. Ann. Stat. 2001, 29, 1189–1232. [Google Scholar] [CrossRef]
Aggogeri, F.; Pellegrini, N.; Tagliani, F.L. Recent advances on machine learning applications in machining processes. Appl. Sci. 2021, 11, 8764. [Google Scholar] [CrossRef]

Figure 1. Schematic overview of the experimental workflow and machine learning optimization.

Figure 2. The Ti-5553 specimens employed in the tests.

Figure 3. Average hardness value of the unmachined Ti-5553 specimens employed in the tests.

Figure 4. Schematic views of the machining process and measurement methods.

Figure 5. Representative XRD peak and sin²ψ plots used for residual stress calculation of the near-β titanium alloy under different cooling conditions: (a) dry, (b) conventional cooling, and (c) high-pressure cooling.

Figure 6. Residual stress plot under dry cutting conditions.

Figure 7. Cutting force signals corresponding to the machining conditions of 80 m/min cutting speed, conventional cooling (0.6 MPa), and 0.15 mm/rev feed rate.

Figure 8. Cutting temperature plot under dry cutting conditions.

Figure 9. Surface roughness results under the cutting conditions of 50 m/min cutting speed, dry cutting, and 0.15 mm/rev feed rate.

Figure 10. Dataset augmentation and correlation graph: (a) residual stress, (b) cutting force, (c) cutting temperature, (d) surface roughness.

Figure 11. Dataset PCA analysis: (a) residual stress, (b) cutting force, (c) cutting temperature, (d) surface roughness.

Figure 12. Feature importance analysis: (a) residual stress, (b) cutting force, (c) cutting temperature, (d) surface roughness.

Figure 13. Cutting speed and machining responses relationship analysis: (a) residual stress relationship, (b) cutting force relationship, (c) cutting temperature relationship, (d) surface roughness relationship.

Figure 14. Cooling pressure and machining responses relationship analysis: (a) residual stress relationship, (b) cutting force relationship, (c) cutting temperature relationship, (d) surface roughness relationship.

Figure 15. Feed rate and machining responses relationship analysis: (a) residual stress relationship, (b) cutting force relationship, (c) cutting temperature relationship, (d) surface roughness relationship.

Figure 16. Impact of machining parameters on cutting performance: feature importance analysis.

Figure 17. Performance of the optimized algorithms for all datasets: (a) R², (b) MSE, (c) MAE, (d) RMSE, (e) Accuracy.

Table 1. The chemical constituents of the Ti-5553 alloy.

Elements	Al	Mo	V	Cr	Fe	Ti
Weight (%)	4.4–5.7	4.0–5.5	4.0–5.5	2.5–3.5	0.3–0.5	Balance

Table 2. Properties of Ti- 5553 (Ti–5Al–5Mo–5V–3Cr).

Density	4650 kg/m³
Melting Point	1933 K
Specific Heat	520 J/kg K
Thermal conductivity	6.7 W/mK
Thermal Diffusivity	2.76 mm²/s
Ultimate tensile strength	1280 MPa
Young Modulus	1.15 × 10⁵ MPa
Poisson’s Ratio	0.33
Elongation	8–15%
Brinell Hardness	350–400 HV

Table 3. Technical specifications of machine tools, factors and levels of processing parameters for experimental design.

Category	Specifications
Machine tool	A general-purpose ALEX ANL-75 CNC lathe, (it has 15 kW, speed range: 35 to 3500 rpm and manufactured based in Taichung, Taiwan), Fanuc control system.
Work material	Ti-5Al-5V-5Mo-3Cr or known as near beta Ti alloy (Ti-5553)
Dimensions	Ø80 × 445 mm
Cutting Tool and Toolholder	Rhombic shape CNMG 120408, (Ti,Al)N + TiN coated carbide SECO Jet stream PCLNR tool holder.
Cutting speed, Vc (m/min)	50, 80 and 120
Feed, f (mm/rev)	0.15, 0.25 and 0.35
Depth of cut, DoC (mm)	1 mm
Cooling environment (High-Pressure assisted Jet Stream Cooling)	Dry, Conventional (0.6 MPa) and HPC (30 MPa) The cooling/lubrication fluid (CLF) is a chemical-based, 5% concentration, water-soluble oil, 5 to 6°, with a cutting tool rake angle of 21 L/min, with a 1.5 mm brass nozzle diameter.

Table 4. Machining parameters, levels and responses.

	Symbol	Cutting & Response Parameters	Level 1	Level 2	Level 3
Machining Parameters	Vc; (m/min)	Cutting speed	50	80	120
	f; (mm/rev)	Feed rate	0.15	0.25	0.35
	P; (MPa)	Cooling pressure	Dry	Conv.	30
Machining Responses	Fc; (N)	Cutting Force	The four responses obtained from the analyses for each test were entered into the table corresponding to the results of the 27 experiments determined using the General Full Factorial Design method.
	Rs; (MPa)	Residual stress
	Tc; (°C)	Cutting temperature
	Ra; (µm)	Surface roughness

Table 5. Residual Stress dataset test results and error metric values of machine learning algorithms.

Algorithm & Metric		Extra Trees	Random Forest	Gradient Boosting	KNN	AdaBoost
OPTIMIZED	R²	0.9997	0.9989	0.9986	0.9981	0.9863
	MSE	6.8289	34.4913	46.2887	60.8995	451.063
	MAE	1.7122	2.8227	5.3949	4.1833	16.6629
	RMSE	2.6132	5.8729	6.8035	7.8038	21.2382
	Accuracy	98.4792	94.9827	94.8783	93.0728	84.3172
OPTIMIZATION PROCESS	Parameters and Values for Hyperparameter Optimization	n_estimators: [50, 100, 200] max_depth: [None, 10, 20, 30] min_samples_split: [2, 5, 10] min_samples_leaf: [1, 2, 4]	n_estimators: [50, 100, 200] max_depth: [2, 4, 8, 16] min_samples_split: [2, 3, 4] min_samples_leaf: [2, 3, 4, 8]	n_estimators: [50, 75, 100] max_depth: [2, 4, 8] min_samples_split: [1, 2, 4] min_samples_leaf: [1, 2, 8] learning rate: [0.05, 0.1, 0.5, 1]	n_neighbors: [1, 2, 4, 8] p: [1, 2, 3, 4]	n_estimators: [30, 50, 75, 100, 200] learning rate: [0.05, 0.1, 0.5, 1, 2]
OPTIMIZATION PROCESS	Hypermeter Values	max_depth: 20 min_samples_leaf: 2 min_samples_split: 2 n_estimators: 50	max_depth: 8 min_samples_leaf: 2 min_samples_split: 2 n_estimators: 50	learning rate: 0.05 max_depth: 8 min_samples_leaf: 8 min_samples_split: 2 n_estimators: 75	n_neighbors: 4 p: 1	learning rate: 2 n_estimators: 75
WITHOUT OPTIMIZATION	R²	0.9997	0.9975	0.9991	0.99544	0.9820
	MSE	7.0673	80.7830	27.5253	150.3425	592.3764
	MAE	2.176	4.7259	3.913	5.4315	18.7525
	RMSE	2.6584	8.9879	5.2464	12.2614	24.3387
	Accuracy	98.3999	93.004	96.3664	89.4361	82.7548

Table 6. Residual Stress dataset optimized algorithms test results and error metric graphs.

Alg.	Prediction Graph (Train)	Fault Distribution Graph	Prediction Graph (Test & Train)
Extra Trees
Random Forest
Gradiend Boosting
KNN
AdaBoost

Table 7. Cutting force dataset test results and error metric values of machine learning algorithms.

Algorithm & Metric		Extra Trees	Random Forest	Gradient Boosting	KNN	AdaBoost
OPTIMIZED	R²	0.999	0.9941	0.999	0.9923	0.7756
	MSE	0.0010	10.8459	0.0867	12.0678	355.8604
	MAE	0.0043	1.2966	0.1982	2.2902	14.442
	RMSE	0.0317	3.2933	0.2946	3.4738	18.8642
	Accuracy	0.999	99.6432	99.943	99.3442	95.972
OPTIMIZATION PROCESS	Parameters and Values for Hyperparameter Optimization	n_estimators: [50, 100, 200] max_depth: [None, 10, 20, 30] min_samples_split: [2, 5, 10] min_samples_leaf: [1, 2, 4]	n_estimators: [50, 100, 200] max_depth: [2, 4, 8, 16] min_samples_split: [2, 3, 4] min_samples_leaf: [2, 3, 4, 8]	n_estimators: [50, 75, 100] max_depth: [2, 4, 8] min_samples_split: [1, 2, 4] min_samples_leaf: [1, 2, 8] learning rate: [0.05, 0.1, 0.5, 1]	n_neighbors: [1, 2, 4, 8] p: [1, 2, 3, 4]	n_estimators: [30, 50, 75, 100, 200] learning rate: [0.05, 0.1, 0.5, 1, 2]
OPTIMIZATION PROCESS	Hypermeter Values	max_depth: 20 min_samples_leaf: 2 min_samples_split: 2 n_estimators: 50	max_depth: 8 min_samples_leaf: 2 min_samples_split: 2 n_estimators: 50	learning rate: 0.05 max_depth: 8 min_samples_leaf: 8 min_samples_split: 2 n_estimators: 75	n_neighbors: 4 p: 1	learning rate: 2 n_estimators: 75
WITHOUT OPTIMIZATION	R²	99.999	0.9934	0.9838	0.9880	0.7271
	MSE	2.07740 × 10⁻²⁵	12.2450	30.2194	19.009	432.919
	MAE	3.953 × 10⁻¹³	1.7933	4.2865	2.4617	15.9034
	RMSE	4.5578 × 10⁻¹³	3.499	5.4972	4.3599	20.806
	Accuracy	99.999	99.4980	98.7925	99.2951	95.6271

Table 8. Cutting Force dataset optimized algorithms test results and error metric graphs.

Alg.	Prediction Graph (Train)	Fault Distribution Graph	Prediction Graph (Test & Train)
Extra Trees
Random Forest
Gradiend Boosting
KNN
AdaBoost

Table 9. Cutting Temperature dataset test results and error metric values of machine learning algorithms.

Algorithm & Metric		Extra Trees	Random Forest	Gradient Boosting	KNN	AdaBoost
OPTIMIZED	R²	99.999	0.9994	0.9997	0.9982	0.9738
	MSE	6.5182 × 10⁻²⁵	5.619	2.1709	14.3638	219.8588
	MAE	6.940 × 10⁻¹³	1.362	1.01962	2.3673	12.5081
	RMSE	8.0735 × 10⁻¹³	2.370	1.4734	3.7899	14.827
	Accuracy	99.999	99.721	99.8025	99.521	97.5465
OPTIMIZATION PROCESS	Parameters and Values for Hyperparameter Optimization	n_estimators: [50, 100, 200] max_depth: [None, 10, 20, 30] min_samples_split: [2, 5, 10] min_samples_leaf: [1, 2, 4]	n_estimators: [50, 100, 200] max_depth: [2, 4, 8, 16] min_samples_split: [2, 3, 4] min_samples_leaf: [2, 3, 4, 8]	n_estimators: [50, 75, 100] max_depth: [2, 4, 8] min_samples_split: [1, 2, 4] min_samples_leaf: [1, 2, 8] learning rate: [0.05, 0.1, 0.5, 1]	n_neighbors: [1, 2, 4, 8] p: [1, 2, 3, 4]	n_estimators: [30, 50, 75, 100, 200] learning rate: [0.05, 0.1, 0.5, 1, 2]
OPTIMIZATION PROCESS	Hypermeter Values	max_depth: 20 min_samples_leaf: 2 min_samples_split: 2 n_estimators: 50	max_depth: 8 min_samples_leaf: 2 min_samples_split: 2 n_estimators: 50	learning rate: 0.05 max_depth: 8 min_samples_leaf: 8 min_samples_split: 2 n_estimators: 75	n_neighbors: 4 p: 1	learning rate: 2 n_estimators: 75
WITHOUT OPTIMIZATION	R²	99.999	0.9984	0.9971	0.9977	0.9685
	MSE	6.518 × 10⁻²⁵	16.512	30.902	19.218	264.378
	MAE	6.9405 × 10⁻¹³	2.7308	4.3286	2.4791	13.2734
	RMSE	8.0735 × 10⁻¹³	4.0636	5.55899	4.3839	16.259
	Accuracy	99.999	99.4609	99.2055	99.4906	97.391

Table 10. Cutting Temperature dataset: optimized algorithms, test results, and error metric graphs.

Alg.	Prediction Graph (Train)	Fault Distribution Graph	Prediction Graph (Test & Train)
Extra Trees
Random Forest
Gradiend Boosting
KNN
AdaBoost

Table 11. Surface Roughness dataset test results and error metric values of machine learning algorithms.

Algorithm & Metric		Extra Trees	Random Forest	Gradient Boosting	KNN	AdaBoost
OPTIMIZED	R²	99.999	0.9481	0.8034	0.7627	0.7237
	MSE	1.4230 × 10⁻²⁹	0.0152	0.0577	0.0574	0.0668
	MAE	3.1752 × 10⁻¹⁵	0.1006	0.1961	0.1924	0.21417
	RMSE	3.7722 × 10⁻¹⁵	0.1234	0.2403	0.2396	0.2586
	Accuracy	99.999	95.9533	92.239	91.9548	91.120
OPTIMIZATION PROCESS	Parameters and Values for Hyperparameter Optimization	n_estimators: [50, 100, 200] max_depth: [None, 10, 20, 30] min_samples_split: [2, 5, 10] min_samples_leaf: [1, 2, 4]	n_estimators: [50, 100, 200] max_depth: [2, 4, 8, 16] min_samples_split: [2, 3, 4] min_samples_leaf: [2, 3, 4, 8]	n_estimators: [50, 75, 100] max_depth: [2, 4, 8] min_samples_split: [1, 2, 4] min_samples_leaf: [1, 2, 8] learning rate: [0.05, 0.1, 0.5, 1]	n_neighbors: [1, 2, 4, 8] p: [1, 2, 3, 4]	n_estimators: [30, 50, 75, 100, 200] learning rate: [0.05, 0.1, 0.5, 1, 2]
OPTIMIZATION PROCESS	Hypermeter Values	max_depth: 20 min_samples_leaf: 2 min_samples_split: 2 n_estimators: 50	max_depth: 8 min_samples_leaf: 2 min_samples_split: 2 n_estimators: 50	learning rate: 0.05 max_depth: 8 min_samples_leaf: 8 min_samples_split: 2 n_estimators: 75	n_neighbors: 4 p: 1	learning rate: 2 n_estimators: 75
WITHOUT OPTIMIZATION	R²	0.74822	0.7219	0.7351	0.7349	0.7002
	MSE	0.07396	0.0816	0.0778	0.0641	0.0725
	MAE	0.2269	0.2351	0.22722	0.2030	0.2133
	RMSE	0.2719	0.2857	0.2789	0.2533	0.2694
	Accuracy	90.9042	90.639	90.8623	91.4726	91.0005

Table 12. Surface Roughness dataset optimized algorithms test results and error metric graphs.

Alg.	Prediction Graph (Train)	Fault Distribution Graph	Prediction Graph (Test & Train)
Extra Trees
Random Forest
Gradiend Boosting
KNN
AdaBoost

Table 13. Statistical characteristics of experimental outputs.

Response	Mean	Standard Deviation
Fc (N)	356.17	48.95
Tc (°C)	525.57	106.20
Rs (MPa)	52.13	194.86
Ra (µm)	2.64	0.46

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Yünlü, L. Machine Learning-Based Prediction of Surface Integrity in High-Pressure Coolant-Assisted Machining of Near-β Ti-5553 Titanium Alloy. Machines 2026, 14, 367. https://doi.org/10.3390/machines14040367

AMA Style

Yünlü L. Machine Learning-Based Prediction of Surface Integrity in High-Pressure Coolant-Assisted Machining of Near-β Ti-5553 Titanium Alloy. Machines. 2026; 14(4):367. https://doi.org/10.3390/machines14040367

Chicago/Turabian Style

Yünlü, Lokman. 2026. "Machine Learning-Based Prediction of Surface Integrity in High-Pressure Coolant-Assisted Machining of Near-β Ti-5553 Titanium Alloy" Machines 14, no. 4: 367. https://doi.org/10.3390/machines14040367

APA Style

Yünlü, L. (2026). Machine Learning-Based Prediction of Surface Integrity in High-Pressure Coolant-Assisted Machining of Near-β Ti-5553 Titanium Alloy. Machines, 14(4), 367. https://doi.org/10.3390/machines14040367

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Machine Learning-Based Prediction of Surface Integrity in High-Pressure Coolant-Assisted Machining of Near-β Ti-5553 Titanium Alloy

Abstract

1. Introduction

2. Materials and Methods

2.1. Material

2.2. Methods

2.2.1. Machining Parameters and High-Pressure Cooling System

2.2.2. Residual Stress Measurement by X-Ray Diffraction (XRD)

2.2.3. Cutting Force Measurement

2.2.4. Cutting Temperature Measurement

2.2.5. Surface Roughness Measurement

2.3. Machine Learning Methodology

2.3.1. Dataset

2.3.2. Dataset Augmentation

2.3.3. Performance Evaluation of Machine Learning Algorithms and Prediction Results

Extra Trees Repressors Algorithm

K-Nearest Neighbors Regressor Algorithm

Gradient Boosting Regressor Algorithm

Random Forest Regressor Algorithm

Adaboost Regressor Algorithm

Evaluation of All Models

3. Analysis and Results

3.1. Experimental Test Results and Analysis

3.1.1. Residual Stress Results

3.1.2. Cutting Force Results

3.1.3. Cutting Temperature Results

3.1.4. Surface Roughness Measurement Results

3.2. Data Augmentation and Dataset Validation Analyses

3.3. Feature Importance for All Datasets

3.3.1. Feature Importance of Machining Responses

3.3.2. Cutting Speed and Machining Responses Relationship

3.3.3. Cooling Pressure and Machining Responses Relationship

3.3.4. Feed Rate and Machining Responses Relationship

3.4. Findings and Discussions

3.4.1. Residual Stress Findings

3.4.2. Cutting Force Findings

3.4.3. Cutting Temperature Findings

3.4.4. Surface Roughness Findings

3.4.5. Discussions

4. Conclusions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI