Machine Learning Prediction of Turning Precision Using Optimized XGBoost Model

: The present study proposes a machine learning approach for optimizing turning parameters in such a way as to maximize the turning precision. The Taguchi method is ﬁrst employed to optimize the turning parameters, and the experimental results are then used to train three machine learning models to predict the turning precision for any given values of the input parameters. The model which shows the best prediction performance (XGBoost) is further improved through the use of a synthetic minority over-sampling technique for regression with Gaussian noise (SMOGN) and four different optimization algorithms, including center particle swarm optimization (CPSO). Finally, the performances of the various models are evaluated and compared using the leave-one-out cross-validation technique. The experimental results show that the XGBoost model, combined with SMOGN and CPSO, provides the best performance, and is a useful tool for predicting the machining error of turning. The method can also reduce the cost of obtaining the optimized turning parameters corresponding with the predicted machining error.


Introduction
Advances in machine tool technology make possible the creation of parts with tolerances as fine as ±0.0025 mm. Thus, precision machining plays a key role in fabricating many of the critical tight-tolerance components used in the military, electronics, aerospace, and medical industries nowadays. However, the success of precision machining methods still relies on the appropriate determination of the machining parameters. In the past, the optimal processing parameters were determined mainly by operator experience, or through a process of trial-and-error. However, such an approach is not only time-consuming, expensive, and subjective, but also offers no guarantee of finding the optimal solution. Furthermore, the knowledge learned in this way is not easily documented and shared with others. Thus, to realize the goals of high efficiency, high precision, and low cost, precision machining requires faster and more systematic approaches for identifying the optimal processing parameters.
Computer numerical control (CNC) machines are widely used throughout the manufacturing industry. CNC turning machines have many advantages for mass production, including a high speed, high precision, excellent repeatability, minimal manual intervention, and good potential for automation.
The cutting behavior of turning is determined by two components: the toolpath and the feed rate [1]. The "toolpath" is the trace on the workpiece that the cutter moves along, while the "feed rate" is the speed at which the cutter moves over the surface of the workpiece. When cutting metals, the machining quality is affected by two main types of 2 of 19 factors, namely human and environmental. In most cases, the environmental factors, e.g., the age and condition of the machine, the machining temperature, the workpiece-material properties, and so on, are difficult to control. Accordingly, in attempting to improve the quality of the CNC machining process, many studies have focused on the problem of controlling the human factors, in particular, the settings assigned by the operator or designer of the machining parameters, such as the machining depth, rotational speed, feed rate, and cantilever cutting length. The machining depth has a direct effect on the cutting force exerted on the cutter. An excessive force may cause vibration of the cutter, which not only causes tool wear, but also significantly decreases the machining quality. The rotational speed and feed rate govern the speed at which the cutter moves over the workpiece. In general, a higher rotational speed and/or feed rate increases the rate at which the material is removed from the workpiece. However, an excessive speed may increase the cutting force, while a low speed increases the contact time between the cutter and the workpiece, and may thus result in the production of scratches. Finally, an excessive cantilever cutting length reduces the stability of the cutter at the contact point between the tip and the workpiece, and may therefore degrade the precision of the machining process. Hence, it is essential that all four machining parameters be optimized in order to ensure the quality and precision of the turning process.
Su et al. [2] presented a multi-objective optimization framework based on the robust Taguchi experimental design method, the grey relational analysis (GRA), and the response surface methodology (RSM) for determining the optimal cutting parameters in the turning of AISI 304 austenitic stainless steel. It was shown that, given the optimal settings of the turning parameters, the surface roughness (Ra) was reduced by 66.90%, and the material removal rate (MRR) was increased by 8.82%. Furthermore, the specific energy consumption (SEC) of the turning process was reduced by 81.46%. Akkus and Yaka [3] conducted a series of experiments to investigate the optimal cutting speed, feed rate, and depth of cut for the turning of Ti6Al4V titanium alloy. The optimality of the turning parameters was evaluated by examining their effects on surface roughness, vibration energy, and energy consumption. In general, the results revealed that the quality of the machining outcome was governed mainly by the feed rate. Zhou He et al. [4] proposed a genetic-gradient boosting regression tree (GA-GBRT) model, and used cutting speed, feed rate, and depth of cut as inputs to predict the surface roughness of milling, and then matched AISI 304 stainless steel for the cutting experiment. The results showed that the best processing parameters with the highest efficiency and best quality could be obtained based on the GA-GBRT model. Wu et al. [5] predicted the surface roughness when milling Inconel 718 through the Elman neural network according to the cutting conditions, vibration, and current signals during cutting, and then took the maximum cutting efficiency as the consideration and used the particle swarm optimization (PSO) algorithm to optimize cutting parameters. Sivalingam et al. [6] conducted the turning of Hastelloy X materials with PVD Ti-Al-N tools in different machining environments, and evaluated the cutting performances according to the cutting force, surface roughness, and cutting temperature caused by different cutting conditions. They found that the moth-flame optimization (MFO) algorithm could be applied to match the multiple linear regression models (MLRM) model to select the best processing parameters.
In response to the rise of artificial intelligence, this paper uses machine learning methods to predict turning machining error. Before using the machine learning method, a database of machining parameters and corresponding machining errors must be established to train the model. Due to the huge number of permutations and combinations of processing parameters, it is important to improve the efficiency of data collection. For this reason, this paper uses the Taguchi method to design the experiment, which can collect the most comprehensive data with the least number of experiments.
Machine learning [7] has proven effective in solving many problems which require extensive experimentation to obtain reliable results, or which involve such massive volumes of data that they cannot realistically be solved by manual analysis. Supervised machine learning models use a labelled dataset to learn the mapping function which most accurately describes the relationship between the input variables and the output (or outputs). Having trained the model, it is tested using a testing dataset and predicts the output accordingly. Such models have been widely applied for face detection, stock price prediction, spam detection, risk assessment, and so on. They have also been used to solve various problems in the field of smart manufacturing. For example, Lu et al. [8] studied the use of the group method of data handling (GMDH) model to predict the surface roughness of micromilling LF21 antenna. The spindle speed and feed rate were used as model inputs, and the surface roughness was the output. The prediction accuracy could reach an average relative error of 13.92%, the maximum relative error was 21.22%, and finally, the best cutting parameters were found through the prediction results. Chen et al. [9] used a back-propagation neural network (BPNN) to predict the surface roughness of a milling workpiece. The model inputs were the depth of cut, feed rate, spindle speed and milling distance. The prediction accuracy was expressed by the root-mean-square-error (RMSE), which could reach 0.008, and analyzed the influence of various parameters on the surface roughness. The experimental results showed that the feed rate had the greatest influence on the surface roughness. Benardos and Vosniakos, et al. [10] used an artificial neural network (ANN) model based on the Levenberg-Marquardt (LM) training algorithm to predict the surface roughness in CNC face milling. In the proposed approach, the training and testing data were obtained using the Taguchi design of experiment (DoE) method based on seven control factors, namely the depth of cut, the feed rate per tooth, the cutting speed, the engagement and wear of the cutting tool, the cutting fluid, and the cutting force. It was shown that the trained model achieved a mean square error (MSE) of just 1.86% for the surface roughness of the milled component. Cus and Zuperl [11] developed an ANN for predicting the optimal cutting conditions subject to technological, economic, and organizational constraints. Asiltürk and Çunkaş [12] combined an ANN model and a regression analysis technique to predict the surface roughness of turned AISI 1040 steel. The model was trained using three different algorithms, namely back-propagation, scaled conjugate gradient (SCG), and Levenberg-Marquardt (LM). The results showed that the ANN model trained with the SCG algorithm provided a significantly better prediction accuracy than that obtained from a traditional, regression-based model. Pontes et al. [13] used the Taguchi DOE method to optimize the parameters of a radial base function (RBF) model for predicting the mean value of the surface roughness (Ra) of AISI 52,100 hardened steel in hand-turning processes. In general, the results confirmed that the use of the DOE approach to obtain the training data required to optimize the RBF model was far more efficient than traditional trial-and-error methods. The above studies focus mainly on the effects of the depth of cut, feed rate, and cutting speed on the surface roughness of machined components. However, besides the surface roughness, the machining error of the machining outcome is also extremely important, particularly for tight-tolerance components. The application of machine learning to the optimization of the cutting parameters is generally utilized by an ANN model [11,12]. However, such models, which have a simple structure consisting of an input layer, one or more hidden layers, and an output layer, achieve only a relatively poor prediction performance for machining precision. Accordingly, the present study examines the performance of three different tree-based machine learning (ML) models in predicting the precision of the turning process, namely random forest, XGBoost, and decision tree. For each model, the data required for training purposes are obtained using the Taguchi DOE method [14] based on four turning parameters: the machining depth, the rotational speed, the feed rate, and the cantilever cutting length. The model which shows the best prediction performance is further enhanced through the use of an over-sampling technique and four different optimization algorithms, namely genetic algorithm (GA), grey wolf (GW), PSO, and center particle swarm optimization (CPSO). Finally, the performances of the various models are evaluated and compared using the leave-one-out cross-validation technique.
In response to the rapid development of present technology, the requirements for product quality are also increasing, and the quality is closely related to the processing parameters. At present, there is no systematic method for the optimization of processing parameters in the industry, and most of them rely on empirical rules to overcome, but the disadvantages are that they are time-consuming and difficult to inherit techniques. In terms of research methods, in the past, it was necessary to rely on methods such as analyzing the structural characteristics of the machine or analyzing the cutting behavior in order to find suitable parameters more efficiently, but it was difficult and more expensive. Therefore, this paper proposes a kind of parameter optimization method to improve this problem. This method is to use the Taguchi method with the machine learning method for parameter optimization. In this paper, the Taguchi method and oversampling techniques are applied to improve the efficiency of data collection. Machine learning methods and optimization algorithms are then used to estimate the turning accuracy. The remainder of this paper is organized as follows: Section 2 briefly describes the setup of the Taguchi DOE method employed in the present study; Section 3 introduces the details of each of the major components of the proposed methodology, including the data preprocessing step, the ML models, the oversampling technique, and the optimization algorithms; Section 4 presents and discusses the experimental results; and finally, Section 5 provides some brief concluding remarks. Figure 1 illustrates the four machining parameters considered in the present study, namely the depth of machining, the rotational speed, the feed rate, and the length of cantilever workpiece. The machining error is defined as the difference between the actual size and ideal size of diameter of a machined workpiece. The parameters have both individual and interactive effects on the precision of the machining process. Consequently, determining their respective impacts on the machining precision is challenging when using traditional trial-and-error methods. Accordingly, in the present study, the individual and interactive effects of the four parameters are investigated using the Taguchi DOE method, which allows for the optimal solution to be obtained with the minimal number of experiments. In the Taguchi method, the experiments are arranged in an orthogonal array (OA), where the configuration of this array depends on the number of control factors and level settings to be considered in the optimization process. In the present study, the aim is to investigate the effects of four machining parameters on the machining precision. Hence, the Taguchi experiments involve four control factors. Furthermore, each control factor is assigned three level settings, where these level settings are determined in accordance with the capabilities of the CNC machine used in the DOE trials (see Table 1). Accordingly, the experiments were configured in an L9 (3 4 ) OA containing nine experimental runs.

Proposed Method
In the present study, the effects of the four machining parameters on the turning precision were predicted using a supervised MK model. Most supervised-learning problems

Proposed Method
In the present study, the effects of the four machining parameters on the turning precision were predicted using a supervised MK model. Most supervised-learning problems can be categorized as either "classification" problems or "regression" problems. Classification problems involve predicting a discrete valued output, such as a number or a certain species. By contrast, regression problems involve predicting a continuous valued output, such as stock market movement or the price of goods. The present study considered a regression-type problem, in which the aim was to predict the machining error of the turning process, given a knowledge of the settings assigned to the four machining parameters. The data required to train the ML model were obtained from the Taguchi DOE factorial design described in the previous section, where the level settings for the four machining parameters are listed in Table 2. The machining error is calculated by the following equation: where D target is the diameter of the turning target size and D measured is the diameter of the measured value after the turning process.
The turning experiments were performed on a Feeler turning machine (model: FTC-10), as shown in Figure 2, using cylindrical S45C medium carbon steel specimens. The specifications of the workpiece and cutting tool are shown in Table 3.    Prior to constructing the ML model, the experimental data obtained from the Taguchi experiments were pre-processed in order to accelerate the convergence of the training process and improve the prediction accuracy. Data pre-processing was performed using the min-max normalization [15] technique. In other words, the minimum and maximum values of each variable in the dataset were mapped to 0 and 1, respectively, and all the other values in between were scaled accordingly within the range of [0, 1]. In other words, the normalization process was performed as where X normalize is the value after normalization and x max , x min , and x are the maximum, minimum, and original values, respectively. The normalization process is illustrated schematically in Figure 3.  Prior to constructing the ML model, the experimental data obtained from the Taguchi experiments were pre-processed in order to accelerate the convergence of the training process and improve the prediction accuracy. Data pre-processing was performed using the min-max normalization [15] technique. In other words, the minimum and maximum values of each variable in the dataset were mapped to 0 and 1, respectively, and all the other values in between were scaled accordingly within the range of [0, 1]. In other words, the normalization process was performed as where is the value after normalization and xmax, xmin, and x are the maximum, minimum, and original values, respectively. The normalization process is illustrated schematically in Figure 3. Following data processing, the model was trained and tested using the leave-one-out cross-validation technique [16]. Briefly, one data sample in the dataset was taken as an assessment set (i.e., a testing set), and all the other data samples were taken as a training set. The process was repeated until every data sample in the dataset had been taken as the assessment set once. As shown in Table 2, the Taguchi DOE design involved nine experimental runs. In other words, the Taguchi experiments generated nine data samples, with each sample consisting of the values assigned to the four machining parameters and the corresponding machining precision. Accordingly, the leave-one-out procedure was repeated nine times, as shown in Figure 4, with the prediction accuracy calculated each time. Following data processing, the model was trained and tested using the leave-one-out cross-validation technique [16]. Briefly, one data sample in the dataset was taken as an assessment set (i.e., a testing set), and all the other data samples were taken as a training set. The process was repeated until every data sample in the dataset had been taken as the assessment set once. As shown in Table 2, the Taguchi DOE design involved nine experimental runs. In other words, the Taguchi experiments generated nine data samples, with each sample consisting of the values assigned to the four machining parameters and the corresponding machining precision. Accordingly, the leave-one-out procedure was repeated nine times, as shown in Figure 4, with the prediction accuracy calculated each time. The present study considered three ML models: random forest (RF), XGBoost (XGB), and decision tree (DT). Having evaluated the prediction performance of all three models, the model with the best performance was further improved using a synthetic minority  The present study considered three ML models: random forest (RF), XGBoost (XGB), and decision tree (DT). Having evaluated the prediction performance of all three models, the model with the best performance was further improved using a synthetic minority over-sampling technique for regression with Gaussian noise (SMOGN) and four different optimization algorithms: genetic algorithm (GA), grey wolf (GW), particle swarm optimization (PSO), and center particle swarm optimization (CPSO). The prediction performances of the various models were then evaluated and compared. The details of the ML models, oversampling technique, and optimization algorithms are provided in the following sections.

Random Forest
Random forest (RF) [17] combines the strengths of the bagging and decision tree algorithms, as shown in Figure 5. Bagging is used first to generate randomly distributed training data, and this data is then randomly assigned to multiple decision trees for training purposes. Since each decision tree uses different training data, each trained decision tree is different from the others. The weight of each decision tree is thus obtained via majority voting. As the training and testing process proceeds, the weaker decision trees are gradually combined to construct a stronger model with a better prediction performance. The present study considered three ML models: random forest (RF), XGBoost (XGB), and decision tree (DT). Having evaluated the prediction performance of all three models, the model with the best performance was further improved using a synthetic minority over-sampling technique for regression with Gaussian noise (SMOGN) and four different optimization algorithms: genetic algorithm (GA), grey wolf (GW), particle swarm optimization (PSO), and center particle swarm optimization (CPSO). The prediction performances of the various models were then evaluated and compared. The details of the ML models, oversampling technique, and optimization algorithms are provided in the following sections.

Random Forest
Random forest (RF) [17] combines the strengths of the bagging and decision tree algorithms, as shown in Figure 5. Bagging is used first to generate randomly distributed training data, and this data is then randomly assigned to multiple decision trees for training purposes. Since each decision tree uses different training data, each trained decision tree is different from the others. The weight of each decision tree is thus obtained via majority voting. As the training and testing process proceeds, the weaker decision trees are gradually combined to construct a stronger model with a better prediction performance.

XGBoost
Extreme gradient boosting (XGBoost or XGB) [18] combines the strengths of bagging and boosting, as shown in Figure 6. In particular, an additive model is first constructed, consisting of multiple base models. A new tree is then added to the model in order to

XGBoost
Extreme gradient boosting (XGBoost or XGB) [18] combines the strengths of bagging and boosting, as shown in Figure 6. In particular, an additive model is first constructed, consisting of multiple base models. A new tree is then added to the model in order to correct the error produced by the previous tree, thereby improving the overall strength of the model. The process proceeds iteratively in this way until either no further improvement is obtained over a specified number of epochs, or the number of trees in the XGBoost structure reaches the maximum prescribed number. correct the error produced by the previous tree, thereby improving the overall strength of the model. The process proceeds iteratively in this way until either no further improvement is obtained over a specified number of epochs, or the number of trees in the XGBoost structure reaches the maximum prescribed number.
In the present study, the maximum depth of the tree structure was set as 500, the number of samples was set as 1000, the learning rate was set as 0.3, and the booster was set as "gbtree".

Decision Tree
The decision tree (DT) model [19] can be used to solve both classification and regression problems. In the present study, the model was used to solve the regression problem  In the present study, the maximum depth of the tree structure was set as 500, the number of samples was set as 1000, the learning rate was set as 0.3, and the booster was set as "gbtree".

Decision Tree
The decision tree (DT) model [19] can be used to solve both classification and regression problems. In the present study, the model was used to solve the regression problem of predicting the turning precision based on the values of the four machining parameters. As shown in Figure 7, the DT structure consisted of three types of nodes: a root node, multiple interior nodes, and several output nodes. The root node represented the entire sample, while the interior nodes represented the features of the data set and the leaf nodes represented the regression outputs. Moreover, the branches between the nodes represented the decision rules (formulated as True or False). Briefly, each data point in the dataset was run through the tree until it reached an output node. The final prediction result was then obtained by computing the average value of the dependent variable at each node.

Decision Tree
The decision tree (DT) model [19] can be used to solve both classification and regression problems. In the present study, the model was used to solve the regression problem of predicting the turning precision based on the values of the four machining parameters. As shown in Figure 7, the DT structure consisted of three types of nodes: a root node, multiple interior nodes, and several output nodes. The root node represented the entire sample, while the interior nodes represented the features of the data set and the leaf nodes represented the regression outputs. Moreover, the branches between the nodes represented the decision rules (formulated as True or False). Briefly, each data point in the dataset was run through the tree until it reached an output node. The final prediction result was then obtained by computing the average value of the dependent variable at each node.
In this study, the depth of the DT structure was set as 10, the parameter controlling the randomness of the estimator was set as 0, and the feature split randomness criterion was set as "best" (meaning that the DT chose the split preferably based on the most important feature).  In this study, the depth of the DT structure was set as 10, the parameter controlling the randomness of the estimator was set as 0, and the feature split randomness criterion was set as "best" (meaning that the DT chose the split preferably based on the most important feature).

Oversampling
As described above, each of the three models was trained and tested using the leaveone-out cross-validation method. The best model was then further improved using the synthetic minority over-sampling technique for regression with Gaussian noise (SMOGN) [20]. In general, the purpose of oversampling is to overcome the problem of data imbalance in the dataset and/or insufficient available data. To preserve the original data features in the dataset, the data instances with few but important features are first identified. Synthetic data with similar features are then created, such that the final dataset contains a balanced number of features. In implementing the oversampling method, the sampling factor is set in accordance with the imbalance proportion of the data. In addition, the k-NN algorithm is applied to each sample X in the minority class. In particular, the distances between the sample and all the other samples in the same minority class are calculated in order to find the k minority class samples that are the closest to X. One of the minority class examples is then randomly selected and inserted into the following equation: where δ is a random factor with a value in the interval of [0, 1] andX i is the minority class example selected (see Figure 8).
between the sample and all the other samples in the same minority class are calculated in order to find the k minority class samples that are the closest to X. One of the minority class examples is then randomly selected and inserted into the following equation: where is a random factor with a value in the interval of [0, 1] and is the minority class example selected (see Figure 8).

Model Optimization
Following the over-sampling process, optimization algorithms [21] were applied to further tune the parameters of the ML model. Four algorithms were considered, namely GA, GW, PSO and CPSO. The details of each algorithm are described in the following.
The genetic algorithm (GA) optimizer [22] was a randomized search algorithm which imitated the processes of selection and reproduction in nature. The algorithm commenced by constructing an initial population of potential candidate solutions, where each candidate was encoded in the form of a string. A set of these candidates was then selected as the initial guessed values, and selection, crossover, and mutation operations were performed based on the fitness of these guessed values in order to create a new population of improved candidate solutions. The algorithm iterated in this way until the specified termination criteria were satisfied, at which point the candidate with the best fit was decoded and taken as the optimal solution. Figure 9 illustrates the basic workflow of the GA algorithm.

Model Optimization
Following the over-sampling process, optimization algorithms [21] were applied to further tune the parameters of the ML model. Four algorithms were considered, namely GA, GW, PSO and CPSO. The details of each algorithm are described in the following.
The genetic algorithm (GA) optimizer [22] was a randomized search algorithm which imitated the processes of selection and reproduction in nature. The algorithm commenced by constructing an initial population of potential candidate solutions, where each candidate was encoded in the form of a string. A set of these candidates was then selected as the initial guessed values, and selection, crossover, and mutation operations were performed based on the fitness of these guessed values in order to create a new population of improved candidate solutions. The algorithm iterated in this way until the specified termination criteria were satisfied, at which point the candidate with the best fit was decoded and taken as the optimal solution. Figure 9 illustrates the basic workflow of the GA algorithm.  The grey wolf (GW) optimizer [23] is an intelligent optimization algorithm that mimics the group hunting behavior of grey wolves, and involves three main steps, namely, establishing hierarchy, encircling, and attacking. As shown in Figure 10, three solutions (alpha, beta, and delta) are first selected as the best solutions among the initial population. In each iteration, the position of the prey (i.e., the optimal solution) is updated according to the positions of these wolves, based on = ⋅ − (4) The grey wolf (GW) optimizer [23] is an intelligent optimization algorithm that mimics the group hunting behavior of grey wolves, and involves three main steps, namely, establishing hierarchy, encircling, and attacking. As shown in Figure 10, three solutions (alpha, beta, and delta) are first selected as the best solutions among the initial population. In each iteration, the position of the prey (i.e., the optimal solution) is updated according to the positions of these wolves, based on where t is the current iteration; A and C are the auxiliary coefficient vectors; X p (t) is the position vector of the prey; and X(t) is the current position vector of the wolves. α drops from 2 to 0 as the iteration process proceeds, and r 1 and r 2 are random vectors in the interval of [0, 1]. In the hunting process, when → A > 1, the wolves diverge. By contrast, when → A < 1, they converge, and search a more localized area in order to pinpoint the prey. The particle swarm optimization (PSO) algorithm [24] encodes each candidate solution as a particle within the feasible search space, and gradually converges toward the optimal solution based on the individual and collective experiences of all the members of the swarm. As shown in Figure 11, the optimization process involves three iterative steps: (one) assigning the initial positions of all the particles and recording the position of the best solution among them, (two) calculating the acceleration vector for each particle and moving the particle to a new position, and (three) updating the personal best solutions of the individual particles and the global best solution. The related PSO equations are expressed as follows:  The particle swarm optimization (PSO) algorithm [24] encodes each candidate solution as a particle within the feasible search space, and gradually converges toward the optimal solution based on the individual and collective experiences of all the members of the swarm. As shown in Figure 11, the optimization process involves three iterative steps: (one) assigning the initial positions of all the particles and recording the position of the best solution among them, (two) calculating the acceleration vector for each particle and moving the particle to a new position, and (three) updating the personal best solutions of the individual particles and the global best solution. The related PSO equations are expressed as follows: where V k i is the velocity of individual i in the k-th iteration, ω is the weight, c 1 and c 2 are the acceleration constants, and r 1 and r 2 are random values in the interval [0, 1]. In addition, X k i is the position of individual i in the k-th iteration, Pbest k i is the best position of individual i in each iteration, and Gbest k is the best solution in the entire domain. Appl. Sci. 2022, 12, x FOR PEER REVIEW 12 of 20 Figure 11. Conceptual representation of the PSO algorithm.
Center particle swarm optimization (CPSO) [25] is an extension of the PSO algorithm, and aims to reach the optimal solution more rapidly through the introduction of a center particle located in the middle of all the particles in the swarm (see Figure 12). The rationale for this approach lies in the fact that the center of the personal best solutions of all the particles is located closer to the best solution than the global best solution. Thus, in the CPSO algorithm, the position of the center particle is taken in place of the global best position in the original PSO algorithm in order to improve the convergence speed. The position of the center particle is computed as where is the center position in the (k + 1)-th iteration, N is the number of particles in the swarm, and is the position of the i-th particle in the (k + 1)-th iteration. Figure 11. Conceptual representation of the PSO algorithm.
Center particle swarm optimization (CPSO) [25] is an extension of the PSO algorithm, and aims to reach the optimal solution more rapidly through the introduction of a center particle located in the middle of all the particles in the swarm (see Figure 12). The rationale for this approach lies in the fact that the center of the personal best solutions of all the particles is located closer to the best solution than the global best solution. Thus, in the CPSO algorithm, the position of the center particle is taken in place of the global best position in the original PSO algorithm in order to improve the convergence speed. The position of the center particle is computed as where X (k+1) cd is the center position in the (k + 1)-th iteration, N is the number of particles in the swarm, and X k i is the position of the i-th particle in the (k + 1)-th iteration. Figures 13 and 14 show the training procedures for the single RF, XGBoost, and DT models, and the optimized models, respectively. As shown in Figure 13, the training process for the single models involves the following steps: (one) loading the data, (two) data normalization, (three) cross validation (separating the training set and test set), (four) model training, (five) prediction, and (six) prediction accuracy evaluation. Similarly, the main steps in the training process for the optimized models include: (one) loading the data, (two) data normalization, (three) cross validation, (four) oversampling, (five) model optimization, (six) model training using the optimized parameters, (seen) prediction, and (eight) prediction accuracy evaluation (see Figure 14).  Figures 13 and 14 show the training procedures for the single RF, XGBoost, and DT models, and the optimized models, respectively. As shown in Figure 13, the training process for the single models involves the following steps: (one) loading the data, (two) data normalization, (three) cross validation (separating the training set and test set), (four) model training, (five) prediction, and (six) prediction accuracy evaluation. Similarly, the main steps in the training process for the optimized models include: (one) loading the data, (two) data normalization, (three) cross validation, (four) oversampling, (five) model optimization, (six) model training using the optimized parameters, (seen) prediction, and (eight) prediction accuracy evaluation (see Figure 14).

Performances of Different ML Models
As described in Section III, the ML models were trained and tested using the leaveone-out cross-validation technique with nine data samples. For each testing process, the prediction performance of the model was evaluated using three different metrics: the root mean square error (RMSE), the mean absolute error (MAE), and the coefficient of determination ( ).
The RMSE computes the square root of the ratio of the square of the errors between the predicted values and the actual values over the number of data instances. The square relationship in the RMSE formula renders the metric sensitive to very large or very small errors. A larger RMSE indicates a poorer prediction performance, and vice versa. The mean absolute error (MAE) computes the sum of the absolute values of the differences between the actual values of the machining error and the predicted values. In other words, a smaller MAE indicates a better prediction accuracy, and vice versa.
Finally, the coefficient of determination ( or the score) is a statistical measure representing the proportion of variance in the dependent variable. In other words, it provides an indication of how well (or otherwise) the prediction model fits the actual data. The score has a value in the interval of [0, 1], where a value closer to 1 indicates a better fit of the prediction model.
The leave-one-out cross-validation process revealed that all of the trained single models (RF, XGBoost, and DT) had a poor prediction performance. However, among the

Performances of Different ML Models
As described in Section 3, the ML models were trained and tested using the leaveone-out cross-validation technique with nine data samples. For each testing process, the prediction performance of the model was evaluated using three different metrics: the root mean square error (RMSE), the mean absolute error (MAE), and the coefficient of determination (R 2 ).
The RMSE computes the square root of the ratio of the square of the errors between the predicted values and the actual values over the number of data instances. The square relationship in the RMSE formula renders the metric sensitive to very large or very small errors. A larger RMSE indicates a poorer prediction performance, and vice versa. The mean absolute error (MAE) computes the sum of the absolute values of the differences between the actual values of the machining error and the predicted values. In other words, a smaller MAE indicates a better prediction accuracy, and vice versa.
Finally, the coefficient of determination (R 2 or the R 2 score) is a statistical measure representing the proportion of variance in the dependent variable. In other words, it provides an indication of how well (or otherwise) the prediction model fits the actual data. The R 2 score has a value in the interval of [0, 1], where a value closer to 1 indicates a better fit of the prediction model.
The leave-one-out cross-validation process revealed that all of the trained single models (RF, XGBoost, and DT) had a poor prediction performance. However, among the three models, the best performance was obtained by the XGBoost model, with a RMSE of 0.007887, a MAE of 0.0067, and an R 2 score of 0.554. Accordingly, the XGBoost model was selected for further improvement using the SMOGN over-sampling method and four optimization algorithms. Figure 15 shows the learning curves of the four optimizers. Note that in compiling the results, the RMSE values of the optimizers were taken as the fitness value, and once a better fitness value was found in the model optimization process, it was used in place of the best fitness value. In other words, a lower learning curve in Figure 15 indicates an improved optimization performance, and shows that the CPSO optimizer yields the best learning performance.
improved optimization performance, and shows that the CPSO optimizer yields the best learning performance. Figures 16-22 show the prediction results obtained for the seven ML models (i.e., the three single models and the four over-sampled and optimized XGBoost models). For each model, the figures show both a direct comparison of the actual and predicted machining errors and the corresponding residual plot.  three single models and the four over-sampled and optimized XGBoost models). For each model, the figures show both a direct comparison of the actual and predicted machining errors and the corresponding residual plot.           For ease of visualization and comparison, the prediction errors with a value greater than 0.008 in Figures 16-22 are shown in orange, while those with a value between 0.003 and 0.008 are shown in green, and those with a value less than 0.003 are shown in blue. Thus, the prediction accuracy of each model can be crudely evaluated by the number of points of each color in the respective plots. A more quantitative evaluation of the performance of each model can be obtained from the corresponding RMSE, MAE, and R 2 values. Figure 23 compares the predicted machining errors of the seven models with the actual machining error for each of the nine runs in the Taguchi OA. Overall, the results show that the SMOGN-CPSO-XGB model and SMOGN-GWO-XGB model yield the best fits with the actual prediction error in each run. By contrast, the RF model yields the poorest fit with the experimental data. Table 4 lists the MAE, RMSE, and R 2 values of the seven models. In general, the results confirm the effectiveness of the SMOGN technique and optimization algorithms in improving the prediction performance, compared with that of the single models. Moreover, the results confirm that, among all of the models, the XGBoost model optimized with SMOGN and CPSO achieves the lowest MAE and RMSE values and the highest R 2 score.
The values of R 2 for SMOGN-CPSO-XGB and SMOGN-GWO-XGB are 0.661 and 0.64, respectively. It can be seen that SMOGN-CPSO-XGB and SMOGN-GWO-XGB have better prediction effects than the others and perform a good performance, as shown in Table 4. If the two models of SMOGN-CPSO-XGB and SMOGN-GWO-XGB are compared by evaluating the three indicators of MAE, RMSE, and R 2 , the performances of MAE and R 2 for SMOGN-CPSO-XGB model are significantly better than the SMOGN-GWO-XGB model. Accordingly, the SMOGN-CPSO-XGB model has the best predictive ability and effect for turning process. It is also shown that from the R 2 value, the prediction effect of SMOGN-CPSO-XGB model is significantly higher than that of the decision tree model by 23%. In other words, the SMOGN-CPSO-XGB model is the best fitted model among the considered alternatives.  Table 4 lists the MAE, RMSE, and R values of the seven models. In general, the results confirm the effectiveness of the SMOGN technique and optimization algorithms in improving the prediction performance, compared with that of the single models. Moreover, the results confirm that, among all of the models, the XGBoost model optimized with SMOGN and CPSO achieves the lowest MAE and RMSE values and the highest score. The values of R 2 for SMOGN-CPSO-XGB and SMOGN-GWO-XGB are 0.661 and 0.64, respectively. It can be seen that SMOGN-CPSO-XGB and SMOGN-GWO-XGB have better prediction effects than the others and perform a good performance, as shown in Table 4. If the two models of SMOGN-CPSO-XGB and SMOGN-GWO-XGB are compared by evaluating the three indicators of MAE, RMSE, and R 2 , the performances of MAE and R 2 for SMOGN-CPSO-XGB model are significantly better than the SMOGN-GWO-XGB model. Accordingly, the SMOGN-CPSO-XGB model has the best predictive ability and effect for turning process. It is also shown that from the R 2 value, the prediction effect of SMOGN-CPSO-XGB model is significantly higher than that of the decision tree model by 23%. In other words, the SMOGN-CPSO-XGB model is the best fitted model among the considered alternatives.  Figure 23 and Table 4, it can be seen that the SMOGN-CPSO-XGB model can accurately predict the surface accuracy of the workpiece after the turning process.

Confirmation Experiments for the Prediction of the SMOGN-CPSO-XGB Model
In Table 2, the optimized combination of parameters is A3B1C1D2, which means the depth of machining, rotational speed, feed rate, and length of cantilever workpiece are 1.0 mm, 2000 rpm, 0.15 mm/rev, and 60 mm, respectively. This set of parameters is then applied to be the input of the SMOGN-CPSO-XGB model, and the obtained predicted value of machining error is 0.005163 mm within 0.002 s. The result is the best of all values of  From Figure 23 and Table 4, it can be seen that the SMOGN-CPSO-XGB model can accurately predict the surface accuracy of the workpiece after the turning process.

Confirmation Experiments for the Prediction of the SMOGN-CPSO-XGB Model
In Table 2, the optimized combination of parameters is A3B1C1D2, which means the depth of machining, rotational speed, feed rate, and length of cantilever workpiece are 1.0 mm, 2000 rpm, 0.15 mm/rev, and 60 mm, respectively. This set of parameters is then applied to be the input of the SMOGN-CPSO-XGB model, and the obtained predicted value of machining error is 0.005163 mm within 0.002 s. The result is the best of all values of machining error in Table 2. It is confirmed that the SMOGN-CPSO-XGB model is suitable for the prediction in the turning process.
If a set of experimental factors and levels designed by the Taguchi method are input arbitrarily, the predicted machining error of the workpiece can be accurately obtained through the SMOGN-CPSO-XGB model. For the industry, it is very important that when a certain processing parameter changes, the machining error change of the workpiece needs to be quickly understood, which can be used as a reference for parameter adjustment. It can also instantly understand the impact of machining parameter changes on the workpiece accuracy.

Conclusions
This study has proposed a machine learning approach for predicting the machining precision of the turning process based on the values assigned to the machining depth, rotational speed, feed rate, and cantilever cutting length, respectively. Based on the experimental results obtained from the Taguchi factorial design experiment, the contributions of this study are summarized as follows.